Problem: Unstable Performance While Pipelining with HardDisc IO

timminn · ‎04-09-2008

Strange phenomenon occurs when I am doing pipeline with harddisc reading.

There are two filters inthe pipeline: The first filter thatsetted to be serial,reads in a piece of data with size of2M from file on harddisc; and the second filter is set to be non-serial,recieve the data chunk and performs some heavy computational operatons on them. There are roughly 80 pieces of data to be processed in total. And the input file are always the same, it is never modified, nor moved from one path to another.

I make some timing on the two filters. Generally, both of eachfilter spend about 0.4~0.6 second on handling one piece of data. However, in some cases, the first reading filter consumes more than 10 seconds to read a piece of data, and concurrently the second computing filter consumes more than 1 second, which is abnormal. These unusual time consumption occurs unpredictably, and continuous with serveral piece of data.

Is this related to the hard-disc swapping behavior? Could anybody give an explaination in detail?

Environment:

Ubuntu with kernel 2.6.20.3

One2-Core CPU, 2.33Ghz 2048k Cache

Ext2 File system. 2G Mem, 4G swap.

GCC 3.4.3

robert-reed · ‎04-10-2008

It could be some kind of hard disk behavior that's affecting your numbers. 2 MB isa pretty good chunk of data to buffer.

What happens if you remove the computing filter? It should be a simple matter just to comment it out of the pipeline construction process. Do you still see the anomalous timings if all your app does is the serial reads? Are you generating any output from the pipeline?