Software Tuning, Performance Optimization & Platform Monitoring
Discussion regarding monitoring and software tuning methodologies, Performance Monitoring Unit (PMU) of Intel microprocessors, and platform updating.
1711 Discussions

Intel IOMeter problem - why Windows multi-threading data fetching is much much faster than IOMeter?

zhaonaiy
Beginner
683 Views
Greetings,
I do apologize if I post this message to a wrong place, but I really don't know where to seek help about IOMeter (not sure if Intel still take care of its support, even though it has already gone to sourceforge).
I did some testing using windows multi-threading data fetching against IO Meter, but foundIOPS calculated from my program is much much faster than IOMeter. I tried hard to figure out where's the problem but got frustrated.
My SSD is PLEXTOR PX-128M3S, by IOMeter, its max 512B random read IOPS is around 94k (queue depth is 32).
However my program (32 windows threads) can reach around 500k 512B IOPS, around 5 times of IOMeter!!! I did data validation but didn't find any error in data fetching. It's because my data fetching in order?
I paste my code below (it mainly fetch 512B from file and release it; I did use 4bytes (an int) to validate program logic and didn't find problem), can anybody help me figure out where I am wrong?
Thanks so much in advance!!
Nai Yan.

#include

#include

/*

** Purpose: Verify file random read IOPS in comparison with IOMeter

** Author: Nai Yan

** Date: Feb. 9th, 2012

**/

//Global variables

long completeIOs = 0;

long completeBytes = 0;

int threadCount = 32;

unsigned long long length = 1073741824; //4G test file

int interval = 1024;

int resultArrayLen = 320000;

int *result = new int[resultArrayLen];

//Method declarison

double GetSecs(void); //Calculate out duration

int InitPool(long long,char*,int); //Initialize test data for testing, if successful, return 1; otherwise, return a non 1 value.

int * FileRead(char * path);

unsigned int DataVerification(int*, int sampleItem); //Verify data fetched from pool

int main()

{

int sampleItem = 0x1;

char * fPath = "G:\\\\workspace\\\\4G.bin";

unsigned int invalidIO = 0;

if (InitPool(length,fPath,sampleItem)!= 1)

printf("File write err... \\n");

//start do random I/Os from initialized file

double start = GetSecs();

int * fetchResult = FileRead(fPath);

double end = GetSecs();

printf("File read IOPS is %.4f per second.. \\n",completeIOs/(end - start));

//start data validation, for 4 bytes fetch only

// invalidIO = DataVerification(fetchResult,sampleItem);

// if (invalidIO !=0)

// {

// printf("Total invalid data fetch IOs are %d", invalidIO);

// }

return 0;

}

int InitPool(long long length, char* path, int sample)

{

printf("Start initializing test data ... \\n");

FILE * fp = fopen(path,"wb");

if (fp == NULL)

{

printf("file open err... \\n");

exit (-1);

}

else //initialize file for testing

{

fseek(fp,0L,SEEK_SET);

for (int i=0; i

{

fwrite(&sample,sizeof(int),1,fp);

}

fclose(fp);

fp = NULL;

printf("Data initialization is complete...\\n");

return 1;

}

}

double GetSecs(void)

{

LARGE_INTEGER frequency;

LARGE_INTEGER start;

if(! QueryPerformanceFrequency(&frequency))

printf("QueryPerformanceFrequency Failed\\n");

if(! QueryPerformanceCounter(&start))

printf("QueryPerformanceCounter Failed\\n");

return ((double)start.QuadPart/(double)frequency.QuadPart);

}

class input

{

public:

char *path;

int starting;

input (int st, char * filePath):starting(st),path(filePath){}

};

//Workers

DWORD WINAPI FileReadThreadEntry(LPVOID lpThreadParameter)

{

input * in = (input*) lpThreadParameter;

char* path = in->path;

FILE * fp = fopen(path,"rb");

int sPos = in->starting;

// int * result = in->r;

if(fp != NULL)

{

fpos_t pos;

for (int i=0; i

{

pos = i * interval;

fsetpos(fp,&pos);

//For 512 bytes fetch each time

unsigned char *c =new unsigned char [512];

if (fread(c,512,1,fp) ==1)

{

InterlockedIncrement(&completeIOs);

delete c;

}

//For 4 bytes fetch each time

/*if (fread(&result[sPos + i],sizeof(int),1,fp) ==1)

{

InterlockedIncrement(&completeIOs);

}*/

else

{

printf("file read err...\\n");

exit(-1);

}

}

fclose(fp);

fp = NULL;

}

else

{

printf("File open err... \\n");

exit(-1);

}

}

int * FileRead(char * p)

{

printf("Starting reading file ... \\n");

HANDLE mWorkThread[256]; //max 256 threads

completeIOs = 0;

int slice = int (resultArrayLen/threadCount);

for(int i = 0; i < threadCount; i++)

{

mWorkThread = CreateThread(

NULL,

0,

FileReadThreadEntry,

(LPVOID)(new input(i*slice,p)),

0,

NULL);

}

WaitForMultipleObjects(threadCount, mWorkThread, TRUE, INFINITE);

printf("File read complete... \\n");

return result;

}

unsigned int DataVerification(int* result, int sampleItem)

{

unsigned int invalid = 0;

for (int i=0; i< resultArrayLen/interval;i++)

{

if (result!=sampleItem)

{

invalid ++;

continue;

}

}

return invalid;

}

0 Kudos
2 Replies
Patrick_F_Intel1
Employee
683 Views

Hello zhaonaly,
If I understand correctly, you are comparing IOmeter's random read performance to your program's sequential read performance.
Random 'anything' (such as disk reads/writes, memory reads/writes)are generally much slower than sequential reads and writes.
The IOmeter product has a forum and you've posted this message there.
That is probably the appropriate place for the message.
Hope this helps,
Pat

0 Kudos
SergeyKostrov
Valued Contributor II
683 Views
I think that comparison is not valid because it is hard to reproduce / replicate IOmeter's algorithm for measuring I/O performance.

Is IOmeter an Open Source project? If Yes,you can look what it does and how it calculates values.

Best regards,
Sergey
0 Kudos
Reply