<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Intel IOMeter problem - why Windows multi-threading data fetchi in Software Tuning, Performance Optimization &amp; Platform Monitoring</title>
    <link>https://community.intel.com/t5/Software-Tuning-Performance/Intel-IOMeter-problem-why-Windows-multi-threading-data-fetching/m-p/811971#M872</link>
    <description>&lt;P&gt;Hello zhaonaly,&lt;BR /&gt;If I understand correctly, you are comparing IOmeter's random read performance to your program's sequential read performance.&lt;BR /&gt;Random 'anything' (such as disk reads/writes, memory reads/writes)are generally much slower than sequential reads and writes.&lt;BR /&gt;The IOmeter product has a forum and you've posted this message there. &lt;BR /&gt;That is probably the appropriate place for the message.&lt;BR /&gt;Hope this helps,&lt;BR /&gt;Pat&lt;/P&gt;</description>
    <pubDate>Mon, 13 Feb 2012 13:09:36 GMT</pubDate>
    <dc:creator>Patrick_F_Intel1</dc:creator>
    <dc:date>2012-02-13T13:09:36Z</dc:date>
    <item>
      <title>Intel IOMeter problem - why Windows multi-threading data fetching is much much faster than IOMeter?</title>
      <link>https://community.intel.com/t5/Software-Tuning-Performance/Intel-IOMeter-problem-why-Windows-multi-threading-data-fetching/m-p/811970#M871</link>
      <description>Greetings,&lt;DIV&gt;  I do apologize if I post this message to a wrong place, but I really don't know where to seek help about IOMeter (not sure if Intel still take care of its support, even though it has already gone to sourceforge).&lt;/DIV&gt;&lt;DIV&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;  I did some testing using windows multi-threading data fetching against IO Meter, but found&lt;/SPAN&gt;IOPS calculated from my program is much much faster than IOMeter&lt;SPAN&gt;. I tried hard to figure out where's the problem but got frustrated.&lt;/SPAN&gt; &lt;/DIV&gt;&lt;DIV&gt;&lt;DIV id="_mcePaste"&gt;  &lt;/DIV&gt;&lt;DIV id="_mcePaste"&gt;   My SSD is PLEXTOR PX-128M3S, by IOMeter, its max 512B random read IOPS is around 94k (queue depth is 32).&lt;/DIV&gt;&lt;DIV id="_mcePaste"&gt;   However my program (32 windows threads) can reach around 500k 512B IOPS, around 5 times of IOMeter!!! I did data validation but didn't find any error in data fetching. It's because my data fetching in order?&lt;/DIV&gt;&lt;DIV id="_mcePaste"&gt;&lt;/DIV&gt;&lt;DIV id="_mcePaste"&gt;&lt;/DIV&gt;&lt;DIV id="_mcePaste"&gt;  I paste my code below (it mainly fetch 512B from file and release it; I did use 4bytes (an int) to validate program logic and didn't find problem), can anybody help me figure out where I am wrong?&lt;/DIV&gt;&lt;DIV id="_mcePaste"&gt;&lt;/DIV&gt;&lt;DIV id="_mcePaste"&gt;&lt;/DIV&gt;&lt;DIV id="_mcePaste"&gt;  Thanks so much in advance!!&lt;/DIV&gt;&lt;DIV id="_mcePaste"&gt;&lt;/DIV&gt;&lt;DIV id="_mcePaste"&gt;&lt;/DIV&gt;&lt;DIV id="_mcePaste"&gt;Nai Yan.&lt;/DIV&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;P&gt;#include &lt;STDIO.H&gt;&lt;/STDIO.H&gt;&lt;/P&gt;&lt;P&gt;	#include &lt;WINDOWS.H&gt;&lt;/WINDOWS.H&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;	/* &lt;/P&gt;&lt;P&gt;	**  Purpose: Verify file random read IOPS in comparison with IOMeter    &lt;/P&gt;&lt;P&gt;	**  Author:  Nai Yan&lt;/P&gt;&lt;P&gt;	**  Date:    Feb. 9th, 2012&lt;/P&gt;&lt;P&gt;	**/&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;	//Global variables&lt;/P&gt;&lt;P&gt;	long completeIOs = 0; &lt;/P&gt;&lt;P&gt;	long completeBytes = 0;&lt;/P&gt;&lt;P&gt;	int  threadCount = 32;&lt;/P&gt;&lt;P&gt;	unsigned long long length = 1073741824;                  //4G test file&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;	int interval = 1024;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;	int resultArrayLen = 320000;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;	int *result = new int[resultArrayLen];&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;	//Method declarison&lt;/P&gt;&lt;P&gt;	double GetSecs(void);					           //Calculate out duration&lt;/P&gt;&lt;P&gt;	int InitPool(long long,char*,int);		     		  //Initialize test data for testing, if successful, return 1; otherwise, return a non 1 value. &lt;/P&gt;&lt;P&gt;	int * FileRead(char * path);&lt;/P&gt;&lt;P&gt;	unsigned int DataVerification(int*, int sampleItem);		                 //Verify data fetched from pool&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;	int main()&lt;/P&gt;&lt;P&gt;	{&lt;/P&gt;&lt;P&gt;		int sampleItem = 0x1;&lt;/P&gt;&lt;P&gt;		char * fPath = "G:\\\\workspace\\\\4G.bin";&lt;/P&gt;&lt;P&gt;		unsigned int invalidIO = 0;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;		if (InitPool(length,fPath,sampleItem)!= 1)&lt;/P&gt;&lt;P&gt;		   printf("File write err... \\n");&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;		//start do random I/Os from initialized file&lt;/P&gt;&lt;P&gt;		double start = GetSecs();&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;		int * fetchResult = FileRead(fPath);&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;		double end = GetSecs();&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;		printf("File read IOPS is %.4f per second.. \\n",completeIOs/(end - start));&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;		//start data validation, for 4 bytes fetch only&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;	//	invalidIO = DataVerification(fetchResult,sampleItem);&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;	//	if (invalidIO !=0)&lt;/P&gt;&lt;P&gt;	//	{&lt;/P&gt;&lt;P&gt;	//		printf("Total invalid data fetch IOs are %d", invalidIO);&lt;/P&gt;&lt;P&gt;	//	}&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;		return 0;&lt;/P&gt;&lt;P&gt;	}&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;	int InitPool(long long length, char* path, int sample)&lt;/P&gt;&lt;P&gt;	{&lt;/P&gt;&lt;P&gt;		printf("Start initializing test data ... \\n");&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;		FILE * fp = fopen(path,"wb");&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;		if (fp == NULL)&lt;/P&gt;&lt;P&gt;		{&lt;/P&gt;&lt;P&gt;			printf("file open err... \\n");&lt;/P&gt;&lt;P&gt;			exit (-1);&lt;/P&gt;&lt;P&gt;		}&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;		else									//initialize file for testing&lt;/P&gt;&lt;P&gt;		{&lt;/P&gt;&lt;P&gt;			fseek(fp,0L,SEEK_SET);&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;			for (int i=0; i&lt;LENGTH&gt;&lt;/LENGTH&gt;&lt;/P&gt;&lt;P&gt;			{&lt;/P&gt;&lt;P&gt;				fwrite(&amp;amp;sample,sizeof(int),1,fp);&lt;/P&gt;&lt;P&gt;			}&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;			fclose(fp);&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;			fp = NULL;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;			printf("Data initialization is complete...\\n");&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;			return 1;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;		}&lt;/P&gt;&lt;P&gt;	}&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;	double GetSecs(void)&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;	{&lt;/P&gt;&lt;P&gt;	    LARGE_INTEGER frequency;&lt;/P&gt;&lt;P&gt;	    LARGE_INTEGER start;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;	    if(! QueryPerformanceFrequency(&amp;amp;frequency)) &lt;/P&gt;&lt;P&gt;	        printf("QueryPerformanceFrequency Failed\\n");&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;	    if(! QueryPerformanceCounter(&amp;amp;start))&lt;/P&gt;&lt;P&gt;	        printf("QueryPerformanceCounter Failed\\n");&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;		return ((double)start.QuadPart/(double)frequency.QuadPart);&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;	}&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;	class input&lt;/P&gt;&lt;P&gt;	{&lt;/P&gt;&lt;P&gt;	public:&lt;/P&gt;&lt;P&gt;		char *path;&lt;/P&gt;&lt;P&gt;		int starting;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;		input (int st, char * filePath):starting(st),path(filePath){}&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;	};&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;	//Workers&lt;/P&gt;&lt;P&gt;	DWORD WINAPI FileReadThreadEntry(LPVOID lpThreadParameter)&lt;/P&gt;&lt;P&gt;	{&lt;/P&gt;&lt;P&gt;		input * in = (input*) lpThreadParameter; &lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;		char* path = in-&amp;gt;path;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;		FILE * fp = fopen(path,"rb");&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;		int sPos = in-&amp;gt;starting;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;	//	int * result = in-&amp;gt;r;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;		if(fp != NULL)&lt;/P&gt;&lt;P&gt;		{&lt;/P&gt;&lt;P&gt;			fpos_t pos;&lt;/P&gt;&lt;P&gt;			for (int i=0; i&lt;RESULTARRAYLEN&gt;&lt;/RESULTARRAYLEN&gt;&lt;/P&gt;&lt;P&gt;			{&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;				pos = i * interval;&lt;/P&gt;&lt;P&gt;				fsetpos(fp,&amp;amp;pos);&lt;/P&gt;&lt;P&gt;				//For 512 bytes fetch each time&lt;/P&gt;&lt;P&gt;				unsigned char *c =new unsigned char [512];&lt;/P&gt;&lt;P&gt;				if (fread(c,512,1,fp) ==1)&lt;/P&gt;&lt;P&gt;				{&lt;/P&gt;&lt;P&gt;					InterlockedIncrement(&amp;amp;completeIOs);&lt;/P&gt;&lt;P&gt;					delete c;&lt;/P&gt;&lt;P&gt;				}&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;				//For 4 bytes fetch each time&lt;/P&gt;&lt;P&gt;				/*if (fread(&amp;amp;result[sPos + i],sizeof(int),1,fp) ==1)&lt;/P&gt;&lt;P&gt;				{&lt;/P&gt;&lt;P&gt;					InterlockedIncrement(&amp;amp;completeIOs);&lt;/P&gt;&lt;P&gt;				}*/&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;				else&lt;/P&gt;&lt;P&gt;				{&lt;/P&gt;&lt;P&gt;					printf("file read err...\\n");&lt;/P&gt;&lt;P&gt;					exit(-1);&lt;/P&gt;&lt;P&gt;				}&lt;/P&gt;&lt;P&gt;			}&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;			fclose(fp);&lt;/P&gt;&lt;P&gt;			fp = NULL;&lt;/P&gt;&lt;P&gt;			}&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;		else&lt;/P&gt;&lt;P&gt;		{&lt;/P&gt;&lt;P&gt;			printf("File open err... \\n");&lt;/P&gt;&lt;P&gt;			exit(-1);&lt;/P&gt;&lt;P&gt;		}&lt;/P&gt;&lt;P&gt;	}&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;	int * FileRead(char * p)&lt;/P&gt;&lt;P&gt;	{&lt;/P&gt;&lt;P&gt;		printf("Starting reading file ... \\n");&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;		HANDLE mWorkThread[256];                      //max 256 threads&lt;/P&gt;&lt;P&gt;		completeIOs = 0;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;		int slice = int (resultArrayLen/threadCount);&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;		for(int i = 0; i &amp;lt; threadCount; i++)&lt;/P&gt;&lt;P&gt;		{&lt;/P&gt;&lt;P&gt;			mWorkThread&lt;I&gt; = CreateThread(&lt;/I&gt;&lt;/P&gt;&lt;P&gt;						NULL,&lt;/P&gt;&lt;P&gt;						0,&lt;/P&gt;&lt;P&gt;						FileReadThreadEntry,&lt;/P&gt;&lt;P&gt;						(LPVOID)(new input(i*slice,p)),&lt;/P&gt;&lt;P&gt;						0, &lt;/P&gt;&lt;P&gt;						NULL);&lt;/P&gt;&lt;P&gt;		}&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;	   WaitForMultipleObjects(threadCount, mWorkThread, TRUE, INFINITE);&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;	   printf("File read complete... \\n");&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;	   return result;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;	}&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;	unsigned int DataVerification(int* result, int sampleItem)&lt;/P&gt;&lt;P&gt;	{&lt;/P&gt;&lt;P&gt;		unsigned int invalid = 0;&lt;/P&gt;&lt;P&gt;		for (int i=0; i&amp;lt; resultArrayLen/interval;i++)&lt;/P&gt;&lt;P&gt;		{&lt;/P&gt;&lt;P&gt;			if (result&lt;I&gt;!=sampleItem)&lt;/I&gt;&lt;/P&gt;&lt;P&gt;			{&lt;/P&gt;&lt;P&gt;				invalid ++;&lt;/P&gt;&lt;P&gt;				continue;&lt;/P&gt;&lt;P&gt;			}&lt;/P&gt;&lt;P&gt;		}&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;		return invalid;&lt;/P&gt;&lt;P&gt;	}&lt;/P&gt;&lt;/DIV&gt;</description>
      <pubDate>Fri, 10 Feb 2012 06:59:44 GMT</pubDate>
      <guid>https://community.intel.com/t5/Software-Tuning-Performance/Intel-IOMeter-problem-why-Windows-multi-threading-data-fetching/m-p/811970#M871</guid>
      <dc:creator>zhaonaiy</dc:creator>
      <dc:date>2012-02-10T06:59:44Z</dc:date>
    </item>
    <item>
      <title>Intel IOMeter problem - why Windows multi-threading data fetchi</title>
      <link>https://community.intel.com/t5/Software-Tuning-Performance/Intel-IOMeter-problem-why-Windows-multi-threading-data-fetching/m-p/811971#M872</link>
      <description>&lt;P&gt;Hello zhaonaly,&lt;BR /&gt;If I understand correctly, you are comparing IOmeter's random read performance to your program's sequential read performance.&lt;BR /&gt;Random 'anything' (such as disk reads/writes, memory reads/writes)are generally much slower than sequential reads and writes.&lt;BR /&gt;The IOmeter product has a forum and you've posted this message there. &lt;BR /&gt;That is probably the appropriate place for the message.&lt;BR /&gt;Hope this helps,&lt;BR /&gt;Pat&lt;/P&gt;</description>
      <pubDate>Mon, 13 Feb 2012 13:09:36 GMT</pubDate>
      <guid>https://community.intel.com/t5/Software-Tuning-Performance/Intel-IOMeter-problem-why-Windows-multi-threading-data-fetching/m-p/811971#M872</guid>
      <dc:creator>Patrick_F_Intel1</dc:creator>
      <dc:date>2012-02-13T13:09:36Z</dc:date>
    </item>
    <item>
      <title>Intel IOMeter problem - why Windows multi-threading data fetchi</title>
      <link>https://community.intel.com/t5/Software-Tuning-Performance/Intel-IOMeter-problem-why-Windows-multi-threading-data-fetching/m-p/811972#M873</link>
      <description>I think that comparison is not &lt;SPAN style="text-decoration: underline;"&gt;valid&lt;/SPAN&gt; because it is hard to reproduce / replicate &lt;STRONG&gt;IOmeter&lt;/STRONG&gt;'s algorithm for measuring I/O performance.&lt;BR /&gt;&lt;BR /&gt;Is &lt;STRONG&gt;IOmeter&lt;/STRONG&gt; an Open Source project? If &lt;STRONG&gt;Yes&lt;/STRONG&gt;,you can look what it does and how it calculates values.&lt;BR /&gt;&lt;BR /&gt;Best regards,&lt;BR /&gt;Sergey&lt;BR /&gt;</description>
      <pubDate>Wed, 15 Feb 2012 15:49:18 GMT</pubDate>
      <guid>https://community.intel.com/t5/Software-Tuning-Performance/Intel-IOMeter-problem-why-Windows-multi-threading-data-fetching/m-p/811972#M873</guid>
      <dc:creator>SergeyKostrov</dc:creator>
      <dc:date>2012-02-15T15:49:18Z</dc:date>
    </item>
  </channel>
</rss>

