Analyzers
Support for Analyzers (Intel VTune™ Profiler, Intel Advisor, Intel Inspector)
Announcements
Welcome to the Intel Community. If you get an answer you like, please mark it as an Accepted Solution to help others. Thank you!
4554 Discussions

intel VTune -- weird results

mrkf100
Beginner
305 Views

Hi everyone! 

I wrote a C++ program for computing the elements of the logistic map and profiled it by using intel VTune Profiler but in the results there is a surprising thing: in the row of 'Cycles of 3+ Port Utilized' there is 55.6 % but below it in every row there is 0.0 %. What does it mean? Where could that 55.6 % disappear? (I'm very beginner in profiling codes.)Logisticmap_vect_question.PNG

The exact program I profiled is here: 

#include <iostream>
#include <time.h> //clock!!
#include <fstream>
#include <vector>

#include <stdio.h>


#define MAX_VECTOR_SIZE 256
#include <vectorclass.h>
#include <cstdlib> 
#include <malloc.h>


using namespace std;

int main()
{

	// Allocated on STACK
	const int Resolution = 4 * 8 * 3125; // Dividable by 4
	const int Transients = 512;
	const int SavedPoints = 128;

	const int LoopUnrollFactor = 4 * 8;
	//double r[Resolution];
	//double x[Resolution];

	// Allocated on HEAP
	double* r = (double*)_aligned_malloc(Resolution * sizeof(double),64);
	double* x = (double*)_aligned_malloc(Resolution * sizeof(double),64);

	//double* r = new double[Resolution];
	//double* x = new double[Resolution];

	// Initialisation
	double dr = 4.0 / (Resolution - 1.0); // Only 1 type conversion!
	r[0] = 0.0;
	x[0] = 0.5;
	for (int i = 1; i < Resolution; i++)
	{
		r[i] = r[i - 1] + dr;
		x[i] = 0.5;
	}


	// Transient iterations
	clock_t SimulationStart = clock();
	Vec4d xact[8];
	Vec4d ract[8];
	for (int i = 0; i < Resolution; i += LoopUnrollFactor)
	{
		#pragma unroll
		for (int k = 0; k < 8; k++)
		{
			xact[k].load_a(x + i + k * 4); // Loading states from alligned memory
			ract[k].load_a(r + i + k * 4); // Loading parameters from alligned memory
		}

		for (int j = 0; j < Transients; j++)
		{
			#pragma unroll
			for (int k = 0; k < 8; k++)
				xact[k] = ract[k] * xact[k] * (1.0 - xact[k]);
		}

		#pragma unroll
		for (int k = 0; k < 8; k++)
		{
			xact[k].store_a(x + i + k * 4); // Storing states from alligned memory
			ract[k].store_a(r + i + k * 4); // Storing parameters from alligned memory
		}
	}
	clock_t SimulationEnd = clock();
	cout << "Simulation time: " << 1000.0*(SimulationEnd - SimulationStart) / CLOCKS_PER_SEC << "ms" << endl << endl;


	// Converged iterations and save to file
	/*ofstream DataFile;
	DataFile.open("LogisticMap.txt");
	int Width = 18;
	DataFile.precision(10);
	DataFile.flags(ios::scientific);

	for (int i = 0; i < Resolution; i++)
	{
		for (int j = 0; j < SavedPoints; j++)
		{
			x[i] = r[i] * x[i] * (1.0 - x[i]);

			DataFile.width(Width); DataFile << r[i]; //Don't place here any commas, because dlmread won't work in MATLAB!!!
			DataFile.width(Width); DataFile << x[i];
			DataFile << '\n';
		}
	}

	DataFile.close();*/

	return 0;
}

Thanks for your answer in advance. 

0 Kudos
1 Solution
Kirill_U_Intel
Employee
226 Views

Hi.

Is it possible to increase workload time

0.038 sec is too small to build correct microarchitecture metrics at all.

They could be incorrect such in your case

Kirill

View solution in original post

5 Replies
JananiC_Intel
Moderator
230 Views

Hi,


Thanks for posting in Intel forums.


We will try from our end and let you know.


Regards,

Janani Chandran


Kirill_U_Intel
Employee
227 Views

Hi.

Is it possible to increase workload time

0.038 sec is too small to build correct microarchitecture metrics at all.

They could be incorrect such in your case

Kirill

View solution in original post

JananiC_Intel
Moderator
200 Views

Hi,


Is your issue resolved? Please let us know if the issue still persists.


Regards,

Janani Chandran


mrkf100
Beginner
194 Views

Hi! 

Thank you, it helped a lot! After increasing the number of Resolution the error did'nt appear. 

 

mrkf100

JananiC_Intel
Moderator
174 Views

Hi,

 

Thanks for the confirmation. If you need any additional information, please submit a new question as this thread will no longer be monitored.

 

Regards,

Janani Chandran

 

Reply