Intel® oneAPI Threading Building Blocks
Ask questions and share information about adding parallelism to your applications when using this threading library.
2477 ディスカッション

the program with TBB can't analyze the hotspots

Frank_F
ビギナー
1,561件の閲覧回数

    Hello everyone!

    I have a serial program,and i have changed it with Intel TBB,now it is a parallel program.The serial program need one hour,and the parallel need 10 minutes.

But the parallel program can't use VTune Amplifier analyze the hotspots,the VTune Amplifier load the file all the time ,and i see in the 'data.0' folder ,there are many files,but the serial is not appear.What should i do ?

     Thank you!

0 件の賞賛
17 返答(返信)
Vladimir_P_1234567890
1,561件の閲覧回数

how long is "all the time"? is it 10 minutes, 10 hours or 10 days? depending on threads number and number of frames it might load results times longer.

what size is "data.0" folder?

--Vladimir

Frank_F
ビギナー
1,561件の閲覧回数

Vladimir Polin (Intel) wrote:

how long is "all the time"? is it 10 minutes, 10 hours or 10 days? depending on threads number and number of frames it might load results times longer.

what size is "data.0" folder?

--Vladimir

    I have 10 hours without running out,Then I interrupted it.It appears like attachments,and in the 'data.0'folder,there are 83990 '.th'and '.cs' files,and the 'data.0' folder is 140M,and Take up the space of 391M.

 

Frank_F
ビギナー
1,561件の閲覧回数

Vladimir Polin (Intel) wrote:

how long is "all the time"? is it 10 minutes, 10 hours or 10 days? depending on threads number and number of frames it might load results times longer.

what size is "data.0" folder?

--Vladimir

I have 10 hours without running out,Then I interrupted it.It appears like attachments,and in the 'data.0'folder,there are 83990 '.th'and '.cs' files,and the 'data.0' folder is 140M,and Take up the space of 391M.   

Vladimir_P_1234567890
1,561件の閲覧回数

10 hours are pretty long time:) 

I'll forward the link to this thread to VTune team. they will do next steps to reproduce and advise what to do.

thanks for the report
--Vladimir

Frank_F
ビギナー
1,561件の閲覧回数

Vladimir Polin (Intel) wrote:

10 hours are pretty long time:) 

I'll forward the link to this thread to VTune team. they will do next steps to reproduce and advise what to do.

thanks for the report
--Vladimir

    Thank you! Dear friend!

Vitaly_S_Intel
従業員
1,561件の閲覧回数

What do you mean by "VTune Amplifier load the file all the time"? Which VTune version do you use?

Frank_F
ビギナー
1,561件の閲覧回数

Vitaly Slobodskoy (Intel) wrote:

What do you mean by "VTune Amplifier load the file all the time"? Which VTune version do you use?

    I mean I have 10 hours without running out,Then I interrupted it.I use VTune Amplifier XE 2013 update 14 for windows.

Vitaly_S_Intel
従業員
1,561件の閲覧回数

So, your application doesn't start through you see collection is started, right?

Can you see application process in the list of processes?

Do you use GUI or command line?

Which analysis type do you use?

Frank_F
ビギナー
1,561件の閲覧回数

Vitaly Slobodskoy (Intel) wrote:

So, your application doesn't start through you see collection is started, right?

Can you see application process in the list of processes?

Do you use GUI or command line?

Which analysis type do you use?

I can see application process in the list of processes.The VTune Amplifier is running,but can not run out  in the Finalizing results.The same status appear both the Basic hotspots and Advanced hotspots.I don', use GUI or command line.Thank you!

Vitaly_S_Intel
従業員
1,561件の閲覧回数

Thanks Frank.F.

Can you attach result directory?

If it takes too much space please try to remove <result_dir>/sqlite-db subfolder and archive <result_dir>.

If it is still too big I suggest to submit that issue via premier.

Frank_F
ビギナー
1,561件の閲覧回数

Vladimir Polin (Intel) wrote:

10 hours are pretty long time:) 

I'll forward the link to this thread to VTune team. they will do next steps to reproduce and advise what to do.

thanks for the report
--Vladimir

I know the reason.When i add "task_scheduler_init init;" before parallel_for realize statement,the VTune Amplifier can't run out .And when i remove "task_scheduler_init init;",the VTune Amplifier can run out.Did i write "task_scheduler_init init;" is wrong?

Frank_F
ビギナー
1,561件の閲覧回数

Vitaly Slobodskoy (Intel) wrote:

Thanks Frank.F.

Can you attach result directory?

If it takes too much space please try to remove <result_dir>/sqlite-db subfolder and archive <result_dir>.

If it is still too big I suggest to submit that issue via premier.

I know the reason.When i add "task_scheduler_init init;" before parallel_for realize statement,the VTune Amplifier can't run out .And when i remove "task_scheduler_init init;",the VTune Amplifier can run out.Did i write "task_scheduler_init init;" is wrong?

Vitaly_S_Intel
従業員
1,561件の閲覧回数

Based on the data you provided above your application with task_scheduler_init creates ~42000 threads making VTune in trouble loading all that stuff. So, if you didn't expect that, this is the problem in your code. How many threads does VTune show for the case without task_scheduler_init?

 

Frank_F
ビギナー
1,561件の閲覧回数

Vitaly Slobodskoy (Intel) wrote:

Based on the data you provided above your application with task_scheduler_init creates ~42000 threads making VTune in trouble loading all that stuff. So, if you didn't expect that, this is the problem in your code. How many threads does VTune show for the case without task_scheduler_init?

 

It show Total Thread Count is 8. My computer's cpu is Xeon E 5430,it has 4 cores 8 threads.

Frank_F
ビギナー
1,561件の閲覧回数

Vitaly Slobodskoy (Intel) wrote:

Based on the data you provided above your application with task_scheduler_init creates ~42000 threads making VTune in trouble loading all that stuff. So, if you didn't expect that, this is the problem in your code. How many threads does VTune show for the case without task_scheduler_init?

 

The VTune Amplifier shows Total Thread Count is 8. My computer has 4 cores and 8 threads.

Vladimir_P_1234567890
1,561件の閲覧回数

Frank.F wrote:

I know the reason.When i add "task_scheduler_init init;" before parallel_for realize statement,the VTune Amplifier can't run out .And when i remove "task_scheduler_init init;",the VTune Amplifier can run out.Did i write "task_scheduler_init init;" is wrong?

the behavior should be similar there should be created 8 threads in both cases with or without Amplifier.

Vitaly_S_Intel
従業員
1,561件の閲覧回数

I think I was able to reproduce slowdown on loading stage running analysis on synthetic application creating thousands of threads. Thanks Frank.F for reporting that issue!

返信