- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
We have recently upgraded from 2019.3 to 2021.6.0, and I'm investigating something that got 3x slower after the upgrade. To work with a 3rd party library, we need to initialize thread-specific data for each new TBB thread. We are using task_scheduler_observer for that. Prior to the upgrade, on_scheduler_entry was called once per core, or 10 times in my case. After the upgrade, it is called over 3000 times for the case I'm investigating. This causes the 3rd party library to have to rebuild the cache many times per thread, leading to the slowdown.
I'm not sure what the right question is, but one of these might be it:
- Is this change in behavior expected?
- Is this the right way to hook into thread creation for thread-specific setup?
- Should I be looking into task_arena to achieve this?
Thank you,
Jeff
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
Thanks for reaching out to us.
Could you please provide us with a sample reproducer and the steps you have followed to reproduce the issue so that we can try it from our end?
Please let us know how you are measuring the performance of your code.
Also please provide the OS details.
Thanks & Regards,
Noorjahan
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Noorjahan,
In lieu of a reproducible example, I think I can provide enough details. This is happening in parallel_pipeline, and it happens throughout the execution of the pipeline. Previously, we would get one on_scheduler_entry per thread. It's just a guess, but I think now we are getting one per task. Based on the name "task_scheduler_observer", this is arguably how it is supposed to work. It just happens to be a breaking change for us.
We measure the performance with a regression test suite on a dedicated computer using wall time, and track performance trends over time. Then we use VTune to diagnose problems when a test shows an issue.
Thank you,
-Jeff
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
You can use task_arena to achieve this as the observer will only receive callbacks for threads that enter and exit that specific arena.
Please refer to the ProTBB Textbook, page no:359
Thanks & Regards,
Noorjahan.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
We haven't heard back from you. Could you please provide an update on your issue?
Thanks & Regards,
Noorjahan.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
I have not heard back from you, so I will close this inquiry now. If you need further assistance, please post a new question.
Thanks & Regards,
Noorjahan.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Noorjahan,
Sorry, I was on vacation and then caught up with other priorities. This does not solve my problem. I will follow up with a new question with example code.
-Jeff
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page