- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I have a Fortran numerical simulation that I have converted to run using Open MP. My do loop runs twice, which fits my 2 core CPU. A test run shows both CPUs working, and my run time is cut in half. I also note that about half of my output values are wrong. I suspect I have not setone or more arraysup properly with respect to what is shared and what is private.
At this point I am not sure how to debug parallel code. I am looking for helpful hints, or maybesomeone can point me to an article or book that might provide some guidance.
At this point I am not sure how to debug parallel code. I am looking for helpful hints, or maybesomeone can point me to an article or book that might provide some guidance.
Link Copied
2 Replies
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
The most common bug would be where the 2nd loop iteration uses the results of the 1st iteration. You have programmed under Openmp to perform the 2nd iteration without waiting for results of the 1st. In that case,
set omp_num_threads=1 would fix the results.
Or, you have arrays which you expect to have SAVE syntax which you haven't so designated. Setting /Qopenmp gives them /Qauto status, which is required for them to be modified in the parallel region.
Most scalar values which you set in the parallel region will need to be designated with some type of private. If they need to inherit values from before the loop, firstprivate; if the values need to be accessible after the parallel region, lastprivate (either of those detracting from parallel speedup).
If you have entire arrays used inside the loop with independent values for the threads, those also must be so designated. There may be ways to change such code to make it more efficiently parallelized.
Intel thread checker is meant to point out threading consistency errors. It's a useful learning tool.
set omp_num_threads=1 would fix the results.
Or, you have arrays which you expect to have SAVE syntax which you haven't so designated. Setting /Qopenmp gives them /Qauto status, which is required for them to be modified in the parallel region.
Most scalar values which you set in the parallel region will need to be designated with some type of private. If they need to inherit values from before the loop, firstprivate; if the values need to be accessible after the parallel region, lastprivate (either of those detracting from parallel speedup).
If you have entire arrays used inside the loop with independent values for the threads, those also must be so designated. There may be ways to change such code to make it more efficiently parallelized.
Intel thread checker is meant to point out threading consistency errors. It's a useful learning tool.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thanks.
The loops are totally independent calculations, based on the frequency of a transmitter. I will have to go through all the variables and arrays again, and make sure I have all the private and shared statements done correctly.
I will look into thread checker.
The loops are totally independent calculations, based on the frequency of a transmitter. I will have to go through all the variables and arrays again, and make sure I have all the private and shared statements done correctly.
I will look into thread checker.

Reply
Topic Options
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page