- 新着としてマーク
- ブックマーク
- 購読
- ミュート
- RSS フィードを購読する
- ハイライト
- 印刷
- 不適切なコンテンツを報告
Hi all,
Currently, I'm doing an experiment that splits single transaction into
small and multiple sub transactions with 'labyrinth' in STAMP benchmark.
Original TX and multiple sub-TX example codes are following:
[Original TX code] TM_BEGIN() func2(1,n-1) TM_END()
[Multiple sub-TX code (ignore consistency of program)] long start,end; long chunk; while(start < n -1) { end=((start+chunk) < n-1 ) ? (start+chunk) : n-1); TM_BEGIN() func2(start,end); TM_END() start=end; }
When chunk size is n-1, it is exactly the same as original code example
and when chunk size is less than 8, it runs well with TSX.
This is the TX statistics information when chunk size is 4
tx-start: 48474
tx-abort: 2837
tx-explicit: 1775
tx-conflict: 670
tx-capacity: 306
tx-other: 86
But, problem is here when chunk size is 8. The result shows wired counts.
tx-start: 41805
tx-abort: 19969
tx-explicit: 599
tx-conflict: 107
tx-capacity: 19154
tx-other: 109
It shows too high abort ratio, especially in capacity aborts!!
As far as I know, capacity aborts occur when transactional writes are evicted
from L1D cache due to lack of capacity.
But, the above result shows that a lot of capacity aborts occur, even with small TX chunk size.
Is there anyone who know why this results happen?
please, help me solve this problem.
More detailed code is written under
func1 { long n=getSize(pointVectorPtr); long start,end; long chunk=8; while(start < n -1) { end=((start+chunk) < n-1 ) ? (start+chunk) : n-1); func2(pointVectorPtr, start,end); start=end; } } func2 (pointVectorPtr, start, end) { tsx_begin(); for(i=start; i<end; i++) { long *gridPointPtr = (long *)pointVectorPtr->elements; long value = (long)(*gridPointPtr); if( value != -1 ) { TM_RESTART(); } *gridPointPtr = -1; } tsx_end(); }
コピーされたリンク
- 新着としてマーク
- ブックマーク
- 購読
- ミュート
- RSS フィードを購読する
- ハイライト
- 印刷
- 不適切なコンテンツを報告
Hi,
according to this paper the scalability with HTM for "labyrinth" is expected to be low due to the very large thread-local-memory footprint accessed within transaction (up to 14 MByte).
Thanks,
Roman
- 新着としてマーク
- ブックマーク
- 購読
- ミュート
- RSS フィードを購読する
- ハイライト
- 印刷
- 不適切なコンテンツを報告
Also: is data modified within transaction 4KByte aligned? If so then the program experiences associativity cache misses leading to TSX aborts.
A solution is to avoid 4KByte alignment.
Thanks,
Roman
