- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello,
For a multi-threading synchonization purpose, I'm using LOCK BTS and LOCK BTR with a shared memory.
However how to test this bit when BT does not work with LOCK ?
LOCK AND could be a solution but the destination operand, the shared memory, is destroyed by the result of this logical operation.
Thanks for any help.
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
The pseudo instruction 'LOCK BT' should have the same effect as the instruction 'BT', because memory is normally read once with the instruction 'BT'. Therefore the instruction 'BT' should serve your original need.
Anyway, you may disclose more details of what you are doing to confirm the above.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thank for your reply.
Although it is specified that the prefix lock is forbidden with (assembler rejects it), BT serves my need and seems to be atomic in a slow speed context (500 ms)
However, I need to write a multi-thread stress test to check if BT is really atomic in different cases :
1- Two logical cores (HTT)
2- Two physical cores
3- Same core ?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
According to Intel's manual, reading a byte, reading a word aligned on 16-bit boundary and reading a doubleword aligned on 32-bit boundary are guaranteed atomic since Intel486 processor.
So make sure that your 32-bit variable is properly aligned.
The next thing you need to pay attention is memory ordering as your purpose is synchronization.
--
'BT' instruction should not be so widely used as 'AND' instruction, hence 'BT' instruction may have poorer performance (longer latency; longer instruction decode time) than 'AND' instruction under some microarchitectures.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Jeremy W. wrote:
According to Intel's manual, reading a byte, reading a word aligned on 16-bit boundary and reading a doubleword aligned on 32-bit boundary are guaranteed atomic since Intel486 processor.
So make sure that your 32-bit variable is properly aligned.
The next thing you need to pay attention is memory ordering as your purpose is synchronization.
--
'BT' instruction should not be so widely used as 'AND' instruction, hence 'BT' instruction may have poorer performance (longer latency; longer instruction decode time) than 'AND' instruction under some microarchitectures.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Opcode Instruction 21 lr AND m32, r32 ; memory address is the destination operand 23 lr AND r32, m32 ; memory address is the source operand
You'll get what you want.
--
BTW, beware that interrupts can appear in the middle of instruction (especially in the case of REP prefix instruction).
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Unfortunately the assembler refuse to compile AND with the LOCK prefix when destination is not a memory address.
I'm facing this case because a third thread is aggregating the values of all slab memories: this thread, also as a consumer and not cpu pinned, needs to test the availability of each slab using the bit of synchronisation. So I believe a LOCK is required to garanty the atomicity of the test.
Do you mean that an interrupt can "preempt" the execution between the REP prefix and the rest of op codes ?
Such as LOCK <i> AND dest, src
where is <i> is the position when the interruption happens.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
CyrIng wrote:
Unfortunately the assembler refuse to compile AND with the LOCK prefix when destination is not a memory address.
I'm facing this case because a third thread is aggregating the values of all slab memories: this thread, also as a consumer and not cpu pinned, needs to test the availability of each slab using the bit of synchronisation. So I believe a LOCK is required to garanty the atomicity of the test.
Do you mean that an interrupt can "preempt" the execution between the REP prefix and the rest of op codes ?
Such as LOCK <i> AND dest, src
where is <i> is the position when the interruption happens.
To test the availability of each slab, is atomic read necessary? Does your third thread execute in the same logical processor as the first 2 threads?
LOCK prefix instruction costs ~70 cycles in average, is it still too expensive in your case?
--
According to Intel's manual, interrupts are taken at instruction boundary. For the case of REP prefix instruction, interrupt is taken at the current iteration (e.g. at 50th iteration when there are 100 iterations specified)
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page