Community
cancel
Showing results for 
Search instead for 
Did you mean: 
393 Views

RTM: Finding memory address at which transaction was aborted

Jump to solution

If a page is marked Copy-on-Write, and I try to write to it inside of a transaction, the transaction aborts.  If I know the address at which it aborted, there is a trivial fix:

int v = addr[0];
__sync_bool_compare_and_swap(addr, v, v); // Force CoW.

And then retry the transaction.  Again, this only works if I know the cacheline on which the transaction was aborted.  Is there a way to find this?  Is it in a performance counter or something?

Tags (1)
0 Kudos
1 Solution
Roman_D_Intel
Employee
393 Views

There is no perf counter or register with the memory access address that causes the abort. I think the best you can do is to retry the transaction body under a global lock instead of using TSX. In the TSX abort handler you can check the abort status (in EAX register) if the abort was persistent (RETRY bit = 0).

Roman

 

View solution in original post

11 Replies
jimdempseyatthecove
Black Belt
393 Views

Three options that I can think of:

a) prior to issuing each instruction that may cause a transaction abort, write the address that may abort into a memory location that won't abort. Somewhat of the same philosophy of a try/catch. Should the transaction abort __sync_fetch_add(loc,0); // RMW

b) prior to entering the transaction region (and inside your retry code) performs something like

     __sync_fetch_add(locA,0); // RMW
     __sync_fetch_add(locB,0);
     ...
     ... start transaction (if abort, go back/redo __sync_fetch_add's, then retry transaction)

c) start transaction, then if abort, perform the __sync_fetch_add's of b) and retry transaction

You will have to determine if choice a), b) or c) is more efficient.

Jim Dempsey

Roman_D_Intel
Employee
393 Views

Hi,

before writing to a page you can printf its address using tsx_printf (it escapes transaction also working for transactions that are aborted). You need a processor with Skylake architecture for that.

Roman

393 Views

Hi Jim and Roman,

(a) may be a workable possibility, but it won't catch all cases.  Same with tsx_printf (which is very clever, btw; I just read your blog post about it).

Let me provide some context:  I'm developing a compiler that allows the code:

xbegin;
​// Do some transactional stuff.
xcommit;

If the transaction aborts, it will simply try again until it succeeds.  However, the "do some transactional stuff" may call out to C or C++, which contains code that the compiler didn't generate and can't hook.  This is expected to be a common scenario in the language.  So it could be walking a graph or some such, and the memory is non-trivial to find outside of the transaction.

Roman_D_Intel
Employee
394 Views

There is no perf counter or register with the memory access address that causes the abort. I think the best you can do is to retry the transaction body under a global lock instead of using TSX. In the TSX abort handler you can check the abort status (in EAX register) if the abort was persistent (RETRY bit = 0).

Roman

 

View solution in original post

jimdempseyatthecove
Black Belt
393 Views

Roman,

tsx_printf is an interesting hack. I would have done something different.

#include <stdio.h>
#include <stdarg.h>
#include <stdlib.h>

#define TSX_PRINTF_BUF_PADD 1024
#define TSX_PRINTF_BUF_LEN (1024*1024)
char tsx_printf_str[TSX_PRINTF_BUF_LEN+TSX_PRINTF_BUF_PADD];
int  tsx_printf_fill = 0;
bool tsx_printf_wrapped =  false;

int tsx_printf(const char* format, ...)
{
    va_list list;
    va_start(list, format);
    int ret = vsnprintf(str+tsx_printf_fill, TSX_PRINTF_BUF_PADD, format, list);
    va_end(list);
    if((tsx_printf_fill += ret) == TSX_PRINTF_BUF_LEN)
    {
        tsx_printf_wrapped = true;
        int j = TSX_PRINTF_BUF_LEN;
        tsx_printf_fill = 0;
        for(int i = 0; i < ret; ++i)
        {
            tsx_printf_str[tsx_printf_fill++] = tsx_printf_str[j++];
        }
    }
    return ret;
}

void tsx_printf_dump()
{
    if(tsx_printf_wrapped)
    {
        tsx_printf_str[TSX_PRINTF_BUF_LEN] = 0;
        printf(&tsx_printf_str[tsx_printf_fill]);
    }
    tsx_printf_str[tsx_printf_fill] = 0;
    printf((&tsx_printf_str);
    tsx_printf_fill = 0;
    tsx_printf_wrapped =  false;
}

Jim Dempsey

393 Views

Okay, yeah.  I was afraid of that.  Thanks Roman.

Roman_D_Intel
Employee
393 Views

Jim,

changes to your tsx_printf memory buffer will be lost in case of an abort (e.g. if it happens after tsx_printf). The Intel processor trace records instruction control flow also in aborted transactions. My tsx_printf is (mis-)using it allowing the output data survive aborts.

Best regards,

Roman

393 Views

jimdempseyatthecove wrote:

a) prior to issuing each instruction that may cause a transaction abort, write the address that may abort into a memory location that won't abort. Somewhat of the same philosophy of a try/catch. Should the transaction abort __sync_fetch_add(loc,0); // RMW

Hi Jim,

I've been thinking more about this.  Is there a way to specify the memory I don't want added the transaction, or is there pre-defined memory that I can use?  Do you have a link with information on how this would work?

jimdempseyatthecove
Black Belt
393 Views

Roman, oops. You are right.

William, either all memory access is transactional or not. The instruction trace hack is not backed out. You could potentially use the instruction trace information in your transaction abort handler. This should be documented in a systems programmer manual. Perhaps Roman could provide a link. Converting the address into source code line number would be up to you to figure out.

Jim Dempsey

393 Views

Hey Jim,

This was what I thought.  I must have misunderstood your first comment.

Thanks!

Roman_D_Intel
Employee
393 Views

Setting up processor trace recording and reading the results directly from hardware is only possible in the kernel (ring 0) and not from user space. I am not sure if it is practical for this use case (compiler).

Roman

Reply