Intel® ISA Extensions
Use hardware-based isolation and memory encryption to provide more code protection in your solutions.
1099 Discussions

xresldtrk and xsusldtrk behavior & RTM_RETIRED.ABORTED_MEMTYPE

pepesy00
Beginner
256 Views

Hi,

I'm trying to understand the functioning of the TSX intrinsic functions xresldtrk and xsusldtrk for suspending and resuming a transaction. I've written a simple code where two threads read from an array of random values and sum them up within a variable named "suma" (private to each thread). Since the array is shared among threads, its reading is protected by a transaction. However, the thread with tid 1 has to wait, using a spinlock, for the thread with tid 0 to perform the sum (and set a flag to 1) before proceeding with its own sum. The spinlock is escaped using xres and xsus. Despite this, the thread with tid 1 always aborts (transactional abort type) until it commits. Hence, I'm not clear on the functioning of xsus and xres. I suspect that since only the read set is escaped with xsus, and the flag is within the write set, the escape has no effect. But I'm not entirely sure. Could you help me with this? Below is the code snippet for the transaction function:

 

#pragma omp parallel // proc_bind(close)
  {
    int tid = omp_get_thread_num();
    printf("Hello from tid: %d\n",tid);
    TX_DESCRIPTOR_INIT(); // Declares the retries variable
    long int suma = 0;
    int flag = 0;

    for (int j = 0; j < 10; j++)
    {
#pragma omp for schedule(dynamic) nowait
      for (int i = 0; i < g_xLength; i++)
      {
        BEGIN_TRANSACTION(tid, 0);
        BEGIN_ESCAPE;
        if (tid == 1)
          while (!g_flag.flag)
            ;
        END_ESCAPE;
        suma += g_x[i];
        if (tid == 0)
          g_flag.flag = 1;

        COMMIT_TRANSACTION(tid, 0);

        if (tid == 1)
        {
          g_flag2.flag = 1;
          g_flag.flag = 0;
        }
        if (tid == 0)
        {
          while (!g_flag2.flag)
            ;
          g_flag2.flag = 0;
          g_flag.flag = 0;
        }
      }
    }
    printf("Sum of thread %d is: %ld\n", tid, suma);
  }
}

 Additionally, I'm encountering an issue with another transactional code. It deals with fairly large transactions that often abort due to capacity issues. However, as the transaction size and the number of involved threads increase, there's a significant rise in aborts of the type RTM_RETIRED.ABORTED_MEMTYPE (detected using perf), which, based on the limited information available, seems to indicate a write to an incompatible memory type. Could you provide more insight into this? I'm unsure of what might be happening and can't find any solutions.

0 Kudos
1 Reply
Roman_D_Intel
Employee
96 Views

Hi,

you might still have false-sharing due to HW prefetchers and experience conflicts due to that. Are the conflicts reproducible in SDE emulator? On which CPU type are you running your test (Sappphire Rapids or Emerald Rapids)?

Also your lock elision implementation need to put the lock into the read set to be correct. I did not find any evidence that you are avoiding this issue (but I might miss something):
Not putting Lock into Read Set: https://www.intel.com/content/www/us/en/developer/articles/technical/tsx-anti-patterns-in-lock-elision-code.html

To profile the location of RTM_RETIRED.ABORTED_MEMTYPE you can try "perf record -e r4c9:ppp"

Best regards,
Roman

0 Kudos
Reply