- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I tested below source on IBM XSeries 225 which has two Xeon 2.4 GHz processors.
I thought that avoiding cache false sharing lifted up performance.
When I padd some data structure, my assumption came true.
But when I turned on Hyper-Threading in BIOS, performance went down.
To improve performance using Hyper-Threading, what factor must I use or change?
Will I increase number of thread?
OS : Redhat Linux 9
compiler : icc 8.0
reference site for source :
Source
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
#include
#include
#include
// 4*4 = 16 bytes
unsigned long thread_id;
unsigned long v;
unsigned long start;
unsigned long end;
#ifdef FALSE_SHARING_FIX
// expand to 128 bytes to avoid false sharing
// (4 long + 28 padding)*4 = 128 bytes
int padding[12];
#endif
};
#define MAXLEN 1024*1024
#define NUM_PROC 4
int count=0;
void* thread_fn(void* arg) {
struct thread_param *p = (struct thread_param*)arg;
int i;
array[p->v] += 1;
}
}
pthread_t tid[NUM_PROC];
struct thread_param thread_struct[NUM_PROC];
int i, interval;
struct timeval start, end, result;
printf("usage: false_none count ");
return 0;
}
#ifdef FALSE_SHARING_FIX
printf("with FIX ");
#else
printf("without FIX ");
#endif
printf(" total execution time for ");
for (i=0; i< NUM_PROC-1; i++) {
thread_struct.thread_id = i;
thread_struct.start = i * interval;
thread_struct.end = thread_struct.start + interval;
}
thread_struct[NUM_PROC - 1].start = (NUM_PROC - 1) * interval;
thread_struct[NUM_PROC - 1].end = MAXLEN;
}
printf("%ld sec, %ld usec ", result.tv_sec, result.tv_usec);
return 0;
}
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
If your performance was reduced by turning on HT, running the same test with 2 threads, it doesn't look like a false sharing issue. To get an advantage from HT, you do usually need to increase the number of threads to match the number of logical processors. A significant reduction in performance is likely to be a scheduling problem. I don't know whether schedulers which work better with HT on dual CPU's are likely to come with distros incorporating 2.6 kernels.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Persepone -
As Tim pointed out, if you kept the same two threads when running under HT, the OS may have scheduled both threads onto the same physical processor (the two logical HT processors). This would result in a performance drop comapred to the dual-processor test without HT. Have you tired to run this with four threads on a dual-processor, HT-enabled system?
-- clay
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page