Software Tuning, Performance Optimization & Platform Monitoring
Discussion regarding monitoring and software tuning methodologies, Performance Monitoring Unit (PMU) of Intel microprocessors, and platform updating.

Can not enable the "branch trace store" on SandyBridge

yakovxu
Beginner
1,145 Views
my cpu is : intel core i5 2400, Sandy Bridge

i start the branch trace store of one cpu(cpu0), msr register setting is as followed:

#define IA32_DS_AREA 0x600

#define IA32_DEBUGCTL 0x1d9

dwEDX = 0;

dwEAX = (DWORD)dbgStore32;

// set the DS memroy to MSR

WriteMSR(IA32_DS_AREA, dwEDX, dwEAX);

dwEAX |= (1 << BIT_BTS);

dwEAX |= (1 << BIT_TR);

//Clear LBR bit.

dwEAX &= ~(1 << 0 );

// Clear the BTINT flag in the MSR_DEBUGCTLA

dwEAX &= ~(1 << BIT_BTINT);

WriteMSR(IA32_DEBUGCTL, dwEDX, dwEAX);


WriteMSR(IA32_DEBUGCTL, dwEDX, dwEAX); //dwEAX=0, dwEDX=0x000000C0


the same program run normally on intel core solo processor

can anyone tell me why?

0 Kudos
5 Replies
Patrick_F_Intel1
Employee
1,152 Views

Hello Yakovxu,

Does the processor report that BTS is supported?
For others reference (from vol 3 of SDM), check msr IA32_MISC_ENABLE.bit11 where bit 11 is:
Branch Trace Storage Unavailable. (RO)
1 = Processor doesnt support branch trace storage (BTS)
0 = BTS is supported

Does the processor support the debug store facility? check CPUID.1:EDX[21] == 1
And is the DS Save area setup properly? (see SDM vol3 section 17.4.9).
What happens when you run your program (on both core solo and on Sandybridge)?
Thanks,
Pat

0 Kudos
yakovxu
Beginner
1,152 Views

thanks Patrick Fay

Icheck the msr IA32_MISC_ENABLE.bit11, it's value is 1

I also used following code to check the debug store facility

DWORD GetFamilyNo()

{

ULONG ulSignature = 0, ulFamily = 0;

ULONG ulEAX = 0;

ULONG ulEDX = 0;

__asm

{

MOV EAX,1

cpuid

MOV ulEAX, EAX

MOV ulEDX, EDX

}

// get family no, bit 8~11

ulFamily= ((ulEAX & 0xF00) >> 8);

KdPrint(("CPU Signature is %x,family %x.\n", ulSignature, ulFamily));

KdPrint(("GetFamilyNo ulEAX=%08x ulEDX=%08x\n", ulEAX, ulEDX));

return ulFamily;

}

the reported value is as followed:
GetFamilyNo ulEAX=000206a7 ulEDX=bfebfbff //ulEDX[21] == 1

my program can run normally on core solo processer, i get the following BTS records:


004bf5d8 7c80a4c7 00000000

004bf5f4 004bf7ae 00000000

004bf7bc 7c80a4c7 00000000

004bf7e2 004bf7e4 00000000

004bf7e6 8054420c 00000000

004bf7ff 004bf806 00000000

004bf823 004bf82b 00000000

004bf848 004bf850 00000000

004bf8a6 004bf8ae 00000000

004bf8e1 004bf8e9 00000000

004bf939 76b14e4f 00000000

004bf95c 004cf2c0 00000000

004cf2ee 004bdb60 00000000

004bdb67 004cf2ee 00000000

004cf2f4 004cf300 00000000

004cf30b 004cf310 00000000

004cf31a 004cf4c0 00000000

004cf4c9 004cf512 00000000

004cf51b 004cf565 00000000

004cf56e 004cf5b3 00000000

004cf5bd 004bf95c 00000000

004bf964 004c9cb0 00000000

004c9cc4 004c9cc9 00000000

004c9ccd 004c9cf4 00000000

004c9cf9 004d12a0 00000000

but the same program run on SandyBridge processer, i can't get any BTS records

0 Kudos
Patrick_F_Intel1
Employee
1,152 Views
What isthe OS thein both (core sole & SNB) cases? (64bit vs 32bit, windows version orlinux version)
0 Kudos
yakovxu
Beginner
1,152 Views
my OS is windows xp sp3, 32bit in SNB

core sole is the same as SNB
0 Kudos
Patrick_F_Intel1
Employee
1,152 Views
ok.
Are you allocating the debug store buffer with nonpagepool memory (with something like ExAllocatePool())?
And, if you areallocating more than 1 page, are the pages contiguous? I'm not sure how to ensure this requirement.
As you can tell, I'm justasking the easy questions first.
Pat
0 Kudos
Reply