- Marquer comme nouveau
- Marquer
- S'abonner
- Sourdine
- S'abonner au fil RSS
- Surligner
- Imprimer
- Signaler un contenu inapproprié
Hello everyone!
I have to synchronize time between processors in a multicore system i.e. I have to calculate TSC differences of all processors relative to one of them.
I tried rdtsc() but it returned TSC of the current processor. Is there any way to get TSC from the necessary processor? Or may be I can define processor id somewhere and use an appropriate time stamp counter value.
Thanks in advance,
Roman
Lien copié
76 Réponses
- Marquer comme nouveau
- Marquer
- S'abonner
- Sourdine
- S'abonner au fil RSS
- Surligner
- Imprimer
- Signaler un contenu inapproprié
>>>It's not at all clear how QueryPerformanceCounter is implemented,>>> QueryPerformanceCounter could be disassembled and statically or dynamically analyzed in order to understand its implementation.I suppose that this functions could use HPET timer.You can call QueryPerformanceFrequency (returns counts per second) to have an idea if it is implemented with TSC or HPET. Thanks, Roman
- Marquer comme nouveau
- Marquer
- S'abonner
- Sourdine
- S'abonner au fil RSS
- Surligner
- Imprimer
- Signaler un contenu inapproprié
Hi,
But, if I got right, the invariant TSC in newer processors (17.13.1 in Vol.3) guarantees me TSC values been synchronized. Well, in older processors I can't still rely on TSC's of different cores without manual synchronization. Am I right?not quite... The time-stamp counter on recent Intel processors is reset to zero each time the processor package has RESET asserted. From that point onwards the invariant TSC will continue to tick constantly across frequency changes, turbo mode and ACPI C-states. All parts that see RESET synchronously will have their TSC's completely synchronized. This synchronous distribution of RESET is required for all sockets connected to a single PCH. For multi-node systems RESET might not be synchronous. The biggest issue with TSC synchronization across multiple threads/cores/packages is the ability for software to write the TSC. The TSC is exposed as MSR 0x10. Software is able to use WRMSR 0x10 to set the TSC. However, as the TSC continues as a moving target, writing it is not guaranteed to be precise. For example a SMI (System Management Interrupt) could interrupt the software flow that is attempting to write the time-stamp counter immediately prior to the WRMSR. This could mean the value written to the TSC could vary by thousands to millions of clocks. hope this helps, Roman
- Marquer comme nouveau
- Marquer
- S'abonner
- Sourdine
- S'abonner au fil RSS
- Surligner
- Imprimer
- Signaler un contenu inapproprié
>>>You can call QueryPerformanceFrequency (returns counts per second) to have an idea if it is implemented with TSC or HPET.>>>
Thank you Roman.
- Marquer comme nouveau
- Marquer
- S'abonner
- Sourdine
- S'abonner au fil RSS
- Surligner
- Imprimer
- Signaler un contenu inapproprié
>>...If you are interested you can test HT scaling...
This is what I'm going to do some time later.
- Marquer comme nouveau
- Marquer
- S'abonner
- Sourdine
- S'abonner au fil RSS
- Surligner
- Imprimer
- Signaler un contenu inapproprié
>>...For multi-node systems RESET might not be synchronous...
I wonder how VTune gets times on a multi-node system?
Does VTune use 'QueryPerformanceCounter' Win32 API function or 'RDTSC' instruction?
- Marquer comme nouveau
- Marquer
- S'abonner
- Sourdine
- S'abonner au fil RSS
- Surligner
- Imprimer
- Signaler un contenu inapproprié
Hi Sergey,
>>...For multi-node systems RESET might not be synchronous... I wonder how VTune gets times on a multi-node system? Does VTune use 'QueryPerformanceCounter' Win32 API function or 'RDTSC' instruction?I am not VTune developer, but could you please elaborate why are you concerned? Which type of VTune analysis should not work if TSC has a small delta between the sockets? We are probably talking about deltas that are comparable with the delay of just a few remote memory accesses to other socket. Roman
- Marquer comme nouveau
- Marquer
- S'abonner
- Sourdine
- S'abonner au fil RSS
- Surligner
- Imprimer
- Signaler un contenu inapproprié
>>...I am not VTune developer, but could you please elaborate why are you concerned?
I don't have any concerns and I simply would like to know how VTune gets times. Does VTune use 'QueryPerformanceCounter' Win32 API function or 'RDTSC' instruction?
- Marquer comme nouveau
- Marquer
- S'abonner
- Sourdine
- S'abonner au fil RSS
- Surligner
- Imprimer
- Signaler un contenu inapproprié
Sergey,
I recommend you to repost your question with the reference to this thread to the Intel VTune forum which is tracked by VTune developers.
Thanks,
Roman
- Marquer comme nouveau
- Marquer
- S'abonner
- Sourdine
- S'abonner au fil RSS
- Surligner
- Imprimer
- Signaler un contenu inapproprié
>>>This is what I'm going to do some time later.>>>
It would be great to see the results.
I bet that for heavy floating point load scaling won't give any advantage.Some speedup probably will be due to lack of interdependencies beetwen various instruction beign dispatched to various ports.
- Marquer comme nouveau
- Marquer
- S'abonner
- Sourdine
- S'abonner au fil RSS
- Surligner
- Imprimer
- Signaler un contenu inapproprié
>>>recommend you to repost your question with the reference to this thread to the>>>
Do you think that Intel developers will reveal exact implementation of the VTune timers.
- Marquer comme nouveau
- Marquer
- S'abonner
- Sourdine
- S'abonner au fil RSS
- Surligner
- Imprimer
- Signaler un contenu inapproprié
>>>>recommend you to repost your question with the reference to this thread to the...
>>
>>Do you think that Intel developers will reveal exact implementation of the VTune timers.
Actually, I don't details and I simply need Yes or No answer, like 'Yes, RDTSC used' or 'No, RDTSC Not used'... Here is a link to my question on the VTune forum:
Forum topic: Does VTune use 'QueryPerformanceCounter' Win32 API function or 'RDTSC' instruction?
Web-link: http://software.intel.com/en-us/forums/topic/335541
- Marquer comme nouveau
- Marquer
- S'abonner
- Sourdine
- S'abonner au fil RSS
- Surligner
- Imprimer
- Signaler un contenu inapproprié
Actually, I don't details and I simply need Yes or No answer, like 'Yes, RDTSC used' or 'No, RDTSC Not used'... Here is a link to my question on the VTune forum:
I asked this because a few weeks ago I posted a question on MKL forum and asked about exact algorithm used to approximate Gamma on problematic range [0,001,1.0] and one of Intel employees refused to reveal an algorithmic implementation.
- Marquer comme nouveau
- Marquer
- S'abonner
- Sourdine
- S'abonner au fil RSS
- Surligner
- Imprimer
- Signaler un contenu inapproprié
>>...asked about exact algorithm used to approximate Gamma on problematic range [0,001,1.0] and one of Intel employees refused to
>>reveal an algorithmic implementation.
I'm not surprised to hear that. In many cases like yours things are working only in one direction, that is, for the benefit of a corporation. Iliya, try to ask Microsoft to release some sources and you won't get a response at all.
- Marquer comme nouveau
- Marquer
- S'abonner
- Sourdine
- S'abonner au fil RSS
- Surligner
- Imprimer
- Signaler un contenu inapproprié
>>>I'm not surprised to hear that. In many cases like yours things are working only in one direction, that is, for the benefit of a corporation. Iliya, try to ask Microsoft to release some sources and you won't get a response at all.>>>
Yes that's true.Sometimes little bit of reversing is the only solution albeit not the simplest and fastest one:)
- Marquer comme nouveau
- Marquer
- S'abonner
- Sourdine
- S'abonner au fil RSS
- Surligner
- Imprimer
- Signaler un contenu inapproprié
Sorry, it is off the topic...
>>...asked about exact algorithm used to approximate Gamma on problematic range [0,001,1.0] and one of Intel employees refused to
>>reveal an algorithmic implementation.
Could you try to ask the same question on a GNU Scientific Library ( GSL ) forum?
- Marquer comme nouveau
- Marquer
- S'abonner
- Sourdine
- S'abonner au fil RSS
- Surligner
- Imprimer
- Signaler un contenu inapproprié
>>>Could you try to ask the same question on a GNU Scientific Library ( GSL ) forum?>>>
Good question.I will ask this on their forum.GSL source code and implementation is open source so they will probably came with an exact answer.
Btw I solved this problem with the help of Mathematica 8 minimax polynomial calculation.
@Sergey
Can I freely use my own wrappers based on MKL library?
- Marquer comme nouveau
- Marquer
- S'abonner
- Sourdine
- S'abonner au fil RSS
- Surligner
- Imprimer
- Signaler un contenu inapproprié
>>Can I freely use my own wrappers based on MKL library?
You need to review MKL's license regarding what you can do and what you can't with the library.
- Marquer comme nouveau
- Marquer
- S'abonner
- Sourdine
- S'abonner au fil RSS
- Surligner
- Imprimer
- Signaler un contenu inapproprié
[ To Roman Oderov ]
Any updates? Performance results?
- Marquer comme nouveau
- Marquer
- S'abonner
- Sourdine
- S'abonner au fil RSS
- Surligner
- Imprimer
- Signaler un contenu inapproprié
- Marquer comme nouveau
- Marquer
- S'abonner
- Sourdine
- S'abonner au fil RSS
- Surligner
- Imprimer
- Signaler un contenu inapproprié
- Marquer comme nouveau
- Marquer
- S'abonner
- Sourdine
- S'abonner au fil RSS
- Surligner
- Imprimer
- Signaler un contenu inapproprié
Hi Roman, Thanks and I'll take a look at your results.

Répondre
Options du sujet
- S'abonner au fil RSS
- Marquer le sujet comme nouveau
- Marquer le sujet comme lu
- Placer ce Sujet en tête de liste pour l'utilisateur actuel
- Marquer
- S'abonner
- Page imprimable