Intel® C++ Compiler
Community support and assistance for creating C++ code that runs on platforms based on Intel® processors.

Avoiding branch mispredictions

Shankar1
Beginner
355 Views
Im trying to leaarn about branch mispredictions and wanted to know how intel compiler can help me in doing this.

I have the following code
[cpp]enum Type { Type1, Type2, .., Typen };

class Data
{
public:
  const GetType() const { return type; }
  void setType( Type type_) { type = type_; }

  void* GetBuffer() { .. }

private:
  Type type;
};

// Here is function which is called a million times
void ProcessData( Data& data) {
  switch( data->GetType() ) {
    case Type1 :
    //... Process Type1 data
    break;
    case Type2 :
    //... Process Type2 data
    break;
    ..
    case Typen :
    //... Process Typen data
    break
  }
}[/cpp]
In the above switch case where the ProcessData function will be called with a million of Data objects, the frequency at which Type1 data arrive is extremely high than other types of data( say for every 1000 Type1 data i might get 1 data of other type).

By declaring the GetType method return a const Type enum variable will branch misprediction be avoided? if not is there a way to avoid branch misprediction?

Will intel compiler switch -Qixo help me in this regard? If yes how do I do that?
0 Kudos
4 Replies
TimP
Honored Contributor III
355 Views
It's reasonable to hope that normal optimization will favor the first case in the switch. If not, the prof-gen/prof-use scheme is intended to collect branch statistics and modify the compilation accordingly. An uglier method is to put the favored case in an if(){} clause and the switch for the remaining cases in else{}.
Did you use VTune or some other method to verify that you have excessive branch mis-predictions? This case would probably be taken care of easily by the hardware, if you don't have so many intervening loops that this branch history is evicted from the cache.
0 Kudos
jimdempseyatthecove
Honored Contributor III
355 Views

The Borland compiler (ca 1992) examined the case list. When fairly dense (not many gaps), it would generate a dispatch table. For large switch statements virtually all dispatches were the same (short) time. The time to execute the first section on an internal implementation of a chain of if statements could potentially be shorter than using a branch table (which requires 2 branches, one to the case clause, and one for the break). An optimization though could be made where the compiler generates the equivalent of the if for the first clause and a branch table dispatch for the remainder.

Jim Dempsey
0 Kudos
Shankar1
Beginner
355 Views
Quoting - tim18
It's reasonable to hope that normal optimization will favor the first case in the switch. If not, the prof-gen/prof-use scheme is intended to collect branch statistics and modify the compilation accordingly. An uglier method is to put the favored case in an if(){} clause and the switch for the remaining cases in else{}.
Did you use VTune or some other method to verify that you have excessive branch mis-predictions? This case would probably be taken care of easily by the hardware, if you don't have so many intervening loops that this branch history is evicted from the cache.
I profiled my code using vtune and I see a huge branch misprediction impact from the advice given by Tuning assistant.

But Im not able to pin point that it is this switch case which is causing the branch mispredictions. Is there a way to narrow down which function is causing branch mispredictions using vtune? It was my guess that this switch case was the reason for that.

In my case since the switch case is evaluated based on a const function's result, will the processor not get to know of the result earlier and will it not predict the correct branch?

0 Kudos
TimP
Honored Contributor III
354 Views
Quoting - Shankar
I profiled my code using vtune and I see a huge branch misprediction impact from the advice given by Tuning assistant.

But Im not able to pin point that it is this switch case which is causing the branch mispredictions. Is there a way to narrow down which function is causing branch mispredictions using vtune? It was my guess that this switch case was the reason for that.

In my case since the switch case is evaluated based on a const function's result, will the processor not get to know of the result earlier and will it not predict the correct branch?

If you have built with debug symbols, you should be able to "drill down" in VTune to determine source code locality of the reported mispredicts.
I doubt that the compiler can use information beyond the order of switch cases and profile guided data collection to optimize this.
0 Kudos
Reply