FPGA Intellectual Property
PCI Express*, Networking and Connectivity, Memory Interfaces, DSP IP, and Video IP
Intel Support hours are Monday-Fridays, 8am-5pm PST, except Holidays. Thanks to our community members who provide support during our down time or before we get to your questions. We appreciate you!

Need Forum Guidance? Click here
Search our FPGA Knowledge Articles here.
5985 Discussions

Determining the index of the first 1 bit in a vector

Honored Contributor II

Let's say I have a 64b wire, and I want to efficiently determine the index of the first 1 bit. 


For example: 

wire [63:0] sum; 

wire [5:0] index; 


assign sum = 64'b0000000111010110....10101; 


Here I want some hardware that assigns the value 7 to index. 


I'd like to do this in one clock cycle. 


Is there an easy or well-known way to do this? 


Thank you
0 Kudos
4 Replies
Honored Contributor II

You can use a function, like the one below and hope that Quartus synthesizes it efficiently. 

It will produce a rather long combinational path, but I don't think that can be avoided. 


function [5:0] getIndex(input [63:0] data); 

integer i; 

for(i = 63; i >= 0; i = i - 1) begin 

if (data[63-i] == 1'b1) 

getIndex = i; 




assign index = getIndex(sum);
Honored Contributor II

I think the most efficient solution involves dividing your 64bit vector in halves recursively. 

So, ideally you get a minimum combinatorial path log2(64) long.
Honored Contributor II

There'a discussion about priority bit masking and priority encoding implementations in the Altera advanced synthesis cookbook. I previously did some own comparisions. A behavioral description as suggested by rbugalho doesn't result in a very fast implementation. The cookbook suggest an adder to utilize the carry chain which is quite fast. As discussed in a previous thread, Quartus has some problem to find a timing optimal carry chain solution for non-arithmetical problems. http://www.alteraforum.com/forum/showthread.php?t=28798 


With the binary tree solution, you have to re-combine the result of the branches, I doubt, that it can compete with carry chain speed. For very wide inputs, some kind of parallel processing may be reasonable. 


Regarding the speed requirements of the original post, it should work a moderate clock speeds, e.g. 40 - 50 MHz. The carry chain priority mask logic need's about 13 ns for 64 bits.
Honored Contributor II

has anyone tried 3rd party synthesis targeting Altera to see if the carry chains are taken advantage of?