PUF in an FPGA

Altera_Forum · ‎04-26-2008

I am making a circuit that runs a pulse through a large amount of multiplexers. I want to create a race condition arbitrated by a RS flip-flop (Latch). This basic design is then replicated for as many times as you want output bits. This is also called a Physically Unclonable Function (PUF) circuit. The purpose is to create random numbers based on the unique wire delays in different devices.

Now, I have designed the whole thing in Quartus II, and I am using a DE2 FPGA as my device. I am able to program the DE2, but both my simulations (timing) and my output from the 7-seg displays on the FPGA shows all 0s. When I choose functional simulation I get all Fs. I wasn't expecting for this to work as a simulation, but it should work on the FPGA.

Apparently the same pulse is winning every time on every device. Any idea what I can do to force the FPGA to give me a more variable source of delay?

I hope I have been clear. Sorry for the long post.

Lars

Altera_Forum · ‎04-27-2008

To be honest, I don't yet see what's the mechanism to create independent random states in the said way. Can you give a one bit design example that also allows to analyze why Quartus doesn't understand your intentions.

P.S.: I understand, that the intention is to generate an unique device signature. I think this may be difficult based on FPGA delay variations.

P.P.S: A classical puf multiplexer circuit as presented by Daihyun Lim would be completely removed during compilation without keep attributes, see the recent discussion in this forum. Furthermore, an always zero output simply indicates, that the delay variation is smaller than the setup time of the latch used as arbiter. Finally I fear, that the systematic delay variations caused by logic cell and routing topology may be considerably higher than the contribution of random, chip specific variation. Thus the method would fail. Designing an almost symmetric topology would require means beyond the capabilities of Quartus user interface. A more symmetric arbiter (at least a systematic correction for the setup time) should be used anyway.

Altera_Forum · ‎04-27-2008

Thank you for your reply!

I have also come to the conclusion that a PUF based on wire delays is not suited for the CAD/FPGA environment. I am therefore going to try and implement an Oscillator Ring (OR) PUF, or an SRAM PUF instead.

For a good explanation, please take a look at Figure 1 of this paper:

physical unclonable functions for device authentication and secret key generation (http://videos.dac.com/44th/papers/1_3.pdf)

Figure 2 shows the Oscillator based PUF. Do you think an RO PUF would work better in the environment I'm working under?

Thanks!

Altera_Forum · ‎04-27-2008

The said paper was among the literature I consulted to understand the topic. I didn't yet consider the possible problems of ring oscillator realization with FPGA. These constructs are beyond my regular horizon of synchronous FPGA designs. I guess, that the general situation could be similar to the multiplexer case. I didn't mean, that they would be completely unrealizable. Your previous results are probably due to ignoring the usual behaviour of FPGA compilers. To get a logic structure exactly as you designed it, you have to switch off respectively block all optimization steps of the compiler. You should be able to realize at least some elements of puf behaviour then.

Altera_Forum · ‎04-27-2008

Do you know how to turn off the optimizations in Quartus II?

Altera_Forum · ‎04-27-2008

See the recent topic "simple" delay chain mystery for a detailed discussion how to place logic cells intentionally by using a keep synthesis attribute. You should also get familiar with the Quartus Netlist Viewer tools to check the achieved results.

Altera_Forum · ‎04-27-2008

I have tried to alter the optimization options, but to no avail. Going for the RO PUF, hopefully this will work better. Perhaps you could help me clarify the design of the RO PUF? I have decided to create 64 ROs to output 32 bits. What is the purpose of the counter at the end of the block?

Altera_Forum · ‎04-28-2008

I guess the counter is there to let you know which oscillator is faster. then you somehow output a bit if the first one is faster and a zero if the second one is faster.

The thing is that you need a seed...which I guess is the "input" in figure 2. Will a 32 bit input then choose two unique oscillators to compare?

Altera_Forum · ‎04-28-2008

See the Advanced Synthesis Cookbook available at http://www.altera.com/literature/lit-manual.jsp. Figure 13-1 in "Chapter 13. Random and Pseudorandom Functions" contains a ring oscillator used to produce random numbers. The ring_counter.v file provided with the document uses the "keep" synthesis attribute that FvM mentioned. You might have to turn off "Ignore LCELL Buffers" in the "More Analysis & Synthesis Settings" dialog box.

Altera_Forum · ‎04-28-2008

You should have more on-die variation if you stretch the ring oscillator across the device. Or if you want more variation from one ring oscillator to the next on the same device, place the individual ring oscillators far apart. I would expect that any method that has more single-device on-die variation will also increase the device-to-device variation for devices in the same production lot that might tend to be very similar to each other. Usually it is not a good idea to make assignments to combinational nodes, but with the "keep" synthesis attribute the combinational node names should be repeatable from compile to compile so that the assignments will work. In a LUT-based FPGA (as opposed to a p-term CPLD) you could use either LogicLock regions or LAB location assignments to assign individual combinational nodes to widely separated places on the device. Either kind of assignment is OK to stretch out the individual nodes of a single ring oscillator. Locked-origin LogicLock regions are more convenient to place each ring oscillator in a small area with the ring oscillators spread apart from each other; simply assign an instance of the entire ring oscillator block of hierarchy to each region.

--- Quote Start ---

P.S.: I understand, that the intention is to generate an unique device signature. I think this may be difficult based on FPGA delay variations.

... Finally I fear, that the systematic delay variations caused by logic cell and routing topology may be considerably higher than the contribution of random, chip specific variation. Thus the method would fail.

--- Quote End ---

If you want a fixed unique signature for each device as opposed to simply a random number, then my suggestion to spread out the ring oscillator will make FvM's concern even worse. It would be better to place the nodes as compactly as possible to minimize the on-die variation across the individual device. Even with zero on-die variation affecting this circuit, a number that depends on device-to-device variation will also depend on supply voltage and operating temperature. The delays for a given device will vary over the operating conditions. You still won't get a fixed unique signature for each device even with zero single-device on-die variation.