I am using the Altera Stratix IV dev kit. The FPGA is in a PCIe slot on a system running Linux. Is there a way (i.e., some Linux commands or some other method) for me to be able to disconnect/disable the PCI device from the bus so that I can load a new .sof onto the FPGA, than reconnect/reenable the device after the FPGA has been reprogrammed? I need to find a way to reprogram the FPGA without causing the Linux kernel to panic due to PCI slot issues caused when the FPGA is reprogrammed.For the purpose of this thread, please assume that loading a .sof onto the FPGA is the only way that I can program the FPGA. Also, please assume that I do not have the luxury of re-booting the machine. As a result of these constraints, I am ensuring that my PCIe BARs and config space do not change from .sof to .sof. I am considering coming out of retirement and resuming my other career if I can't get this to work. :D
Unlike USB, PCIe insertion/removal isn't handled well, but it is possible. Here's what I've seen done with a vendor X FPGA, which should behave similarly to Altera...Linux has a system (kernel) routine pci_save_state which can save state of an already configured device. Once the state is saved, you can reconfigure the FPGA, or even unplug and replug it into PCIE, then use the pci_restore_state routine to restore the config registers and then call pci_enable_device to re-enable it. These routines would be called from a kernel module, and you would need /dev device file created to convey ioctl calls from a user app to the kernel module trigger the save and restore events. I'm afraid I can't show source code, but I suspect that similar code is probably available, maybe in the Linux Device Drivers book which O'Reilly allows you to read directly on their web site. This usage model presumes that the FPGA is programmed before Linux boots, so that the kernel does the required memory allocation at boot time (or maybe it uses the BIOS setup). I think that related routines may allow you to trigger setup for a newly inserted device, but we never explored that. \chuck
Chuck,Thanks. Is the process that you performed simply as simple as: (1) pci_save_state (2) reconfig FPGA (3) pci_restore_state Or is there some other step between pci_save_state and reconfig FPGA? When we tried pci_save_state followed by reconfig FPGA, we still get the same old NMI bringing the server down. Don't we need some way to tell the bus that the device is no longer present before we reprogram? If not, how can we ensure that the undetermined signal values of the FPGA I/Os driving the FPGA will not mess up the bus?
For a PCIe device using a PCIe slot, the linux PCs that we've tried have behaved fine using regular linux kernels (Debian Lenny and Squeeze) - we haven't seen any weird events. There are no interrupt signals on the PCIe connector, so the NMI you are getting must be from the PCIe root complex or bridge and be due to the device disappearing.If you don't have any known drivers for the device, does the problem still happen? We use insmod in a script to add our kernel modules, rather than tying them to the PCIe device/vendor ID. I could imagine that if the device is registered to a driver, then the removal of the device will cause the kernel to try to involve the driver in handling the removal. In our usage, we don't currently need a strong tie between the modules and the device, because our kernel modules only do app driven kmallocs for DMA usage. \c
Chuck,It turns out that the target server doesn't support hot swap, so I don't think it is possible to do what I am looking to do. I will keep your comments in mind for other servers. Thanks.
HiI have Dell precision T3500 Desktop Redhat Linux. Wanted to reboot my Intel Board (EP80857) from above said Desktop by toggling some Power lines on PCIe line. Connected the Target Intel board with HOst PC over the PCIe Slot 3 (putting Host Bus adapater). Any idea how to do this Reboot.... Thanks Ramu
Hi!I am are facing the same issue. When I reprogram my Stratix IV 530, the kernel panics about an NMI and reboots... Did you find a software workaround, i.e is there a way to disconnect/disable the endpoint before reprogramming the FPGA? Also, is this issue server-dependent? I.e. are there some servers (or more precisely some motherboards, I think...) that simply do not allow to do this kind of stuff? Thx Julien
Julien,I have been able to live with the issue in my current environment. As such, I have not dug into this any further. Sorry I could not be of any more help. I will be sure to update this thread if I ever revisit this issue.
To be clear, the workaround I have used in a test environment is to reprogram the device using the programmer. As soon as the kernel panics, I <Ctrl-C> out of the programmer shell and I reboot the machine from the command line. Since the FPGA contents are corrupted, there is no PCIe link on the card to enumerate when the OS boots. Once reboot is complete, I re-run the programmer. After programming completes, I reboot again so that the PCIe bus will re-enumerate.Clearly this is not a good solution...just a workable one in the environment I am in...
The more normal case is that you reprogram the flash that contains the data and then reboot/power cycle the system to pick up the new fpga image before linux has booted.I think we actually load all our fpga images from uboot.
I'm not sure I'd want to debug software if I could only load it into the fpga image! Even if that was how the real code would get loaded I think I sort out another way of doing it - and it wouldn't involve JTAG if I could do it a different way.