Programmable Devices
CPLDs, FPGAs, SoC FPGAs, Configuration, and Transceivers
19656 Discussions

Options to trigger remote reset of Arria 10 using "magic packet" approach?

munster-shug
Novice
298 Views

Hi,

We have a rare failure scenario on our deployed Arria 10 based devices whereby we see loss of connectivity back to our remote server, resulting in an inabiity to establish a 2-way SSH connectivity to the device.  SSH is our default means of remote access to the device. We have no serial link.

The Arria 10 and HPS still receives packets from our remote server but cannot acknowledge back to establish the link.

We suspect uplink scheduler lockup and are working to confirm this root cause.

Meanwhile we would like a means to remotely soft reset some of all of the Arria10 as a recovery mechanism that does not require a forced power cycle.

Is there any built in capability to enable a "Magic Packet" reset that could be used to reset the FPGA by sending defined packet from remote server?

Any suggestions to easiest implementation of a remote reset functionality would be much appreciated..

Thanks

 

 

0 Kudos
8 Replies
JingyangTeh
Employee
265 Views

Hi Munster-shug


Sorry for the late reply as there was a public holiday here.


There is no built in capability for the scenario that you are experiencing.


One way that I could think of is creating a script running in the back ground under the linux systemd which will be checking the connection to your remote server.

https://askubuntu.com/questions/522505/script-to-monitor-internet-connection-stability


To further understand your problem, is the connection to your remote server intermittent lost or is it permanent?

Is the connection lost only to your remote server or is it able to still ping external sites?


Regards

Jingyang, Teh


munster-shug
Novice
251 Views

Thanks Teh,

ACtually it is a little bit of a strange situation....

Remote device is connected via fibre.

We are normally able to SSH into the device, however for some unknown reason there are intermittent failures into some devices (frequency is highly varied, ranging from 4-14 hours)  whereby the SSH connectivity is lost.

Ping to remote device is not replied.

ARP commands sent from host server via fibre to our remote device are not responded to (wireshark logs).

Meanwhile the remote device continues to stream packets back up the uplink fibre.

The SFP transceievers are reporting link is up in both directions.

 

We suspect the MAC scheduler is frozen..... ......and hence we are looking for way to reset either the mac scheduler or the FPGA

 

WE are open to any other suggestions

 

Thanks

 

 

JingyangTeh
Employee
246 Views

Hi munster-shug


For a software reset you could write to the register 0xFFD0500C.

You could trigger a cold or a warm reset with the register.

https://www.intel.com/content/www/us/en/programmable/hps/arria-10/hps.html#topic/sfo1429890572275.ht...


For a hardware reset you could toggle the "HPS_cold_resetn " and "HPS_WAM_RESETn".


I would suggest having a heartbeat system that could send some status signal back to your server to get a ping.

If the unit does not get a ping back the unit could reset through software.


One of the status that you could add in is the "stat register" of the HPS.

https://www.intel.com/content/www/us/en/programmable/hps/arria-10/hps.html#topic/sfo1429890569960.ht...

With this your server could get information if there is a reset happened in the unit.


Regards

Jingyang, Teh


JingyangTeh
Employee
218 Views

Hi


Any update on this case?


Regards

Jingyang, Teh


munster-shug
Novice
209 Views

Thanks Teh, we were heading in the same direction in relation to the heartbeat signal, but really helpful to get the pointers to the specific inbuilt registers and functions. Much appreciated....

Still really keen to figure out a way of doing it via a magic packet....

 

We cannot force a register write remotely when we cannot bring up the SSH connection......

 

 

JingyangTeh
Employee
189 Views

Hi Munster-shug


An application running on the device will be writing to the register once it detects a lost of connection to your server.

Whenever you try to ssh into the remote system you could read the status register or a log file store in the device to check if there was a reset happen.


Regards

Jingyang, Teh


JingyangTeh
Employee
161 Views

Hi


Any update on this case?


Regards

Jingyang, Teh


JingyangTeh
Employee
126 Views

Hi


Since there are no feedback for this thread, I shall set this thread to close pending. Please login to ‘https://supporttickets.intel.com’, view details of the desire request, and post a feed/response within the next 15 days to allow me to continue to support you. After 15 days, this thread will be transitioned to community support. The community users will be able to help you on your follow-up questions.


If you happened to close this thread you might receive a survey. If you think you would rank your support experience less than 10 out of 10, please allow me to correct it before closing or if the problem can’t be corrected, please let me know the cause so that I may improve your future service experience.


Regards

Jingyang, Teh


Reply