Intel® NUCs
Support for Intel® NUC products

NVMe Hotplugging

JWoeb
Novice
1,198 Views

Hi,

I am currently trying to get NVMe Hotplugging / Hotswapping to work on my NUC running Ubuntu 18. Hotswapping works by removing the device and rescanning the bus but if I power up the device without any device attached rescanning does nothing because it seems like the PCIe root port ( 0000:01 ) was powered down somewhere in the BIOS or while the kernel booted.

I am tending towards this being a problem in the BIOS as I have the same problem under Windows. Hotswapping works there ( by searching for changed Hardware in the device manager ) but Hotplugging does not.

If the card is attached and F2 is pressed during boot and, when in the BIOS, the card is removed before continuing to boot Windows it is possible attach an NVMe later so I think it is a Problem with the BIOS powering down that port.

Is there any way to deactivate the deactivation of that PCIe root port? We would like to use the NUC to record and process huge amounts of data so Hotplugging / Hotswapping would be a must for our application.

Thanks

 

Hotswapping:

 

sudo lspci | grep Samsung
01:00.0 Non-Volatile memory controller: Samsung Electronics Co Ltd NVMe SSD Controller SM981/PM981/PM983

sudo ls /sys/class/nvme
nvme0

** removing NVMe **

sudo lspci | grep Samsung
01:00.0 Non-Volatile memory controller: Samsung Electronics Co Ltd NVMe SSD Controller SM981/PM981/PM983 (rev ff)

sudo ls /sys/class/nvme

**reattaching NVMe**

sudo sh -c "echo 1 > /sys/bus/pci/devices/0000\:01\:00.0/remove"
sudo sh -c "echo 1 > /sys/bus/pci/rescan"

sudo lspci | grep Samsung
01:00.0 Non-Volatile memory controller: Samsung Electronics Co Ltd NVMe SSD Controller SM981/PM981/PM983

sudo ls /sys/class/nvme
nvme0

 

 

Edit: It is a NUC8 BKCM8I5CB8N

0 Kudos
13 Replies
n_scott_pearson
Super User Retired Employee
1,191 Views

AFAIK, NVMe Hotswap/Hotplug is not a supported feature in any NUC model. I could find no hits for this topic in the TPS documentation at all.

Intel Customer Support can bring this to the NUC team as a feature request, but if hardware modifications (for power management, for example) are required, it could only be considered for future products.

...S

Hans_Bausewein
New Contributor I
1,155 Views

Maybe you should consider some virtualization?

With LXD you can remotely (or locally) manage containers, even move them between hardware.

If you have more NUCs you can just move them and shutdown the NUC for maintenance.

The overhead of Linux Containers is quite low, but GPU access is still limited.

Hans

JWoeb
Novice
1,124 Views

Thanks for the tip but we are designing a physical measurement device that records and processes data. Switching out the NUCs is not possible due to them also managing some other devices in addition to recording the data.

JWoeb
Novice
1,116 Views

We had a similar problem with a different platform ( Ubuntu on NVIDIA Tegra TX2 ) where it was possible to unload / reload the PCIe Host Controller Driver to get a Hotplugging to work. Would something like this be possible on Ubuntu on an Intel based Platform? If so, which driver would be best to unload to not completely mess up all the devices connected via PCIe on an NUC?

DeividA_Intel
Moderator
1,067 Views

Hello JWoeb,  

  


Thank you for posting on the Intel® communities.   



I would like to let you know that the Intel® NUC 8 Compute Element CM8i5CB is not designed with M.2 hot-plugging capabilities, bear in mind that the unit does not provide easy access to the storage device to hotplug this safely.


Bear in mind that the device must have the following features:


1. Protect the unit against electric shocks that may occur from connecting two charged devices.

2. The device may have some sort of shield or protection to keep components from generating static.

3. A mechanism must be in place in the operating system and the device to recognize the removal or addition of a device. It is something that Intel does not support.


A recommendation would be to look at other devices that you can connect to the NUC via Thunderbolt or USB that offer external storage and they may also include hotplugging features.



Regards,  


   

Deivid A. 

Intel Customer Support Technician 


JWoeb
Novice
1,047 Views

Thanks @DeividA_Intel for your detailed response. Sorry I forgot to mention it but the electrical Problems are not an issue for us because we are using a ToughArmor MB840M2P-B Removeable SSD Rack and Hotswapping generally seems to work so I think our main Problem is that the PCIe root complex is switched off by the BIOS. Is there any way to circumvent this. Some hidden BIOS option or special version.

Do you think it would be possible to "trick" the BIOS into not switching the PCIe port off somehow? If a PCIe Switch is connected to the port would that prevent the port from being switched off?

powerarmour
Valued Contributor I
1,040 Views

@JWoeb Be careful going down this road, the M.2 connection itself is very delicate, it's only officially rated for ~60 mating cycles (whereas SATA is 10k+), it's just not built to be a hot swap solution mechanically.

They are really designed to just be installed a few times at most and left in situ tbh.

JWoeb
Novice
1,018 Views

Thanks for mentioning that. That will certainly cause some issues for us if the durability of the M.2 connection is that low. This could even short out some traces if one detaches from the PCB and cause some additional damage, couldn't it?

This does make the ToughArmor MB840M2P-B seem a little bit pointless though if M.2 connectors are actually that fragile in practice. Will try to do some testing.

 

MfG

Johannes

powerarmour
Valued Contributor I
1,011 Views

@JWoeb U.2 (M.2 with a cable) is the 'official' NVMe external solution, but that's generally enterprise only gear at the moment.

But as mentioned elsewhere, you'd probably want to look at a Thunderbolt NAS ideally, you'd then be able to hot swap on that device itself.

JWoeb
Novice
990 Views

Thanks again. A Thunderbolt 3 NAS sounds nice but we have got the problem that we need to record data at a rate > 10 Gb/s and that it has to be somewhat rugged so that it can be used in an airborne environment which limits our options. Ideal would be a NUC with a Thunderbolt 3 Interface in a PCB only form. I only managed to find the desktop PC versions of NUCs with TB3.

n_scott_pearson
Super User Retired Employee
981 Views

It is typically only the NUC Pro series that includes board products. The existing Provo Canyon (NUC 8 Pro) has TBT3 support and the forthcoming Tiger Canyon (NUC 11 Pro) will have dual TBT4 support.

The part numbers for the NUC 8 Pro are NUC8v7PNB (Core i7 vPro), NUC8v5PNB (Core i5 vPro) and NUC8i3PNB (Core i3). SimplyNUC can offer you this board in a fanless (PorCoolPine, yes I am rolling my eyes) chassis. See https://simplynuc.com/provo-canyon and https://simplynuc.com/provo-canyon-porcoolpine for more information.

The part numbers for the NUC 11 Pro are NUC11TNBi7, NUC11TNBi5 and NUC11TNBi3. I expect SimplyNUC will offer a fanless chassis versions of these as well.

Hope this helps,

...S

JWoeb
Novice
892 Views

Thank you @n_scott_pearson for looking those up. Those sound like a good backup option in case we cant get it to work with M.2 directly.

JWoeb
Novice
832 Views

Hi,

one way we tried to work around this issue is to use a PCIe Riser Card ( https://www.delock.com/produkt/41433/pdf.html?sprache=en ). If this card is attached the PCIe port is not switched off in the BIOS and can be reenumerated. But the drive can not be mounted as the nvme driver seems to be not loaded.

** start without CFExpress attached

** attach CFExpress

sudo lspci | grep NVMe
 
sudo sh -c "echo 1 > /sys/bus/pci/devices/0000\:01\:00.0/remove"
sudo sh -c "echo 1 > /sys/bus/pci/rescan"

sudo lspci | grep NVMe
03:00.0 Non-Volatile memory controller: Phison Electronics Corporation Device 5008 (rev 01)

sudo ls /sys/class/nvme

 

Edit: In dmesg the error "Removing after probe failure" is thrown. I tried setting grub options "pcie_aspm=off", "noveau.modeset=0" and "nvme_core.default_ps_max_latency_us=0" without success.

 

nvme nvme0: pci function 0000:03:00.0
nvme 0000:03:00.0 enabling device ( 0000 -> 0002 )
nvme nvme0: Removing after probe failure status: -19
Reply