Community
cancel
Showing results for 
Search instead for 
Did you mean: 
Highlighted
Beginner
624 Views

IOMMU/Vt-d initialization on Centos Linux Failure

Hi,

 

I'm trying to enable the IOMMU/Vt-d on an Intel server (Motherboard: S2600BP, CPU: Xeon Gold 6252). However, the initialization fails to complete, with the kernel complaining:

 

[ 1.283650] DMAR: Device scope type does not match for 0000:d7:00.0

The device in question is a root port:

d7:00.0 PCI bridge: Intel Corporation Sky Lake-E PCI Express Root Port A (rev 07) (prog-if 00 [Normal decode])

The kernel code seems to be complaining because ACPI does not agree with PCI config space (which indicates the device is type 1). I see the following entries in the ACPI DMAR with a scope that refers to this root port:

 

[0D8h 0216 2] Subtable Type : 0000 [Hardware Unit Definition] [0DAh 0218 2] Length : 0020 [0DCh 0220 1] Flags : 00 [0DDh 0221 1] Reserved : 00 [0DEh 0222 2] PCI Segment Number : 0000 [0E0h 0224 8] Register Base Address : 00000000FBFFC000   ... [0F0h 0240 1] Device Scope Type : 02 [PCI Bridge Device] [0F1h 0241 1] Entry Length : 08 [0F2h 0242 2] Reserved : 0000 [0F4h 0244 1] Enumeration ID : 00 [0F5h 0245 1] PCI Bus Number : D7 [0F6h 0246 2] PCI Path : 00,00     ...   [198h 0408 2] Subtable Type : 0001 [Reserved Memory Region] [19Ah 0410 2] Length : 0020 [19Ch 0412 2] Reserved : 0000 [19Eh 0414 2] PCI Segment Number : 0000 [1A0h 0416 8] Base Address : 0000000052CC8000 [1A8h 0424 8] End Address (limit) : 000000005ACCFFFF [1B0h 0432 1] Device Scope Type : 01 [PCI Endpoint Device] [1B1h 0433 1] Entry Length : 08 [1B2h 0434 2] Reserved : 0000 [1B4h 0436 1] Enumeration ID : 00 [1B5h 0437 1] PCI Bus Number : D7 [1B6h 0438 2] PCI Path : 00,00   ...   [1D0h 0464 2] Subtable Type : 0002 [Root Port ATS Capability] [1D2h 0466 2] Length : 0030 [1D4h 0468 1] Flags : 00 [1D5h 0469 1] Reserved : 00 [1D6h 0470 2] PCI Segment Number : 0000   ...   [1F8h 0504 1] Device Scope Type : 02 [PCI Bridge Device] [1F9h 0505 1] Entry Length : 08 [1FAh 0506 2] Reserved : 0000 [1FCh 0508 1] Enumeration ID : 00 [1FDh 0509 1] PCI Bus Number : D7 [1FEh 0510 2] PCI Path : 00,00

The RMRR entry identifies the device as "Endpoint" rather than "Bridge", which seems to be the source of the issue. What might cause this RMRR to be generated? Is there a BIOS option set incorrectly? Or is there perhaps another issue I've missed?

 

Thanks,

Eric

0 Kudos
13 Replies
Highlighted
Moderator
163 Views

Hello EBadg1,

 

Thank you for joining the community

 

Even when it seems to be a OS kernel related issue, the most obvious question could be to make sure you have Intel Virtualization technology enabled under BIOS>Advanced>CPU configuration. If yes please let us know so we can research further.

 

Regards

 

Jose A.

Intel Customer Support Technician

A Contingent Worker at Intel

0 Kudos
Highlighted
Beginner
163 Views

INTELVT.pngINTELVT2.pngINTELVT3.pngHi Jose,

 

 

Intel VT is enabled, as well as VT-D and MMIO is set to 1024.

SR-IOV is also enabled.

 

 

Regards,

Donatien

 

0 Kudos
Highlighted
Moderator
163 Views

Hello DLEB,

 

Could you please tell what version of CentOS did you install? Besides that do you know if the BIOS VT settings were enabled prior the OS installation or after it was already installed?

 

By any chance have you look at github for any possible related issue available there?

 

Regards

 

Jose A.

Intel Customer Support Technician

A Contingent Worker at Intel

0 Kudos
Highlighted
Beginner
163 Views

This is Centos 7. Vt-d was enabled after the OS was installed. I'm not sure why I would look at github specifically (what project?), but yes I've searched for related issues and failed to find a solution.

 

Eric

0 Kudos
Highlighted
Moderator
163 Views

Hello EBadg1,

 

We suspect the BIOS VT settings need to be enabled before the OS installation just to make sure the correct VT hardware environment is ready and findable by the OS. I think it is worth try to reinstall OS. About the GitHub I was just wondering so don't worry about it.

 

Jose A.

Intel Customer Support Technician

A Contingent Worker at Intel

 

0 Kudos
Highlighted
Moderator
163 Views

Hello EBadg1,

 

Do you have any further details, updates, questions or comments in regards to this issue? This thread will be marked as resolved automatically in the next 72 hours if no activity is received.

 

Regards

 

Jose A.

Intel Customer Support Technician

A Contingent Worker at Intel

0 Kudos
Highlighted
Beginner
163 Views

Hi Jose,

 

 

We reinstalled the OS with VT-D enabled. The behaviour is actually the same.

On Skylake node, the init is done without error at startup.

 

Could we assume that the issue is Cascade Lake ? Or is it related to the motherboard ? ( S2600BP )

 

 

Regards,

Donatien

0 Kudos
Highlighted
Moderator
163 Views

Hello DLEB,

 

Thanks for the updates. Let me elevate this questions and will get back to you as soon as we have updates.

 

Regards

 

Jose A.

Intel Customer Support Technician

A Contingent Worker at Intel

 

0 Kudos
Highlighted
Moderator
163 Views

Hello all,

 

I didn't get that much from engineering pretty much because this seems to be a OS related issue more than server related or hardware related. We got confirmation that both Skylake and Cascade platforms are virtualization capable, so as long as you enable Vd-t prior to the OS install it should be ok. By any chance have you tried with any other OS version, either another Linux distribution or even Windows HyperV just to discard hardware issues?

 

Regards

 

Jose A.

Intel Customer Support Technician

A Contingent Worker at Intel

 

 

0 Kudos
Highlighted
Beginner
163 Views

Hi Jose,

 

Perhaps worth mentioning: we're not trying to run virtual machines. The IOMMU is being used entirely by the host OS.

 

Is there anyone who can read the ACPI DMAR table entries I posted initially and comment on them? Having looked at other servers (including other Cascade Lakes), I never see this mismatch in the ACPI tables (which are generated by the BIOS).

 

Eric

0 Kudos
Highlighted
Moderator
163 Views

Hello EBadg1,

 

Probably an option could be to post a question in the CentOS community. The other Cascade Lake servers are they identical to this one exhibiting this behavior? What BIOS version is it running on this particular server? Is it been updated? Is it possible to replicate the exact same hardware recourse and BIOS parameters been used by any of the other working servers?

 

Will wait for your updates.

 

Jose A.

Intel Customer Support Technician

A Contingent Worker at Intel

0 Kudos
Highlighted
Moderator
163 Views

Hello EBadg1,

 

I am just following up to double check if you were able to gather the requested information. Otherwise let us know if you require more time to accomplish this. This support interaction will be marked as resolved automatically in the next 72 hours if no activity is received.

 

Regards

 

Jose A.

Intel Customer Support Technician

A Contingent Worker at Intel

0 Kudos
Highlighted
Moderator
163 Views

Hello EBadg1,

 

We will proceed to mark this thread as resolved. If you have further issues or questions just go ahead and create a new topic.

 

Jose A.

Intel Customer Support Technician

A Contingent Worker at Intel

0 Kudos