Software Archive
Read-only legacy content
17061 Discussions

How to ensure that SCIF is configured correctly and performing as expected?

drMikeT
New Contributor I
686 Views

MPSS 3.2.3, RHEL 6u4 FDR Mellanox fabric. 

I am wondering how we can check that the SCIF i/f is configured correctly and that it performs as expected?

Thanks

Michael

 

 

0 Kudos
5 Replies
Frances_R_Intel
Employee
686 Views

You are concerned particularly about its functioning as an InfiniBand interface? This blog shows what to expect from  ibv_devinfo and ibstatus for SCIF. In general, you can use standard Linux commands (like ifconfig and ip) to look at the status of the network interface to the card.

To look at the underlying status of the SCIF, you can look at /sys/class/mic/micN/scif_status to see if the scif is online for a given coprocessor N. Other than that, if those things which use SCIF, like the network interface and the offload library (COI), are working, I think that is your best way of telling if SCIF is ok.

0 Kudos
drMikeT
New Contributor I
686 Views

Hello Frances, 

thanks for the reply.

I can see the scif0 devices on host and Phis (see below) . I should then assume that the SCIF subsystem has been working properly?

HOST # cat /sys/class/mic/mic?/scif_status 
online
online

HOST # ibv_devices
    device                 node GUID
    ------              ----------------
    scif0               4c79bafffe300521
    mlx4_0              24be05ffff918e50

## ssh $(hostname)-mic1 ibv_devices
    device                 node GUID
    ------              ----------------
    mlx4_0              24be05ffff918e50
    scif0               4c79bafffe18127e

## ssh $(hostname)-mic0 ibv_devices
    device                 node GUID
    ------              ----------------
    mlx4_0              24be05ffff918e50
    scif0               4c79bafffe300520

 

regards

Michael

 

0 Kudos
Frances_R_Intel
Employee
686 Views

Yes. For practical purposes, if the SCIF comes up, OFED comes up and the ib interfaces are correct, then SCIF is working properly.

Are you asking this question because you have reason to believe things are not ok?

0 Kudos
drMikeT
New Contributor I
686 Views

I just wanted to make sure that SCIF is in good shape. I was stracing the micctrl command that does the --adduser option and I noticed that when the cards are "on-line" it keeps polling the file descriptor on a scif device and progress is so slow to the point that the command stalls. We have an open service call with Premier on this micctrl --adduser failure (while the cards are "on-line"). I guess in our case SCIF is OK, and the culprit is user level s/w using the SCIF devices ?

0 Kudos
Frances_R_Intel
Employee
686 Views

My recommendation is to always stop the mpss service before modifying anything in the configuration. Actually, I'm surprised micctrl didn't complain. Of course, the folks answering the questions in Premier will know better than I do and may have a different response.

Is there some reason you want to add users to the coprocessors while they are running? 

0 Kudos
Reply