Software Archive
Read-only legacy content
17061 Discussions

SCIF_FENCE_INIT_SELF vs SCIF_FENCE_RAS_SELF

Mark_L_1
Beginner
410 Views

The SCIF PDF mentions "SCIF_FENCE_RAS_SELF" while the scif.h header file mentions SCIF_FENCE_INIT_SELF as options to the flags argument of scif_fence_mark( .. int flags ...)

Which one is it?

As a follow on question, if the answer is SCIF_FENCE_INIT_SELF, then what does the flag SCIF_FENCE_RAS_SELF do? What function does it call?

 

-Mark

0 Kudos
2 Replies
Mark_L_1
Beginner
410 Views

Let me clarify my question. The SCIF PDF documentation disagrees with the SCIF header file, on allowable options to scif_fence_mark.  I need to know which documentation to trust.

0 Kudos
Evan_P_Intel
Employee
410 Views

Mark L. wrote:

Let me clarify my question. The SCIF PDF documentation disagrees with the SCIF header file, on allowable options to scif_fence_mark.  I need to know which documentation to trust.

Trust the documentation within the SCIF header file over other sources.

My original design proposal for the SCIF API included the ability to fence RMA transfers with extremely fine granularity; for example, it was possible to "mark" partial transfers too, and the fence APIs took an offset and a length as arguments for this purpose. The flag SCIF_FENCE_RAS_SELF or SCIF_FENCE_RAS_PEER indicated whether these arguments should be interpreted relative to the registered address space of the indicated SCIF endpoint or of its connected peer; the "RAS" within their names is an acronym for this term, which is introduced in section 3.5 of the SCIF User Guide.

After gaining some implementation experience and interacting with internal development teams, however, it became obvious that this functionality was both difficult to extract performance benefits from and challenging for application developers to reason about. It was ultimately removed.

I imagine that these flags were overlooked and survived because they appear related to the SCIF_RAS_PORT_n constants elsewhere in the header file, where "RAS" instead refers to reliability and serviceability, an umbrella term covering, among other things, the SCIF "daemon" which uses those ports to communicate ECC failures etc. to the host computer.

I'll pass a note about this thread to the author of the SCIF User Guide.

0 Kudos
Reply