Software Archive
Read-only legacy content
17061 Discussions

scif offset question

Vladimir_Dergachev
560 Views

I have a question about assignment of offsets by the scif library.

Some background: due to poor performance of file transfer over NFS (http://software.intel.com/en-us/forums/topic/404743#comment-1746053) I have written a daemon to provide NSF-like services over SCIF interface. The daemon runs fast and reliably easily achieving speeds in excess of 100MB/sec.

On startup, the daemon first listens on SCIF endpoint for incoming connections. After successful accept, the daemon forks. The main process closes received endpoint and continues listening.

The child process allocates receive/transmit buffers and calls scif_register. The function succeeds, but the physical offsets returned are always the same, even when several simultaneous connections to different nodes are in progress.

Do I understand correctly that physical offset is specific to the computer and particular instance of the process ? Why do I receive the same offset even though several different processes make a call on different memory (all allocated using calloc() after fork()).

Here is the init_buffers function I use:

[cpp]

#define SI_PAGE_SIZE 4096

static void init_buffers(SCIF_IO_CONTEXT *ctx)
{
fprintf(stderr, "Initializing transmit/receive buffers\n");
ctx->receive_window_size=10*1024*1024;
ctx->receive_window_free=0;
ctx->receive_window=(char *)((long long)calloc(ctx->receive_window_size+SI_PAGE_SIZE, 1) & ~(SI_PAGE_SIZE-1));

ctx->transmit_window_size=10*1024*1024;
ctx->transmit_window_free=0;
ctx->transmit_window=(char *)((long long)calloc(ctx->transmit_window_size+SI_PAGE_SIZE, 1) & ~(SI_PAGE_SIZE-1));
fprintf(stderr, "Recieve =0x%016llx Transmit =0x%016llx\n", (long long)ctx->receive_window, (long long)ctx->transmit_window);

ctx->receive_window_po=scif_register(ctx->epd, ctx->receive_window, ctx->receive_window_size, 0, SCIF_PROT_WRITE, 0);
if(ctx->receive_window_po==-1)perror("receive_window");
ctx->transmit_window_po=scif_register(ctx->epd, ctx->transmit_window, ctx->transmit_window_size, 0, SCIF_PROT_READ, 0);
if(ctx->transmit_window_po==-1)perror("transmit_window");
fprintf(stderr, "Recieve PO=0x%016lx Transmit PO=0x%016lx\n", ctx->receive_window_po, ctx->transmit_window_po);
}

[/cpp]

Here is example log from the daemon:

Aug 29 17:48:21 ypsilon1 scif_daemon: Connection to node 4 ended
Aug 29 17:48:22 ypsilon1 scif_daemon: Connection to node 2 ended
Aug 29 17:48:23 ypsilon1 scif_daemon: Connection to node 1 ended
Aug 29 17:48:32 ypsilon1 scif_daemon: accepted connection request from node:1 port:1089 user=1023
Aug 29 17:48:32 ypsilon1 scif_daemon: Connection to node 1: allocated buffers tx=0x4000000000a00000 rx=0x4000000000000000
Aug 29 17:48:32 ypsilon1 scif_daemon: accepted connection request from node:2 port:1089 user=1023
Aug 29 17:48:32 ypsilon1 scif_daemon: Connection to node 2: allocated buffers tx=0x4000000000a00000 rx=0x4000000000000000
Aug 29 17:48:32 ypsilon1 scif_daemon: accepted connection request from node:4 port:1089 user=1023
Aug 29 17:48:32 ypsilon1 scif_daemon: Connection to node 4: allocated buffers tx=0x4000000000a00000 rx=0x4000000000000000

As you can see the physical offsets are all the same even though scif_register was called from different processes.

0 Kudos
2 Replies
Vladimir_Dergachev
560 Views

I double checked and I also get the same physical offset on the Xeon Phi if I start multiple instances of the same application simultaneously.

0 Kudos
Vladimir_Dergachev
560 Views

After reading driver source it appears that a combination (endpoint, physical offset) identifies actual aperture, and "physical offset" is actually a virtual quantity. Thus there is nothing wrong with having same physical offsets as long as endpoints are different.

0 Kudos
Reply