I've noticed that with the H.264 encoder Media Foundation sample (mostly implemented within the following two files: mf_venc_plg.cpp and mfx_mft_h264ve.cpp), the sample does not use video/GPU memory for either IOPATTERN_IN or IOPATTERN_OUT, even if the hardware implementation is being used. The m_uiInSurfacesMemType class variable starts off with a value of MFX_IOPATTERN_IN_SYSTEM_MEMORY, and even though the m_uiOutSurfacesMemType class variable is initially set to MFX_IOPATTERN_OUT_VIDEO_MEMORY, it is changed to MFX_IOPATTERN_OUT_SYSTEM_MEMORY on line 1177 using the MEM_IN_TO_OUT() macro. According to the Intel Media SDK Developer's Guide, video/GPU memory is best for the hardware implementation, although it doesn't explicitly state how much of a problem it is to use system memory with the hardware implementation (i.e., does it result in worse performance than using system memory with the software implementation?, etc).
As far as I can tell, the only way to cause the encoder to use video memory is to call the InitPlg() method from the custom IMFDeviceDXVA interface. The corresponding H.264 decoder Media Foundation plug-in makes use of this method, but InitPlg() is not called anywhere for the encoder, and possibly, the assumption is that it should be called by the user. I think this will set things up properly such that video memory will be used for input (supplied by the caller of ProcessInput()) and also for output (supplied by the encoder plug-in, I think). It's somewhat unclear to me how to make use of the IMFDeviceDXVA interface, however. The InitPlg() method takes as input an IMfxVideoSession object and and mfxVideoParam object. However, these objects have already been created, and really, all I want to do is cause the encoder to use hardware memory. What is the recommended approach for doing so?
The sample MF encoder plugin will use video memory only if it is connected upstream to the sample MF decoder plugin which produces video memory media samples. In all other cases system memory will be used. You have almost traced it right: the encoder plugin shares a pointer to itself (IMFDeviceDXVA) with the upstream plugin in topology through MF_MT_D3D_DEVICE attribute of input type (you saw that decoder plugin gets this interface and saves it in m_pDeviceDXVA). Then the decoder plugin calls InitPlg method of this interface which shares decoder's mfxSession, mfxVideoParam and number of surfaces with encoder plugin. Encoder plugin replaces own session with the decoder's session and uses many of parameters to adjust own configuration, e.g. InitCodec is called with the memory pattern taken from decoder mfxVideoParam. So if it was video memory on decoder side, encoder will use video memory, too.
I hope this helps. None of those methods are intended to be called by user/application. All this stuff is decided by plugins themselves on connection.