As far as I understand, the FPGA and ARM cores use the same Physical memory (except for some small memory blocks on the FPGA). I've been trying to implement some simple OpenCL programs on the FPGA and they work fast. The problem is, that transferring data from the host (CPU) to the OpenCL buffers (FPGA) is extremely slow. And by slow I mean unusable. I created a thread on the opencl board (http://www.alteraforum.com/forum/showthread.php?t=50785) about this, but realized that this may also be a board specific issue that has nothing to do with OpenCL. Does anyone here have an Idea what could cause this extreme performance problem. As far as i understand it, the transfer operation should be as fast as a normal memcpy. Am I missing something?