- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi
Is there a way to make an opencl node (or custom streaming_node) which has more output ports than input ports.
I have tried, but I cannot seem to get the graph to execute, as it wants me to call try_put() on the output ports as well before executing.
I have this example, which doesn't work:
graph g; gpu_device_selector gpu_selector; opencl_program<> program("myclprogram.cl"); opencl_node< tuple<opencl_buffer<cl_uchar>, opencl_buffer<cl_uchar>, opencl_buffer<cl_uchar>> > myopenclnode(g, program.get_kernel("clCopy2"), gpu_selector); join_node < tuple<opencl_buffer<cl_uchar>, opencl_buffer<cl_uchar>>> join_node(g); function_node< tuple<opencl_buffer<cl_uchar>, opencl_buffer<cl_uchar>> > myOutputWriter(g, unlimited, [](const tuple<opencl_buffer<cl_uchar>, opencl_buffer<cl_uchar>>& input) { opencl_buffer<cl_uchar> buffer1 = std::get<0>(input); opencl_buffer<cl_uchar> buffer2 = std::get<1>(input); printf("'%s' '%s'\r\n", buffer1.data(), buffer2.data()); }); make_edge(output_port<1>(myopenclnode), input_port<0>(join_node)); make_edge(output_port<2>(myopenclnode), input_port<1>(join_node)); make_edge(join_node, myOutputWriter); const char str[] = "Hello world"; opencl_buffer<cl_uchar> a(sizeof(str)); std::copy_n(str, sizeof(str), a.begin()); opencl_buffer<cl_uchar> b(sizeof(str)); opencl_buffer<cl_uchar> c(sizeof(str)); myopenclnode.set_range(std::deque<int>{sizeof(str)}); myopenclnode.set_args(port_ref<0>(), b, c); input_port<0>(myopenclnode).try_put(a); g.wait_for_all();
The kernel just copies argument 1 to argument 2 and 3
However, the kernel is never executed in this example.
If I do a try_put() on inport_port<1> and <2>, it works fine.
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Nikolaj,
Implementation of opencl_node waits for the input on each input port before starting execute the kernel.
Let's try to understand use case in a bit more detail. Since it copies the first argument to the second and the third, the memory for the last two arguments should also be provided somehow, right. Otherwise, from where the node "understands" where to copy the data coming from the first parameter? The call to "try_put" to all of its ports is actually the way to "tell" the node about all the memory necessary to execute its encapsulated kernel.
If you have the use case where the described logic does not apply please tell us the details so we can better understand it and discuss.
Regards, Aleksei
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page