Conformance and performance of multi-threads ocl processing
Hi, I have questions about the conformance and performance of belowing 2 user case:
There are N (N >= 2) host-side threads, which setup ocl and do processing indepently. (From clGetPlatformIDs, clGetDeviceIDs, clCreateContext, clCreateCommandQueue to clEnqueueNDRangeKernel...).
Same number host threads (N threads). Setup ocl (including clGetPlatformIDs, clGetDeviceIDs, clCreateContext, clCreateCommandQueue) before creating threads, and then these threads do clEnqueueNDRangeKernel to the same commandQueue. It means host threads will share clContext and clCommandQueue. My questions are:
Is there any conformance issue in case 1?
Does case 1 will have performance loss (consume more cpu or gpu) compared with case2?