Currently, anybody could write a DLL to intercept the clBuildProgram call and get our precious IP for free!
We can obfuscate the code ( not very effective ) or to precompile the kernel ( a pain if you have to precompile the kernel for 30 different GPU models ).
Perhaps you could add a service to your web to enter the kernel source code and return it encrypted using a private key. In your drivers, just decrypt it . Ok, it won't be perfect and somebody could reverse-engine the key ( though it's hard to debug in kernel-mode )... and, yes, I assume all can be hacked but it's better than nothing...
Not by much. First, this imply you trust Intel with your precious IP (no offense meant to Intel ;-), and that you also trust the communication between you and the webserver. And anyway, as soon as the code has been decrypted, it is in clear form in the driver memory, which in most OSes can be read by any super-user. I don't think OpenCL is supported on Trusted Solaris :-) Finally, anyone would have the ability to access any amount of encrypted data from a known cleartext with the exact same key as your code, which is a serious problem in cryptography. And it would be Intel-specific anyway.
A marginally better way would be an extension with some sort of clBuildProgramTrustedX509, where :
1) the driver would supply a X.509 certificate to the application so that the application can validate the driver's origin
2) the application use the certificate to encrypt its own encrypted code and the decryption key
3) the application send the newly encrypted data to the driver
4) the driver use the certificate's key to access the application code & key
5) the driver can decrypt the code and compile it
... in the end you still have the unecrypted code in the driver memory, whcih is pretty much unavoidable at the moment. You could conceivably add the following :
1B) the application and the driver use SSL & X509 to exchange newly generated random data
1C) the application XOR the code with random data
5B) the code is still XORed with random data (perfect encryption unless you know the random data) and unreadable, but the driver can do a character-by-character XOR during parsing and never expose the unencrypted code as a whole.
There's probably some flaw in there (quick'n'dirty answers are like that), but it's a step. One would need to reverse-engineer the whole driver to confirm the beginning of the random data and the XORed code to be able to extract the whole code, or look up every unencrypted byte one by one. Still doable, but likely harder than dumping the driver's memory and loking for cleartext code.
Disclaimer: I'm not a cryptographer :-)
You are raising a very valid concern. However, it is a cross vendor issue and should be handled at the spec level. We (intel) are conceptually in favor of a cross vendor solution. Hence, I would like to encourge you to submit this issue to the public Khoronos forums:
Khronos Forum on "OpenCL Suggestions for next release": http://www.khronos.org/message_boards/viewforum.php?f=41&sid=6f9de79d60b408c63a867f7bc7e2e425
Khronos Public Bugzilla : http://www.khronos.org/bugzilla/
Btw, somebody posted it at
I think the programmer should just test for a few IHVs ( the ones your program is certified to run ok ) like ATI, NVIDIA, Intel, etc... If an Intel OpenCL implementation is detected just pass the Intel's key to clBuildProgram, if you detect NVIDIA just pass its key, etc... You really should need to add 3 lines of code to your app.
At least the second part is flawed. You need the unencrypted code app-side for XORing, and you don't have it by definition. It could be stored pre-XORed in the application and both the code & the random data sent to the application, but that means the random data is stored in the app. Even encrypted, it's a security risk.
Adding a layer of XOR makes it even more complicated: Code C & random data R1 (by the developer) are XORed, (C^R1) and R1 are encrypted in the application. Random data R2 is negotiated with the driver using SSL. Application compute R1^R2, then unencrypt C^R1 (not needing to exchange that key with the driver) and send (C^R1)^(R1^R2) to the driver, which only need to XOR with R2 to get the code, hopefully character-by-character during parsing. The full code is never human-readable.
Again, Q'n'D, so this may (likely, will not) work. Security is hard.
Software vendors already have means to protect their "precious IP" (which anyone with half a brain can come up with independently without even looking at the prior art) by means of software patents.
The cost of software protection should not be transferred to hardware vendors because it will make hardware more expensive for everyone, while not everyone will need to use those few applications that require such protection.
Moreover, I would really hate if I had to pay for a certificate in order to write OpenCL code, and it is blindingly obvious that suggestions such as this one would lead to that sooner or later, just like you have to pay for a certificate if you want to write kernel driver even if you need it for personal use only, or for an application you intend to release to public domain.
Finally, everything that runs can be cracked.