Different Ways To Optimize With GPU PyOpenCL A Python Code : Extern Function Inside Kernel GPU/PyOpenCL
I have used the following command to profile my Python code : python2.7 -m cProfile -o X2_non_flat_multiprocessing_dummy.prof X2_non_flat.py Then, I can visualize globally the rep
Solution 1:
- Is there a way to implement a GPU/OpenCL layer in this routine, especially for CubicSpline or the whole Pobs_C function
In all probability, no. The majority of time in the profiling seems to be in 12 million polynomial evaluations, and each evaluation call is only taking 6 microseconds on the CPU. It is unclear whether there would be significant embarrassing parallelism to expose in that operation. And GPUs are only useful for performing embarrassingly parallel tasks.
- So, can I declare inside the kernel code a call to an extern function (I mean a function not inside kernel, i.e the classical part code (called Host code ?) ?
No. That is impossible. And it is difficult to fathom what benefit that could possibly impart given that the Python code has to run on the host CPU anyway.
- Maybe can I declare this extern function inside the kernel : is it possible by doing explicitely[sic] this declaration inside ?
No.
Post a Comment for "Different Ways To Optimize With GPU PyOpenCL A Python Code : Extern Function Inside Kernel GPU/PyOpenCL"