Skip to content Skip to sidebar Skip to footer

Different Ways To Optimize With GPU PyOpenCL A Python Code : Extern Function Inside Kernel GPU/PyOpenCL

I have used the following command to profile my Python code : python2.7 -m cProfile -o X2_non_flat_multiprocessing_dummy.prof X2_non_flat.py Then, I can visualize globally the rep

Solution 1:

  1. Is there a way to implement a GPU/OpenCL layer in this routine, especially for CubicSpline or the whole Pobs_C function

In all probability, no. The majority of time in the profiling seems to be in 12 million polynomial evaluations, and each evaluation call is only taking 6 microseconds on the CPU. It is unclear whether there would be significant embarrassing parallelism to expose in that operation. And GPUs are only useful for performing embarrassingly parallel tasks.

  1. So, can I declare inside the kernel code a call to an extern function (I mean a function not inside kernel, i.e the classical part code (called Host code ?) ?

No. That is impossible. And it is difficult to fathom what benefit that could possibly impart given that the Python code has to run on the host CPU anyway.

  1. Maybe can I declare this extern function inside the kernel : is it possible by doing explicitely[sic] this declaration inside ?

No.


Post a Comment for "Different Ways To Optimize With GPU PyOpenCL A Python Code : Extern Function Inside Kernel GPU/PyOpenCL"