Is It Unsafe To Run Multiple Tensorflow Processes On The Same GPU?
I only have one GPU (Titan X Pascal, 12 GB VRAM) and I would like to train multiple models, in parallel, on the same GPU. I tried encapsulated my model in a single python program (
Solution 1:
In short: yes it is safe to run multiple procceses on the same GPU (as of May 2017). It was previously unsafe to do so.
Solution 2:
Answer
Depending on video memory size, it will be allowed or not.
For my case I have total video memory of 2GBs while the single instance reserves about 1.4GB. When I have tried to run another tensorflow code while I was running already the speech recognition training.
2018-08-28 08:52:51.279676: I T:\src\github\tensorflow\tensorflow\core\common_runtime\gpu\gpu_device.cc:1405] Found device 0 with properties:
name: GeForce 940MX major: 5 minor: 0 memoryClockRate(GHz): 1.2415
pciBusID: 0000:01:00.0
totalMemory: 2.00GiB freeMemory: 1.65GiB
2018-08-28 08:52:51.294948: I T:\src\github\tensorflow\tensorflow\core\common_runtime\gpu\gpu_device.cc:1484] Adding visible gpu devices: 0
2018-08-28 08:52:55.643813: I T:\src\github\tensorflow\tensorflow\core\common_runtime\gpu\gpu_device.cc:965] Device interconnect StreamExecutor with strength 1 edge matrix:
2018-08-28 08:52:55.647912: I T:\src\github\tensorflow\tensorflow\core\common_runtime\gpu\gpu_device.cc:971] 0
2018-08-28 08:52:55.651054: I T:\src\github\tensorflow\tensorflow\core\common_runtime\gpu\gpu_device.cc:984] 0: N
2018-08-28 08:52:55.656853: I T:\src\github\tensorflow\tensorflow\core\common_runtime\gpu\gpu_device.cc:1097] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 1409 MB memory) -> physical GPU (device: 0, name: GeForce 940MX, pci bus id: 0000:01:00.0, compute
capability: 5.0)
I got the following error in speech recogntion, which completely terminated the script: (I think according to this is related to out of video memory)
2018-08-28 08:53:05.154711: E T:\src\github\tensorflow\tensorflow\stream_executor\cuda\cuda_driver.cc:1108] could not synchronize on CUDA context: CUDA_ERROR_LAUNCH_FAILED ::
Traceback (most recent call last):
File "C:\Python35\lib\site-packages\tensorflow\python\client\session.py", line 1278, in _do_call
return fn(*args)
File "C:\Python35\lib\site-packages\tensorflow\python\client\session.py", line 1263, in _run_fn
options, feed_dict, fetch_list, target_list, run_metadata)
File "C:\Python35\lib\site-packages\tensorflow\python\client\session.py", line 1350, in _call_tf_sessionrun
run_metadata)
tensorflow.python.framework.errors_impl.InternalError: GPU sync failed
Post a Comment for "Is It Unsafe To Run Multiple Tensorflow Processes On The Same GPU?"