Running Keras Model For Prediction In Multiple Threads
Solution 1:
multi threading in python doesn't necessarily make a better use of your resources since python uses global interpreter lock and only one native thread can run at a time.
in python, usually you should use multi processing to utilize your resources, but since we're talking about keras models, I'm not sure even that is the right thing to do. loading several models in several processes has its own overhead, and you could simply increase the batch size as others have already pointed out.
OR if you have a heavy pre-processing stage you could preprocess your data in one process and predict them in another (although I doubt that would be necessary either).
Solution 2:
It's a bad idea to predict data in multiple threads . You can use greater batch_size
in model.predict
when you predict data offline and use tensorflow serving
when you predict data online.
Solution 3:
Keras is not thread safe, for predicting large batch you can use batch_size to set Max limits. If you are deploying to production than the ideal is to convert the model weights tensorflow protobuf and than use tensorflow serving.
You can follow this blog http://machinelearningmechanic.com/keras/2019/06/26/keras-serving-keras-model-quickly-with-tensorflow-serving-and-docker-md.html
Solution 4:
Because of Global Interpreter Lock of Python, you should consider using multiprocessing instead of threading. Ray is a great API to build distributed applications with Python and they already have a reinforcement learning framework called RLlib. I would highly recommend taking a look at Ray, especially for reinforcement learning applications.
Post a Comment for "Running Keras Model For Prediction In Multiple Threads"