Elephas Not Loaded In PySpark: No Module Named Elephas.spark_model
I am trying to distribute Keras training on a cluster and use Elephas for that. But, when running the basic example from the doc of Elephas (https://github.com/maxpumperla/elephas)
Solution 1:
I found a solution on how to properly load a virtual environment to the master and all the slave workers:
virtualenv venv --relocatable
cd venv
zip -qr ../venv.zip *
PYSPARK_PYTHON=./SP/bin/python spark-submit --master yarn --deploy-mode cluster --conf spark.yarn.appMasterEnv.PYSPARK_PYTHON=./SP/bin/python --driver-memory 4G --archives venv.zip#SP filename.py
More details in the GitHub Issue: https://github.com/maxpumperla/elephas/issues/80#issuecomment-371073492
Solution 2:
You should add elephas
library as an argument to your spark-submit
command.
Citing official guide:
For Python, you can use the
--py-files
argument ofspark-submit
to add .py, .zip or .egg files to be distributed with your application. If you depend on multiple Python files we recommend packaging them into a .zip or .egg.
Post a Comment for "Elephas Not Loaded In PySpark: No Module Named Elephas.spark_model"