Error Pythonudfrunner: Python Worker Exited Unexpectedly (crashed)
I am running a PySpark job that calls udfs. I know udfs are bad with memory and slow due to serializing/deserializing but due to situation, we have to use. The dataset is 60GB and
Solution 1:
The error can be caused by various issues. You have to find the root cause first.
First make sure you have a try-except-block at the highest level in your udf and log exceptions there.
Second register a handler like import faulthandler; faulthandler.enable()
as early as possible in driver and worker code (udf). If the reason is segmentation fault, the handler will print a stacktrace, visible in the logs.
Both approaches help you to understand the underlying issue.
Post a Comment for "Error Pythonudfrunner: Python Worker Exited Unexpectedly (crashed)"