Skip to content Skip to sidebar Skip to footer

Error Pythonudfrunner: Python Worker Exited Unexpectedly (crashed)

I am running a PySpark job that calls udfs. I know udfs are bad with memory and slow due to serializing/deserializing but due to situation, we have to use. The dataset is 60GB and

Solution 1:

The error can be caused by various issues. You have to find the root cause first.

First make sure you have a try-except-block at the highest level in your udf and log exceptions there.

Second register a handler like import faulthandler; faulthandler.enable() as early as possible in driver and worker code (udf). If the reason is segmentation fault, the handler will print a stacktrace, visible in the logs.

Both approaches help you to understand the underlying issue.

Post a Comment for "Error Pythonudfrunner: Python Worker Exited Unexpectedly (crashed)"