Skip to content Skip to sidebar Skip to footer

'sparksession' Object Has No Attribute 'serializer' When Evaluating A Classifier In Pyspark

I am using Apache spark in batch mode. I have set up an entire pipeline that transforms text into TFIDF vectors and then predicts a boolean class using Logistic regression: # Chain

Solution 1:

For prosperity's sake, here's what I did to fix this. When I initiate the Spark Session and the SQL context, I was doing this, which is not right:

sc = SparkSession.builder.appName('App Name').master("local[*]").getOrCreate()
sqlContext = SQLContext(sc)

This problem was resolved by doing this instead:

sc = SparkSession.builder.appName('App Name').master("local[*]").getOrCreate()
sqlContext = SQLContext(sparkContext=sc.sparkContext, sparkSession=sc)

I'm not sure why that needed to be explicit, and would welcome clarification from the community if someone knows.


Post a Comment for "'sparksession' Object Has No Attribute 'serializer' When Evaluating A Classifier In Pyspark"