The Spark job fails with an exception like the following while reading Parquet files:
Error in SQL statement: SparkException: Job aborted due to stage failure: Task 20 in stage 11227.0 failed 4 times, most recent failure: Lost task 20.3 in stage 11227.0 (TID 868031, 10.111.245.219, executor 31): java.lang.UnsupportedOperationException: org.apache.parquet.column.values.dictionary.PlainValuesDictionary$PlainDoubleDictionary at org.apache.parquet.column.Dictionary.decodeToLong(Dictionary.java:52)
java.lang.UnsupportedOperationException in this instance is caused by one or more Parquet files written to a Parquet folder with an incompatible schema.
Find the Parquet files and rewrite them with the correct schema. Try to read the Parquet dataset with schema merging enabled:
spark.conf.set("spark.sql.parquet.mergeSchema", "true") spark.read.parquet(path)
If you do have Parquet files with incompatible schemas, the snippets above will output an error with the name of the file that has the wrong schema.