Forum Discussion
Access dedicated SQL pool from notebook
I have some notebooks where I use the com.microsoft.spark.sqlanalytics library to fetch the data from the dedicated SQL pool. Everything was working fine until a couple of days when we started getting the errors which are not very helpful.
The error is like this:
Py4JJavaError: An error occurred while calling o4062.count.
: org.apache.spark.SparkException: Job aborted due to stage failure: Task 2 in stage 0.0 failed 4 times, most recent failure: Lost task 2.3 in stage 0.0 (TID 21) (vm-00321611 executor 2): java.lang.IllegalArgumentException: For input string: 'null'
The code was working without issues up until a couple of days and there were no new deployments prior to that. The error occurs when the data is being accessed. Here is an excerpt:
dfData = spark.read.option(Constants.DATABASE, "sql_db_pool1").synapsesql(query)
cnt = dfData.count()
The error is coming deep from the library and there is no way to determine what argument is null.
Anybody ran into an issue like this?
Regards
3 Replies
- ljupcheCopper Contributor
In my case they changed the backend image for our Synapse workspaces and the version of spark we were using (3.3) started acting weird. I updated the version to 3.4 and everything is back to normal.
- ricardo_floresCopper Contributor
I'm facing the same issue.
Checked the data in the table and there is no cell with the string value "null"- ricardo_floresCopper Contributor
Hi ljupche
In my case changing the spark version of the notebook solved the problem. I was still using 3.3 After usin 3.4 the problem vanished