Forum Discussion

ljupche's avatar
ljupche
Copper Contributor
Mar 29, 2025

Access dedicated SQL pool from notebook

I have some notebooks where I use the com.microsoft.spark.sqlanalytics library to fetch the data from the dedicated SQL pool. Everything was working fine until a couple of days when we started getting the errors which are not very helpful.

The error is like this:
Py4JJavaError: An error occurred while calling o4062.count.
: org.apache.spark.SparkException: Job aborted due to stage failure: Task 2 in stage 0.0 failed 4 times, most recent failure: Lost task 2.3 in stage 0.0 (TID 21) (vm-00321611 executor 2): java.lang.IllegalArgumentException: For input string: 'null'

The code was working without issues up until a couple of days and there were no new deployments prior to that. The error occurs when the data is being accessed. Here is an excerpt:

dfData = spark.read.option(Constants.DATABASE, "sql_db_pool1").synapsesql(query)
cnt = dfData.count()

The error is coming deep from the library and there is no way to determine what argument is null.
Anybody ran into an issue like this?

Regards

3 Replies

  • ljupche's avatar
    ljupche
    Copper Contributor

    In my case they changed the backend image for our Synapse workspaces and the version of spark we were using (3.3) started acting weird. I updated the version to 3.4 and everything is back to normal.

  • ricardo_flores's avatar
    ricardo_flores
    Copper Contributor

    I'm facing the same issue.
    Checked the data in the table and there is no cell with the string value "null"

    • ricardo_flores's avatar
      ricardo_flores
      Copper Contributor

      Hi ljupche 

      In my case changing the spark version of the notebook solved the problem. I was still using 3.3 After usin 3.4 the problem vanished

Resources