Unable to configure log directory of Spark History Server to Storage Blob when deployed on AKS

Copper Contributor

am trying to deploy Spark History Server on AKS and wanted to point it's log directory to Storage Blob. For achieving this, I am putting my configs on the values.yaml file as below:-

 

wasbs:
  enableWASBS: true
  secret: azure-secrets
  sasKeyMode: false
  storageAccountKeyName: azure-storage-account-key
  storageAccountNameKeyName: azure-storage-account-name
  containerKeyName:  azure-blob-container-name
  logDirectory: wasbs:///test/piyush/spark-history

pvc:
  enablePVC: false
nfs:
  enableExampleNFS: false

 

First, I am creating the azure-secrets using the below command:-

 

kubectl create secret generic azure-secrets --from-file=azure-storage-account-name --from-file=azure-blob-container-name --from-file=azure-storage-account-key

 

After that, I am running the following set of commands:-

 

helm repo add stable https://kubernetes-charts.storage.googleapis.com
helm install stable/spark-history-server --values values.yaml --generate-name 

 

But while doing go, I am getting the following error:-

 

2020-10-05 19:17:56 INFO  HistoryServer:2566 - Started daemon with process name: 12@spark-history-server-1601925447-57c5476fb-5wh6q
2020-10-05 19:17:56 INFO  SignalUtils:54 - Registered signal handler for TERM
2020-10-05 19:17:56 INFO  SignalUtils:54 - Registered signal handler for HUP
2020-10-05 19:17:56 INFO  SignalUtils:54 - Registered signal handler for INT
2020-10-05 19:17:56 WARN  NativeCodeLoader:62 - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
2020-10-05 19:17:56 INFO  SecurityManager:54 - Changing view acls to: root
2020-10-05 19:17:56 INFO  SecurityManager:54 - Changing modify acls to: root
2020-10-05 19:17:56 INFO  SecurityManager:54 - Changing view acls groups to:
2020-10-05 19:17:56 INFO  SecurityManager:54 - Changing modify acls groups to:
2020-10-05 19:17:56 INFO  SecurityManager:54 - SecurityManager: authentication disabled; ui acls disabled; users  with view permissions: Set(root); groups with view permissions: Set(); users  with modify permissions: Set(root); groups with modify permissions: Set()
2020-10-05 19:17:56 INFO  FsHistoryProvider:54 - History server ui acls disabled; users with admin permissions: ; groups with admin permissions
Exception in thread "main" java.lang.reflect.InvocationTargetException
        at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
        at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
        at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
        at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
        at org.apache.spark.deploy.history.HistoryServer$.main(HistoryServer.scala:280)
        at org.apache.spark.deploy.history.HistoryServer.main(HistoryServer.scala)
Caused by: java.lang.RuntimeException: java.lang.ClassNotFoundException: Class org.apache.hadoop.fs.azure.NativeAzureFileSystem not found
        at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2195)
        at org.apache.hadoop.fs.FileSystem.getFileSystemClass(FileSystem.java:2654)
        at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2667)
        at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:94)
        at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2703)
        at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2685)
        at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:373)
        at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:364)
        at org.apache.hadoop.fs.Path.getFileSystem(Path.java:295)
        at org.apache.spark.deploy.history.FsHistoryProvider.<init>(FsHistoryProvider.scala:117)
        at org.apache.spark.deploy.history.FsHistoryProvider.<init>(FsHistoryProvider.scala:86)
        ... 6 more
Caused by: java.lang.ClassNotFoundException: Class org.apache.hadoop.fs.azure.NativeAzureFileSystem not found
        at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:2101)
        at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2193)
        ... 16 more

 

Any type of help or suggestion would be appreciated.

Thanks in advance!

 

Regards,

Piyush Thakur

4 Replies
You mean I need to change or rename my values.yaml file to the py.yaml file?
I encountered the same issue. Seems like the dll is erroneous. Was able to get this up and running using Azure file share as a PVC. Refer this to mount PVCs: https://docs.microsoft.com/en-us/azure/aks/azure-files-dynamic-pv

@Piyush_Thakur 

 

Not sure if you are still facing issue, you can add below jars to Dockerfile:

 

# Add dependency for hadoop-azure

ADD http://repo1.maven.org/maven2/com/microsoft/azure/azure-storage/2.0.0/azure-storage-2.0.0.jar $SPARK_HOME/jars

# Add hadoop-azure to access Azure Blob Storage

ADD http://repo1.maven.org/maven2/org/apache/hadoop/hadoop-azure/2.7.3/hadoop-azure-2.7.3.jar $SPARK_HOME/jars