Forum Discussion
Unable to configure log directory of Spark History Server to Storage Blob when deployed on AKS
I am trying to deploy Spark History Server on AKS and wanted to point it's log directory to Storage Blob. For achieving this, I am putting my configs on the values.yaml file as below:-
wasbs: enableWASBS: true secret: azure-secrets sasKeyMode: false storageAccountKeyName: azure-storage-account-key storageAccountNameKeyName: azure-storage-account-name containerKeyName: azure-blob-container-name logDirectory: wasbs:///test/piyush/spark-history pvc: enablePVC: false nfs: enableExampleNFS: false
First, I am creating the azure-secrets using the below command:-
kubectl create secret generic azure-secrets --from-file=azure-storage-account-name --from-file=azure-blob-container-name --from-file=azure-storage-account-key
After that, I am running the following set of commands:-
helm repo add stable https://kubernetes-charts.storage.googleapis.com helm install stable/spark-history-server --values values.yaml --generate-name
But while doing go, I am getting the following error:-
2020-10-05 19:17:56 INFO HistoryServer:2566 - Started daemon with process name: 12@spark-history-server-1601925447-57c5476fb-5wh6q 2020-10-05 19:17:56 INFO SignalUtils:54 - Registered signal handler for TERM 2020-10-05 19:17:56 INFO SignalUtils:54 - Registered signal handler for HUP 2020-10-05 19:17:56 INFO SignalUtils:54 - Registered signal handler for INT 2020-10-05 19:17:56 WARN NativeCodeLoader:62 - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable 2020-10-05 19:17:56 INFO SecurityManager:54 - Changing view acls to: root 2020-10-05 19:17:56 INFO SecurityManager:54 - Changing modify acls to: root 2020-10-05 19:17:56 INFO SecurityManager:54 - Changing view acls groups to: 2020-10-05 19:17:56 INFO SecurityManager:54 - Changing modify acls groups to: 2020-10-05 19:17:56 INFO SecurityManager:54 - SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(root); groups with view permissions: Set(); users with modify permissions: Set(root); groups with modify permissions: Set() 2020-10-05 19:17:56 INFO FsHistoryProvider:54 - History server ui acls disabled; users with admin permissions: ; groups with admin permissions Exception in thread "main" java.lang.reflect.InvocationTargetException at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:423) at org.apache.spark.deploy.history.HistoryServer$.main(HistoryServer.scala:280) at org.apache.spark.deploy.history.HistoryServer.main(HistoryServer.scala) Caused by: java.lang.RuntimeException: java.lang.ClassNotFoundException: Class org.apache.hadoop.fs.azure.NativeAzureFileSystem not found at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2195) at org.apache.hadoop.fs.FileSystem.getFileSystemClass(FileSystem.java:2654) at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2667) at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:94) at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2703) at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2685) at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:373) at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:364) at org.apache.hadoop.fs.Path.getFileSystem(Path.java:295) at org.apache.spark.deploy.history.FsHistoryProvider.<init>(FsHistoryProvider.scala:117) at org.apache.spark.deploy.history.FsHistoryProvider.<init>(FsHistoryProvider.scala:86) ... 6 more Caused by: java.lang.ClassNotFoundException: Class org.apache.hadoop.fs.azure.NativeAzureFileSystem not found at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:2101) at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2193) ... 16 more
Any type of help or suggestion would be appreciated.
Thanks in advance!
Regards,
Piyush Thakur
- Felicia_Ann_Kelley500Copper Contributortry changing out the piyush for
py. yaml- Piyush_ThakurCopper ContributorYou mean I need to change or rename my values.yaml file to the py.yaml file?
- SarthakAgrawalMicrosoftI encountered the same issue. Seems like the dll is erroneous. Was able to get this up and running using Azure file share as a PVC. Refer this to mount PVCs: https://docs.microsoft.com/en-us/azure/aks/azure-files-dynamic-pv
- Renu21Copper Contributor
Not sure if you are still facing issue, you can add below jars to Dockerfile:
# Add dependency for hadoop-azure ADD http://repo1.maven.org/maven2/com/microsoft/azure/azure-storage/2.0.0/azure-storage-2.0.0.jar $SPARK_HOME/jars
# Add hadoop-azure to access Azure Blob Storage
ADD http://repo1.maven.org/maven2/org/apache/hadoop/hadoop-azure/2.7.3/hadoop-azure-2.7.3.jar $SPARK_HOME/jars