SOLVED
Home

How can I use NiFi to ingest data from/to ADLS?

%3CLINGO-SUB%20id%3D%22lingo-sub-263534%22%20slang%3D%22en-US%22%3EHow%20can%20I%20use%20NiFi%20to%20ingest%20data%20from%2Fto%20ADLS%3F%3C%2FLINGO-SUB%3E%3CLINGO-BODY%20id%3D%22lingo-body-263534%22%20slang%3D%22en-US%22%3E%3CP%3EI%20would%20like%20to%20use%20NiFi%20to%20connect%20with%20ADLS.%20My%20scenario%20is%20like%20this%3A%20Nifi%20is%20installed%20and%20running%20in%20windows%20machine.Now%20i%20want%20to%20move%20data%20from%20my%20windows%20local%20directory%20to%20ADLS.%20I%20am%20not%20using%20any%20hadoop%20component%20for%20now.%20From%20ADLS%20again%20i%20want%20to%20move%20that%20data%20to%20SQL%20server%20which%20is%20in%20Azure%20too.%3CBR%20%2F%3EHow%20can%20i%20connect%20windows%20running%20Nifi%20to%20ADLS%3F%20All%20the%20instruction%20i%20found%20configuring%20core-site.xml%20files%20and%20taking%20the%20jars%20to%20Nifi%20specific%20folder.%20But%20as%20i%20dont%20have%20Hadoop%20running(so%20i%20dont%20have%20core-site.xml%20file)%20in%20that%20case%20how%20I%20can%20connect%20Nifi%20to%20ADLS%3F%3C%2FP%3E%3CP%3ECan%20anyone%20please%20share%20the%20pointers%20how%20it%20can%20be%20done%3F%3C%2FP%3E%3CP%3EThanks%20in%20advance.%3C%2FP%3E%3C%2FLINGO-BODY%3E%3CLINGO-LABS%20id%3D%22lingo-labs-263534%22%20slang%3D%22en-US%22%3E%3CLINGO-LABEL%3EAzure%3C%2FLINGO-LABEL%3E%3CLINGO-LABEL%3EData%20%2B%20Storage%3C%2FLINGO-LABEL%3E%3CLINGO-LABEL%3EHands-on-Labs%3C%2FLINGO-LABEL%3E%3C%2FLINGO-LABS%3E%3CLINGO-SUB%20id%3D%22lingo-sub-387140%22%20slang%3D%22en-US%22%3ERe%3A%20How%20can%20I%20use%20NiFi%20to%20ingest%20data%20from%2Fto%20ADLS%3F%3C%2FLINGO-SUB%3E%3CLINGO-BODY%20id%3D%22lingo-body-387140%22%20slang%3D%22en-US%22%3E%3CP%3Eapologies%20..%20I%20missed%20the%20account%20name%20when%20cleaning%20up%20the%20core-site.xml%20to%20send.%26nbsp%3B%3C%2FP%3E%0A%3CP%3E%26nbsp%3B%3C%2FP%3E%0A%3CP%3EI%20also%20have%20Nifi%20*HDFS%20working%20with%20RBAC%20and%20ACLs%20for%20ADL%20gen2%20as%20well%20..%20working%20on%20a%20write%20up%3C%2FP%3E%3C%2FLINGO-BODY%3E%3CLINGO-SUB%20id%3D%22lingo-sub-386906%22%20slang%3D%22en-US%22%3ERe%3A%20How%20can%20I%20use%20NiFi%20to%20ingest%20data%20from%2Fto%20ADLS%3F%3C%2FLINGO-SUB%3E%3CLINGO-BODY%20id%3D%22lingo-body-386906%22%20slang%3D%22en-US%22%3E%3CP%3EOk%2C%20I%20figured%20out%20where%20the%20problem%20was%20by%20reading%20%3CA%20href%3D%22http%3A%2F%2Fdocs.wandisco.com%2Fbigdata%2Fwdfusion%2Fadls%2F%22%20target%3D%22_blank%22%20rel%3D%22nofollow%20noopener%20noreferrer%20noopener%20noreferrer%22%3Ehttp%3A%2F%2Fdocs.wandisco.com%2Fbigdata%2Fwdfusion%2Fadls%2F%3C%2FA%3E.%3CBR%20%2F%3EIn%20Bruce%20answer%20there%20is%20a%20small%20mistake%20in%20the%20ADLS%20Gen2%20core-site.xml.%3CBR%20%2F%3EThe%20correct%20one%20should%20be%20(differences%20marked%20in%20%3CSTRONG%3Ebold%3C%2FSTRONG%3E%3A(%3C%2Fimg%3E%3C%2FP%3E%3CP%3E%26nbsp%3B%3C%2FP%3E%3CPRE%3E%3CCONFIGURATION%3E%0A%20%20%3CPROPERTY%3E%0A%20%20%20%20%3CNAME%3Efs.defaultFS%3C%2FNAME%3E%0A%20%20%20%20%3CVALUE%3Eabfss%3A%2F%2F%24%3CCONTAINER%3E%24%40%24%3CACCOUNT%3E%24.dfs.core.windows.net%3C%2FACCOUNT%3E%0A%20%20%3C%2FCONTAINER%3E%0A%20%20%3CPROPERTY%3E%0A%20%20%20%20%3CNAME%3Efs.azure.account.key.%3CSTRONG%3E%24%3CACCOUNT%3E%24%3C%2FACCOUNT%3E%3C%2FSTRONG%3E.dfs.core.windows.net%26lt%3B%2Fname%26gt%3B%0A%20%20%20%20%26lt%3Bvalue%26gt%3B%24%26lt%3Bstorage%20key%26gt%3B%24%26lt%3B%2Fvalue%26gt%3B%0A%20%20%26lt%3B%2Fproperty%26gt%3B%0A%20%20%26lt%3Bproperty%26gt%3B%0A%20%20%20%20%20%20%20%20%20%20%20%20%20%26lt%3Bname%26gt%3Bfs.adlsGen2.impl%26lt%3B%2Fname%26gt%3B%0A%20%20%20%20%20%20%20%20%20%20%20%20%20%26lt%3Bvalue%26gt%3Borg.apache.hadoop.fs.azurebfs.AzureBlobFileSystem%26lt%3B%2Fvalue%26gt%3B%0A%20%20%20%26lt%3B%2Fproperty%26gt%3B%0A%20%20%20%26lt%3Bproperty%26gt%3B%0A%20%20%20%20%20%20%20%20%20%20%20%20%20%26lt%3Bname%26gt%3Bfs.abfss.impl%26lt%3B%2Fname%26gt%3B%0A%20%20%20%20%20%20%20%20%20%20%20%20%20%26lt%3Bvalue%26gt%3Borg.apache.hadoop.fs.azurebfs.SecureAzureBlobFileSystem%26lt%3B%2Fvalue%26gt%3B%0A%20%20%20%26lt%3B%2Fproperty%26gt%3B%0A%20%20%20%26lt%3Bproperty%26gt%3B%0A%20%20%20%20%20%20%20%20%20%20%20%20%20%26lt%3Bname%26gt%3Bfs.AbstractFileSystem.adlsGen2.impl%26lt%3B%2Fname%26gt%3B%0A%20%20%20%20%20%20%20%20%20%20%20%20%20%26lt%3Bvalue%26gt%3Borg.apache.hadoop.fs.azurebfs.Abfs%26lt%3B%2Fvalue%26gt%3B%0A%20%20%20%26lt%3B%2Fproperty%26gt%3B%0A%20%20%20%26lt%3Bproperty%26gt%3B%0A%20%20%20%20%20%20%20%20%20%20%20%20%20%26lt%3Bname%26gt%3Bfs.AbstractFileSystem.abfss.impl%26lt%3B%2Fname%26gt%3B%0A%20%20%20%20%20%20%20%20%20%20%20%20%20%26lt%3Bvalue%26gt%3Borg.apache.hadoop.fs.azurebfs.Abfss%26lt%3B%2Fvalue%26gt%3B%0A%20%20%20%26lt%3B%2Fproperty%26gt%3B%0A%20%20%20%26lt%3Bproperty%26gt%3B%0A%20%20%20%20%20%20%20%20%20%20%20%20%20%26lt%3Bname%26gt%3Bfs.azure.check.block.md5%26lt%3B%2Fname%26gt%3B%0A%20%20%20%20%20%20%20%20%20%20%20%20%20%26lt%3Bvalue%26gt%3Bfalse%26lt%3B%2Fvalue%26gt%3B%0A%20%20%20%26lt%3B%2Fproperty%26gt%3B%0A%20%20%20%26lt%3Bproperty%26gt%3B%0A%20%20%20%20%20%20%20%20%20%20%20%20%20%26lt%3Bname%26gt%3Bfs.azure.store.blob.md5%26lt%3B%2Fname%26gt%3B%0A%20%20%20%20%20%20%20%20%20%20%20%20%20%26lt%3Bvalue%26gt%3Bfalse%26lt%3B%2Fvalue%26gt%3B%0A%20%20%20%26lt%3B%2Fproperty%26gt%3B%0A%20%20%20%26lt%3Bproperty%26gt%3B%0A%20%20%20%20%20%20%20%20%20%20%20%20%20%26lt%3Bname%26gt%3Bfs.azure.createRemoteFileSystemDuringInitialization%26lt%3B%2Fname%26gt%3B%0A%20%20%20%20%20%20%20%20%20%20%20%20%20%26lt%3Bvalue%26gt%3Btrue%26lt%3B%2Fvalue%26gt%3B%0A%20%20%20%26lt%3B%2Fproperty%26gt%3B%0A%26lt%3B%2Fconfiguration%26gt%3B%3C%2FNAME%3E%3C%2FPROPERTY%3E%3C%2FVALUE%3E%3C%2FPROPERTY%3E%3C%2FCONFIGURATION%3E%3C%2FPRE%3E%3CP%3E%26nbsp%3BTested%20using%3A%3C%2FP%3E%3CUL%3E%3CLI%3Ehadoop-azure-3.2.0.jar%3C%2FLI%3E%3CLI%3Ewildfly-openssl-1.0.4.Final.jar%3C%2FLI%3E%3C%2FUL%3E%3C%2FLINGO-BODY%3E%3CLINGO-SUB%20id%3D%22lingo-sub-377120%22%20slang%3D%22en-US%22%3ERe%3A%20How%20can%20I%20use%20NiFi%20to%20ingest%20data%20from%2Fto%20ADLS%3F%3C%2FLINGO-SUB%3E%3CLINGO-BODY%20id%3D%22lingo-body-377120%22%20slang%3D%22en-US%22%3E%3CP%3EHi%20%3CA%20href%3D%22https%3A%2F%2Ftechcommunity.microsoft.com%2Ft5%2Fuser%2Fviewprofilepage%2Fuser-id%2F43752%22%20target%3D%22_blank%22%3E%40Bruce%20Nelson%3C%2FA%3E%26nbsp%3B%2C%3C%2FP%3E%3CP%3Ethanks%20for%20the%20tips.%3C%2FP%3E%3CP%3EI%20tried%20to%20reproduce%20the%20steps%20you%20mentioned%2C%20but%20NiFi%20doesn't%20load%20properly%20the%20PutHDFS%20flow%20(it's%20the%20only%20one%20I'm%20testing%20at%20the%20moment).%3CBR%20%2F%3EThe%20error%20I%20get%20is%3A%3C%2FP%3E%3CPRE%3Eo.apache.nifi.processors.hadoop.PutHDFS%20PutHDFS%5Bid%3D...%5D%20HDFS%20Configuration%20error%20-%20Configuration%20property%20%5Baccount%20name%5D.dfs.core.windows.net%20not%20found.%3A%20Configuration%20property%20%5Baccount%20name%5D.dfs.core.windows.net%20not%20found.%3CBR%20%2F%3Eorg.apache.hadoop.fs.azurebfs.contracts.exceptions.ConfigurationPropertyNotFoundException%3A%20Configuration%20property%20%5Baccount%20name%5D.dfs.core.windows.net%20not%20found.%3CBR%20%2F%3Eat%20org.apache.hadoop.fs.azurebfs.AbfsConfiguration.getStorageAccountKey(AbfsConfiguration.java%3A342)%3CBR%20%2F%3Eat%20org.apache.hadoop.fs.azurebfs.AzureBlobFileSystemStore.initializeClient(AzureBlobFileSystemStore.java%3A812)%3CBR%20%2F%3Eat%20org.apache.hadoop.fs.azurebfs.AzureBlobFileSystemStore.%3CINIT%3E(AzureBlobFileSystemStore.java%3A149)%3CBR%20%2F%3Eat%20org.apache.hadoop.fs.azurebfs.AzureBlobFileSystem.initialize(AzureBlobFileSystem.java%3A108)%3CBR%20%2F%3Eat%20org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java%3A3288)%3CBR%20%2F%3Eat%20org.apache.hadoop.fs.FileSystem.get(FileSystem.java%3A473)%3CBR%20%2F%3Eat%20org.apache.hadoop.fs.FileSystem.get(FileSystem.java%3A225)%3CBR%20%2F%3Eat%20org.apache.nifi.processors.hadoop.AbstractHadoopProcessor%241.run(AbstractHadoopProcessor.java%3A434)%3CBR%20%2F%3Eat%20org.apache.nifi.processors.hadoop.AbstractHadoopProcessor%241.run(AbstractHadoopProcessor.java%3A431)%3CBR%20%2F%3Eat%20java.base%2Fjava.security.AccessController.doPrivileged(Native%20Method)%3CBR%20%2F%3Eat%20java.base%2Fjavax.security.auth.Subject.doAs(Subject.java%3A423)%3CBR%20%2F%3Eat%20org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java%3A1962)%3CBR%20%2F%3Eat%20org.apache.nifi.processors.hadoop.AbstractHadoopProcessor.getFileSystemAsUser(AbstractHadoopProcessor.java%3A431)%3CBR%20%2F%3Eat%20org.apache.nifi.processors.hadoop.AbstractHadoopProcessor.resetHDFSResources(AbstractHadoopProcessor.java%3A393)%3CBR%20%2F%3Eat%20org.apache.nifi.processors.hadoop.AbstractHadoopProcessor.abstractOnScheduled(AbstractHadoopProcessor.java%3A251)%3CBR%20%2F%3Eat%20java.base%2Fjdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native%20Method)%3CBR%20%2F%3Eat%20java.base%2Fjdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java%3A62)%3CBR%20%2F%3Eat%20java.base%2Fjdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java%3A43)%3CBR%20%2F%3Eat%20java.base%2Fjava.lang.reflect.Method.invoke(Method.java%3A564)%3CBR%20%2F%3Eat%20org.apache.nifi.util.ReflectionUtils.invokeMethodsWithAnnotations(ReflectionUtils.java%3A142)%3CBR%20%2F%3Eat%20org.apache.nifi.util.ReflectionUtils.invokeMethodsWithAnnotations(ReflectionUtils.java%3A130)%3CBR%20%2F%3Eat%20org.apache.nifi.util.ReflectionUtils.invokeMethodsWithAnnotations(ReflectionUtils.java%3A75)%3CBR%20%2F%3Eat%20org.apache.nifi.util.ReflectionUtils.invokeMethodsWithAnnotation(ReflectionUtils.java%3A52)%3CBR%20%2F%3Eat%20org.apache.nifi.controller.StandardProcessorNode.lambda%24initiateStart%244(StandardProcessorNode.java%3A1515)%3CBR%20%2F%3Eat%20java.base%2Fjava.util.concurrent.FutureTask.run(FutureTask.java%3A264)%3CBR%20%2F%3Eat%20java.base%2Fjava.util.concurrent.ScheduledThreadPoolExecutor%24ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java%3A304)%3CBR%20%2F%3Eat%20java.base%2Fjava.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java%3A1135)%3CBR%20%2F%3Eat%20java.base%2Fjava.util.concurrent.ThreadPoolExecutor%24Worker.run(ThreadPoolExecutor.java%3A635)%3CBR%20%2F%3Eat%20java.base%2Fjava.lang.Thread.run(Thread.java%3A844)%3C%2FINIT%3E%3C%2FPRE%3E%3CP%3E%26nbsp%3B%3C%2FP%3E%3CP%3EI%20tried%20to%20check%20around%2C%20but%20I%20have%20no%20idea%20what%20it%20can%20refer%20to.%3C%2FP%3E%3CP%3EI'm%20currently%20using%20Nifi%201.9.1%2C%20and%20the%20jars%20with%20the%20version%20you%20specified%20on%20Ubuntu%2018.04%20Server.%3C%2FP%3E%3CP%3ECould%20you%20help%20me%3F%3CBR%20%2F%3E%3CBR%20%2F%3EThanks%20in%20advance%2C%3C%2FP%3E%3CP%3EMichele%3C%2FP%3E%3C%2FLINGO-BODY%3E%3CLINGO-SUB%20id%3D%22lingo-sub-363693%22%20slang%3D%22en-US%22%3ERe%3A%20How%20can%20I%20use%20NiFi%20to%20ingest%20data%20from%2Fto%20ADLS%3F%3C%2FLINGO-SUB%3E%3CLINGO-BODY%20id%3D%22lingo-body-363693%22%20slang%3D%22en-US%22%3E%3CP%3ENo%20Hadoop%20is%20needed%20..%20For%20ADLS%20Gen1%20and%20Gen1%20you%20need%20a%20couple%20of%20JAR%20files%20and%20a%20simplified%20core-site.xml.%26nbsp%3B%20I%20am%20currently%20working%20with%20Nifi%201.9.0%20(released%20feb%202019).%3C%2FP%3E%0A%3CP%3E%26nbsp%3B%3C%2FP%3E%0A%3CP%3EFor%20ADLS%20Gen1%20I%20am%20using%26nbsp%3B%20%3A%3C%2FP%3E%0A%3CUL%3E%0A%3CLI%3Eazure-data-lake-store-sdk-2.3.1.jar%3C%2FLI%3E%0A%3CLI%3Ehadoop-azure-datalake-3.1.1.jar%26nbsp%3B%3C%2FLI%3E%0A%3CLI%3EThese%20jars%20are%20available%20in%20the%20Maven%20central%20repository%3C%2FLI%3E%0A%3CLI%3EMy%20core-site.xml%20%3A%20(replace%20the%20%24%26lt%3B%20%26gt%3B%24%20with%20your%20values.%26nbsp%3B%3C%2FLI%3E%0A%3C%2FUL%3E%0A%3CP%3E%26nbsp%3B%3C%2FP%3E%0A%3CPRE%3E%26lt%3Bconfiguration%26gt%3B%0A%26lt%3Bproperty%26gt%3B%0A%26lt%3Bname%26gt%3Bfs.defaultFS%26lt%3B%2Fname%26gt%3B%0A%26lt%3Bvalue%26gt%3Badl%3A%2F%2F%24%26lt%3Badls%20storage%20account%20name%26gt%3B%24.azuredatalakestore.net%26lt%3B%2Fvalue%26gt%3B%0A%26lt%3B%2Fproperty%26gt%3B%0A%26lt%3Bproperty%26gt%3B%0A%26lt%3Bname%26gt%3Bdfs.adls.oauth2.access.token.provider.type%26lt%3B%2Fname%26gt%3B%0A%26lt%3Bvalue%26gt%3BClientCredential%26lt%3B%2Fvalue%26gt%3B%0A%26lt%3B%2Fproperty%26gt%3B%0A%26lt%3Bproperty%26gt%3B%0A%26lt%3Bname%26gt%3Bdfs.adls.oauth2.refresh.url%26lt%3B%2Fname%26gt%3B%0A%26lt%3Bvalue%26gt%3Bhttps%3A%2F%2Flogin.microsoftonline.com%2F%24%26lt%3Btenant%20id%26gt%3B%24%2Foauth2%2Ftoken%26lt%3B%2Fvalue%26gt%3B%0A%26lt%3B%2Fproperty%26gt%3B%0A%26lt%3Bproperty%26gt%3B%0A%26lt%3Bname%26gt%3Bdfs.adls.oauth2.client.id%26lt%3B%2Fname%26gt%3B%0A%26lt%3Bvalue%26gt%3B%24%26lt%3Bclient%20id%26gt%3B%24%26lt%3B%2Fvalue%26gt%3B%0A%26lt%3B%2Fproperty%26gt%3B%0A%26lt%3Bproperty%26gt%3B%0A%26lt%3Bname%26gt%3Bdfs.adls.oauth2.credential%26lt%3B%2Fname%26gt%3B%0A%26lt%3Bvalue%26gt%3B%24%26lt%3Bkey%26gt%3B%24%26lt%3B%2Fvalue%26gt%3B%0A%26lt%3B%2Fproperty%26gt%3B%0A%26lt%3B%2Fconfiguration%26gt%3B%0A%0A%3C%2FPRE%3E%0A%3CP%3EFor%20ADLS%20Gen2%20I%20am%20using%26nbsp%3B%20%3A%26nbsp%3B%3C%2FP%3E%0A%3CP%3E%26nbsp%3B%3C%2FP%3E%0A%3CUL%3E%0A%3CLI%3Ehadoop-azure-3.2.0.jar%3C%2FLI%3E%0A%3CLI%3Ewildfly-openssl-1.0.4.Final.jar%3C%2FLI%3E%0A%3CLI%3Emy%20core-site.xml%26nbsp%3B(replace%20the%20%24%26lt%3B%20%26gt%3B%24%20with%20your%20values.%26nbsp%3B%3C%2FLI%3E%0A%3C%2FUL%3E%0A%3CP%3E%26nbsp%3B%3C%2FP%3E%0A%3CPRE%3E%26lt%3Bconfiguration%26gt%3B%0A%20%20%26lt%3Bproperty%26gt%3B%0A%20%20%20%20%26lt%3Bname%26gt%3Bfs.defaultFS%26lt%3B%2Fname%26gt%3B%0A%20%20%20%20%26lt%3Bvalue%26gt%3Babfss%3A%2F%2F%24%26lt%3Bcotainer%26gt%3B%24%40%24%26lt%3Baccount%26gt%3B%24.dfs.core.windows.net%26lt%3B%2Fvalue%26gt%3B%0A%20%20%26lt%3B%2Fproperty%26gt%3B%0A%20%20%26lt%3Bproperty%26gt%3B%0A%20%20%20%20%26lt%3Bname%26gt%3Bfs.azure.account.key.adbstorgen2.dfs.core.windows.net%26lt%3B%2Fname%26gt%3B%0A%20%20%20%20%26lt%3Bvalue%26gt%3B%24%26lt%3Bstorage%20key%26gt%3B%24%26lt%3B%2Fvalue%26gt%3B%0A%20%20%26lt%3B%2Fproperty%26gt%3B%0A%20%20%26lt%3Bproperty%26gt%3B%0A%20%20%20%20%20%20%20%20%20%20%20%20%20%26lt%3Bname%26gt%3Bfs.adlsGen2.impl%26lt%3B%2Fname%26gt%3B%0A%20%20%20%20%20%20%20%20%20%20%20%20%20%26lt%3Bvalue%26gt%3Borg.apache.hadoop.fs.azurebfs.AzureBlobFileSystem%26lt%3B%2Fvalue%26gt%3B%0A%20%20%20%26lt%3B%2Fproperty%26gt%3B%0A%20%20%20%26lt%3Bproperty%26gt%3B%0A%20%20%20%20%20%20%20%20%20%20%20%20%20%26lt%3Bname%26gt%3Bfs.abfss.impl%26lt%3B%2Fname%26gt%3B%0A%20%20%20%20%20%20%20%20%20%20%20%20%20%26lt%3Bvalue%26gt%3Borg.apache.hadoop.fs.azurebfs.SecureAzureBlobFileSystem%26lt%3B%2Fvalue%26gt%3B%0A%20%20%20%26lt%3B%2Fproperty%26gt%3B%0A%20%20%20%26lt%3Bproperty%26gt%3B%0A%20%20%20%20%20%20%20%20%20%20%20%20%20%26lt%3Bname%26gt%3Bfs.AbstractFileSystem.adlsGen2.impl%26lt%3B%2Fname%26gt%3B%0A%20%20%20%20%20%20%20%20%20%20%20%20%20%26lt%3Bvalue%26gt%3Borg.apache.hadoop.fs.azurebfs.Abfs%26lt%3B%2Fvalue%26gt%3B%0A%20%20%20%26lt%3B%2Fproperty%26gt%3B%0A%20%20%20%26lt%3Bproperty%26gt%3B%0A%20%20%20%20%20%20%20%20%20%20%20%20%20%26lt%3Bname%26gt%3Bfs.AbstractFileSystem.abfss.impl%26lt%3B%2Fname%26gt%3B%0A%20%20%20%20%20%20%20%20%20%20%20%20%20%26lt%3Bvalue%26gt%3Borg.apache.hadoop.fs.azurebfs.Abfss%26lt%3B%2Fvalue%26gt%3B%0A%20%20%20%26lt%3B%2Fproperty%26gt%3B%0A%20%20%20%26lt%3Bproperty%26gt%3B%0A%20%20%20%20%20%20%20%20%20%20%20%20%20%26lt%3Bname%26gt%3Bfs.azure.check.block.md5%26lt%3B%2Fname%26gt%3B%0A%20%20%20%20%20%20%20%20%20%20%20%20%20%26lt%3Bvalue%26gt%3Bfalse%26lt%3B%2Fvalue%26gt%3B%0A%20%20%20%26lt%3B%2Fproperty%26gt%3B%0A%20%20%20%26lt%3Bproperty%26gt%3B%0A%20%20%20%20%20%20%20%20%20%20%20%20%20%26lt%3Bname%26gt%3Bfs.azure.store.blob.md5%26lt%3B%2Fname%26gt%3B%0A%20%20%20%20%20%20%20%20%20%20%20%20%20%26lt%3Bvalue%26gt%3Bfalse%26lt%3B%2Fvalue%26gt%3B%0A%20%20%20%26lt%3B%2Fproperty%26gt%3B%0A%20%20%20%26lt%3Bproperty%26gt%3B%0A%20%20%20%20%20%20%20%20%20%20%20%20%20%26lt%3Bname%26gt%3Bfs.azure.createRemoteFileSystemDuringInitialization%26lt%3B%2Fname%26gt%3B%0A%20%20%20%20%20%20%20%20%20%20%20%20%20%26lt%3Bvalue%26gt%3Btrue%26lt%3B%2Fvalue%26gt%3B%0A%20%20%20%26lt%3B%2Fproperty%26gt%3B%0A%26lt%3B%2Fconfiguration%26gt%3B%3C%2FPRE%3E%0A%3CP%3Efor%20the%20put%20list%20fetch%20HDFS%20flows%20..%20you%20just%20need%20to%20fill%20in%20the%26nbsp%3BHadoop%20Configuration%20Resources%20with%20the%20path%20to%20your%20core-site.xml%20and%26nbsp%3BAdditional%20Classpath%20Resources%26nbsp%3B%20with%20the%20folder%20that%20has%20the%20azure%20jar%20files.%20Enjoy%20!%3C%2FP%3E%0A%3CP%3E%26nbsp%3B%3C%2FP%3E%3C%2FLINGO-BODY%3E
marshal.tito01
New Contributor

I would like to use NiFi to connect with ADLS. My scenario is like this: Nifi is installed and running in windows machine.Now i want to move data from my windows local directory to ADLS. I am not using any hadoop component for now. From ADLS again i want to move that data to SQL server which is in Azure too.
How can i connect windows running Nifi to ADLS? All the instruction i found configuring core-site.xml files and taking the jars to Nifi specific folder. But as i dont have Hadoop running(so i dont have core-site.xml file) in that case how I can connect Nifi to ADLS?

Can anyone please share the pointers how it can be done?

Thanks in advance.

4 Replies
Solution

No Hadoop is needed .. For ADLS Gen1 and Gen1 you need a couple of JAR files and a simplified core-site.xml.  I am currently working with Nifi 1.9.0 (released feb 2019).

 

For ADLS Gen1 I am using  :

  • azure-data-lake-store-sdk-2.3.1.jar
  • hadoop-azure-datalake-3.1.1.jar 
  • These jars are available in the Maven central repository
  • My core-site.xml : (replace the $< >$ with your values. 

 

<configuration>
<property>
<name>fs.defaultFS</name>
<value>adl://$<adls storage account name>$.azuredatalakestore.net</value>
</property>
<property>
<name>dfs.adls.oauth2.access.token.provider.type</name>
<value>ClientCredential</value>
</property>
<property>
<name>dfs.adls.oauth2.refresh.url</name>
<value>https://login.microsoftonline.com/$<tenant id>$/oauth2/token</value>
</property>
<property>
<name>dfs.adls.oauth2.client.id</name>
<value>$<client id>$</value>
</property>
<property>
<name>dfs.adls.oauth2.credential</name>
<value>$<key>$</value>
</property>
</configuration>

For ADLS Gen2 I am using  : 

 

  • hadoop-azure-3.2.0.jar
  • wildfly-openssl-1.0.4.Final.jar
  • my core-site.xml (replace the $< >$ with your values. 

 

<configuration>
  <property>
    <name>fs.defaultFS</name>
    <value>abfss://$<cotainer>$@$<account>$.dfs.core.windows.net</value>
  </property>
  <property>
    <name>fs.azure.account.key.adbstorgen2.dfs.core.windows.net</name>
    <value>$<storage key>$</value>
  </property>
  <property>
             <name>fs.adlsGen2.impl</name>
             <value>org.apache.hadoop.fs.azurebfs.AzureBlobFileSystem</value>
   </property>
   <property>
             <name>fs.abfss.impl</name>
             <value>org.apache.hadoop.fs.azurebfs.SecureAzureBlobFileSystem</value>
   </property>
   <property>
             <name>fs.AbstractFileSystem.adlsGen2.impl</name>
             <value>org.apache.hadoop.fs.azurebfs.Abfs</value>
   </property>
   <property>
             <name>fs.AbstractFileSystem.abfss.impl</name>
             <value>org.apache.hadoop.fs.azurebfs.Abfss</value>
   </property>
   <property>
             <name>fs.azure.check.block.md5</name>
             <value>false</value>
   </property>
   <property>
             <name>fs.azure.store.blob.md5</name>
             <value>false</value>
   </property>
   <property>
             <name>fs.azure.createRemoteFileSystemDuringInitialization</name>
             <value>true</value>
   </property>
</configuration>

for the put list fetch HDFS flows .. you just need to fill in the Hadoop Configuration Resources with the path to your core-site.xml and Additional Classpath Resources  with the folder that has the azure jar files. Enjoy !

 

Hi @Bruce Nelson ,

thanks for the tips.

I tried to reproduce the steps you mentioned, but NiFi doesn't load properly the PutHDFS flow (it's the only one I'm testing at the moment).
The error I get is:

o.apache.nifi.processors.hadoop.PutHDFS PutHDFS[id=...] HDFS Configuration error - Configuration property [account name].dfs.core.windows.net not found.: Configuration property [account name].dfs.core.windows.net not found.
org.apache.hadoop.fs.azurebfs.contracts.exceptions.ConfigurationPropertyNotFoundException: Configuration property [account name].dfs.core.windows.net not found.
at org.apache.hadoop.fs.azurebfs.AbfsConfiguration.getStorageAccountKey(AbfsConfiguration.java:342)
at org.apache.hadoop.fs.azurebfs.AzureBlobFileSystemStore.initializeClient(AzureBlobFileSystemStore.java:812)
at org.apache.hadoop.fs.azurebfs.AzureBlobFileSystemStore.<init>(AzureBlobFileSystemStore.java:149)
at org.apache.hadoop.fs.azurebfs.AzureBlobFileSystem.initialize(AzureBlobFileSystem.java:108)
at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:3288)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:473)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:225)
at org.apache.nifi.processors.hadoop.AbstractHadoopProcessor$1.run(AbstractHadoopProcessor.java:434)
at org.apache.nifi.processors.hadoop.AbstractHadoopProcessor$1.run(AbstractHadoopProcessor.java:431)
at java.base/java.security.AccessController.doPrivileged(Native Method)
at java.base/javax.security.auth.Subject.doAs(Subject.java:423)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1962)
at org.apache.nifi.processors.hadoop.AbstractHadoopProcessor.getFileSystemAsUser(AbstractHadoopProcessor.java:431)
at org.apache.nifi.processors.hadoop.AbstractHadoopProcessor.resetHDFSResources(AbstractHadoopProcessor.java:393)
at org.apache.nifi.processors.hadoop.AbstractHadoopProcessor.abstractOnScheduled(AbstractHadoopProcessor.java:251)
at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.base/java.lang.reflect.Method.invoke(Method.java:564)
at org.apache.nifi.util.ReflectionUtils.invokeMethodsWithAnnotations(ReflectionUtils.java:142)
at org.apache.nifi.util.ReflectionUtils.invokeMethodsWithAnnotations(ReflectionUtils.java:130)
at org.apache.nifi.util.ReflectionUtils.invokeMethodsWithAnnotations(ReflectionUtils.java:75)
at org.apache.nifi.util.ReflectionUtils.invokeMethodsWithAnnotation(ReflectionUtils.java:52)
at org.apache.nifi.controller.StandardProcessorNode.lambda$initiateStart$4(StandardProcessorNode.java:1515)
at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
at java.base/java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:304)
at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1135)
at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
at java.base/java.lang.Thread.run(Thread.java:844)

 

I tried to check around, but I have no idea what it can refer to.

I'm currently using Nifi 1.9.1, and the jars with the version you specified on Ubuntu 18.04 Server.

Could you help me?

Thanks in advance,

Michele

Ok, I figured out where the problem was by reading http://docs.wandisco.com/bigdata/wdfusion/adls/.
In Bruce answer there is a small mistake in the ADLS Gen2 core-site.xml.
The correct one should be (differences marked in bold:(

 

<configuration>
  <property>
    <name>fs.defaultFS</name>
    <value>abfss://$<container>$@$<account>$.dfs.core.windows.net</value>
  </property>
  <property>
    <name>fs.azure.account.key.$<account>$.dfs.core.windows.net</name>
    <value>$<storage key>$</value>
  </property>
  <property>
             <name>fs.adlsGen2.impl</name>
             <value>org.apache.hadoop.fs.azurebfs.AzureBlobFileSystem</value>
   </property>
   <property>
             <name>fs.abfss.impl</name>
             <value>org.apache.hadoop.fs.azurebfs.SecureAzureBlobFileSystem</value>
   </property>
   <property>
             <name>fs.AbstractFileSystem.adlsGen2.impl</name>
             <value>org.apache.hadoop.fs.azurebfs.Abfs</value>
   </property>
   <property>
             <name>fs.AbstractFileSystem.abfss.impl</name>
             <value>org.apache.hadoop.fs.azurebfs.Abfss</value>
   </property>
   <property>
             <name>fs.azure.check.block.md5</name>
             <value>false</value>
   </property>
   <property>
             <name>fs.azure.store.blob.md5</name>
             <value>false</value>
   </property>
   <property>
             <name>fs.azure.createRemoteFileSystemDuringInitialization</name>
             <value>true</value>
   </property>
</configuration>

 Tested using:

  • hadoop-azure-3.2.0.jar
  • wildfly-openssl-1.0.4.Final.jar
Highlighted

apologies .. I missed the account name when cleaning up the core-site.xml to send. 

 

I also have Nifi *HDFS working with RBAC and ACLs for ADL gen2 as well .. working on a write up

Related Conversations
Tabs and Dark Mode
cjc2112 in Discussions on
22 Replies
flashing a white screen while open new tab
cntvertex in Discussions on
13 Replies
Stable version of Edge insider browser
HotCakeX in Discussions on
35 Replies
How to Prevent Teams from Auto-Launch
chenrylee in Microsoft Teams on
28 Replies
description for autoplay blocking in settings page
HotCakeX in Discussions on
8 Replies