SOLVED
Home

How can I use NiFi to ingest data from/to ADLS?

%3CLINGO-SUB%20id%3D%22lingo-sub-263534%22%20slang%3D%22en-US%22%3EHow%20can%20I%20use%20NiFi%20to%20ingest%20data%20from%2Fto%20ADLS%3F%3C%2FLINGO-SUB%3E%3CLINGO-BODY%20id%3D%22lingo-body-263534%22%20slang%3D%22en-US%22%3E%3CP%3EI%20would%20like%20to%20use%20NiFi%20to%20connect%20with%20ADLS.%20My%20scenario%20is%20like%20this%3A%20Nifi%20is%20installed%20and%20running%20in%20windows%20machine.Now%20i%20want%20to%20move%20data%20from%20my%20windows%20local%20directory%20to%20ADLS.%20I%20am%20not%20using%20any%20hadoop%20component%20for%20now.%20From%20ADLS%20again%20i%20want%20to%20move%20that%20data%20to%20SQL%20server%20which%20is%20in%20Azure%20too.%3CBR%20%2F%3EHow%20can%20i%20connect%20windows%20running%20Nifi%20to%20ADLS%3F%20All%20the%20instruction%20i%20found%20configuring%20core-site.xml%20files%20and%20taking%20the%20jars%20to%20Nifi%20specific%20folder.%20But%20as%20i%20dont%20have%20Hadoop%20running(so%20i%20dont%20have%20core-site.xml%20file)%20in%20that%20case%20how%20I%20can%20connect%20Nifi%20to%20ADLS%3F%3C%2FP%3E%3CP%3ECan%20anyone%20please%20share%20the%20pointers%20how%20it%20can%20be%20done%3F%3C%2FP%3E%3CP%3EThanks%20in%20advance.%3C%2FP%3E%3C%2FLINGO-BODY%3E%3CLINGO-LABS%20id%3D%22lingo-labs-263534%22%20slang%3D%22en-US%22%3E%3CLINGO-LABEL%3EAzure%3C%2FLINGO-LABEL%3E%3CLINGO-LABEL%3EData%20%2B%20Storage%3C%2FLINGO-LABEL%3E%3CLINGO-LABEL%3EHands-on-Labs%3C%2FLINGO-LABEL%3E%3C%2FLINGO-LABS%3E%3CLINGO-SUB%20id%3D%22lingo-sub-387140%22%20slang%3D%22en-US%22%3ERe%3A%20How%20can%20I%20use%20NiFi%20to%20ingest%20data%20from%2Fto%20ADLS%3F%3C%2FLINGO-SUB%3E%3CLINGO-BODY%20id%3D%22lingo-body-387140%22%20slang%3D%22en-US%22%3E%3CP%3Eapologies%20..%20I%20missed%20the%20account%20name%20when%20cleaning%20up%20the%20core-site.xml%20to%20send.%26nbsp%3B%3C%2FP%3E%0A%3CP%3E%26nbsp%3B%3C%2FP%3E%0A%3CP%3EI%20also%20have%20Nifi%20*HDFS%20working%20with%20RBAC%20and%20ACLs%20for%20ADL%20gen2%20as%20well%20..%20working%20on%20a%20write%20up%3C%2FP%3E%3C%2FLINGO-BODY%3E%3CLINGO-SUB%20id%3D%22lingo-sub-386906%22%20slang%3D%22en-US%22%3ERe%3A%20How%20can%20I%20use%20NiFi%20to%20ingest%20data%20from%2Fto%20ADLS%3F%3C%2FLINGO-SUB%3E%3CLINGO-BODY%20id%3D%22lingo-body-386906%22%20slang%3D%22en-US%22%3E%3CP%3EOk%2C%20I%20figured%20out%20where%20the%20problem%20was%20by%20reading%20%3CA%20href%3D%22http%3A%2F%2Fdocs.wandisco.com%2Fbigdata%2Fwdfusion%2Fadls%2F%22%20target%3D%22_blank%22%20rel%3D%22nofollow%20noopener%20noreferrer%20noopener%20noreferrer%22%3Ehttp%3A%2F%2Fdocs.wandisco.com%2Fbigdata%2Fwdfusion%2Fadls%2F%3C%2FA%3E.%3CBR%20%2F%3EIn%20Bruce%20answer%20there%20is%20a%20small%20mistake%20in%20the%20ADLS%20Gen2%20core-site.xml.%3CBR%20%2F%3EThe%20correct%20one%20should%20be%20(differences%20marked%20in%20%3CSTRONG%3Ebold%3C%2FSTRONG%3E%3A(%3C%2Fimg%3E%3C%2FP%3E%3CP%3E%26nbsp%3B%3C%2FP%3E%3CPRE%3E%3CCONFIGURATION%3E%0A%20%20%3CPROPERTY%3E%0A%20%20%20%20%3CNAME%3Efs.defaultFS%3C%2FNAME%3E%0A%20%20%20%20%3CVALUE%3Eabfss%3A%2F%2F%24%3CCONTAINER%3E%24%40%24%3CACCOUNT%3E%24.dfs.core.windows.net%3C%2FACCOUNT%3E%0A%20%20%3C%2FCONTAINER%3E%0A%20%20%3CPROPERTY%3E%0A%20%20%20%20%3CNAME%3Efs.azure.account.key.%3CSTRONG%3E%24%3CACCOUNT%3E%24%3C%2FACCOUNT%3E%3C%2FSTRONG%3E.dfs.core.windows.net%26lt%3B%2Fname%26gt%3B%0A%20%20%20%20%26lt%3Bvalue%26gt%3B%24%26lt%3Bstorage%20key%26gt%3B%24%26lt%3B%2Fvalue%26gt%3B%0A%20%20%26lt%3B%2Fproperty%26gt%3B%0A%20%20%26lt%3Bproperty%26gt%3B%0A%20%20%20%20%20%20%20%20%20%20%20%20%20%26lt%3Bname%26gt%3Bfs.adlsGen2.impl%26lt%3B%2Fname%26gt%3B%0A%20%20%20%20%20%20%20%20%20%20%20%20%20%26lt%3Bvalue%26gt%3Borg.apache.hadoop.fs.azurebfs.AzureBlobFileSystem%26lt%3B%2Fvalue%26gt%3B%0A%20%20%20%26lt%3B%2Fproperty%26gt%3B%0A%20%20%20%26lt%3Bproperty%26gt%3B%0A%20%20%20%20%20%20%20%20%20%20%20%20%20%26lt%3Bname%26gt%3Bfs.abfss.impl%26lt%3B%2Fname%26gt%3B%0A%20%20%20%20%20%20%20%20%20%20%20%20%20%26lt%3Bvalue%26gt%3Borg.apache.hadoop.fs.azurebfs.SecureAzureBlobFileSystem%26lt%3B%2Fvalue%26gt%3B%0A%20%20%20%26lt%3B%2Fproperty%26gt%3B%0A%20%20%20%26lt%3Bproperty%26gt%3B%0A%20%20%20%20%20%20%20%20%20%20%20%20%20%26lt%3Bname%26gt%3Bfs.AbstractFileSystem.adlsGen2.impl%26lt%3B%2Fname%26gt%3B%0A%20%20%20%20%20%20%20%20%20%20%20%20%20%26lt%3Bvalue%26gt%3Borg.apache.hadoop.fs.azurebfs.Abfs%26lt%3B%2Fvalue%26gt%3B%0A%20%20%20%26lt%3B%2Fproperty%26gt%3B%0A%20%20%20%26lt%3Bproperty%26gt%3B%0A%20%20%20%20%20%20%20%20%20%20%20%20%20%26lt%3Bname%26gt%3Bfs.AbstractFileSystem.abfss.impl%26lt%3B%2Fname%26gt%3B%0A%20%20%20%20%20%20%20%20%20%20%20%20%20%26lt%3Bvalue%26gt%3Borg.apache.hadoop.fs.azurebfs.Abfss%26lt%3B%2Fvalue%26gt%3B%0A%20%20%20%26lt%3B%2Fproperty%26gt%3B%0A%20%20%20%26lt%3Bproperty%26gt%3B%0A%20%20%20%20%20%20%20%20%20%20%20%20%20%26lt%3Bname%26gt%3Bfs.azure.check.block.md5%26lt%3B%2Fname%26gt%3B%0A%20%20%20%20%20%20%20%20%20%20%20%20%20%26lt%3Bvalue%26gt%3Bfalse%26lt%3B%2Fvalue%26gt%3B%0A%20%20%20%26lt%3B%2Fproperty%26gt%3B%0A%20%20%20%26lt%3Bproperty%26gt%3B%0A%20%20%20%20%20%20%20%20%20%20%20%20%20%26lt%3Bname%26gt%3Bfs.azure.store.blob.md5%26lt%3B%2Fname%26gt%3B%0A%20%20%20%20%20%20%20%20%20%20%20%20%20%26lt%3Bvalue%26gt%3Bfalse%26lt%3B%2Fvalue%26gt%3B%0A%20%20%20%26lt%3B%2Fproperty%26gt%3B%0A%20%20%20%26lt%3Bproperty%26gt%3B%0A%20%20%20%20%20%20%20%20%20%20%20%20%20%26lt%3Bname%26gt%3Bfs.azure.createRemoteFileSystemDuringInitialization%26lt%3B%2Fname%26gt%3B%0A%20%20%20%20%20%20%20%20%20%20%20%20%20%26lt%3Bvalue%26gt%3Btrue%26lt%3B%2Fvalue%26gt%3B%0A%20%20%20%26lt%3B%2Fproperty%26gt%3B%0A%26lt%3B%2Fconfiguration%26gt%3B%3C%2FNAME%3E%3C%2FPROPERTY%3E%3C%2FVALUE%3E%3C%2FPROPERTY%3E%3C%2FCONFIGURATION%3E%3C%2FPRE%3E%3CP%3E%26nbsp%3BTested%20using%3A%3C%2FP%3E%3CUL%3E%3CLI%3Ehadoop-azure-3.2.0.jar%3C%2FLI%3E%3CLI%3Ewildfly-openssl-1.0.4.Final.jar%3C%2FLI%3E%3C%2FUL%3E%3C%2FLINGO-BODY%3E%3CLINGO-SUB%20id%3D%22lingo-sub-377120%22%20slang%3D%22en-US%22%3ERe%3A%20How%20can%20I%20use%20NiFi%20to%20ingest%20data%20from%2Fto%20ADLS%3F%3C%2FLINGO-SUB%3E%3CLINGO-BODY%20id%3D%22lingo-body-377120%22%20slang%3D%22en-US%22%3E%3CP%3EHi%20%3CA%20href%3D%22https%3A%2F%2Ftechcommunity.microsoft.com%2Ft5%2Fuser%2Fviewprofilepage%2Fuser-id%2F43752%22%20target%3D%22_blank%22%3E%40Bruce%20Nelson%3C%2FA%3E%26nbsp%3B%2C%3C%2FP%3E%3CP%3Ethanks%20for%20the%20tips.%3C%2FP%3E%3CP%3EI%20tried%20to%20reproduce%20the%20steps%20you%20mentioned%2C%20but%20NiFi%20doesn't%20load%20properly%20the%20PutHDFS%20flow%20(it's%20the%20only%20one%20I'm%20testing%20at%20the%20moment).%3CBR%20%2F%3EThe%20error%20I%20get%20is%3A%3C%2FP%3E%3CPRE%3Eo.apache.nifi.processors.hadoop.PutHDFS%20PutHDFS%5Bid%3D...%5D%20HDFS%20Configuration%20error%20-%20Configuration%20property%20%5Baccount%20name%5D.dfs.core.windows.net%20not%20found.%3A%20Configuration%20property%20%5Baccount%20name%5D.dfs.core.windows.net%20not%20found.%3CBR%20%2F%3Eorg.apache.hadoop.fs.azurebfs.contracts.exceptions.ConfigurationPropertyNotFoundException%3A%20Configuration%20property%20%5Baccount%20name%5D.dfs.core.windows.net%20not%20found.%3CBR%20%2F%3Eat%20org.apache.hadoop.fs.azurebfs.AbfsConfiguration.getStorageAccountKey(AbfsConfiguration.java%3A342)%3CBR%20%2F%3Eat%20org.apache.hadoop.fs.azurebfs.AzureBlobFileSystemStore.initializeClient(AzureBlobFileSystemStore.java%3A812)%3CBR%20%2F%3Eat%20org.apache.hadoop.fs.azurebfs.AzureBlobFileSystemStore.%3CINIT%3E(AzureBlobFileSystemStore.java%3A149)%3CBR%20%2F%3Eat%20org.apache.hadoop.fs.azurebfs.AzureBlobFileSystem.initialize(AzureBlobFileSystem.java%3A108)%3CBR%20%2F%3Eat%20org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java%3A3288)%3CBR%20%2F%3Eat%20org.apache.hadoop.fs.FileSystem.get(FileSystem.java%3A473)%3CBR%20%2F%3Eat%20org.apache.hadoop.fs.FileSystem.get(FileSystem.java%3A225)%3CBR%20%2F%3Eat%20org.apache.nifi.processors.hadoop.AbstractHadoopProcessor%241.run(AbstractHadoopProcessor.java%3A434)%3CBR%20%2F%3Eat%20org.apache.nifi.processors.hadoop.AbstractHadoopProcessor%241.run(AbstractHadoopProcessor.java%3A431)%3CBR%20%2F%3Eat%20java.base%2Fjava.security.AccessController.doPrivileged(Native%20Method)%3CBR%20%2F%3Eat%20java.base%2Fjavax.security.auth.Subject.doAs(Subject.java%3A423)%3CBR%20%2F%3Eat%20org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java%3A1962)%3CBR%20%2F%3Eat%20org.apache.nifi.processors.hadoop.AbstractHadoopProcessor.getFileSystemAsUser(AbstractHadoopProcessor.java%3A431)%3CBR%20%2F%3Eat%20org.apache.nifi.processors.hadoop.AbstractHadoopProcessor.resetHDFSResources(AbstractHadoopProcessor.java%3A393)%3CBR%20%2F%3Eat%20org.apache.nifi.processors.hadoop.AbstractHadoopProcessor.abstractOnScheduled(AbstractHadoopProcessor.java%3A251)%3CBR%20%2F%3Eat%20java.base%2Fjdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native%20Method)%3CBR%20%2F%3Eat%20java.base%2Fjdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java%3A62)%3CBR%20%2F%3Eat%20java.base%2Fjdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java%3A43)%3CBR%20%2F%3Eat%20java.base%2Fjava.lang.reflect.Method.invoke(Method.java%3A564)%3CBR%20%2F%3Eat%20org.apache.nifi.util.ReflectionUtils.invokeMethodsWithAnnotations(ReflectionUtils.java%3A142)%3CBR%20%2F%3Eat%20org.apache.nifi.util.ReflectionUtils.invokeMethodsWithAnnotations(ReflectionUtils.java%3A130)%3CBR%20%2F%3Eat%20org.apache.nifi.util.ReflectionUtils.invokeMethodsWithAnnotations(ReflectionUtils.java%3A75)%3CBR%20%2F%3Eat%20org.apache.nifi.util.ReflectionUtils.invokeMethodsWithAnnotation(ReflectionUtils.java%3A52)%3CBR%20%2F%3Eat%20org.apache.nifi.controller.StandardProcessorNode.lambda%24initiateStart%244(StandardProcessorNode.java%3A1515)%3CBR%20%2F%3Eat%20java.base%2Fjava.util.concurrent.FutureTask.run(FutureTask.java%3A264)%3CBR%20%2F%3Eat%20java.base%2Fjava.util.concurrent.ScheduledThreadPoolExecutor%24ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java%3A304)%3CBR%20%2F%3Eat%20java.base%2Fjava.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java%3A1135)%3CBR%20%2F%3Eat%20java.base%2Fjava.util.concurrent.ThreadPoolExecutor%24Worker.run(ThreadPoolExecutor.java%3A635)%3CBR%20%2F%3Eat%20java.base%2Fjava.lang.Thread.run(Thread.java%3A844)%3C%2FINIT%3E%3C%2FPRE%3E%3CP%3E%26nbsp%3B%3C%2FP%3E%3CP%3EI%20tried%20to%20check%20around%2C%20but%20I%20have%20no%20idea%20what%20it%20can%20refer%20to.%3C%2FP%3E%3CP%3EI'm%20currently%20using%20Nifi%201.9.1%2C%20and%20the%20jars%20with%20the%20version%20you%20specified%20on%20Ubuntu%2018.04%20Server.%3C%2FP%3E%3CP%3ECould%20you%20help%20me%3F%3CBR%20%2F%3E%3CBR%20%2F%3EThanks%20in%20advance%2C%3C%2FP%3E%3CP%3EMichele%3C%2FP%3E%3C%2FLINGO-BODY%3E%3CLINGO-SUB%20id%3D%22lingo-sub-363693%22%20slang%3D%22en-US%22%3ERe%3A%20How%20can%20I%20use%20NiFi%20to%20ingest%20data%20from%2Fto%20ADLS%3F%3C%2FLINGO-SUB%3E%3CLINGO-BODY%20id%3D%22lingo-body-363693%22%20slang%3D%22en-US%22%3E%3CP%3ENo%20Hadoop%20is%20needed%20..%20For%20ADLS%20Gen1%20and%20Gen1%20you%20need%20a%20couple%20of%20JAR%20files%20and%20a%20simplified%20core-site.xml.%26nbsp%3B%20I%20am%20currently%20working%20with%20Nifi%201.9.0%20(released%20feb%202019).%3C%2FP%3E%0A%3CP%3E%26nbsp%3B%3C%2FP%3E%0A%3CP%3EFor%20ADLS%20Gen1%20I%20am%20using%26nbsp%3B%20%3A%3C%2FP%3E%0A%3CUL%3E%0A%3CLI%3Eazure-data-lake-store-sdk-2.3.1.jar%3C%2FLI%3E%0A%3CLI%3Ehadoop-azure-datalake-3.1.1.jar%26nbsp%3B%3C%2FLI%3E%0A%3CLI%3EThese%20jars%20are%20available%20in%20the%20Maven%20central%20repository%3C%2FLI%3E%0A%3CLI%3EMy%20core-site.xml%20%3A%20(replace%20the%20%24%26lt%3B%20%26gt%3B%24%20with%20your%20values.%26nbsp%3B%3C%2FLI%3E%0A%3C%2FUL%3E%0A%3CP%3E%26nbsp%3B%3C%2FP%3E%0A%3CPRE%3E%26lt%3Bconfiguration%26gt%3B%0A%26lt%3Bproperty%26gt%3B%0A%26lt%3Bname%26gt%3Bfs.defaultFS%26lt%3B%2Fname%26gt%3B%0A%26lt%3Bvalue%26gt%3Badl%3A%2F%2F%24%26lt%3Badls%20storage%20account%20name%26gt%3B%24.azuredatalakestore.net%26lt%3B%2Fvalue%26gt%3B%0A%26lt%3B%2Fproperty%26gt%3B%0A%26lt%3Bproperty%26gt%3B%0A%26lt%3Bname%26gt%3Bdfs.adls.oauth2.access.token.provider.type%26lt%3B%2Fname%26gt%3B%0A%26lt%3Bvalue%26gt%3BClientCredential%26lt%3B%2Fvalue%26gt%3B%0A%26lt%3B%2Fproperty%26gt%3B%0A%26lt%3Bproperty%26gt%3B%0A%26lt%3Bname%26gt%3Bdfs.adls.oauth2.refresh.url%26lt%3B%2Fname%26gt%3B%0A%26lt%3Bvalue%26gt%3Bhttps%3A%2F%2Flogin.microsoftonline.com%2F%24%26lt%3Btenant%20id%26gt%3B%24%2Foauth2%2Ftoken%26lt%3B%2Fvalue%26gt%3B%0A%26lt%3B%2Fproperty%26gt%3B%0A%26lt%3Bproperty%26gt%3B%0A%26lt%3Bname%26gt%3Bdfs.adls.oauth2.client.id%26lt%3B%2Fname%26gt%3B%0A%26lt%3Bvalue%26gt%3B%24%26lt%3Bclient%20id%26gt%3B%24%26lt%3B%2Fvalue%26gt%3B%0A%26lt%3B%2Fproperty%26gt%3B%0A%26lt%3Bproperty%26gt%3B%0A%26lt%3Bname%26gt%3Bdfs.adls.oauth2.credential%26lt%3B%2Fname%26gt%3B%0A%26lt%3Bvalue%26gt%3B%24%26lt%3Bkey%26gt%3B%24%26lt%3B%2Fvalue%26gt%3B%0A%26lt%3B%2Fproperty%26gt%3B%0A%26lt%3B%2Fconfiguration%26gt%3B%0A%0A%3C%2FPRE%3E%0A%3CP%3EFor%20ADLS%20Gen2%20I%20am%20using%26nbsp%3B%20%3A%26nbsp%3B%3C%2FP%3E%0A%3CP%3E%26nbsp%3B%3C%2FP%3E%0A%3CUL%3E%0A%3CLI%3Ehadoop-azure-3.2.0.jar%3C%2FLI%3E%0A%3CLI%3Ewildfly-openssl-1.0.4.Final.jar%3C%2FLI%3E%0A%3CLI%3Emy%20core-site.xml%26nbsp%3B(replace%20the%20%24%26lt%3B%20%26gt%3B%24%20with%20your%20values.%26nbsp%3B%3C%2FLI%3E%0A%3C%2FUL%3E%0A%3CP%3E%26nbsp%3B%3C%2FP%3E%0A%3CPRE%3E%26lt%3Bconfiguration%26gt%3B%0A%20%20%26lt%3Bproperty%26gt%3B%0A%20%20%20%20%26lt%3Bname%26gt%3Bfs.defaultFS%26lt%3B%2Fname%26gt%3B%0A%20%20%20%20%26lt%3Bvalue%26gt%3Babfss%3A%2F%2F%24%26lt%3Bcotainer%26gt%3B%24%40%24%26lt%3Baccount%26gt%3B%24.dfs.core.windows.net%26lt%3B%2Fvalue%26gt%3B%0A%20%20%26lt%3B%2Fproperty%26gt%3B%0A%20%20%26lt%3Bproperty%26gt%3B%0A%20%20%20%20%26lt%3Bname%26gt%3Bfs.azure.account.key.adbstorgen2.dfs.core.windows.net%26lt%3B%2Fname%26gt%3B%0A%20%20%20%20%26lt%3Bvalue%26gt%3B%24%26lt%3Bstorage%20key%26gt%3B%24%26lt%3B%2Fvalue%26gt%3B%0A%20%20%26lt%3B%2Fproperty%26gt%3B%0A%20%20%26lt%3Bproperty%26gt%3B%0A%20%20%20%20%20%20%20%20%20%20%20%20%20%26lt%3Bname%26gt%3Bfs.adlsGen2.impl%26lt%3B%2Fname%26gt%3B%0A%20%20%20%20%20%20%20%20%20%20%20%20%20%26lt%3Bvalue%26gt%3Borg.apache.hadoop.fs.azurebfs.AzureBlobFileSystem%26lt%3B%2Fvalue%26gt%3B%0A%20%20%20%26lt%3B%2Fproperty%26gt%3B%0A%20%20%20%26lt%3Bproperty%26gt%3B%0A%20%20%20%20%20%20%20%20%20%20%20%20%20%26lt%3Bname%26gt%3Bfs.abfss.impl%26lt%3B%2Fname%26gt%3B%0A%20%20%20%20%20%20%20%20%20%20%20%20%20%26lt%3Bvalue%26gt%3Borg.apache.hadoop.fs.azurebfs.SecureAzureBlobFileSystem%26lt%3B%2Fvalue%26gt%3B%0A%20%20%20%26lt%3B%2Fproperty%26gt%3B%0A%20%20%20%26lt%3Bproperty%26gt%3B%0A%20%20%20%20%20%20%20%20%20%20%20%20%20%26lt%3Bname%26gt%3Bfs.AbstractFileSystem.adlsGen2.impl%26lt%3B%2Fname%26gt%3B%0A%20%20%20%20%20%20%20%20%20%20%20%20%20%26lt%3Bvalue%26gt%3Borg.apache.hadoop.fs.azurebfs.Abfs%26lt%3B%2Fvalue%26gt%3B%0A%20%20%20%26lt%3B%2Fproperty%26gt%3B%0A%20%20%20%26lt%3Bproperty%26gt%3B%0A%20%20%20%20%20%20%20%20%20%20%20%20%20%26lt%3Bname%26gt%3Bfs.AbstractFileSystem.abfss.impl%26lt%3B%2Fname%26gt%3B%0A%20%20%20%20%20%20%20%20%20%20%20%20%20%26lt%3Bvalue%26gt%3Borg.apache.hadoop.fs.azurebfs.Abfss%26lt%3B%2Fvalue%26gt%3B%0A%20%20%20%26lt%3B%2Fproperty%26gt%3B%0A%20%20%20%26lt%3Bproperty%26gt%3B%0A%20%20%20%20%20%20%20%20%20%20%20%20%20%26lt%3Bname%26gt%3Bfs.azure.check.block.md5%26lt%3B%2Fname%26gt%3B%0A%20%20%20%20%20%20%20%20%20%20%20%20%20%26lt%3Bvalue%26gt%3Bfalse%26lt%3B%2Fvalue%26gt%3B%0A%20%20%20%26lt%3B%2Fproperty%26gt%3B%0A%20%20%20%26lt%3Bproperty%26gt%3B%0A%20%20%20%20%20%20%20%20%20%20%20%20%20%26lt%3Bname%26gt%3Bfs.azure.store.blob.md5%26lt%3B%2Fname%26gt%3B%0A%20%20%20%20%20%20%20%20%20%20%20%20%20%26lt%3Bvalue%26gt%3Bfalse%26lt%3B%2Fvalue%26gt%3B%0A%20%20%20%26lt%3B%2Fproperty%26gt%3B%0A%20%20%20%26lt%3Bproperty%26gt%3B%0A%20%20%20%20%20%20%20%20%20%20%20%20%20%26lt%3Bname%26gt%3Bfs.azure.createRemoteFileSystemDuringInitialization%26lt%3B%2Fname%26gt%3B%0A%20%20%20%20%20%20%20%20%20%20%20%20%20%26lt%3Bvalue%26gt%3Btrue%26lt%3B%2Fvalue%26gt%3B%0A%20%20%20%26lt%3B%2Fproperty%26gt%3B%0A%26lt%3B%2Fconfiguration%26gt%3B%3C%2FPRE%3E%0A%3CP%3Efor%20the%20put%20list%20fetch%20HDFS%20flows%20..%20you%20just%20need%20to%20fill%20in%20the%26nbsp%3BHadoop%20Configuration%20Resources%20with%20the%20path%20to%20your%20core-site.xml%20and%26nbsp%3BAdditional%20Classpath%20Resources%26nbsp%3B%20with%20the%20folder%20that%20has%20the%20azure%20jar%20files.%20Enjoy%20!%3C%2FP%3E%0A%3CP%3E%26nbsp%3B%3C%2FP%3E%3C%2FLINGO-BODY%3E
marshal.tito01
New Contributor

I would like to use NiFi to connect with ADLS. My scenario is like this: Nifi is installed and running in windows machine.Now i want to move data from my windows local directory to ADLS. I am not using any hadoop component for now. From ADLS again i want to move that data to SQL server which is in Azure too.
How can i connect windows running Nifi to ADLS? All the instruction i found configuring core-site.xml files and taking the jars to Nifi specific folder. But as i dont have Hadoop running(so i dont have core-site.xml file) in that case how I can connect Nifi to ADLS?

Can anyone please share the pointers how it can be done?

Thanks in advance.

4 Replies
Highlighted
Solution

No Hadoop is needed .. For ADLS Gen1 and Gen1 you need a couple of JAR files and a simplified core-site.xml.  I am currently working with Nifi 1.9.0 (released feb 2019).

 

For ADLS Gen1 I am using  :

  • azure-data-lake-store-sdk-2.3.1.jar
  • hadoop-azure-datalake-3.1.1.jar 
  • These jars are available in the Maven central repository
  • My core-site.xml : (replace the $< >$ with your values. 

 

<configuration>
<property>
<name>fs.defaultFS</name>
<value>adl://$<adls storage account name>$.azuredatalakestore.net</value>
</property>
<property>
<name>dfs.adls.oauth2.access.token.provider.type</name>
<value>ClientCredential</value>
</property>
<property>
<name>dfs.adls.oauth2.refresh.url</name>
<value>https://login.microsoftonline.com/$<tenant id>$/oauth2/token</value>
</property>
<property>
<name>dfs.adls.oauth2.client.id</name>
<value>$<client id>$</value>
</property>
<property>
<name>dfs.adls.oauth2.credential</name>
<value>$<key>$</value>
</property>
</configuration>

For ADLS Gen2 I am using  : 

 

  • hadoop-azure-3.2.0.jar
  • wildfly-openssl-1.0.4.Final.jar
  • my core-site.xml (replace the $< >$ with your values. 

 

<configuration>
  <property>
    <name>fs.defaultFS</name>
    <value>abfss://$<cotainer>$@$<account>$.dfs.core.windows.net</value>
  </property>
  <property>
    <name>fs.azure.account.key.adbstorgen2.dfs.core.windows.net</name>
    <value>$<storage key>$</value>
  </property>
  <property>
             <name>fs.adlsGen2.impl</name>
             <value>org.apache.hadoop.fs.azurebfs.AzureBlobFileSystem</value>
   </property>
   <property>
             <name>fs.abfss.impl</name>
             <value>org.apache.hadoop.fs.azurebfs.SecureAzureBlobFileSystem</value>
   </property>
   <property>
             <name>fs.AbstractFileSystem.adlsGen2.impl</name>
             <value>org.apache.hadoop.fs.azurebfs.Abfs</value>
   </property>
   <property>
             <name>fs.AbstractFileSystem.abfss.impl</name>
             <value>org.apache.hadoop.fs.azurebfs.Abfss</value>
   </property>
   <property>
             <name>fs.azure.check.block.md5</name>
             <value>false</value>
   </property>
   <property>
             <name>fs.azure.store.blob.md5</name>
             <value>false</value>
   </property>
   <property>
             <name>fs.azure.createRemoteFileSystemDuringInitialization</name>
             <value>true</value>
   </property>
</configuration>

for the put list fetch HDFS flows .. you just need to fill in the Hadoop Configuration Resources with the path to your core-site.xml and Additional Classpath Resources  with the folder that has the azure jar files. Enjoy !

 

Hi @Bruce Nelson ,

thanks for the tips.

I tried to reproduce the steps you mentioned, but NiFi doesn't load properly the PutHDFS flow (it's the only one I'm testing at the moment).
The error I get is:

o.apache.nifi.processors.hadoop.PutHDFS PutHDFS[id=...] HDFS Configuration error - Configuration property [account name].dfs.core.windows.net not found.: Configuration property [account name].dfs.core.windows.net not found.
org.apache.hadoop.fs.azurebfs.contracts.exceptions.ConfigurationPropertyNotFoundException: Configuration property [account name].dfs.core.windows.net not found.
at org.apache.hadoop.fs.azurebfs.AbfsConfiguration.getStorageAccountKey(AbfsConfiguration.java:342)
at org.apache.hadoop.fs.azurebfs.AzureBlobFileSystemStore.initializeClient(AzureBlobFileSystemStore.java:812)
at org.apache.hadoop.fs.azurebfs.AzureBlobFileSystemStore.<init>(AzureBlobFileSystemStore.java:149)
at org.apache.hadoop.fs.azurebfs.AzureBlobFileSystem.initialize(AzureBlobFileSystem.java:108)
at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:3288)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:473)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:225)
at org.apache.nifi.processors.hadoop.AbstractHadoopProcessor$1.run(AbstractHadoopProcessor.java:434)
at org.apache.nifi.processors.hadoop.AbstractHadoopProcessor$1.run(AbstractHadoopProcessor.java:431)
at java.base/java.security.AccessController.doPrivileged(Native Method)
at java.base/javax.security.auth.Subject.doAs(Subject.java:423)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1962)
at org.apache.nifi.processors.hadoop.AbstractHadoopProcessor.getFileSystemAsUser(AbstractHadoopProcessor.java:431)
at org.apache.nifi.processors.hadoop.AbstractHadoopProcessor.resetHDFSResources(AbstractHadoopProcessor.java:393)
at org.apache.nifi.processors.hadoop.AbstractHadoopProcessor.abstractOnScheduled(AbstractHadoopProcessor.java:251)
at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.base/java.lang.reflect.Method.invoke(Method.java:564)
at org.apache.nifi.util.ReflectionUtils.invokeMethodsWithAnnotations(ReflectionUtils.java:142)
at org.apache.nifi.util.ReflectionUtils.invokeMethodsWithAnnotations(ReflectionUtils.java:130)
at org.apache.nifi.util.ReflectionUtils.invokeMethodsWithAnnotations(ReflectionUtils.java:75)
at org.apache.nifi.util.ReflectionUtils.invokeMethodsWithAnnotation(ReflectionUtils.java:52)
at org.apache.nifi.controller.StandardProcessorNode.lambda$initiateStart$4(StandardProcessorNode.java:1515)
at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
at java.base/java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:304)
at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1135)
at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
at java.base/java.lang.Thread.run(Thread.java:844)

 

I tried to check around, but I have no idea what it can refer to.

I'm currently using Nifi 1.9.1, and the jars with the version you specified on Ubuntu 18.04 Server.

Could you help me?

Thanks in advance,

Michele

Ok, I figured out where the problem was by reading http://docs.wandisco.com/bigdata/wdfusion/adls/.
In Bruce answer there is a small mistake in the ADLS Gen2 core-site.xml.
The correct one should be (differences marked in bold:(

 

<configuration>
  <property>
    <name>fs.defaultFS</name>
    <value>abfss://$<container>$@$<account>$.dfs.core.windows.net</value>
  </property>
  <property>
    <name>fs.azure.account.key.$<account>$.dfs.core.windows.net</name>
    <value>$<storage key>$</value>
  </property>
  <property>
             <name>fs.adlsGen2.impl</name>
             <value>org.apache.hadoop.fs.azurebfs.AzureBlobFileSystem</value>
   </property>
   <property>
             <name>fs.abfss.impl</name>
             <value>org.apache.hadoop.fs.azurebfs.SecureAzureBlobFileSystem</value>
   </property>
   <property>
             <name>fs.AbstractFileSystem.adlsGen2.impl</name>
             <value>org.apache.hadoop.fs.azurebfs.Abfs</value>
   </property>
   <property>
             <name>fs.AbstractFileSystem.abfss.impl</name>
             <value>org.apache.hadoop.fs.azurebfs.Abfss</value>
   </property>
   <property>
             <name>fs.azure.check.block.md5</name>
             <value>false</value>
   </property>
   <property>
             <name>fs.azure.store.blob.md5</name>
             <value>false</value>
   </property>
   <property>
             <name>fs.azure.createRemoteFileSystemDuringInitialization</name>
             <value>true</value>
   </property>
</configuration>

 Tested using:

  • hadoop-azure-3.2.0.jar
  • wildfly-openssl-1.0.4.Final.jar

apologies .. I missed the account name when cleaning up the core-site.xml to send. 

 

I also have Nifi *HDFS working with RBAC and ACLs for ADL gen2 as well .. working on a write up