Aug 16 2022 07:26 PM
Hi,
I have mounted the container in synapse workspace and I need to list all the file present in subfolders using Synapse notebook. The same code is working on windows but not on synapse. For example on my windows machine, when I run the below command :
from pathlib import Path
list(Path("C:/Users/sutripathi/Documents/PySpark/archive").rglob(f'{year}/{month}/{day}/*.csv'))
o/p :
[WindowsPath('C:/Users/sutripathi/Documents/PySpark/archive/2022/08/16/20220804000000_availabilityzones_v0.csv'),
WindowsPath('C:/Users/sutripathi/Documents/PySpark/archive/2022/08/16/20220804010000_availabilityzones_v0.csv')]
But when I am using to get the files from Synapse, it is returning empty list :
list(Path("synfs:/8/mnt/qoscontainer/publish/xxxx/yyyy/zzzzz").rglob('2022/08/16/*.parquet'))
o/p : []
Please check and help
Sep 11 2022 06:23 AM
Hi @sutripathi,
Have you tried
mssparkutils.fs.ls('abfss://<container_name>@<storage_account_name>.dfs.core.windows.net/<path>')
For details see https://docs.microsoft.com/en-us/azure/synapse-analytics/spark/microsoft-spark-utilities?pivots=prog...