How to get date part of a filename

Copper Contributor

Hi. I have a requirement where I have to copy multiple files with the naming convention filename_us_ddmmyyyy.csv from a blob container. I have to extract the date part of each of these files and decrement them by one and use it as an additional column in the copy activity.

How do i get each filename and then extract the date part of the filename and feed into the copy activity. I am using GetMetadata and SetVariable.

Please help!!

2 Replies
Assuming you have a blob container with multiple files named in the format filename_us_ddmmyyyy.csv, you can use the following steps to extract the date part of the filename and decrement it by one day:

Create a pipeline in Azure Data Factory and add a "Get Metadata" activity to list all the files in the blob container. In the "Blob container" field, select the name of the container that contains the files you want to copy. In the "Child items" field, enter the file name pattern you want to search for, for example, filename_us_*.csv. In the "Metadata" tab, select the "Child items" property and the "Name" property. This will retrieve the filename of each file in the container.

Add a "Foreach" activity to iterate over the list of files returned by the "Get Metadata" activity. In the "Items" field, select the output of the "Get Metadata" activity. This will iterate over each file in the list.

Inside the "Foreach" activity, add a "Set Variable" activity to extract the date part of the filename and decrement it by one day. In the "Name" field, enter a name for the variable that will store the decremented date. In the "Value" field, enter the following expression to extract the date from the filename and decrement it by one day:

@{addDays('1900-01-01',int(substring(item().name, item().name.IndexOf('_')+4, 2))*365+int(substring(item().name, item().name.IndexOf('_')+2, 2))*30+int(substring(item().name, item().name.IndexOf('_')+6, 4)),-1)}

This expression assumes that the date in the filename is in the format "ddmmyyyy". The expression first extracts the day, month, and year from the filename using the "substring" function. It then converts the day and month to days using simple multiplication and adds them to the year in days. Finally, it subtracts one day using the "addDays" function, which takes a starting date of '1900-01-01' and the number of days to add or subtract.

Map the output of the "Set Variable" activity to an additional column in the copy activity. In the copy activity, select the file you want to copy from the blob container, and in the "Mapping" tab, select the "Additional column" property. In the "Column name" field, enter a name for the additional column that will store the decremented date. In the "Value" field, select the output of the "Set Variable" activity. This will add a column to the copied file that contains the decremented date.

Run the pipeline to copy the files and create an additional column with the decremented date. When you run the pipeline, it will iterate over each file in the blob container, extract the date from the filename, and create an additional column with the decremented date for each file.
Thank you for your reply.