Forum Discussion
Select text from split function
Hi hope someone can help, (I also hope I can explain this issue)
I created a pipeline to bring in a CSV, stick it in blob storage and then modify it and stick it in a sql database.
But while using data flow to help tidy the contents up I've come unstuck. I created a derived column to split rdfsLabel which contains names of stuff in different languages. Each separated with a |. The issue is that there's no consistency with what order each language is in and each time I run the pipeline the order can change from source.
Can someone give me pointer on how to populate a column with the text from the string with @en at the end, once I get this I can then duplicate this for each of the languages and then go in and create another derived column and trim out the language identifiers.
I'm hoping its something really silly that I've missed.
Thanks in advance
John
Hey John Dorrian , tried the expression builder and here you go.
Hope this is what you were looking for and I might have resolved your issue.
If so, kindly mark this reply as an answer or upvote here!
Thanks and regards,
Sunaina Lalwani
- SLalwaniCopper Contributor
John Dorrian, Can you share some sample records for this field from the source and the final targeted fields that define how do you want the data to be inserted in destination fields?
- John DorrianBrass Contributor
Its an open data set and the link I'm using is https://data.food.gov.uk/codes/reference-number/authority?_format=csv&_view=with_metadata
To note datafactory doesn't like the "@id" title so to get round this I created sql table and then deleted first row.
I was going to create another field called Name, and NameCY to put the content of the arrays but this is where I'm having issues.
Thanks for offering to look
John.
- SLalwaniCopper Contributor
John Dorrian , I can see various values in the specified field as follows ,
'Asiantaeth Safonau Bwyd'@cy|'Food Standards Agency'@en , 'Adur District Council'@en, ...
Please confirm that you need to just filter out the substring which is depicting the language @en. , i.e.,
'Food Standards Agency' 'Adur District Council' ...
For your NOTE: datafactory doesn't like headers starting with '@' , rather than creating a SQL table, you can just enable 'skip n rows' to 1 from blob dataset settings.
Regards,
Sunaina