Forum Discussion
HELP! How do I automate data cleaning from a continually updating source
- Jan 04, 2023
(SergeiBaklan Thanks for joining)
That is why I decided to manually download the csv file instead of using the IT teams Power BI dataset, as the source data is in a tabular table form, however I will have to pull the data weekly and drop it into an excel sheet and then use Power Query to automate the cleaning process
NO. Once you have a CSV you don't have to "drop it into an Excel sheet", that step is unecessary
From Excel, you go to Data (tab) > From Text/CSV
you then select your CSV (the Power Query Editor opens) and you begin your cleaning process
Once you have completed the cleaning process and output/returned the query result to an Excel sheet, next time you download a CSV you copy it (overwrite the previous CSV) in the same folder and Refresh your query
Thank you for your quick reply! I have been advised to connect to the data via a Power BI database within my organisation using a Pivot Table. The team who manage the database get it from the 3rd party source and do a basic clean of it before uploading it to the database. I also have the ability to download the raw data as a csv file from the third party source, however the data isn't in great shape and the data team clean it before uploading it to their Power BI database. They just don't have the product knowledge or time to go to the level of correcting the issues I have mentioned.
Unfortunately, I am unable to locate or connect to the database via Power Query, and I believe this is a common issue. I have considered work arounds such a loading the data into a Pivot Table as advised by the data team, and then using Power Query to transform the data to tabular form and edit it.
I am fairly new to Excel, so want to make sure I don't end up with long complex data chains if there is a more simple solution.
If your data source is actually a PowerBI Dataset not sure I can help as I have a limited understanding (mainly due to the fact that I don't have access to something like that) of what it is and how it can be used within Excel. So, instead of saying something wrong I prefer to stop there
However, reading this page I'm under the impression your Data Team could provide you something easier to consume (i.e. an Excel workbook with Table(s)) with Get & Transform than a PivotTable
Despite its name a PivoTable isn't a Table. Connecting Power Query to the worksheet containing a PivotTable is doable but I would take this approach as the (almost) last option. If your Data Team can provide you an Excel workbook with Table(s) or a CSV file that would be better