Forum Discussion
Cleaning messy excel/csv form ERP system
Some interesting questions. I feel your pain. I regularly import financial data from
- a credit card account with American Express
- investment accounts with Fidelity
The former, when I use their XLSX format, is utterly frustrating because whoever designed their excel version chose to make it a "pretty" spreadsheet--merged cells, colors, multiple lines of things like address. As a result, the CSV file is far more useful, but I would love to see them make the Excel useful as an Excel table.
The latter does allow me to export a good Excel table of data, but at the top they have a huge merged cell, followed by a blank row, then the headers, then the data, beginning at row 4. Then at the bottom of the useful table there are two blank rows followed by "Disclosure" and a paragraph of legalese. To import only the data in the table, I've written:
- a formula that finds the row in which the word "Disclosure" appears in that raw export file
- another formula that imports only the table of data from that raw export file. My working workbook uses a dynamic array function to import the data, but knowing that I always want to ignore the first three rows, I set the import function to import $A$4 through the final column ($T) and that row determined by the prior formula. In other words, I'm able to ignore the header and footer that Fidelity is legally obligated to clutter their file with.
Other than that, I often have to convert text dates to Excel dates, or text representations of dollar values into actual numeric fields.