Nov 16 2023 08:36 AM
Hello! I have an excel sheet that I need to connect to a PowerBI - but it needs to be reformatted. I wasn't sure what the best way to format it is.
I have different tabs (all formatted the same way) for each year of data, and then I have a matrix created to have all of my data. I need to be able to input this into a PowerBI and report on Planned OR and Line Result for each Plant and Line and then have the date on the x-axis. (basically create trendlines/analysis for this data in an easier template)
I have attached the excel sheet - thanks in advance!!
Nov 16 2023 10:28 AM
Hi @nattiej101
#1 I have different tabs (all formatted the same way)
Yes and then No. At the bottom of each sheet you have a couple of extra (inconsistent) rows and not at the same positions (row number). This prevents automating the restructuring process ==> extra rows will be deleted in the file you shared
#2 In reality how many sheets/years do you have to restructure?
Nov 16 2023 10:41 AM
If I were to delete those extra rows (they are just totals I believe) would that help? I have attached the newer/modified report.
I have about 10 years of data (2013-2023)
Nov 16 2023 11:15 AM
If I were to delete those extra rows (they are just totals I believe) would that help?
That's exactly what I said. It's actually a must, especially with 10 years to restructure
Attached is 2022 restructured. Could you check and confirm this reflects what you need per sheet/year?
If not, what (exactly) is missing/wrong?
Nov 16 2023 11:22 AM
Nov 16 2023 12:05 PM
That looks perfect - how did you do that/how can I recreate that for all of the other years?
I'm preparing you a workbook to do it BUT we have an issue and you have to take a decision
Issue: In [Planned OR%] and/or [Line Result] you have some N/A (Text value). Such value can't be converted to a Number and as you are going to load the data to PowerBI, those columns must have a Type decimal number (an UnTyped column is Typed Text when data loads in PowerBI or Excel Data Model)
Question: In [Planned OR%] & [Line Result] should I replace the N/As with nothing (null), 0, another numeric value???
Nov 16 2023 12:09 PM
Nov 16 2023 12:34 PM
I would like them to be null values That's what I anticipated 🙂
Attached file contains what you need to combine your sheets, assuming:
- They're named 2013, 2014... 2023 (a Numeric name)
- You have deleted all the extra rows at the bottom of each sheet
#1 Go the the PARAM sheet and update the file path with yours (path to the file containing sheets 2013, 2014, .... 2023)
#2 Switch to sheet PlantsLines
#3 Go to Data (tab) > Queries & Connections (the corresponding pane opens on the right)
#4 Right-click in the green table (still in sheet PlantsLines) > Refresh
then wait and see...
When the query refresh is complete, in the Queries & Connections pane, check that no error is reported under query name PlantLines. This should look like (w/diff. #rows of course):
If you have errors you can follow this article to understand the reasons and hopefully fix them yourself. Worst case scenario upload your file and I'll have a look at it (I'll be off in +/- 1h from now)
Nov 16 2023 12:45 PM
Nov 16 2023 12:53 PM
Here is the error that I'm getting:
I have even attached the file I was trying to connect to with all of the tabs.
Nov 16 2023 01:37 PM
You're not serious 😉 Not all your sheets are formatted the same way - very quick 1st analysis...:
- Nothing in A1-C1 (easy to fix) in some sheets
- 2011-2013 have extra empty rows in the middle
- Before 2016 some columns are not in the same order as what you shared before or even don't exist
...
#1 I'll try to fix issues in 2016-2023 myself as they look quite consistent but are execptions obviously as I get a bunch of errors with them
#2 For 2014-2015 reorder (& rename) the columns so they're in the same order as what you previously shared (and reshare your file) please
#3 For 2011-2013 either forget about those years or create columns (in the right order+name) with dummy data
Nov 16 2023 02:02 PM
2016-2023: Extra rows at the bottom were not deleted
2014-2018: Have empty columns in-between others:
In the meantime 2019-2023 with no error is attached
Nov 17 2023 01:39 AM
SolutionHi @nattiej101
As commited tried to get 2016-2023 data
#1 2019-2023 was previously shared. Included as a separate sheet in PQ_Restructured_PlantLine2016-2023
=> Data source file: Source_OperationRateReport2019-2023
#2 2016-2018 Included as sheet as a separate sheet PQ_Restructured_PlantLine2016-2023:
In October Operation Rate Report (DSM) sheet 2018 has info. from Apr 2013 - Dec 2018
- Exported that sheet to a new workbook
=> Data source file: Source_OperationRateReport2016-2018
- Duplicated the sheet
- Renamed it '2018 AFTER'
- Deleted all columns between Apr 2013 and Dec 2015 (too many inconsistencies* and #DIV/0!, #REF! errors)
- Filled a bunch of empty cells with NA (all colored in orange in the sheet)
- Created a specific query to import the data from sheet '2018 AFTER'
* Going through that sheet I saw an (maybe there are others...) interesting =AVERAGE(H53,H55,H57,H53) that necessarily returns an inaccurate Avg...
+ Some values that seem to be Avg are not calculated and sometimes stored as Text values due to extra space(s) with the Number
#3 Combined/Appended #1 & #2 as a single table named '2016-2023' in PQ_Restructured_PlantLine2016-2023
==> 5,564 records (0 errors)
Hope this helps
Nov 20 2023 05:53 AM
You are incredible and deserve a raise!!! Thank you so so much. You just saved me probably 40 hours worth of reorganizing these files.
I hope you have a blessed rest of your year!! @Lorenzo
Nov 20 2023 12:56 PM
Glad I could help @nattiej101 . Thanks for posting back & Nice EOY...
Nov 17 2023 01:39 AM
SolutionHi @nattiej101
As commited tried to get 2016-2023 data
#1 2019-2023 was previously shared. Included as a separate sheet in PQ_Restructured_PlantLine2016-2023
=> Data source file: Source_OperationRateReport2019-2023
#2 2016-2018 Included as sheet as a separate sheet PQ_Restructured_PlantLine2016-2023:
In October Operation Rate Report (DSM) sheet 2018 has info. from Apr 2013 - Dec 2018
- Exported that sheet to a new workbook
=> Data source file: Source_OperationRateReport2016-2018
- Duplicated the sheet
- Renamed it '2018 AFTER'
- Deleted all columns between Apr 2013 and Dec 2015 (too many inconsistencies* and #DIV/0!, #REF! errors)
- Filled a bunch of empty cells with NA (all colored in orange in the sheet)
- Created a specific query to import the data from sheet '2018 AFTER'
* Going through that sheet I saw an (maybe there are others...) interesting =AVERAGE(H53,H55,H57,H53) that necessarily returns an inaccurate Avg...
+ Some values that seem to be Avg are not calculated and sometimes stored as Text values due to extra space(s) with the Number
#3 Combined/Appended #1 & #2 as a single table named '2016-2023' in PQ_Restructured_PlantLine2016-2023
==> 5,564 records (0 errors)
Hope this helps