Filling in a Table with Intermittent Dates

Copper Contributor

I am trying to fill in a table with large amounts of data. The table I am filling has an unbroken list of dates through the year, as I am tracking a building's energy usage throughout the year. However, the building I am collecting this data from experiences power outages on a frequent basis, putting holes in my source data.

 

I have been manually filtering the data to see where the holes are, and copying and pasting the data into the table (source data is spread through several different files and worksheets), but this is getting quickly tedious as my table grows larger through the months. Is there a way for me to fill in my table with data that will put 0's in dates where there is no data for me to fill in? Using the Get Data options in Excel would be preferrable.

 

I have included the table I am filling in "Collins MS Data", and the kind of data files I am getting as reference.

10 Replies

@lmelton 

 

Based on what I see in the existing sheets, it's clear that you're not a novice with Excel. So let me just ask whether you're familiar with the newest Dynamic Array functions? Notably FILTER, which might well allow you to do the filtering of the data from your source file, bringing the desired data into your various School summary sheets.

 

That function does require the newest version of Excel, so if you don't currently have it, it would be worth while getting it.

 

Here's a YouTube video that serves as a good introduction to FILTER, but also some of the associated concepts and functions. https://www.youtube.com/watch?v=9I9DtFOVPIg

Learn everything about our brand-new Dynamic Arrays and how you can use them to build advanced spreadsheets. Arrays (CSE) have long been present in Excel, but were limited to power users. With Dynamic Arrays we have rebuilt the calc engine, effectively turning all formulas into array formulas ...

@lmelton 

Let me clarify - do you mean you'd like to generate the table as in Oak Hill Data sheet from the bunch of added from time to time files like Wing XXX Load, syncing resulting data by datetime?

@Sergei Baklan 

That is correct. The full monthly calendar in Oak Hill Data would be populated according to the incomplete dates in the data sheets, hopefully.

@lmelton 

I see, thank you. Format/structure of the source files is exactly the same as in sheets, or you modified them somehow?

@Sergei Baklan 

I don't know how to check the minor going ons behind the scenes in Excel, but the format of both dates fields should be mm/dd/yy hh:mm, sampled every 15 minutes. 

@lmelton 

I mean more data structure

- is each source file have only two column?

- is datetime always in first column and data in second or they could be shifted?

- are columns headers same for each file or they could be different?

- If different for sources when are they they same for monthly files for the same source?

- is there any naming convention for file names?

- are time intervals the same for all sources (starts from 00:00 with every 15 minutes if not skipped, or that could 00:10, 00:25 as well, etc.)

@Sergei Baklan 

- is each source file have only two column?

No, the source files will sometimes have 2 or 3 columns, the first will however always be date and time

 

- is datetime always in first column and data in second or they could be shifted?

Date/time is always first in source data, can't be moved before I get my hands on them

 

- are columns headers same for each file or they could be different?

Every column header is different for the data (Wing A, Wing B, Wing C, Gas, Electric, etc.) first column in each worksheet is date.

 

- If different for sources when are they they same for monthly files for the same source?

Monthly files for the same category ( M.S. HVAC, like was linked in this post), will always have the same headers

 

- is there any naming convention for file names?

Yes, they all follow a "Energy Usage" "Month" "Year" format

 

- are time intervals the same for all sources (starts from 00:00 with every 15 minutes if not skipped, or that could 00:10, 00:25 as well, etc.)

Time intervals are always the same (0:00, 0:15, 0:30, 0:45, 1:00, etc.) unless an interval is skipped, but will always be a multiple of 15 minutes.

 

I have discovered a workaround with Pivot Tables. Relating the broken data to a Pivot Table with data for a full calendar with no breaks, forces the broken data to display blanks where there is no data for the power outages (gaps in date/time). I can change the Pivot Table to display zeroes in blank cells, and then copy and paste into the main table. Still time intensive, but nowhere near as bad as before.

 

I would still like to hear if you have any further thoughts on simplifying this process.  

@lmelton 

Thank you for the clarification. I may try to generate Power Query solution if you submit 2-3 raw files with their own names and un-touched columns. It's not necessary to keep all records for each file, max couple of hundreds will be enough, just ones where the breaks in data appear appear.

 

Idea is to to keep all raw files in one folder, adding them from time to time, and use standard From Folder connector to query and transform them landing result into master file. Standard connector from the box won't work, it will require to modify generated by it supporting scripts, but that's feasible.

@Sergei Baklan 

Thanks for your help, here are 2 months of different electrical system use. These have not been cleaned at all, and are as I would receive them from our contractor.

@lmelton 

Main issue here is with source files - most probably they generated automatically and not very correctly. In any case Power Query never ends the process if use them as it is. 

I cleaned them manually - on each sheet with data go to last row (Ctrl+Down), select next row, select till end (Ctrl+Shift+Down), Home->Clear->Clear All, save the file. Size of the files will be significantly less.

 

Here are such file and master one which generates summary file. Raw files are to placed in some folder, it's name shall be defined in parameter sheet of Master file (this file shall be kept separately). After that Data->Refresh All.

 

Some things are hardcoded within power query, please check first if that works as needed, after that we may add some cosmetic.