Collating tricky data

Copper Contributor
Ok team, I’m rally needing help.

I have one years worth of 1/2hourly electricity usage data. I want to condense it to hourly usage then compare hours of each day of the week in respective months.

I haven’t found a good formula to achieve this. I’m hoping some smart cookie out there will know how to crack the code
12 Replies

@KITSONA 

The values in the date/time column are text values. I converted them to real date/time values, then created a pivot table.

See the attached version.

Years and months are shown in the Rows area of the pivot table, hour of the day in the Columns area, and sum of usage in the Values area.

@Hans Vogelaar 

 

Hi Hans,

 

That's an awesome effort (thank you). I didnt think about pivot tables (I was too focused on vlookups).

 

My only question now is how I can take the data provided and turn it into days of the week?

 

I have manually been collating this data below (very time consuming). This is data from the month of February. Below is an example of what I am trying to achieve. See the below graph - the graph shows the average hourly power usage by day.

 

 MONDAYTUESDAYWEDNESDAYTHURSDAYFRIDAYSATURDAYSUNDAY
12:30:00 AM8.167.727.72587.5757.857.76
1:30:00 AM7.87.967.97.857.757.6757.68
2:30:00 AM7.968.087.98.1257.7257.9757.82
3:30:00 AM8.528.47.9257.758.158.2258.12
4:30:00 AM12.0612.4215.119.7517.57.957.66
5:30:00 AM21.9229.934.824.4522.2257.8257.9
6:30:00 AM28.5834.9840.37528.22525.28.0758.52
7:30:00 AM29.933.634.929.67529.0757.97511.8
8:30:00 AM34.9636.136.42531.27529.8259.17511.72
9:30:00 AM37.2234.9635.631.12529.659.510.58
10:30:00 AM36.4834.4833.67530.2528.77510.0259.36
11:30:00 AM37.736.134.3533.72531.0759.9759.06
12:30:00 PM39.5835.5435.57540.4753410.8259.22
1:30:00 PM40.1635.7841.47540.52536.49.57510.56
2:30:00 PM42.8836.439.1541.335.52598.96
3:30:00 PM40.6636.2636.72539.92535.88.1258.06
4:30:00 PM38.433.8832.82538.333.47.758
5:30:00 PM28.782826.17529.8524.657.257.16
6:30:00 PM18.8619.7418.619.12517.4257.858.42
7:30:00 PM15.415.1217.6516.22511.47.4258.28
8:30:00 PM12.3613.4412.4511.1510.57.959.02
9:30:00 PM7.929.468.757.89.5758.0758.32
10:30:00 PM7.88.1487.657.858.1258.48

 

KITSONA_0-1615086709567.png

 

@KITSONA 

In the attached version, I have added a Day of Week column to the source data, and used that in the pivot table.

I created a chart based on the pivot table.

 

Chart.png

@Hans Vogelaar Hi Hans! You may have noticed that I'm a great fan of Power Query (and Power Pivot). That's why I decided to "attack" this question using these tools. The March graph that I produced indicated clear discrepancies compared with yours. Most prominent for Saturdays and Sundays. Your lines came out much higher than mine. Have attached my file.

 

When you converted the text "date/time" to real dates and times, the date that I believe to be Sunday 1 March 2020 became Friday 3 January 2020 in your data (see picture). And that happened for all the dates where the day and month numbers are 12 or below.

Screenshot 2021-03-07 at 13.20.37.png

Up to @KITSONA to decide which approach he wants to take. Yours with corrected dates (probably easier) or mine with PQ/PP (more secure but probably more difficult to grasp if you are not familiar with it).

@Riny_van_Eekelen 

Hi Riny, my conversion is definitely incorrect

I did specify d/m/y but I must have done something wrong.

After correcting it, I get the same results as you.

 

Anyway, PowerQuery is a very good approach.

Thanks @Hans Vogelaar & @Riny_van_Eekelen. I thought I was good on excel but you have both shown me up!! I do still have one concern. It appears that data provided is 1/2 hourly data, not hourly. For example, on 24th February the peak hourly usage is 45.2kW (see below).

 

22.924/02/2020 16:001-Monday
22.324/02/2020 16:301-Monday

 

According to the pivot table the peak is 27.80 on Wednesday at 1500 hours. Have I missed something?

 

 

 

 

MonthsFeb       
 MondayTuesdayWednesdayThursdayFridaySaturdaySundayGrand Total
004.444.454.494.654.814.494.364.52
014.074.334.304.314.534.194.084.25
024.374.494.654.744.794.434.454.55
034.334.364.414.594.534.614.244.43
045.936.595.816.556.195.194.455.82
0513.5013.6015.7116.7916.465.244.3012.27
0617.5818.5617.8418.1416.144.934.2414.04
0718.5319.3818.4818.2517.104.554.5314.54
0820.8720.2819.5819.7419.545.665.2316.01
0922.8521.3521.1822.1121.655.795.1617.35
1023.8821.3921.3422.5821.865.245.6917.65
1123.7723.1823.6824.9424.905.415.5518.95
1224.7723.7324.4623.5624.765.296.0919.15
1325.4825.3825.8425.1525.855.755.0019.97
1425.3224.4425.8925.4425.895.814.9119.87
1526.6126.3827.8027.5426.515.404.6420.90
1624.0923.4825.0125.4424.295.064.6419.04
1715.4215.8616.1316.6914.494.694.8612.69
188.018.568.049.006.414.384.587.03
197.108.617.538.265.354.194.356.51
207.257.557.047.535.264.334.406.23
214.864.434.804.884.254.144.184.52
224.424.214.464.364.244.004.114.26
234.324.154.394.454.264.054.154.26

@KITSONA 

The workbook in my previous reply calculates the average. Here is a version that calculates the sum again. And it filters by year and by month. 

 

S0182.png

Thanks so much Hans. That looks correct.

Moving forward, when processing similar sets of data, I can use the =WEEKDAY(B40,2)&"-"&TEXT(B40,"dddd") formula to sort the date data. The next question is how to run pivot tables or powerquery and to condense the 1/2hourly data into hourly data. Would you have a "template" I could use in a case like this?

Again, thanks so much for your contribution!!

@KITSONA 

Pivot tables have an option to group date fields by years, months, days, hours, minutes and seconds.

You'll see these options when you right-click any of the dates or times in the pivot table and select Group... from the context menu.

S0183.png

For PowerPivot, I refer you to Riny van Eekelen.

Thanks Hans. I will also be interested to see Riny reply too.

Many thanks

@KITSONA Like Hans, my first file averaged the half hourly numbers for each hour. I guess we weren't clear on what you asked for. The attached file contains a revised summary that first sums total KWh for each hour and then averages these hourly totals for each hour of the day. In PQ you can group records on whatever criteria you choose. As long as you have a column in your table to refer to. The advantage of PQ, in my view, is that for data you are dealing with, it's very easy to create new columns by extracting date and or time segments from a date/time column.

For more complex models, where you'd add a Calendar table that creates a date hierarchy that you can then use in pivot tables. So, in summary, with PQ and PP you can achieve everything that you can achieve with regular PT's and much more.

Hi @Riny_van_Eekelen thanks for your contribution and showing me different tools I can use.

 

I have never used PQ before, so may have some learnings to do. 

 

The reason I am collating this data is I am trying to show what size solar system these people need. So the ideal situation is for me to trade off their usage data with production data. Happy to attach that if you are willing to challenge yourself? 

 

After that, the best scenario for me is to use what you have provided as a template for what I need to do for other people.

 

Let me know if you are keen to challenge yourself by overlaying this with generation data?