Forum Discussion
Conditional Data Accumulation for Sport Analytics
I'm working on a dataset (Excel version 16.54 on Mac Big Sur 11.5.2) to eventually analyze the effect of shots, shots on target, corners, fouls, and cards on betting odds for a college assignment. As seen in the file, I have data from all English Premier League games from the last 10 seasons. Each game tells the home and away teams along with their game statistics listed as either home or away. I'm hoping to fill the blank columns with accumulated data from throughout each season. For example, I would like to know how many shots a team like Arsenal had accumulated coming into each game by adding their shot totals from each of the previous games in the season. This is difficult because the teams are listed in two different columns (home or away), so the game statistics also vary based on if the team was the home team or away for each given week. Please let me know of any more clarifications needed and/or if there is a formula to do this, as I don't have the time or will power to manually add the data for 3910 rows of games.
- Riny_van_EekelenPlatinum Contributor
evwill616 It might take 10 -20 minutes copying, pasting and deleting to prepare the data. Not all that bad. Just make sure that the original data set is not changed so that you can always come back to it to check or start from scratch if something went wrong. You will have to create two rows of data for each game, each row identifying if it is for the Home or Away team.
Create two separate sheets with all game data. On each, add a column that will hold a game index number from 1 to 3910. Call one sheet HOME, the other AWAY. On each of them, delete the columns that contain information that do not relate to Home or Away, so that you end up with two data sets. One for the Home teams and one for the Away teams. On each sheet, insert a column marking the teams either as Home or Away. Now you can copy both data sets onto another sheet below each other and sort by game number (then by H or A). Rename the column headers to neutral names (i.e. without a reference to "H" or "A". It will then look something like this (just the first 10 columns shown).
(In Excel for Windows, you could use Power Query for this. It will be more sturdy and less prone to error. And, once set-up you can use it over and over again. So, if you can get your hands on a PC, do that instead.)
By the way, you need to clean-up your data first as the EPL data is inconsistent. From row 3042 and down, all data shifts one column because of the time information inserted there. Either insert blank space in the rows above it or delete the time information from the rows below it. All columns need to be lined up correctly to begin with.
- evwill616Copper ContributorThank you, this worked!
- apksportzfytvCopper Contributor
Conditional data accumulation allows Sportzfy TV to gather real-time sports analytics, enhancing viewer experience by displaying key stats during live matches. This ensures fans stay updated with the most relevant game data.