Combining same ID from multiple rows into single row

Copper Contributor

Hi there, 

 

I have a pretty substantial data set that resembles the following: 

 

IDCity StateRaceGraduationCollegeEnroll_BeganEnroll_EndClass_LevelMajor
12345CincinattiOHWhite2015Vanderbilt08/15/201512/15/2015FreshmanElectrical Engineering
12345CincinattiOHWhite2015Vanderbilt01/15/201605/15/2016FreshmanElectrical Engineering
12345CincinattiOHWhite2015Ohio State University08/15/201612/15/2016SophomoreElectrical Engineering
12345CincinattiOHWhite2015Ohio State University01/15/201705/15/2017SophomoreElectrical Engineering
67890St. LouisMOAsian2016University of Missouri08/15/201612/15/2016FreshmanArt History
67890St. LouisMOAsian2016University of Missouri01/15/201705/15/2017FreshmanArt History
67890St. LouisMOAsian2016University of Missouri08/15/201712/15/2017SophomoreBiology
67890St. LouisMOAsian2016University of Missouri01/15/201805/15/2018SophomoreBiology
67890St. LouisMOAsian2016University of Missouri08/15/201812/15/2018JuniorBiology

 

As you can see, I have multiple rows for one individual. There are pieces of information that will remain the same in each row with that ID, but there are also parts that may differ (such as College, Beginning and End Dates, Class Level, and Major). In these examples I included someone who changed from one college to another (12345), and another individual who changed their major (67890). My goal is to get all of this information into a single row per ID. I only want to have the static pieces of information to be shown once, but then I want a new column for each new piece of information. Does anyone have any idea how to go about this in Excel? I am suspecting that it may be too complex for Excel and I will have to go to code, but any feedback would be helpful.

 

Thank you!

4 Replies

@cma123 

 

As a person who was at one time (I'm retired now nearly two decades) the director of the HR/Payroll database of a major corporation, I have to ask you to step back and consider whether you really want to do this. What you currently have could be described as an accurate history of these students' academic life, including the dates when changes were made. History is better kept as a single row per transaction or per status change. You'd be potentially messing up that history by getting everything for a given student into a single row.

 

So unless there's a compelling reason for doing that (and as you might guess, I'm skeptical about any such reason), I think you'd be better served leaving the history as it is, looking for ways to create a dashboard where you can see the history of any given individual, school changes, major changes, degrees accomplished, etc. 

@mathetes 

While I do appreciate your input and recognize the need to see data overtime as an individual row, the current goal for this project is to identify an individual's first enrollment in a post-secondary institution. All of the original data will be retained. After identifying an individual's first enrollment, we are going to be doing things that include frequencies for m/f and race/ethnicity for 2 and 4-year institutions, as well as frequencies of STEM majors by demographic breakdown. The organization we are working with is tracking students after completion of a high school program, and at this moment in time they only want to have information on the very first post-secondary enrollment that students of their program have. 

@cma123 

 

You wrote:

The organization we are working with is tracking students after completion of a high school program, and at this moment in time they only want to have information on the very first post-secondary enrollment that students of their program have

 

So here's a way to get, very specifically, at that last. Thank you for a specific goal to achieve.

 

My formula sifts through your database to get the first enrollment of each student, and does so primarily using the FILTER function, first to get the earliest enrollment date for each student, then to get the rest of that row for each student, showing the institution and major associated with each student's first enrollment date.

 

The formula looks like this:

=FILTER(Table1[[City ]:[Major]],(Table1[ID]='First Enrollment'!A2)*(Table1[Enroll_Began]=MIN(FILTER(Table1[Enroll_Began],Table1[ID]='First Enrollment'!A2))))

 

I've attached the sample spreadsheet.

 

@cma123 

 

And if you like shorter formulas, I was able to make it a bit shorter using the LET function to eliminate a couple of redundancies.

=LET(Iden,Table1[ID]='First Enrollment'!A2,FiEn,Table1[Enroll_Began],FILTER(Table1[[City ]:[Major]],Iden*(FiEn=MIN(FILTER(FiEn,Iden)))))