Group rows based on similar text

Brass Contributor

I have a very large spreadsheet with one column that start with similar text but pretty much every line is slightly different. Below is a rough idea on what I mean (image two columns):

 

ID   Software
1    Adobe Acrobat 1
2    Adobe Acrobat 2
3    Adobe Acrobat 3
4    Adobe Flash 1
5    Adobe Flash 2
6    Adobe Flash 3
7    Google Chrome 1
8    Google Chrome 2
9    Google Chrome 3


I want to group these based on say the first two or three words in each row. So looking at the above example I would want a group for Adobe Acrobat, a group for Adobe Flash and a group for Google Chrome.

 

Plus I'd like to use subtotal function also to get a count of how many in each group.

 

My issue is I don't know how to get it to group based on part of the text e.g. the first two word.

 

Any help?

4 Replies

@lfk73 

I'd create a helper column:

S0517.png

The formula in C2 is

=TRIM(LEFT(SUBSTITUTE(B2," ",REPT(" ",255),2),255))

Then click in the table and select Subtotal on the Data tab of the ribbon.

Specify that you want to add a subtotal at each change of Group (the helper column) use Count as summary function, and add the subtotal to the Group column. Then click OK.

S0519.png

Result:

S0520.png

@lfk73 The three words in your post "very large spreadsheet" would warrant to look into Power Query as a variant to the formula based solution with sub-totals as suggested by @Hans Vogelaar 

But since your example give only a "rough idea" of your real data, I might be wrong.

@lfk73 

 

I have a very large spreadsheet...I want to group...Plus I'd like to use subtotal function also to get a count of how many in each group

Another option assuming you run a Windows version of Excel >/= 2010: a Power Query + Pivot Table is attached (done w/Excel 365)

 

For Excel 2010/2013 the Power Query add-in must be installed first, paying attention to select the appropriate download (32 vs 64 bit) according to the installed version of Excel

@Riny_van_Eekelen 

 

All relative I suppose.  The initial spreadsheet is 115k lines.  Is that large to you?  To me its the largest I've dealt with.  I crunch it down a lot in various ways.  However maybe the method you suggest would add extra benefits in the crunching that I am doing also.  However for now I don't have the time to invest is trying it out, just looking for a simpler solution to the grouping.