Dec 14 2021 08:16 PM
Hi folks.
Working with a (trimmed) dataset with over 25,000 rows of data.
Objective: Determine the top 100 most common route_taken for each user_type (only two user_type i.e. casual & member).
Details: Over 6,980 number_unique_routes.
Explanation: Trying to figure out either by using a pivot_table (tried but really don't know how to input data correctly) or by using a function/formula how to retrieve the most common routes taken for members and casual users. Would like to first understand how to apply the formula/function to retrieve the most common routes how many times they appear ordered by Z-A (to retrieve the top 100 rows of data). Not sure how to implement the =COUNTIF function in this scenario to count recurring text strings and how I can apply it. An explanation would really wrap my mind around it, thanks in advance.
e.g. something like this maybe? (open to suggestions on better ways to structure this)
*Note with some route_taken you might get an instance where both members and casual riders use that route but members or casual riders could be the more common user. (See frequent_rider_route columns)
I think I made my problem quite clear - if any clarification is needed, please let me know!
Dec 15 2021 12:18 AM
Hi @zgoldflo
here are two options:
Option 1. Pivot table:
Use the route_taken in the rows area, the user_type in the columns area and again the route_taken in the values area of the Pivot table. This will automatically apply a count.
To sort the Pivot table, open the filter icon for route_taken and select "More Sort Options..."
Then choose "Descending" and select "Count of route_taken" from the dropdown list
In addition/alternatively you can apply a value filter to get only the top X values:
Option 2. Formula solution with COUNTIFS
Then do a manual sort on the Grand Total column.
I would recommend the Pivot table, because it's much more flexible