Comparing lists and returning values

New Contributor



I am working on a project where I need to compare two data set from different workbooks. I need to compare data point x of set B with the entirety of set A. If a match is found, I need it to display the value of the matching data point. I need the program to do this for the entire set B. What function(s) would I use to accomplish this?

5 Replies
best response confirmed by colbyclem (New Contributor)

IF you have Excel 365 you can do the whole set using something like:
=XLOOKUP( [list of values in B], [range to lookup in A], [range to return from A], "match not found")
if you don't have Excel 365 then something like =IFERROR(VLOOKUP(setB!A1, setA!$A:$B, 2), "match not found")      and then fill down (unless it is in a table in which case you would use [@value] and it would automatically fill down)
but if the lookup column is after the return value column then you will need to use INDEX(MATCH()) combination instead of VLOOKUP (or HLOOKUP)

@mtarler Thank you! In your formula, is set B and set A just the amount of cells in each set? Also, I would really like to know how these functions work, can you explain why the formula looks like that?

I don't know how deep to dive into the explanation so apologies if I go too basic or not basic enough.
Most functions have parameters, which are values (or references to values) that are passed to the function to do some operation upon.. In many cases those parameters are references to a cell or a range of cells. In many cases an array may be used instead of a cell range, but not always. In Excel, there are a few ways to reference a range:
a) the classic or direct refence. This is often 'A1' style and looks like A1:B10 to reference the range of cells from A1 to B10. An alternative is the 'RC' style which uses Row and Column references but I won't get into here. Also with 'A1' style is relative vs absolute references so A1 is relative while $A$1 is absolute. If you copy a cell with A1 in the formula down 1 row and to the right 1 row the new cell will have B2 in that formula, but if you had $A$1 then it would stay $A$1
b) you can define a Name. A name for a range can be created by highlighting the range and typing in the box to the LEFT of the formula entry box where it normally shows which cell you are in. By typing a name there you define a name for that range. To edit or add other name you use Name Manager. Once you define a name you can use that name in place of a traditional reference.
c) You can use Table References. If you highlight a table of data you can select from the Home tab the button 'Format as a Table'. Although this looks like it is just a quick way to create a color/formatting scheme it is actually much more because that range is now defined as a table object. It is basically a special case of (b) above as you can define the Table Name (under the now visible Table menu) and you can reference the whole table or various parts of that table. Let's say you create a table called Data and have columns called Time, X, Y, Z then you can refer to whole column of Time data using the Data[Time]. As you add data to the table that reference will still be the whole set of time data. You no longer need to reference A:A and have excel looking at tons of blank rows and you don't have to use A2:A100 and worry about fixing it when you data goes past row 100.
so in my response above I used things like: [list of values in B] meaning you put in a reference to the range of all the value in the B set and that reference could be Sheet1!A1:A999 or Table1[Bvalues] or some custom name you made like ColbyDataSet. In the second formula I use: setB!A1 which is technically saying cell A1 on sheet named "setB" but meant it as a generic reference to the first cell in your dataset B. And then I used setA!$A:$B as a generic reference to the whole table of data set A starting with the lookup column to at least the column with the data you want returned. In the above case I followed that with ,2 meaning return the 2nd column but if the return data is in column D then it might be $A:$D, 4 meaning the table is in columns A through D and you want column D to return.
If you have Excel 365 I recommend the 1st formula as you don't need to 'fill down' because it is uses dynamic arrays and creates the whole array of answers using that 1 formula and all the results 'spill' down.
Again, not sure what parts you are confused about but I hope the above helps.
@mtarler This helps! Thank you so much!
I have Excel 2016, so I will have to use the second formula. With the second formula, the VLOOKUP function is basically saying (look for x, inside of y, if you find it return that same value)?
Vlookup format is
VLOOKUP (lookup_value, table_array, col_index_num, [range_lookup])
which is look for "lookup_value" in the first column of this "table_array" and then return value in column "col_index_num"
the [range_lookup] is optional but should be included as 0 (zero) to say you want an exact match, otherwise the list must be in order and then you get the closest value.
more info I recommend many google sites but here is Microsoft's official: