If you are a frequent user of VLOOKUP
, HLOOKUP
or INDEX + MATCH
, then you will find the LOOKUP + SEARCH
formula to classify data incredibly useful.
- Excel level: Power user
- Last updated: October 2020
Are you a returning reader? Skip straight to the Excel download or the Classification formula if you wish.
Classification Problem
A word or group of words within a search phrase may need to be classified. We build out search phrase data models that contain properties of an item. For example a jacket has:
- Colour (Red, Black, Blue, Green, …)
- Materials (Leather, Duffel, …)
- Size (Small, Medium, Large, …)
- Type (Bomber, Sports, …)
Those list values can be used for variable substitution.
Search | Model |
coats | [coats] |
jackets | [jackets] |
blue coat | [colour] [coats] |
black coat | [colour] [coats] |
duffle jacket | [material] [jackets] |
buy duffle jacket | [purchasing] [material] [jackets] |
small brown winter coat | [size] [colour] [season] [coats] |
For a keyword analysis of 50,000+ phrases, being able to filter the data in Excel can be tricky using just the ‘contains’.
Classification Solution
PPC Campaigns are structured as a tree node, whereby a word group belongs to a campaign. This does not allow for a keyword phrase to appear under multiple nodes of the same data set. Instead, a better solution is to classify words rather than entire search phrases so filters can be applied to important properties.
Lists are a powerful way of dynamically classifying the search phrases. Here are some list examples:
- Genders (men’s, man, women’s, woman, ladies)
- Materials (denim, leather, suede)
- Occasions (birthday, christmas, easter)
- Relationships (husband, wife, brother, sister)
- Seasons (spring, summer, autumn, winter)
- Timeframes (monday, tuesday, weekend, weekday, january, february)
Tip: If you are working at an agency, ask your clients to send you lists of data, such as categories and product properties.
Ready to work through the classification formula? Here is an example to get you started. Study the formulas in the sheets along with this tutorial.
The combination of a LOOKUP
vector with SEARCH
allows Excel users to find a group of characters within another group of characters (substring matching).
LOOKUP
is not case sensitive, however it is good practice to keep the data uniform. Make sure you TRIM
the data to remove any spaces at the beginning or end.
Excel ARRAY
s are extremely difficult to use, often cause bugs and very slow, so instead we create a searchable vector.
SEARCH Function
This is done by using the SEARCH
function with the premade list as the “find_text
” attribute, and the string for the “within_text
” attribute that we are trying to lookup. It is important to use absolute cell references.
For example, assuming the keywords we wish to search are on the current tab, in cell G2
the list is on a tab called Genders and the list has 5 items in Column A, the formula would be:
=SEARCH(Genders!$A$1:$A$5,$G2)
This will not work yet, so expect an #N/A
result.
LOOKUP + SEARCH Function
Now wrap that SEARCH
inside a LOOKUP
, using a “lookup_value
” of 2^10
, “lookup_vector
” is the SEARCH
you just performed and the “result_vector
” is the same Genders list as in the the SEARCH
function, with absolute references.
=LOOKUP(2^10,SEARCH(Genders!$A$1:$A$5,$G2),Genders!$A$1:$A$5)
Error Handling
This formula now show substrings that are within the list and an #N/A
value if a match is not found. To make this more friendly, wrap it in an IFERROR
statement so that a hyphen is presented if no match is found.
=IFERROR(LOOKUP(2^10,SEARCH(Genders!$A$1:$A$5,$G2),Genders!$A$1:$A$5),"-")
Grouping Items of a List
If your list contains variations of a topic, such as gender, each item will printed individually, i.e. “women’s“, “woman“, “ladies“. To collect those variations under a single category of “Women’s“, add a column in the Genders tab, Column B with the grouping.
Search Phrase | Model |
Women’s | Women’s |
Women | Women’s |
Ladies | Women’s |
Woman | Women’s |
Men’s | Men’s |
Men | Men’s |
Man | Men’s |
You can now use VLOOKUP
against the value received from LOOKUP + SEARCH
to power VLOOKUP
. See the tab Keyword List (#2) for examples.
The Final Grouping Classification Formula
=VLOOKUP((IFERROR(LOOKUP(2^10,SEARCH(Genders!$A$1:$A$5,$G2),Genders!$A$1:$A$5),"-")),Genders!$A:$B,2,0)
You will need to scroll to see the entire code. You now have a dynamically generated classification through substring matching.
Important Notes
The -
Hack
If you decide to extend any of the classifications, you will need to insert a new line. Even with absolute cell references, they can often shift from $A$1
to $A$2
if you prepend or append to the list. By using a hyphen at the beginning and end, insert inbetween the hyphens to automatically extend the list, without the need for editing the absolute cell references.
Empty Cells
Empty cells are not allowed in the list, it will break the formula.
Substrings
This classification will match substrings. This means if you have a word such as “coat“, it will match “coat“, “coats“, “overcoat“, but also “coating“. This could be a problem if you have substrings with multiple meanings.
Make the List Absolute Cell References Dynamic
This could be done with the use of COUNTA
, INDIRECT
and CONCATENATION
functions. However, this would need to be computed upon each cell unless the value is stored somewhere. This also makes the formula almost unusable and a lot more complicated than just using “The -
Hack“.
Double Classifications
Some listings may have more than one match. For example, a keyword term could be “black and white dresses“, so the first colour “black” would be matched, unless there was a “black and white” classification.
Have any questions? Send me an email or find me on Twitter @ryansiddle.