If you are a frequent user of VLOOKUPHLOOKUP or INDEX + MATCH, then you will find the LOOKUP + SEARCH formula to classify data incredibly useful.

  • Excel level: Power user
  • Last updated: October 2020

Are you a returning reader? Skip straight to the Excel download or the Classification formula if you wish.

A short form video / GIF showing the classification of keyword terms for a set of fashion terms including Black Dress.

Classification Problem

A word or group of words within a search phrase may need to be classified. We build out search phrase data models that contain properties of an item. For example a jacket has:

  • Colour (Red, Black, Blue, Green, …)
  • Materials (Leather, Duffel, …)
  • Size (Small, Medium, Large, …)
  • Type (Bomber, Sports, …)

Those list values can be used for variable substitution.

SearchModel
coats[coats]
jackets[jackets]
blue coat[colour] [coats]
black coat[colour] [coats]
duffle jacket[material] [jackets]
buy duffle jacket[purchasing] [material] [jackets]
small brown winter coat[size] [colour] [season] [coats]

For a keyword analysis of 50,000+ phrases, being able to filter the data in Excel can be tricky using just the ‘contains’.

A short form video / GIF showing ways in which to filter keyword terms in excel in order to help with Keyword Classification

Classification Solution

PPC Campaigns are structured as a tree node, whereby a word group belongs to a campaign. This does not allow for a keyword phrase to appear under multiple nodes of the same data set. Instead, a better solution is to classify words rather than entire search phrases so filters can be applied to important properties.

Lists are a powerful way of dynamically classifying the search phrases. Here are some list examples:

  • Genders (men’s, man, women’s, woman, ladies)
  • Materials (denim, leather, suede)
  • Occasions (birthday, christmas, easter)
  • Relationships (husband, wife, brother, sister)
  • Seasons (spring, summer, autumn, winter)
  • Timeframes (monday, tuesday, weekend, weekday, january, february)

Tip: If you are working at an agency, ask your clients to send you lists of data, such as categories and product properties.

Ready to work through the classification formula? Here is an example to get you started. Study the formulas in the sheets along with this tutorial.

The combination of a LOOKUP vector with SEARCH allows Excel users to find a group of characters within another group of characters (substring matching).

LOOKUP is not case sensitive, however it is good practice to keep the data uniform. Make sure you TRIM the data to remove any spaces at the beginning or end.

Excel ARRAYs are extremely difficult to use, often cause bugs and very slow, so instead we create a searchable vector.

SEARCH Function

This is done by using the SEARCH function with the premade list as the “find_text” attribute, and the string for the “within_text” attribute that we are trying to lookup. It is important to use absolute cell references.

For example, assuming the keywords we wish to search are on the current tab, in cell G2 the list is on a tab called Genders and the list has 5 items in Column A, the formula would be:

=SEARCH(Genders!$A$1:$A$5,$G2)

This will not work yet, so expect an #N/A result.

LOOKUP + SEARCH Function

Now wrap that SEARCH inside a LOOKUP, using a “lookup_value” of 2^10, “lookup_vector” is the SEARCH you just performed and the “result_vector” is the same Genders list as in the the SEARCH function, with absolute references.

=LOOKUP(2^10,SEARCH(Genders!$A$1:$A$5,$G2),Genders!$A$1:$A$5)

Error Handling

This formula now show substrings that are within the list and an #N/A value if a match is not found. To make this more friendly, wrap it in an IFERROR statement so that a hyphen is presented if no match is found.

=IFERROR(LOOKUP(2^10,SEARCH(Genders!$A$1:$A$5,$G2),Genders!$A$1:$A$5),"-")

Grouping Items of a List

If your list contains variations of a topic, such as gender, each item will printed individually, i.e. “women’s“, “woman“, “ladies“. To collect those variations under a single category of “Women’s“, add a column in the Genders tab, Column B with the grouping.

Search PhraseModel
Women’sWomen’s
WomenWomen’s
LadiesWomen’s
WomanWomen’s
Men’sMen’s
MenMen’s
ManMen’s

You can now use VLOOKUP against the value received from LOOKUP + SEARCH to power VLOOKUP. See the tab Keyword List (#2) for examples.

The Final Grouping Classification Formula

=VLOOKUP((IFERROR(LOOKUP(2^10,SEARCH(Genders!$A$1:$A$5,$G2),Genders!$A$1:$A$5),"-")),Genders!$A:$B,2,0)

You will need to scroll to see the entire code. You now have a dynamically generated classification through substring matching.

Important Notes

The - Hack

If you decide to extend any of the classifications, you will need to insert a new line. Even with absolute cell references, they can often shift from $A$1 to $A$2 if you prepend or append to the list. By using a hyphen at the beginning and end, insert inbetween the hyphens to automatically extend the list, without the need for editing the absolute cell references.

Empty Cells

Empty cells are not allowed in the list, it will break the formula.

Substrings

This classification will match substrings. This means if you have a word such as “coat“, it will match “coat“, “coats“, “overcoat“, but also “coating“. This could be a problem if you have substrings with multiple meanings.

Make the List Absolute Cell References Dynamic

This could be done with the use of COUNTAINDIRECT and CONCATENATION functions. However, this would need to be computed upon each cell unless the value is stored somewhere. This also makes the formula almost unusable and a lot more complicated than just using “The - Hack“.

Double Classifications

Some listings may have more than one match. For example, a keyword term could be “black and white dresses“, so the first colour “black” would be matched, unless there was a “black and white” classification.

Have any questions? Send me an email or find me on Twitter @ryansiddle.