(Google Search Console + Regex)|(GSC API)?

Author Image
by andrewsho


Well it finally happened. Google Search Console’s Search Performance report now supports regular expressions. So this is great because you can now do cool searches like:

Alan Bleiweiss Asshat Regex

But while we waited six years to be able to correlate alan bleiweiss with asshat (it’s not causation folks), our crack TechOps team decided to figure out how to do this ourselves.

We SEO types are never satisfied with the free tools that Google gives us, because with a little extra work, there’s usually there’s a way to get something much more useful. Of course I’m talking about the Google Search Console API. If you haven’t figured out how to use it yet, I strongly recommend you prioritize making it happen. Here are some SEO tech nerd reasons why:

  1. The API already let’s you do regex searches! Put the API data into something like Google Big Query and you can do complex SQL queries over large data sets and much faster than via GSC.
  2. You get way more data from the GSC API than in the GSC interface – ask Noah Learner for his recent LocalU deck to see what someone creative can do with that data. Here’s an example:
    GSC shows 1,000 keywords for this site for the term “pci”:Google Search Console Queries
    Now here’s what we get using the API with Google Big Query:Google Search Console in Google Big QueryWe found 3,281 terms, more than 2K more than we get via the GSC interface. If you are not using the API you may be missing 66% of the data. How can you make decisions based on that?
  3. If you have set up a data warehouse you can access more than 16 months worth of data.
  4. Google’s support of regex is usually pretty limited. For example, the GSC regex currently does not support negative lookaheads, or at least I couldn’t figure out how to do it. We have a client that unintentionally ranks for a ton of porn queries which obscures a lot of the legit queries. There is no way to filter the GSC data within the interface to remove these queries, which makes it basically unusable.
  5. Pandas (the Python data analysis library) already lets you do fancier stuff natively like data frames (pivot tables) and allows you to compare things together within data frames.
  6. So you can compare two regex filters against each other to see what subset is performing better in GBQ or Pandas.
  7. With GSC data loaded into Google Big Query, you can utilize analytic functions within Postgres to compare time-frames easily to see where growth is (or isn’t in your query corpus) overtime; by segmenting the data into buckets you can then suss out details that are otherwise lost to GSC GUI users. For example, using a LAG() + OVER clause you can generate a report that shows the % month over month change for clicks and impressions for a group of queries like:

GSC Month Over Month

For all you Postgres geeks, the code for how to do this is available on the LSG Github

If you have just spent a few hours trying to figure out re2 regex, do yourself a favor. Find someone who can hook you up to the GSC API. You’ll be happier as will your SEO.


  • Twitter
  • Facebook
  • Mail
  • LinkedIn

this content