I just got back from #TechSEOBoost and spent a lot of time engaged in amazing conversations about data analysis, math and very importantly sharing data. So, after thinking about it, I decided I’m going to open source the data we used for the 2017 Local SEO Ranking Factors.
This data is pretty interesting, and honestly would be pretty expensive. It’s ~150k rows of data (each row representing a different business listing on Google My Business). That data was scraped by Places Scout and joined with a bunch of their own data, as well as link API data from AHREFs and Majestic. All in all there are ~150 data points per listings/business, and you can find out more about them here in this data dictionary.
For those curious what we did with the data, we employed two statistical methods. First Kendell’s Tau-b was used to analyze ordinal variables (continuous or integer independent variables) while the Kruskal-Wallis test was used for categorical independent variables. These tests were before we started doing more complicated linear regressions and other modeling, so I’m kinda excited to see what people will do with the data
So, why am I doing this? Well, first as someone who is a constant critic of the way other people conduct research I felt it was time to put up instead of shutting up (as people who know me, know I’m not very good at that.) Also, I just spent a lot of time getting help from amazing members of the community. I have also had the benefit of having people like Andrew, who constantly gave me free help for basically no reason, long before we started working together. There are some amazing parts of this community to counteract the ones that aren’t so great, and I’m gonna start contributing more there.
The data is the hyperlink below, in a Google Cloud bucket:
2017 Local SEO Ranking Factors Data
Also, I have the forthcoming analysis for the 2019 Local SEO Ranking Factors, which will be going live later this week! So much data!
3 Response Comments
Am waiting for your forthcoming Local SEO Guide 2019.
Dan – you’re too kind. This is AWESOME! I will be evaluating this data for a long time!
I truly think there are a lot of ways that the SEO industry can drastically improve, instead of relying on outdated, easy-to-implement strategies. We need more information retrieval specialists and mathematicians evaluating SEO.
Dang what a data dump. I looked at the stats methods and they’re both way over my head. So I’ll just enjoy the results of the tests and conclusions 😀
Us more-in-the-trenches SEOs live off of the results this stuff, so we appreciate it.
Looking forward to 2019’s