How To Find a Domain’s # of Indexed Pages In Google Post-Caffeine
In the olden days, as in before this week, you used to be able to get an idea of how many pages you had in Google’s index by searching “site:<yourdomain>”. The resulting page would say something like “results 1-10 of 1,390,000” which while not entirely accurate gave you a general idea of how well indexed your site was. Now with the official launch of Google Caffeine (update: I stand corrected, this is not a Caffeine issue but a new GOOG UI issue that I neglected to stay on top of – thanks Rhaghavan), the site: query no longer displays the number of total results (update: at least it doesn’t work for me but as you can see in the comments others have not experienced this yet).
While many people were unduly obsessed with this number, it did have its uses. For example, while big swings in the reported number say from 10,000,000 to 236,000 were scary but irrelevant, small changes in the reported number seemed to be more in sync with SEO problems or fixes.
So if you still want to find out how many pages your domain has in the index how do you do it?
- Sign up for Google Webmaster Tools and submit xml sitemaps for every URL on your domain. The Sitemaps report in GWT will then show the number of indexed URLs from your sitemaps (btw it’s not clear that this number is accurate either). My guess is getting more xml sitemaps submitted was one of the primary reasons that GOOG stopped reporting this number. That and maybe saving bandwidth from all of those site: queries that nervous site owners did all day long.
- If you don’t want to give GOOG your data via GWT, then you can still do a fake site: query by using “inurl:<yourdomain>”. Make sure you don’t use “www” in the query (e.g. inurl:localseoguide.com). This isn’t a perfect query – sites that incorporate your domain into their URLs will show up (e.g. www.alexa.com/siteinfo/localseoguide.com), but for most sites this shouldn’t be a huge number of URLs. It’s hard to judge how accurate this query is but I have tried it for several client sites and it seems to square up pretty well with how many pages they seem to have.If anyone has any other ideas feel free to add them to the comments and/or put them on your blog, link back here and it will show up in the trackbacks.