‘Index bulk’ is one of the most common issues SEOs see on larger websites.
It happens innocently over time — putting up a new marketing landing page over here, forgetting to take down an old product page over there, and writing a couple of very similar blog articles.
Next thing you know – BAM!
you have an excess of pages being indexed by Google that you don’t want there in the first place. Where do you begin?
Let’s take it from the top…
What is index bulk?
Index bulk refers to having excess pages on your website showing up in Google’s index. Such as:
- The indexation of pages no longer live on your site, and therefore shouldn’t be showing up in results. An example being, a 404 page is showing up in search results.
- The indexation of pages that are not useful. The page could have little content, few keyword rankings and backlinks, and/or is a near duplicate of another page that is performing well.
- The indexation of pages never meant to be live in the first place. Such as a draft page that was accidentally pushed live.
Why is index bulk bad?
Every website has something that is referred to as a crawl budget. Meaning, the set amount of time that a Google Bot will spend crawling and indexing the pages of your site before moving onto another site.
Asking Google Bot to crawl too many un-useful pages on your website lessens the size of your crawl budget, while simultaneously harming your site’s ability to perform well in search and grow its organic traffic.Tyler Emerson, SEO Lead at Entertainment Weekly
Asking Google Bot to crawl too many un-useful pages on your website lessens the size of your crawl budget, while simultaneously harming your site’s ability to perform well in search and grow its organic traffic. The most disastrous thing that can happen is that your website gets hit with a Panda Penalty – which happens when you have too many low quality pages in Google’s index!
You are ‘asking’ Google Bot to crawl and index un-useful pages anytime you:
- Neglect Google Search Console errors
- Don’t perform a routine ‘site:’ search to spot check problems
- Don’t check for broken backlinks (internal and external)
Google is currently indexing about 83,700 pages on Fox.com.
Yet, some of the pages that appear in Google’s index are for old show episodes that are no longer available to watch on the site. (Screenshot from page 10 of search results.)
The page serves a 200 ok status code. However when users click on the listing, it says the episode isn’t available. The end result is poor UX for people looking to watch the show, higher bounce rates, and lower SEO authority of the site as a whole.
How do I fix index bulk?
There are four ways to fix index bulk on a website:
A page you want to keep on your site, but left out of search results:
- Add a ‘noindex, nofollow’ meta tag to the page you want left out.
- Add a ‘rel canonical’ tag to the page you want left out.
- Add a ‘disallow’ rule in your robots.txt file for the page you want left out.
Learn more about the right solution for your needs here.
A page that you want left out of search results, as well as taken off your site:
- Create a 301 redirect to a relevant, helpful page.
Using the example from above, we will create a 301 redirect from the old Simpson’s episode to the The Simpson’s show main page.
What are the benefits to finding and fixing instances of index bulk on my website?
The primary benefit of finding and fixing index bulk on a website is that it encourages Google Bot to spend more time on your site’s most useful and important pages – rather than sifting through mountains of website gunk until it gets frustrated and leaves.
In turn, will help improve the quantity and quality of your organic keyword rankings and traffic – leading to more opportunities to connect with your target audience and get them to convert.
Have a question or another tip? Leave them in the comments.
What's Your Reaction?
SEO Lead at FOX Networks Group. Founder of TEKKI.digital blog. Contributing author at SearchEngineJournal.com.