
Prevent Search Engines From Crawling and Indexing Website or Specific Website Pages
As we all know, the key to optimizing search engines is having search engines crawl and index your website.
Today, we’re going to discuss the opposite: how not to allow search engines to crawl and index your website or a specific page.
Maybe you have duplicate content throughout your website that you don’t want to delete, yet you don’t want search engines to index and penalize your website.
Or have you ever wanted to have a private page that is not visible to search engines but made available to users?
For instance, let’s say you want a quick and simple page guarding users from receiving content until they give their email.
This is a very popular online experience for those trying to build their mailing list. They’ll offer a free resource of some sort in exchange for your email address.
From complex registrations requiring usernames and passwords to password-protecting files in a directory using .htaccess to the simple use of meta tags on pages or robots.txt files, there are many ways to hide pages from users and search engines.
Below is a quick and simple example, derived from A Practical Guide To Effective SEO, that will provide you with a quick and easy way to keep certain pages or entire websites from being indexed by search engines.
Be sure to include the following line of meta code in the head of each page you want to keep search engines from indexing:
<meta name=”robots” content=”noindex,nofollow,noarchive” />
However, this is not a foolproof method. This only keeps *some*, not all, search engines from indexing the page (“noindex”).
In addition, nofollow means not following any links on the page. So be careful using this one, as you may want to change it from nofollow to follow so that bots and engines follow links from the noindexed page.
Also, noindex does not stop users from simply forwarding the link to others for direct access to the page; however, there are simple coding techniques to stop direct access.
Remember, the meta tag method is a simple, less expensive method should you have a static website or use a content management system and don’t care to pay the development costs for password-protected portals or plugins.
If you use a content management system (CMS) like WordPress, Joomla, or Drupal, you’ll want to programmatically bulk noindex/nofollow pages or use their respective SEO plugins, which offer you an easy interface for limiting which pages are crawled and indexed by search engines.
In closing, we will cover a number of between methods, such as using a robots.txt file and .htaccess, which we will cover later.