Transfer Domain Names

  • Subscribe to our RSS feed.
  • Twitter
  • StumbleUpon
  • Reddit
  • Facebook
  • Digg

Tuesday, 6 October 2009

Reunifying duplicate content on your website

Posted on 15:14 by Unknown
Handling duplicate content within your own website can be a big challenge. Websites grow; features get added, changed and removed; content comes—content goes. Over time, many websites collect systematic cruft in the form of multiple URLs that return the same contents. Having duplicate content on your website is generally not problematic, though it can make it harder for search engines to crawl and index the content. Also, PageRank and similar information found via incoming links can get diffused across pages we aren't currently recognizing as duplicates, potentially making your preferred version of the page rank lower in Google.

Steps for dealing with duplicate content within your website
  1. Recognize duplicate content on your website.
    The first and most important step is to recognize duplicate content on your website. A simple way to do this is to take a unique text snippet from a page and to search for it, limiting the results to pages from your own website by using a site:query in Google. Multiple results for the same content show duplication you can investigate.
  2. Determine your preferred URLs.
    Before fixing duplicate content issues, you'll have to determine your preferred URL structure. Which URL would you prefer to use for that piece of content?
  3. Be consistent within your website.
    Once you've chosen your preferred URLs, make sure to use them in all possible locations within your website (including in your Sitemap file).
  4. Apply 301 permanent redirects where necessary and possible.
    If you can, redirect duplicate URLs to your preferred URLs using a 301 response code. This helps users and search engines find your preferred URLs should they visit the duplicate URLs. If your site is available on several domain names, pick one and use the 301 redirect appropriately from the others, making sure to forward to the right specific page, not just the root of the domain. If you support both www and non-www host names, pick one, use the preferred domain setting in Webmaster Tools, and redirect appropriately.
  5. Implement the rel="canonical" link element on your pages where you can.
    Where 301 redirects are not possible, the rel="canonical" link element can give us a better understanding of your site and of your preferred URLs. The use of this link element is also supported by major search engines such as Ask.com, Bing and Yahoo!.
  6. Use the URL parameter handling tool in Google Webmaster Tools where possible.
    If some or all of your website's duplicate content comes from URLs with query parameters, this tool can help you to notify us of important and irrelevant parameters within your URLs. More information about this tool can be found in our announcement blog post.

What about the robots.txt file?

One item which is missing from this list is disallowing crawling of duplicate content with your robots.txt file. We now recommend not blocking access to duplicate content on your website, whether with a robots.txt file or other methods. Instead, use the rel="canonical" link element, the URL parameter handling tool, or 301 redirects. If access to duplicate content is entirely blocked, search engines effectively have to treat those URLs as separate, unique pages since they cannot know that they're actually just different URLs for the same content. A better solution is to allow them to be crawled, but clearly mark them as duplicate using one of our recommended methods. If you allow us to crawl these URLs, Googlebot will learn rules to identify duplicates just by looking at the URL and should largely avoid unnecessary recrawls in any case. In cases where duplicate content still leads to us crawling too much of your website, you can also adjust the crawl rate setting in Webmaster Tools.

We hope these methods will help you to master the duplicate content on your website! Information about duplicate content in general can also be found in our Help Center. Should you have any questions, feel free to join the discussion in our Webmaster Help Forum.

Posted by John Mueller, Webmaster Trends Analyst, Google Zürich
Email ThisBlogThis!Share to XShare to Facebook
Posted in crawling and indexing | No comments
Newer Post Older Post Home

0 comments:

Post a Comment

Subscribe to: Post Comments (Atom)

Popular Posts

  • Switching to the new website verification API
    Webmaster level: advanced Just over a year ago we introduced a new API for website verification for Google services. In the spirit of keepi...
  • Structured Data dashboard: new markup error reports for easier debugging
    Since we launched the Structured Data dashboard last year, it has quickly become one of the most popular features in Webmaster Tools. We’ve...
  • "It's on Google! YAY!" - Getting webmaster help in our forum
    Webmaster level: all It's been a bit more than five years now that our Webmaster Help Forum has been up and running, helping webmasters...
  • Supporting rel="canonical" HTTP Headers
    Webmaster level: Advanced Based on your feedback, we’re happy to announce that Google web search now supports link rel="canonical"...
  • Getting started with structured data
    Webmaster level: All If Google understands your website’s content in a structured way, we can present that content more accurately and more ...
  • Responsive design – harnessing the power of media queries
    Webmaster Level: Intermediate / Advanced We love data, and spend a lot of time monitoring the analytics on our websites. Any web developer d...
  • Introducing the Structured Data Dashboard
    Webmaster level: All Structured data is becoming an increasingly important part of the web ecosystem. Google makes use of structured data in...
  • Tell us what you think!
    (Cross-posted on the Google Product Ideas Blog ) The Webmaster Central team does our best to support the webmaster community via Webmaster T...
  • Improving URL removals on third-party sites
    Webmaster level: all Content on the Internet changes or disappears, and occasionally it's helpful to have search results for it updated ...
  • Protect your site from spammers with reCAPTCHA
    Webmaster Level: All If you allow users to publish content on your website, from leaving comments to creating user profiles , you’ll likely...

Categories

  • advanced
  • beginner
  • crawling and indexing
  • events
  • feedback and communication
  • general tips
  • hacked sites
  • hreflang
  • images
  • intermediate
  • localization
  • malware
  • mobile
  • performance
  • products and services
  • search results
  • sitemaps
  • structured data
  • url removals
  • verification
  • video
  • webmaster guidelines
  • webmaster tools

Blog Archive

  • ►  2014 (2)
    • ►  January (2)
  • ►  2013 (35)
    • ►  December (6)
    • ►  November (1)
    • ►  October (2)
    • ►  September (2)
    • ►  August (4)
    • ►  July (2)
    • ►  June (4)
    • ►  May (3)
    • ►  April (2)
    • ►  March (6)
    • ►  February (2)
    • ►  January (1)
  • ►  2012 (55)
    • ►  December (3)
    • ►  November (1)
    • ►  October (5)
    • ►  September (2)
    • ►  August (5)
    • ►  July (5)
    • ►  June (6)
    • ►  May (7)
    • ►  April (7)
    • ►  March (6)
    • ►  February (2)
    • ►  January (6)
  • ►  2011 (75)
    • ►  December (7)
    • ►  November (2)
    • ►  October (5)
    • ►  September (8)
    • ►  August (10)
    • ►  July (5)
    • ►  June (10)
    • ►  May (8)
    • ►  April (6)
    • ►  March (6)
    • ►  February (5)
    • ►  January (3)
  • ►  2010 (81)
    • ►  December (9)
    • ►  November (9)
    • ►  October (4)
    • ►  September (8)
    • ►  August (6)
    • ►  July (2)
    • ►  June (6)
    • ►  May (6)
    • ►  April (12)
    • ►  March (11)
    • ►  February (1)
    • ►  January (7)
  • ▼  2009 (52)
    • ►  December (7)
    • ►  November (9)
    • ▼  October (13)
      • Using RSS/Atom feeds to discover new URLs
      • Help us make the web better: An update on Rich Sni...
      • Verifying a Blogger blog in Webmaster Tools
      • One million YouTube views!
      • Dealing with low-quality backlinks
      • Let's make the mobile web faster
      • Managing your reputation through search results
      • Fetch as Googlebot and Malware details -- now in W...
      • A proposal for making AJAX crawlable
      • Reunifying duplicate content on your website
      • New parameter handling tool helps with duplicate c...
      • Google Friend Connect: No more FTP... just get sta...
      • Changes to website verification in Webmaster Tools
    • ►  September (8)
    • ►  August (6)
    • ►  July (5)
    • ►  June (4)
Powered by Blogger.

About Me

Unknown
View my complete profile