Transfer Domain Names

  • Subscribe to our RSS feed.
  • Twitter
  • StumbleUpon
  • Reddit
  • Facebook
  • Digg

Tuesday, 6 December 2011

Tips for hosting providers and webmasters

Posted on 09:39 by Unknown
Webmaster level: All
Some webmasters on our forums ask about hosting-related issues affecting their sites. To help both hosting providers and webmasters recognize, diagnose, and fix such problems, we’d like to share with you some of the common problems we’ve seen and suggest how you can fix them.
  • Blocking of Googlebot crawling. This is a very common issue usually due to a misconfiguration in a firewall or DoS protection system and sometimes due to the content management system the site runs. Protection systems are an important part of good hosting and are often configured to block unusually high levels of server requests, sometimes automatically. Because, however, Googlebot often performs more requests than a human user, these protection systems may decide to block Googlebot and prevent it from crawling your website. To check for this kind of problem, use the Fetch as Googlebot function in Webmaster Tools, and check for other crawl errors shown in Webmaster Tools.
    We offer several tools to webmasters and hosting providers who want more control over Googlebot’s crawling, and to improve crawling efficiency:
    • We have detailed help about how you control Googlebot’s crawling using the robots exclusion protocol and configuring URL parameters.
    • If you’re worried about rogue bots using the Googlebot user-agent, we offer a way to verify whether a crawler is actually Googlebot.
    • If you would like to change how hard Googlebot crawls your site, you can verify your website in Webmaster Tools and change Googlebot’s crawl rate. Hosting providers can verify ownership of their IP addresses too.
    We have more information in our crawling and indexing FAQ.
  • Availability issues. A related type of problem we see is websites being unavailable when Googlebot (and users) attempt to access the site. This includes DNS issues, overloaded servers leading to timeouts and refused connections, misconfigured content distribution networks (CDNs), and many other kinds of errors. When Googlebot encounters such issues, we report them in Webmaster Tools as either URL unreachable errors or crawl errors.
  • Invalid SSL certificates. For SSL certificates to be valid for your website, they need to match the name of the site. Common problems include expired SSL certificates and servers misconfigured such that all websites on that server use the same certificate. Most web browsers will try warn users in these situations, and Google tries to alert webmasters of this issue by sending a message via Webmaster Tools. The fix for these problems is to make sure to use SSL certificates that are valid for all your website’s domains and subdomains your users will interact with.
  • Wildcard DNS. Websites can be configured to respond to all subdomain requests. For example, the website at example.com can be configured to respond to requests to foo.example.com, made-up-name.example.com and all other subdomains.
    In some cases this is desirable to have; for example, a user-generated content website may choose to give each account its own subdomain. However, in some cases, the webmaster may not wish to have this behavior as it may cause content to be duplicated unnecessarily across different hostnames and it may also affect Googlebot’s crawling.
    To minimize problems in wildcard DNS setups, either configure your website to not use them, or configure your server to not respond successfully to non-existent hostnames, either by refusing the connection or by returning an HTTP 404 header.
  • Misconfigured virtual hosting. The symptom of this problem is that multiple hosts and/or domain names hosted on the same server always return the contents of only one site. To rephrase, although the server hosts multiple sites, it returns only one site regardless of what is being requested. To diagnose the issue, you need to check that the server responds correctly to the Host HTTP header.
  • Content duplication through hosting-specific URLs. Many hosts helpfully offer URLs for your website for testing/development purposes. For example, if you’re hosting the website http://a.com/ on the hosting provider example.com, the host may offer access to your site through a URL like http://a.example.com/ or http://example.com/~a/. Our recommendation is to have these hosting-specific URLs not publicly accessible (by password protecting them); and even if these URLs are accessible, our algorithms usually pick the URL webmasters intend. If our algorithms select the hosting-specific URLs, you can influence our algorithms to pick your preferred URLs by implementing canonicalization techniques correctly.
  • Soft error pages. Some hosting providers show error pages using an HTTP 200 status code (meaning “Success”) instead of an HTTP error status code. For example, a “Page not found” error page could return HTTP 200 instead of 404, making it a soft 404 page; or a “Website temporarily unavailable” message might return a 200 instead of correctly returning a 503 HTTP status code. We try hard to detect soft error pages, but when our algorithms fail to detect a web host’s soft error pages, these pages may get indexed with the error content. This may cause ranking or cross-domain URL selection issues.
    It’s easy to check the status code returned: simply check the HTTP headers the server returns using any one of a number of tools, such as Fetch as Googlebot. If an error page is returning HTTP 200, change the configuration to return the correct HTTP error status code. Also, keep an eye out for soft 404 reports in Webmaster Tools, on the Crawl errors page in the Diagnostics section.
  • Content modification and frames. Webmasters may be surprised to see their page contents modified by hosting providers, typically by injecting scripts or images into the page. Web hosts may also serve your content by embedding it in other pages using frames or iframes. To check whether a web host is changing your content in unexpected ways, simply check the source code of the page as served by the host and compare it to the code you uploaded.
    Note that some server-side code modifications may be very useful. For example, a server using Google’s mod_pagespeed Apache module or other tools may be returning your code minified for page speed optimization.
  • Spam and malware. We’ve seen some web hosts and bulk subdomain services become major sources of malware and spam. We try hard to be granular in our actions when protecting our users and search quality, but if we see a very large fraction of sites on a specific web host that are spammy or are distributing malware, we may be forced to take action on the web host as a whole. To help you keep on top of malware, we offer:
    • Safe Browsing Alerts for Network Administrators, useful for hosting providers
    • Malware notifications in Webmaster Tools for individual websites
    • A Safe Browsing API for developers
We hope this list helps both hosting providers and webmasters diagnose and fix these issues. Beyond this list, also think about the qualitative aspects of hosting like quality of service and the helpfulness of support. As always, if you have questions or need more help, please ask in our Webmaster Help Forum.
Written by Pierre Far, Webmaster Trends Analyst
Email ThisBlogThis!Share to XShare to Facebook
Posted in | No comments
Newer Post Older Post Home

0 comments:

Post a Comment

Subscribe to: Post Comments (Atom)

Popular Posts

  • Switching to the new website verification API
    Webmaster level: advanced Just over a year ago we introduced a new API for website verification for Google services. In the spirit of keepi...
  • Structured Data dashboard: new markup error reports for easier debugging
    Since we launched the Structured Data dashboard last year, it has quickly become one of the most popular features in Webmaster Tools. We’ve...
  • "It's on Google! YAY!" - Getting webmaster help in our forum
    Webmaster level: all It's been a bit more than five years now that our Webmaster Help Forum has been up and running, helping webmasters...
  • Supporting rel="canonical" HTTP Headers
    Webmaster level: Advanced Based on your feedback, we’re happy to announce that Google web search now supports link rel="canonical"...
  • Getting started with structured data
    Webmaster level: All If Google understands your website’s content in a structured way, we can present that content more accurately and more ...
  • Responsive design – harnessing the power of media queries
    Webmaster Level: Intermediate / Advanced We love data, and spend a lot of time monitoring the analytics on our websites. Any web developer d...
  • Introducing the Structured Data Dashboard
    Webmaster level: All Structured data is becoming an increasingly important part of the web ecosystem. Google makes use of structured data in...
  • Tell us what you think!
    (Cross-posted on the Google Product Ideas Blog ) The Webmaster Central team does our best to support the webmaster community via Webmaster T...
  • Improving URL removals on third-party sites
    Webmaster level: all Content on the Internet changes or disappears, and occasionally it's helpful to have search results for it updated ...
  • Protect your site from spammers with reCAPTCHA
    Webmaster Level: All If you allow users to publish content on your website, from leaving comments to creating user profiles , you’ll likely...

Categories

  • advanced
  • beginner
  • crawling and indexing
  • events
  • feedback and communication
  • general tips
  • hacked sites
  • hreflang
  • images
  • intermediate
  • localization
  • malware
  • mobile
  • performance
  • products and services
  • search results
  • sitemaps
  • structured data
  • url removals
  • verification
  • video
  • webmaster guidelines
  • webmaster tools

Blog Archive

  • ►  2014 (2)
    • ►  January (2)
  • ►  2013 (35)
    • ►  December (6)
    • ►  November (1)
    • ►  October (2)
    • ►  September (2)
    • ►  August (4)
    • ►  July (2)
    • ►  June (4)
    • ►  May (3)
    • ►  April (2)
    • ►  March (6)
    • ►  February (2)
    • ►  January (1)
  • ►  2012 (55)
    • ►  December (3)
    • ►  November (1)
    • ►  October (5)
    • ►  September (2)
    • ►  August (5)
    • ►  July (5)
    • ►  June (6)
    • ►  May (7)
    • ►  April (7)
    • ►  March (6)
    • ►  February (2)
    • ►  January (6)
  • ▼  2011 (75)
    • ▼  December (7)
      • Download search queries data using Python
      • Website user research and testing on the cheap
      • Rich Snippets Instructional Videos
      • Introducing smartphone Googlebot-Mobile
      • Clicks and impressions for authors
      • Tips for hosting providers and webmasters
      • New markup for multilingual content
    • ►  November (2)
    • ►  October (5)
    • ►  September (8)
    • ►  August (10)
    • ►  July (5)
    • ►  June (10)
    • ►  May (8)
    • ►  April (6)
    • ►  March (6)
    • ►  February (5)
    • ►  January (3)
  • ►  2010 (81)
    • ►  December (9)
    • ►  November (9)
    • ►  October (4)
    • ►  September (8)
    • ►  August (6)
    • ►  July (2)
    • ►  June (6)
    • ►  May (6)
    • ►  April (12)
    • ►  March (11)
    • ►  February (1)
    • ►  January (7)
  • ►  2009 (52)
    • ►  December (7)
    • ►  November (9)
    • ►  October (13)
    • ►  September (8)
    • ►  August (6)
    • ►  July (5)
    • ►  June (4)
Powered by Blogger.

About Me

Unknown
View my complete profile