You are hereGoogle


Google


Google

SEO Guide: Canonical Domains, Apache & HTTP 301 Redirects

By hagrin - Posted on 05 January 2007

Posted By: hagrin
Create Date: 14 December 2005
Last Updated: 4 January 2006

Overview:
Search Engine Optimization (SEO) remains the ultimate goal of the webmaster, blog publisher, e-commerce seller, AdSense user and pageview junkie. By tweaking and modifying your website's layout, design and content, a domain owner can increase his listing rank when terms are searched on the major search engines (for the purpose of these articles, the major search engines are Google, Yahoo! and MSN). One SEO hint/tip/issue that website owners need to deal with is duplicate content penalties resulting from a canonical domain issue. This article will talk about what exactly this problem is and how to resolve it.

What Exactly is a Canonical Domain Name?:
Webopedia defines a canonical name (CNAME) as:

Short for canonical name, also referred to as a CNAME record, a record in a DNS database that indicates the true, or canonical, host name of a computer that its aliases are associated with. A computer hosting a Web site must have an IP address in order to be connected to the World Wide Web. The DNS resolves the computer’s domain name to its IP address, but sometimes more than one domain name resolves to the same IP address, and this is where the CNAME is useful. A machine can have an unlimited number of CNAME aliases, but a separate CNAME record must be in the database for each alias.

I'm sure many of you are saying "English (or your first language) please!". Basically, when you purchased your domain name (for instance, I bought hagrin.com), you have also purchased the ability to add a CNAME (sometimes called "parking a subdomain"). By default, the "www" CNAME is automatically created for your domain usually upon your purchasing of the domain. Therefore, right away, users will have two ways of navigating to your site - through http://www.hagrin.com (with the "www") and http://hagrin.com (just the domain name). Giving users the ability to get to your site in two ways seems to be beneficial without any drawbacks. However, if users can get to your site by 2 different URLs, search engine crawlers can also crawl your content by both URLs. If this does occur (and you have no preventive measures in place), then search engines may collect two copies of the same data, but at two different links potentially causing a "duplicate content" penalty for your site.

How do I know if I have a problem? Well, you can use the Search Engine Friendly Redirect Checker to diagnose any potential problems your site may have. As a note, don't only test the home page, try testing some pages that are not in the root directory to make sure all of your URLs redirect in a search engine friendly manner. So how can you avoid this from happening or fix it once you have diagnosed a problem?

The Fix:
I encountered this problem recently and wanted to make sure that I wasn't having my site split into two or having my content duplicated causing me to drop in the search rankings. Therefore, I started looking around for a way to redirect my users from the plain hagrin.com to www.hagrin.com for all documents on my server. Hagrin.com runs on a Linux machine using Apache as its web server software so the fix below is specific to Apache's web server. After browsing the web for a few hours, I came to the conclusion that I needed to perform a HTTP 301 Redirect for my hagrin.com pages to www.hagrin.com links. Knowing that I was using Apache, I was able to create a .htaccess file in the web root directory (/www) of my web server and added the following lines of code:

RewriteEngine On
RewriteCond %{HTTP_HOST} ^hagrin\.com$ [NC]
RewriteRule ^(.*)$ http://www.hagrin.com/$1 [R=301,L]

So what exactly does this code do? Well, if a user were to request http://hagrin.com/rss/hagrin_atom.xml, the user would be redirected to http://www.hagrin.com/rss/hagrin_atom.xml instead. This allows for both requests, hagrin.com and www.hagrin.com, to lead to the same URL and prevent any duplicate content penalties. If you aren't using Apache, the fix for this issue may be very different and I would suggest doing a Google search on HTTP 301 redirects to resolve any canonical domain name issues you may be having.

Resources:

  1. Webopedia CName Definition
  2. Search Engine Friendly Redirect Checker
  3. SocialSocial Patterns - "Cleaning Up Canonical URLs With Redirects"
  4. Matt Cutts on Canonical Domain Issues

Version Control:

  1. Version 1.1 - 4 January 2006 - Updated Resources to include Matt Cutts' Canonical Domain Issues post
  2. Version 1.0 - 14 December 2005 - Original Article

SEO: Using "Nofollow" for External Links & Preserving Page Rank

By hagrin - Posted on 05 January 2007

SEO Guide: Using "Nofollow" for External Links & Preserving Page Rank

Posted By: hagrin
Date: 20 December 2005

Overview:
Search Engine Optimization (SEO) remains the ultimate goal of the webmaster, blog publisher, e-commerce seller, AdSense user and pageview junkie. By tweaking and modifying your website's layout, design and content, a domain owner can increase his listing rank when terms are searched on the major search engines (for the purpose of these articles, the major search engines are Google, Yahoo! and MSN). One SEO hint/tip/issue that website owners should adhere to is preserving page rank through careful selection of external links. This article will define a lot of the terms used such as page rank, external links, etc., explain how the rel="nofollow" attribute works in preserving page rank, the possible drawbacks and an implementation plan.

What are External Links & How Does it Affect my Page Rank?
A major concern for website owners trying to optimize their sites deals with page rank within search engine result pages. Page rank is defined by Google as:

PageRank relies on the uniquely democratic nature of the web by using its vast link structure as an indicator of an individual page's value. In essence, Google interprets a link from page A to page B as a vote, by page A, for page B. But, Google looks at more than the sheer volume of votes, or links a page receives; it also analyzes the page that casts the vote. Votes cast by pages that are themselves "important" weigh more heavily and help to make other pages "important."

Important, high-quality sites receive a higher PageRank, which Google remembers each time it conducts a search. Of course, important pages mean nothing to you if they don't match your query. So, Google combines PageRank with sophisticated text-matching techniques to find pages that are both important and relevant to your search. Google goes far beyond the number of times a term appears on a page and examines all aspects of the page's content (and the content of the pages linking to it) to determine if it's a good match for your query.

If you thought that was a mouthful, you can read the explanation of page rank offered by Iprcom (this resource is for the math lovers only. Another good resource for explaining page rank can be found here at Web Workshop). So with Google's definition and a formula for calculating page rank, what does page rank have to do with how we post links to other websites on our site or blog? Web Workshop describes the potential harm that outbound links cause to our page rank as the following:

Outbound links are a drain on a site's total PageRank. They leak PageRank. To counter the drain, try to ensure that the links are reciprocated. Because of the PageRank of the pages at each end of an external link, and the number of links out from those pages, reciprocal links can gain or lose PageRank. You need to take care when choosing where to exchange links.

So, we see that we want to maximize our incoming links from other sites while limiting the amount of outbound links to otehr sites. This may prove difficult for some sites that report news since most of the content will come from outside sources. In addition, even original content writers use resources and it's generally good practice to list your references. Well, seems that we are between a rock and a hard place. However, in 2005, the major search engines adopted a new attribute for the anchor tag - rel=nofollow.

Using the "rel=nofollow" Attribute
What exactly does the rel=nofollow attribute do and how do we use it? Well, if you choose to make a link to another website and add the rel=nofollow attribute to the anchor tag, then search engines (when crawling your page) will not counts these links as an outbound link. They will act as functional text links to users, but no more than text to the search engine. Obviously, the benefit of this comes from being able to build highly informative web pages without enduring the page rank leakage from including external links. How do you actually use nofollow? Well, let's look at the example code below:

<a href="http://www.hagrin.com" rel="nofollow">Hagrin.com</a>

As you can see, it's very simple. Just make a link as you would normally do and then just add the rel="nofollow" attribute. It's really that easy.

Potential Drawbacks:
With most SEO tricks and tips, there are portential drawbacks for sure. Although no site directly talks about penalties directly associated to overuse of the nofollow tag, the blogging industry frown heavily upon using nofollow even in cases of trying to combat comment spamming. In addition, many people have come up with CSS snippets that allow them to browse a page and have nofollow links highlighted in a manner that makes it clealry visibile that nofollow is being used. The CSS used by some would look something like this:

a[rel~=”nofollow”] {
border: thin dashed firebrick ! important;
background-color: rgb(255, 200, 200) ! important;
}

This will alert readers to your use of nofollow and potentially cause "bad karma" for your site. Therefore, you may want to consider how heavily you use nofollow and for what sites you will use it for. Hopefully, with extremely directed usage and a little thought, you will be able to maximize your page rank by controlling the external links offf of your site.

Resources:

  1. Official Google Technology
  2. Iprcom Page Rank Explanation
  3. Web Workshop Page Rank Explanation
  4. Matt Cutts' Nofollow CSS

Version Control:

  1. Version 1.0 - 20 December 2005 - Original Article

Does Digg Belong in Google's Index?

By hagrin - Posted on 04 January 2007

I have to thank Search Engine Journal for posing one of the better questions so far of 2007 - does Digg belong in Google's index? (Actually, as you read the SEJ article, Allen Stern seems to have posed this question first.)

So, Does Digg belong in Google's search index?

First, a lot of people have weighed in on this topic since the initial people posed this subject and almost all of them are just plain wrong not because of where they sit on the issue, but more because the facts they used to support their arguments do not make sense or are completely false. What are some of the arguments for both sides and what are the misconceptions?

Pros

  • Helping Users Find Content - This would be the strongest argument for including Digg results within the Google index. Although many people seem to be incorrectly using the term "pagerank" (see here), the general idea is solid. Some pages on lesser authoritative sites (based on not only PR, but backlinks, keyword density, domain age, robots.txt exclusions, etc.) that hold the original content may get lost in the Google index and having the Digg result appear in the index improves the chance that the Google user will find the content he/she is looking to find. Generally, the rule states that you want to do anything that improves the user experience and helping users find the content they need should be the goal of any search engine.
  • Digg Mirroring - One of the greatest benefits of having the Digg version of a story appear in the search results is if a story disappears from the original site, very often the Digg comments will contain a link to a mirror of the original content keeping it alive past just the lifespan of the original website. However, would the average user know that? Obviously, no since most average users have never even clicked on the "Cached" link within the Google search results.
  • Don't Like It? Customize Google - Many people suggested using the -site: digg.com command with all your searches; however that's extremely inefficient unless you're using a Greasemonkey script. But why not just create your own Custom Search Engine and put Digg on your excluded sites list? The fix is easy and more people should really take advantage of the CSE offering from Google.

    Cons

  • Other Indices Do Not Exist within Google SERPs - Probably the most compelling argument for why Digg shouldn't be included in the Google search results is that other indicies, like Yahoo!'s search results do not appear in the Google index. This is obvious because search indices don't have "value added" or original content - they contain the page's title and a short description. Of course you're saying - but is Digg an index? Many will argue that yes, Digg is nothing more than an informative/popular index of links. Although there are no hard numbers to confirm this, it would appear that most Digg stories are submitted with the title and the description 100% copied from the original, linked site. Even if we could prove the previous statement, many would say that the comments associated to the Digg submission provides the original content to differentiate itself from other indices. However, the prevailing opinion of the Digg commenting system is so low that many, including myself, consider it broken, highly useless and completely inferior to similar sites like Slashdot.
  • I'm Tired of Clicking - Another popular argument seems to hold that the user experience is diminished because to actually reach the desired content the user is looking for by having to click on the Google search result and then the title on the Digg page. In addition, users unfamiliar with the Digg interface may not understand that the content actually exists after one more "hop". Although the Digg interface is similar to the Google interface (a blue link followed by a short description), I would like to see the functionality improved to something like Reddit's RSS feed where the direct link takes you to the original story and there is a "More" option to read through Reddit comments - something I almost never do.
  • Original Content < Digg Scraped Content? - As a webmaster, I see something inherently wrong with what amounts to no more than a user powered scraper site ranking higher in Google SERPs than the original content. However, it's important that this is not a deficiency of Digg, but more a "feature" of Google's search indexing algo.
  • Duplicate Content - Let's say that the original content URL and the Digg link both appear within the Top 10 results for a search term. How is duplicate content enhancing the user experience? Answer - it's not. There's little difference between this scenario and those generated by spam blogs.

    Misconceptions

  • Digg is more informative than most sites - The Digg fanboys will be all over this point, but the fact remains that many of the stories submitted to Digg are either blog spam, incorrect or written by authors looking to profit through their site. The reason why Digg attracts a high percentage of these types of sites is because of Digg's power - its massive, fanatical user base, the backlinks a promoted story will receive and the other benefits that relate to increasing your site's popularity. However, just because something is popular doesn't mean that it should be considered an "authoritative source" for certain topics. This isn't a problem specific to Digg, but to much of the Internet and its users - no one is sure exactly who they should trust.

    Conclusion
    So, if you've been reading carefully, you've noticed that I haven't taken a side. Where do I stand on the issue? Simple - create and use Google's Custom Search Engine option. Now, Google has to do a better job of promoting this highly valuable tool to "Joe Internet" because many of the usability issues deal with the average user and the CSE option isn't known by more than 1% of the Internet population I would gather (percentage not based on any facts, just perception). How could it be made more mainstream? If you've ever used a site like Match.com to look for hot steamy love, you can filter out your searches by eliminating certain people from continually showing up in search results with a simple click of an X. Google could implement something similar and save those preferences based on a user the same way they save search history, CSE optimizations, etc. Therefore, whether or not you agree or disagree with Digg's inclusion, you should know that you can put your own solution into action and determine Digg's influence in your Google searches.

  • SEO: Using Descriptive, Creative & Efficient Titles

    By hagrin - Posted on 25 December 2006

    SEO: Using Descriptive, Creative & Efficient Titles

    Posted By: hagrin
    Create Date: 27 December 2005
    Last Updated: 1 July 2010

    Overview:
    Search Engine Optimization (SEO) remains the ultimate goal of the webmaster, blog publisher, e-commerce seller, AdSense user and pageview junkie. By tweaking and modifying your website's layout, design and content, a domain owner can increase his listing rank when terms are searched on the major search engines (for the purpose of these articles, the major search engines are Google, Yahoo! and MSN). A major SEO tip that should be adhered to by everyone is using descriptive, yet creative and efficient titles for all your pages. This article breaks down what title we are actually talking about, tips for writing efficient titles and how to use available tools for figuring out title keywords.

    Title? ... Which Title?
    For many, the term "title" is so vague that they don't know exactly where to focus their SEO efforts. For the purpose of this discussion, we're talking about the phrase displayed in the browser's title bar. The title bar is located at the very top of the browser window and would look something like this (Figure 1):

    Now that we have identified what title we're talking about, let's examing the image. The title bar's value is comprised of two parts - the actual title of the page and the browser's "branding" which appears at the top of every page. Search engine crawlers are only concerned with the first part - in this instance "Google News". I chose this title for my Google News archive page for a few reasons which are discussed below.

    Choosing Your Words Carefully
    A title really can make or break your SEO ranking and page traffic. As you can see from the Figure below (Figure 2), search engines use your title as the "headline" for your stories.

    So, after seeing the importance of your title, what rules should you follow when creating your headlines? Try following these simple guidelines:

    • Concise Word Choice / Eliminating Unnecessary Words - Probably the most important rule to follow. To maximize your keyword density, don't clutter your titles with unnecessary words. For instance, there was no reason for me to label my page "Hagrin's Google News" since my domain name will catch all queries using the term hagrin. If I did include the word Hagrin, I would no longer directly match user requests for the search query "Google News" and my ranking would most likely drop even further for this highly competitive term. Go through your entire site and check all your titles to see if there are any unproductive words that you could remove to improve your page's title.

    • Keywords to the Front - In addition to choosing concise words, there appears to be some weight being applied to the order of the words in the title. Therefore, you want your keywords closer to the beginning of the title than the end.

    • Remain Creative / Use Proper Grammar - Almost contradictory to the previous point. With so many web pages out there all competing for users, you also need to make your title stand out from the rest on the page. Therefore, make sure to use proper grammar to make your titles easy to read and just don't put random words in your title that will match a lot of user search requests. In addition, some creativity while maintaining your keyword density could help improve your page view numbers as users are drawn to your site when presented with 9 other similar looking options. However, title creativity can be detrimental to your SEO efforts so choose your approach wisely.

    • Watch for Duplicates - Although not entirely proven, the general concept is sound. To differentiate all your pages and not have them grouped (and then your pages removed from initial viewable search results) should be considered a good practice which will identify all of your content as unique. Using unique titles also help with certain blogging software applications like Wordpress which will use the title as part of the path. Since Wordpress uses the title as part of the path, Wordpress also has to ammend the date to the file name - a practice that is unnecessary and again lowers density.

    Following these three simple rules will definitely improve your rankings and help drive higher traffic to your site.

    Title Tools
    Everyone gets writer's block at some point in time. Therefore, tools exist that help you determine keyword saturation and search frequency. You can then use this information to best pick the title you want to use for your content. The Yahoo! Overture Keyword Selector Tool is one such tool. Just plug in a generic term for your content and have Yahoo! spit some suggestions back to you. You do not have to be an advertiser currently with Yahoo! to use this tool and seems to be free for everyone. Another tool, limited to Google AdWords customers only, is the Google AdWords Keyword Tool which is only available to AdWords users (however, all you have to do is pay the $5 signup fee and then you have access to all the AdWords tools). Using the above information is a first, major step in search engine optimization. Making sure that you have concise, efficient, creative and unique titles should be the first step in ensuring your success on the web.

    Resources

    1. Google AdWords Keyword Tool
    2. Yahoo! Overture Keyword Selector Tool

    Version Control

    1. Version 1.0 - 27 December 2005 - Original Article

    Search Engine Optimization Guide

    By hagrin - Posted on 25 December 2006

    Posted By: hagrin
    Created: 14 December 2005
    Last Updated: 1 July 2010

    Overview:
    Search Engine Optimization (SEO) remains the ultimate goal of the webmaster, blog publisher, e-commerce seller, AdSense user and pageview junkie. By tweaking and modifying your website's layout, design and content, a domain owner can increase his listing rank when terms are searched on the major search engines (for the purpose of these articles, the major search engines are Google, Yahoo! and MSN). The following articles will assist you in your quest to finely tune your website into a high ranking Internet source.

    Table of Contents / Index

    Domain Names, Software & Other High Level Issues:

    The Nitty Gritty - Low Level SEO Concerns:

    Software Specific SEO Concerns:

    Google Ends Support for SOAP Search API

    By hagrin - Posted on 19 December 2006

    Well, this news puts the lockdown to the SEO project I was working on.

    Google has very quietly ended support for the SOAP Search API which allowed SEOs like myself to query the search engine through a SOAP call and take those results and build reports for those looking to measure their site's performance within the search engine. Now, the results from the search engine definitely weren't accurate as manual searches for terms displayed much different results than the SOAP generated results. However, it still provided a valuable tool to SEOs to paint a broad picture of their site's performance.

    No more SOAP API keys are being issued and it is unclear as to when the SOAP server will be taken offline rendering all these SOAP driven applications useless.

    Google is now directing their coders to the AJAX Search API which is more restrictive than the previous SOAP API. However, now Google requires that a valid URL send the request to the Google server and you can no longer run background processes to gather these results. This means an end to the SEO report generator as we know it and a switch to "illegal" page scraping will most likely occur. I'll reserve judgement on all of this until I fully explore the alternatives, but I definitely see this as an unnecessary restriction that just encourages the more intensive page scraping.

    Buy Your Domain Name Using Google

    By hagrin - Posted on 15 December 2006

    Although Google has been a certified registrar for a few years I believe, Google just announced that you can buy your domain names from them for $10 per year. However, and this might scare some off, they partnered with GoDaddy and eNom to provide this service as opposed to doing it themselves.

    What's worse than the partnership?

    If you don't go through Google and go through their partners directly, it's actually cheaper. So, what exactly is the benefit from registering your domain with Google (or in this case GoDaddy or eNom)? Answer - none that I can see unless your domains suddenly become more trusted in their ranking algorithm. What are the potential drawbacks? Well, for those black hat SEOs looking to link trade between their sites, Google will now be able to link sites owned by the same individual/holding company and treat those links with a lower weight than backlinks from independent sites (although as I wrote this, I wonder if they have already had access to this information since they are a certified registrar - I would assume yes).

    However, even with all that said, I'll probably move my domains to Google because it's actually cheaper than my current registrar - Register.com (brutal decision by me, I know but it's been easier to renew their than run the risk of having the domain transfer not go smoothly). Go Google - I tie my life into you just a little more today.

    Google Patent Search

    By hagrin - Posted on 14 December 2006

    Google has announced on their official corporate blog the availability of their newest product - the Google Patent Search. While sporting a very similar interface as to their otehr search pages, the Google Patent Search also has a neat feature that displays the technical drawings of a few patents below the search box.

    Important to note is the "what types of patents are available" section of the Patent Search Help document. There it states that "We don’t currently include patent applications, international patents, or U.S. patents issued over the last few months, but we look forward to expanding our coverage in the future" which means that this isn't a real-time list currently so looking for ground breaking patent information is pretty fruitless at this point. However, if at some point Google does get patents indexed as soon as they are granted, valuable tools could be built showing breaking patent news and discussing their potential impact.

    Google's Matt Cutts' Best SEO Tips of 2006

    By hagrin - Posted on 07 December 2006

    SEO Egghead posted a great synopsis of Matt Cutts' best SEO Tips of 2006 in a nice, comprehensive list for SEO marketers, designers and enthusiasts to refresh their "straight from Google's mouth" SEO information. I usually don't link to blog spam; however, they really did a great job compiling, shortening and organizing the best tips offered by Matt throughout the year on his blog. So what are some of the better tips that Matt stated web developers/designers should incorporate into their sites?

    • Use dashes instead of underscores in the URL for a page.
    • Make sure that Googlebot can crawl all the content you want indexed. Content behind secured pages aren't crawled and you should offer free versions of valuable content so that it can be indexed.
    • "Assign unique, descriptive title tags and headings to every page" (direct quote from SEO Egg Heads)
    • URLS with descriptive document names will rank better than non-descript URLS (i.e. windows-vista.php vs. 2006-12-08article.php)

    Most of the other tips are generally useful, but found the above tips highly useful when thinking about my own sites. Thanks Matt and SEO Egg Heads for the great information and organization.

    Google Talk - Offline Messaging Added

    By hagrin - Posted on 01 November 2006

    Finally, they added it!

    Google Talk became the best IMing program in the world today when they added support for offline messaging. Now, if you send a message to a user who is offline, that user will receive the message when they sign online again or check their Gmail (the best feature). When users check their Gmail accounts, chats will show up as unread messages (how cool is that?) and intergrates amazingly well with the chat logging integration between Gmail and GTalk. All you need to do is make sure you have chat logging enabled and you'll be receiving offline messages in no time! A wonderful addition by Google which leverages the power of Gmail and GTalk.

    Now, I need to stop being lazy and hookup Google Talk for my Blackberry 7290.