Archive for December 22nd, 2006

December
22

Duplicate Content Issues

Adam Lasnik discussed a few days ago to “not worry too much about duplicate content.” Essentially that Google has ways of identifying similar pages and will, in most cases, index only one of them.

I think this statement is exactly why webmasters worry “a lot” about duplicate content, because of all the scapers and content hijackers, they don’t want the thief’s version getting indexed and their version getting left out in the dark.

Some common issues are:

  • You can have two identical pages on your site, which is common in blogs where pages representing a particular date is more or less identical to a regular post page.
  • You can have content that is grabbed from another site (with our without the original author’s knowledge).
  • You can have “printer friendly” or “mobile friendly” pages with the same content as your web based content.
  • Different URLs may point to the same page (www.seorevolution.com, www.seorevolution.com/index.html, seorevolution.com, etc.)

Statements were also made that if you use duplicate content in hopes to influence the rankings to your benefit, Google will make adjustments. “In the rare cases in which we perceive that duplicate content … we’ll also make adjustment in the indexing and ranking of the sites involved.”

No where does it state what that “perception” may be or what the pending “adjustment” would be.

Let’s go over the main points of the article that you need to know:

What is duplicate content? “It generally refers to substantive blocks of content within or across domains that either completely match other content or are appreciably similar.”

A good tip is that Google will NOT count the same article written in English and Spanish to be duplicate content. Also, “snippets” of text or “quotes” aren’t seen as duplicate content either.

Here is the issue: When Google serves results, they want to serve unique results. Results that have distinct information. “This filtering mean, for instance, that if your site has articles in “regular” and “printer” versions and neither set is blocked in robots.txt or via noindex meta tag, we’ll chose one version to list.” Believe me, you don’t want to rely on a bot to make the right choice. Adam does state that they prefer to filter rather than do ranking adjustments.

What can you do?

  • Block Appropriately: Instead of relying on Google to make the best choice, you should make that choice. If you have two versions of a document, including a printer version, place them in their own folder and disallow that folder from being indexed in the robots.txt file.
  • Use 301s: If you have “dead ends” in your site, use 301 Redirects in your .htaccess file to smartly reroute users and bots. You might also consider using a custom 404 page with the main categories listed in a “site map” format.
  • Show Consistentcy in Linking: Keep things consistent so you aren’t linking to /page/, /page, and page.htm.
  • Use TLDs: If you use TLDs (Top Level Domains) to handle country-specific content, Google will know that .de indicates Germany-focused content easier than it would if it was de.domain.com.
  • Syndicate Carefully: If you syndicate your content on other sites, make sure it contains a link back to the original article on each syndicated article. This will help Google realize which one is the original for them to index.
  • Minimize “Repetition”: If you have lengthy (emphasis on lengthy) copyright text at the bottom of every page, include a brief summary and then a link for more details. The lengthy text could trigger duplicate filters.
  • Avoid Publishing “Stubs”: No one likes seeing “empty” pages and that includes GoogleBot. So, avoid having “placeholders” whenever possible. So, if you have a review site, block pages that have zero reviews. If you have a real estate site, block pages that have zero listings. Nothing is more annoying that a page that states, “Below you’ll find a superb list of all the great rental opportunities in [insert city name]” – but the page contains no listings.
  • Understand Your CMS: Make sure you are familiar, or you hire someone who is familiar, with how the content of your site is displayed. This includes your blog, your forums, or related systems that often shows the same content in multiple formats.
  • Understand the “Big Picture”: Don’t fret too much about sites that scrape your content. Though annoying, it’s highly unlikely that such sites can negatively impact your presence in Google. If there is a major issue, let Google know through a DMCA request.
  • Now that you understand better about Duplicate Content, you can fix the current problems on your site and spend more time focusing on revenue generation.

December
22

Yahoo! Update Complete and Stable

Earlier this month Yahoo! had an update to their index around December 10th. Over the last few weeks we have detected plenty of fluxuations, but things seem to have settled down. Our members of the SEO Revolution will be getting the latest algorithm update and will know exactly what was changed and what to do.

December
22

Yahoo! Paid Inclusion

For some reason, there is still talk about paid inclusion from Yahoo! and their bot Inktomi. I have used paid inclusion for many years – and each time I have used it I have been less than impressed.

Basically, it allows you to get all of your URLs fed into the index and you pay either 15 cents or thirty cents per click. You can also cut a deal for ten cents a click, but only if your site has a ton of quality content, thus, you can qualify for their “library rate”.

A lot of companies rely on it because it allows for their duplicate content to be indexed properly. However, I have consulted for many companies in the past that swore by using Paid Inclusion. Back when I took clients, we did a complete mod_rewrite for them and their URLs were quickly spidered and indexed by Inktomi for free. After a month or so, they cut out paid inclusion, kept their traffic levels steady and saved nearly $20,000 per month.

Did we get a bonus check for that one? No. Another reason why I hated doing client work.

Full Article

December
22

Idiot Callers

So, I came into the office and our receptionist, Elizabeth, who is incredible by the way, handed me a message. “He wouldn’t leave a voice mail,” she said.

Hmmm.

After getting things settled in my office, I called this gentleman, identified myself and that I was returning his phone call. I then asked him what he needed.

“A million dollars in small bills,” was what I heard on the other end.

“Excuse me?”

He repeated himself again to which I replied with a dial tone. I don’t have time to play with the phone.

He called back. Elizabeth answered and he blasted her for us not having a sense of humor.

Her response? Simple and to the point.

We don’t know who he is and since there is no relationship, there is no room for humor in a professional business enviornment. Our time is valuable and he showed a lack of respect, and is continuing to do so.

She then invited him not to call back again.

She rocks. The day is always better when you have good staff who make good decisions.

Let this serve you. When you call a company for either information or to possibly do business with, spend time planning what you will say and get right to the point to be respectful of the other person’s time.

I, for one, would be grateful.