Domain Hotsheet
 
HOME | CONTACT US | SITE MAP

Recent Searches:
Search: Domain Hotsheet
Search: Buy Domains
Search: Domain Names

Partner Sites:
Web Inceptions, Inc.
Domain Name Sales
Domain Registration Alerts


New Sites:
Supernatural Photography
Bargain Scrapbooks
Challenge Workshop
Virtual Pets
Reconcilable Differences
The Love Bible
Advanced Navigation
PUA
Hyper Seduction
Advanced Defense
Party Confidential
Spice Chefs
Adventure Climbers
Independent Cycling
Organic Parenting
Affordable Beach Living
Coach Promotion
Nightlife Photographer
Affordable Home Broker
Interior Updates
Real Estate Bailout
Serenity Photography
Advanced Exports
Enhanced Photography
Smart Custody
Adventure By Nature
The Wine You Love
Bridal Insight
Inspirational Instruction
Coral Adventures
DomainHotsheet.com
Tuesday, March 09, 2010


Playing in Googlebots Sandbox with Slurp, Teoma, & MSNbot - Spiders Display Differing Personalities

There has been endless webmaster speculation and worry about

the so-called "Google Sandbox" - the indexing time delay for

new domain names - rumored to last for at least 45 days from

the date of first "discovery" by Googlebot. This recognized

listing delay came to be called the "Google Sandbox effect. "Ruminations on the algorithmic elements of this sandbox time

delay have ranged widely since the indexing delay was first

noticed in spring of 2004. Some believe it to be an issue of


one single element of good search engine optimization such

as linking campaigns. Link building has been the focus of

most discussion, but others have focused on the possibility

of size of a new site or internal linking structure or just

specific time delays as most relevant algorithmic elements. Rather than contribute to this speculation and further

muddy the Sandbox, we'll be looking at a case study of a

site on a new domain name, established May 11, 2005 and the

specific site structure, submissions activity, external and

internal linking. We'll see how this plays out in search

engine spider activity vs. indexing dates at the top four

search engines. Ready? We'll give dates and crawler action in daily lists and

see how this all plays out on this single new site over time. * May 11, 2005 Basic text on large site posted on newly

purchased domain name and going live by days end. Search

friendly structure implemented with text linking making

full discovery of all content possible by robots. Home

page updated with 10 new text content pages added daily.

Submitted site at Google's "Add URL" submission page. * May 12 - 14 - No visits by Slurp, MSNbot, Teoma or Google.

(Slurp is Yahoo's spider and Teoma is from Ask Jeeves)

Posted link on WebSite101 to new domain at Publish101. com* May 15 - Googlebot arrives and eagerly crawls 245 pages

on new domain after looking for, but not finding the

robots. txt file. Oooops! Gotta add that robots. txt file!* May 16 - Googlebot returns for 5 more pages and stops.

Slurp greedily gobbles 1480 pages and 1892 bad links!

Those bad links were caused by our email masking meant

to keep out bad bots. How ironic slurp likes these. * May 17 - Slurp finds 1409 more masking links & only 209

new content pages. MSNbot visits for the first time and

asks for robots. txt 75 times during the day, but leaves

when it finds that file missing! Finally get around to

add robots. txt by days end & stop slurp crawling email

masking links and let MSNbot know it's safe to come in!* May 23 - Teoma spider shows up for the first time and

crawls 93 pages. Site gets slammed by BecomeBot, a spider

that hits a page every 5 to 7 seconds and strains our

resources with 2409 rapid fire requests for pages. Added

BecomeBot to robots. txt exclusion list to keep 'em out. * May 24 - MSNbot has stopped showing up for a week since

finding the robots. txt file missing. Slurp is showing up

every few hours looking at robots. txt and leaving again

without crawling anything now that it is excluded from

the email masking links. BecomeBot appears to be honoring

the robots. txt exclusion but asks for that file 109 times

during the day. Teoma crawls 139 more pages. * May 25 - We realize that we need to re-allocate server

resources and database design and this requires changes

to URL's, which means all previously crawled pages are

now bad links! Implement subdomains and wonder what now?

Slurp shows up and finds thousands of new email masking

links as the robots. txt was not moved to new directory

structures. Spiders are getting errors pages upon new

visits. Scampering to put out fires after wide-ranging

changes to site, we miss this for a week. Spider action

is spotty for 10 days until we fix robots. txt* June 4 - Teoma returns and crawls 590 pages! No others. * June 5 - Teoma returns and crawls 1902 pages! No others. * June 6 - Teoma returns and crawls 290 pages. No others. * June 7 - Teoma returns and crawls 471 pages. No others. * June 8-14 Odd spider behavior, looking at robots. txt only. * June 15 - Slurp gets thirsty, gulps 1396 pages! No others. * June 16 - Slurp still thirsty, gulps 1379 pages! No others. So we'll take a break here at the 5 weeks point and take note

of the very different behavior of the top crawlers. Googlebot

visits once and looks at a substantial number of pages but

doesn't return for over a month. Slurp finds bad links and

seems addicted to them as it stops crawling good pages until

it is told to lay off the bad liquor, er that is links by

getting robots. txt to slap slurp to its senses. MSNbot visits

looking for that robots. txt and won't crawl any pages until

told what NOT to do by the robots. txt file. Teoma just crawls

like crazy, takes breaks, then comes back for more. This behavior may imitate the differing personalities of the

software engineers who designed them. Teoma is tenacious and

hard working. MSNbot is timid and needs instruction and some

reassurance it is doing the right thing, picks up pages slowly

and carefully. Slurp has addictive personality and performs

erratically on a random schedule. Googlebot takes a good long

look and leaves. Who knows whether it will be back and when. Now let's look at indexing by each engine. As of this writing

on July 7, each engine also shows differing indexing behavior

as well. Google shows no pages indexed although it crawled

250 pages nearly two months ago. Yahoo has three pages indexed

in a clear aging routine that doesn't list any of the nearly

8,000 pages it has crawled to date (not all itemized above. )

MSN has 187 pages indexed while crawling fewer pages than

any of the others. Ask Jeeves has crawled more pages to date

than any search engine, yet has not indexed a single page. Each of the engines will show the number of pages indexed if

you use the query operator "site:publish101. com" without the

quotes. MSN 187 pages, Ask none, Yahoo 3 pages, Google none. The daily activity not listed in the three weeks since June 16

above has not varied dramatically, with Teoma crawling a bit

more than other engines, Slurp erratically up and down and

MSN slowly gathering 30 to 50 pages daily. Google is absent. Linking campaign has been minimal with posts to discussion

lists, a couple of articles and some blog activity. Looking

back over this time it is apparent that a listing delay is

actually quite sensible from the view of the search engines.

Our site restructuring and bobbled robots. txt implementation

seems to have abruptly stalled crawling but the indexing

behavior of each engine displays distinctly differing policy

by each major player. The sandbox is apparently not just Google's playground, but

it is certainly tiresome after nearly two months. I think I'd

like to leave for home, have some lunch and take a nap now. Back to class before we leave for the day kiddies. What did

we learn today? Watch early crawler activity and be certain

to implement robots. txt early and adjust often for bad bots.

Oh yes, and the sandbox belongs to all search engines. Mike Banks Valentine is a search engine optimization specialist

who operates http://WebSite101. com and

will continue reports of

case study chronicling search indexing of http://Publish101. com

Author:
Mike Valentine




More great sites:
Advanced Law | Brain Bait | Insider Realty | Retire Smarter | Flash Interaction | Tax Facts | Ideal Wedding | Image Marketplace | Market Spin | Meet Stars | Business Resale | Virtual Lawyer | Realty Technologies | Wise Fitness | Cell Rates | Holistic Maui | Editorial Desk | Flash Artisan | Classic Luxury | The Property Trader | Freelance Lawyer | Freelance Attorney | DNA Intelligence | The Pet Superstore | Informed Technology | Identity Click | Internet Marketing Team | World Property Sales | Balanced Medicine | Fundraising Experts | Impact Banners | Internet Agent | Law Prepaid | Letters Of The Law | Business Import | Home Buyer Tools | Home Office Advantage | Photo Matchmaker | Internet Home Lender | Hawaii Photos | Hawaii Photographs | Realty Law Office | Real Estate Workshop | Budget Gift Ideas | Business Showplace | Real Estate View | Silent Trader | Realty Station | Hawaii Property Sales | Property Lookup | Pretty In Pictures | Healer Network | Designer Domains | Neighborhood Real Estate | Neighborhood Realty | Pet Studios | Quality Homes For Sale | Quality Real Estate | Real Estate Preview | Realty Dealer | Realty Preview | Realty Previews | Home Realty Guide | Freelance Artist | Market Passport | Freelance Engineering | The Marketing People | Home Real Estate Guide | Home Property Guide | Well Baby | The Property Index | Home Renovations | The Realty Consultant | The Tax Saver | Impact Portfolio | The Ideal Home | Online Photo Store | Picture Superstore | Photograph Warehouse | Office Rewards |

Do you have a web site? Please link to us!


DomainHotsheet.com: Playing in Googlebots Sandbox with Slurp, Teoma, & MSNbot - Spiders Display Differing Personalities

More Domain Hotsheet information:

Article: The truth about hyphenated domain names The truth about hyphenated domain names

Article: Signing up for a Domain Name? Consider Private Registration Signing up for a Domain Name? Consider Private Registration

Article: Understanding Your Domain Name Understanding Your Domain Name

Article: How to Win Expired Domains How to Win Expired Domains

Article: Is Expired Domain Registration Still Profiotable? Is Expired Domain Registration Still Profiotable?

Article: Playing in Googlebots Sandbox with Slurp, Teoma, & MSNbot - Spiders Display Differing Personalities Playing in Googlebots Sandbox with Slurp, Teoma, & MSNbot - Spiders Display Differing Personalities

Article: Search Engine Optimization Tips For 2005 - Part Two Search Engine Optimization Tips For 2005 - Part Two

Article: Making Money from Parked Domains Making Money from Parked Domains

Article: Protecting Your Domain Names Protecting Your Domain Names

Article: .info Freedom Frenzy .info Freedom Frenzy

Article: How To Make Easy Money on the Internet But This Time Its Actually TRUE How To Make Easy Money on the Internet But This Time Its Actually TRUE

Article: 10 Nifty Tips for Better Business Cards 10 Nifty Tips for Better Business Cards

Article: The Growth of ru-Domains The Growth of ru-Domains

Article: The growth of ru domains The growth of ru domains

Article: Selecting Your Domain Name - Things to consider Selecting Your Domain Name - Things to consider

Article: All-time Record of Domain Registrations All-time Record of Domain Registrations

Article: Bugaboos of Article Marketing in Web Content Management Systems (CMS) Bugaboos of Article Marketing in Web Content Management Systems (CMS)

Article: Detagged Domains Detagged Domains

Article: Reseller Web Hosting - A Cheaper Alternative? Reseller Web Hosting - A Cheaper Alternative?

Article: Dot Com or Dot Net, Which is the Best Domain Name to Settle for? Dot Com or Dot Net, Which is the Best Domain Name to Settle for?

Article: info Freedom Frenzy info Freedom Frenzy

Article: Web Design for the Professional Magician Part I - Selecting the Perfect Domain Name Web Design for the Professional Magician Part I - Selecting the Perfect Domain Name

Article: Domain Name and Web Hosting Hell Domain Name and Web Hosting Hell

Article: How to Profit from Expired Domain Name Registration How to Profit from Expired Domain Name Registration

Article: How to Generate Ideas for Info-products? How to Generate Ideas for Info-products?


Domain Hotsheet
Buy Domains Domain Names

Related Items:
Double V3
Tomcat Http
Wellington
V3 Digital
Tomcat Cluster
London
Domain
Chicago
Rome
South Africa
Domains
Domain Name
Domain Names
Cancun
Domain Registration
Domain Name Registration
Internet Domain
Register Domain
Domain Registrations
V3 Black
Web Domain
Register A Domain Name
Register Domains
Register Domain Name
Register Domain Names
Domain Registrar
Cheap Domain Name
Internet Domain Names
Internet Domains
Domain Registry
Buy Domain
Web Domains
Buy A Domain
Domain Name Registry
Buy A Domain Name
Domain Names Registration
Domain Name Register
Buy Domain Name
Web Domain Name
Registering Domain Names
Digimax V3
Available Domain Names
Buy Domain Names
Cybershot Dsc V3
V3 Samsung
Web Names
Web Domain Registration
Free Web Domain
V3 Unlocked
V3 Batteries
Charger V3
Silver Razr V3
Motorola V3 Headset
Cover V3
Register Web Address

 
Copyright © 2000-2006 DomainHotsheet.com. All Rights Reserved.
Home | Contact Us | About Us | Site Map | Add URL