State of Blogosphere Search, Part IV - The RSS Blog
RSS, OPML and the XML platform.
 
Copyright 2003-5 Randy Charles Morin
The RSS Blog
<< Previous Main Next >>
Wed, 26 Jul 2006 15:00:43 GMT
State of Blogosphere Search, Part IV

A couple weeks ago, I asked my readers to provide some blogosphere search queries that I would then use to test the effectiveness of the blogosphere search engines; Technorati, Google blog search, Ice Rocket and others. The lone respondent was Sterling Camden (Blogger of the day, June 11 2006), so I'll be exclusively using the two searches he provided in this test. He suggested a search for his two primary blogging domains; chipsquips.com and chipstips.com. And here's the result grid. I tried to create the best search query string for each search engine. Feel free to suggest better search string and other search engines, which I'll quickly add to the list. Last, feel free to add your own interpretations of the results in the comments.

Search engine chipsquips.com chipstips.com Grade
Blogdigger http://blogdigger.com/linkSearch.jsp?link=http%3A%2F%2Fwww.chipsquips.com%25 http://blogdigger.com/linkSearch.jsp?link=http%3A%2F%2Fwww.chipstips.com%25 C+
Bloglines http://www.bloglines.com/search?q=Bcite:chipsquips.com http://www.bloglines.com/search?q=Bcite:chipstips.com A
BlogPulse http://www.blogpulse.com/search?query=chipsquips.com http://www.blogpulse.com/search?query=chipstips.com C+
Google blog search http://blogsearch.google.com/blogsearch?q=chipsquips.com http://blogsearch.google.com/blogsearch?q=chipstips.com B
Feedster http://www.feedster.com/search/chipsquips.com http://www.feedster.com/search/chipstips.com C
Ice Rocket http://blogs.icerocket.com/search?q=www.chipsquips.com http://blogs.icerocket.com/search?q=www.chipstips.com C
PubSub http://www.pubsub.com/site_stats.php http://www.pubsub.com/site_stats.php F
Technorati http://www.technorati.com/search/chipsquips.com http://www.technorati.com/search/chipstips.com B

Blogdigger

No new links in the last week, but it is capturing some links including links posted to del.icio.us and they seem to be spam free. Grade: C+.

Bloglines

Capturing a lot of links, including ones posted to del.icio.us and my comments. Some are self-referential and none contain spam. Grade: A.

BlogPulse

No new links in the last week, but it is capturing some links including links posted to del.icio.us and they seem to be spam free. Almost identical to Blogdigger. Grade: C+.

Google blog search

Capturing a lot of links. Some are self-referential and none contain spam. The Next button wasn't working for me. That's a major problem. Grade: B.

Feedster

Capturing some in-coming links, but mostly self-referential links. That's a major problem. Grade: C.

Ice Rocket

No new links in the last week, but it is capturing some links and the results are spam free. Grade: C.

PubSub

Broken. Grade: F.

Technorati

Capturing a lot of links. The referring URL is always the homepage of the blog and not the blog entry. This makes it very difficult to actually find the link and sometimes the link has moved off the blog homepage and it becomes near impossible to find. That's a major problem. Results are spam free. Grade: B.

Conclusion

It's amazing how every time I run this test I get completely different results. In the past, Ice Rocket and Google blog search have shown well, but Ice Rocket seems to be degrading and Google blog search had a major bug. On the other hand, Technorati and Bloglines have usually shown poor results, but performed much better for Sterling. Thanks Sterling for the independent starting point.

Addendum

Actually, a second reader responded to my quest for blogosphere search queries. That was David Sifry, the founder and CEO of Technorati. Dave asked for a sample query that was full of SERP (search engine result page) spam. I responded with http://www.technorati.com/search/%22Yahoo%21%20Finance%20Widget%22. Over a week later, that query remains spam ridden. It seems anytime someone posts anything negative about Technorati, Dave is there to ask how he can make it better. The problem? Technorati never seems to act on the user feedback. But then, maybe this time it did get better.

Links

Previous state of blogosphere search blog entries.

Update: It appears Technorati's index is missing all entries for the last 28 days. I did a dozen+ searches and all results are now 28 days old. Also, the Next button is working for Google blog search again. I won't change the original post, but considering this new information Google blog search should be rated A- and Technorati B-.

Reader Comments Subscribe
Randy,

What queries you have done since Bloglines re-launched search in early June which "have usually shown poor results"?

I know that before we relaunched search (And Ask.com Blog Search launched) we had many problems, but I believe the new search is the best out there.


Thanks,

Paul Querna
Bloglines Engineer
The reason I never responded to your request for search strings was (a) because I typically only search for myself or our company name, which aren't very representative search strings, and (b) I very rarely see any spam (probably because neither me nor my company are worth spamming).

However, there are a few points I thought might be worth passing on about the search engines (although not related to spamming).

In Snarfer we have a web search plugin that basically aggregates the results from several search engines. We search everything on your list except PubSub and Technorati (which I believe both require some kind of sign up to obtain RSS results so they aren't of any use to us). In addition we search Sphere (www.sphere.com), MSN Search and Yahoo! Search (although the latter two aren't really blog searches they do seem to include blog results and more importantly they have RSS feeds).

Yahoo! I've recently stopped using because they keep mucking with their URLs and redirecting through yahoo servers which screws with our duplicate removal. Bloglines seems to have problems with my international searches (searching for my website URL) so I disable it for those particular searches. The rest seem pretty much the same to me - what the one misses another will find.

If I had to choose a favourite though, I'd have to go with Feedster. I don't know what their coverage is like (because, as I said, if they miss something, there's always someone else that will find it), but their summaries seem to be way more detailed than anyone else. That can be really helpful if you're trying to figure out if the content is likely to be of interest without having to click through.

Anyway, I'm sorry if this message is slightly off topic. If nothing else, at least consider it a suggestion to check out Sphere. :)

Regards
James Holderness

Paul,
The poor results I refer to were pre-June. My last state of blogosphere search entry was April.

James,
This is great stuff. Thanks! And completely on topic.

Randy

One other thing I meant to add. Blogdigger, Google and IceRocket all make some attempt to include an author in their feeds which can be quite useful for filtering. For example, when doing a vanity search you probably aren't interested in the results where you're the author.

Maybe there aren't that many other uses, but I'm all for more metadata if it's available.

Blogdigger also shows up categories/tags from time to time. They're rare, but then again they're not that common in regular feeds either.

- James Holderness
Based on your recommendation, Randy, I subscribed to RSS feeds for BlogLines searches.  I must say I am amazed.  Within minutes after posting an entry that linked to my blog, it showed up in the search feed.
--SterlingCamden
I've been meaning to test for time to index on the various blog search engines.  Anecdotally I find that Google Blogsearch is the fastest.
-Marshall
Marshall,
Completely agree, Google is the fastest. But 1/10th of a seconds versus 5 seconds really doesn't do it for me anymore.

Randy
Type "339":
Top Articles
  1. Unblock MySpace
  2. MySpace
  3. FaceParty, the British MySpace
  4. del.icio.us and sex.com
  5. Blocking Facebook and MySpace
  1. Review of RSS Readers
  2. MySpace Layouts
  3. RSS Stock Ticker
  4. RSS Gets an Enema
  5. Google Reader rejects del.icio.us