In this post I’m going to explain the reason behind my decision to introduce Lucene.net into Subtext to power the internal search engine.

The problem: high bounce rate

It all stared a few week ago, when I noticed that I get lot of visitors from search engines (around 70%) but that they rarely look at more than one page (only 15% read a second page).

I was interested in knowing if this was just a problem of my blog, or a general problem of all tech/dev oriented blogs. So I ran a quick poll over twitter, and I found out that I’m not alone: 66% of the people that responded state that they have an average of less than 2 page views per visit on their developer oriented blog. The remaining 33% state that they have more than 2 pages per visit (actually 24% even more than 3). But this is probably due to different metrics or way of identifying pages per view (I’ve a 7.3 pages per view if I look at the stats provided by stats provided by my provider with AwStats).

In a few words, developer-focused blogs don’t retain the reader for more than one or two pages.

The possible options

I started thinking about ways to reduce the bounce rate, both as a pure SEO exercise but also because this will help readers that come from search engines to find more posts about the keywords they were interested in. Some possible solutions are:

  • Show the best or latest posts of the blog more prominently
  • Show a list of posts similar to the one the visitor is reading
  • If the visitor comes from a search engine, show other posts that match the same keywords

The first option is not something that can be incorporated into a blogging engine as it “only” requires an update in the design of the blog skin. And it won’t give a lot of benefits to the readers because the “best” or “latest” posts might not be about what the reader is looking for.

The solution

So I decided to focus on the other two upgrades, which can be easily introduced into Subtext (or any other blog engine) and that will give a bigger benefit to the reader. And I’m going to develop the following “widgets”:

Why Lucene.net?

Subtext already has a internal search engine that I could have leveraged to power those two widgets, some of you might has why I’m planning to use Lucene.net. The reason is quite simple: Lucene.net is a powerful full-text search engine, with advanced features that allow fast and consistent searches, and that allows the kind of “more like this” search that I need for one of the two widgets. Subtext already has a "Similar Posts” control, but it relies on the categories of posts, so not really that accurate.

And last but not least, Lucene.net will make the internal search engine more accurate, so I’m planning to completely change the search implementation of Subtext.

What’s next

I never used Lucene.net in a real application, so this will also be an interesting journey, that I’ll take with you, my dear readers. I’m planning on writing on this blog who the integration of Lucene.net in Subtext is going, and I’m also going to write a series of posts about how to use Lucene.net. So, stay tuned!!