First time here? You are looking at the most recent posts. You may also want to check out older archives or the tag cloud. Please leave a comment, ask a question and consider subscribing to the latest posts via RSS. Thank you for visiting! (hide this)

During the last weeks I noticed that my blog was becoming slower and slower (and if you visited my blog you might have noticed it as well).

After a bit of troubleshooting on the server (which was kind of painful due to the slow connection via RDC from Europe to my server in the US) I figured out that most of the slowness came from my SQL Server Express instance that was taking up too much memory.

And I found out that I had 3 tables that were extremely big in size:

  • Subtext_Log - logs all errors happening on the server
  • Subtext_URLs - all the referrals' urls
  • Subtext_Referrals - a 1:1 mapping table that counts how many visits per referral

The first was about 20k row in size, and the other two, same size, over 670k rows. It is not a big number by itself, but if you contextualize it on the (virtual) hardware it runs on (a tiny VM instance of half-core and 614Mb of RAM that hosts both IIS and SQL Server) it was a huge number. The overall size of the database was around 350MB.

The size of these tables are pretty easy to explain: each new visit from coming from a link or from a search engine creates a new record (and subsequent just increase the count) and I never cleaned the tables since I launched my blog which received more or less 1150k page views of that kind.

Googling around I also found I'm not the only one with those problems:

I tried to use the page Travis wrote, but with a the very slow site his page always timed out. I also tried Phil’s approach, cleaning the tables from the search referrals, but, not sure why, the delete statement failed after one hour of execution. So I fell back to the more brutal but quicker and safer "truncate table" approach. And after cleaning all tables I rebuilt the indexes and shrunk the database. From the original 350MB it went down to 30MB, which are much more easy to handle for my tiny VM instance.

I lost the stats about the urls that brought in the most visitors, but I guess I could retrieve the same information with a bit of practice on Google Analytics.

UPDATE: At least on SQL Server Express, after you do the shrinking you have to restart the SQL Server instance otherwise the server will stay as slow as before.

This feature, the referrals’ logging, was introduced in 2003 in the original .Text, when there was not free stats analyzer. But now there is Google Analytics and many other similar services, so I think it's time for a redesign.

A possible solution for this problem might be adding some configuration options to fine tune the behavior of the referrals logging:

  • don't track referrals at all
  • track only referrals that don't come from search engines (based on regex)
  • track referrals only for the latest posts
  • automatically purge referrals older than 1 month
  • track everything

And also introduce some kind of auto-purge of Log entries after a period of time or when the size reaches a specified number of rows.

If you have Subtext, please comment and let me know what you think about this possible solution to this problem.

Final suggestion: if you have a Subtext-powered blog, periodically check the size of these 3 tables and clean them if they grow too much.

posted on Monday, June 20, 2011 12:39 AM

Comments on this entry:

# re: When Subtext DB gets out of control (and how to fix it)

Left by Justin at 6/20/2011 3:14 PM

I haven't launched any Subtext blogs yet but this is definiately a post I will bookmark and keep incase I run into this. Thanks for posting!

# re: When Subtext DB gets out of control (and how to fix it)

Left by David Gardiner at 6/20/2011 3:48 PM

You rebuilt the indexes and then shrunk the database?

Might be worth reading Paul Randal's blog on why this is probably contradictory -

Shrinking more than likely just undid all your nice 'rebuild index' work.


# re: When Subtext DB gets out of control (and how to fix it)

Left by Simone Chiaretta at 6/20/2011 3:54 PM

Shrinking the DB might not affect performance if you have enough memory on the server.
I guess that going down from 350Mb to 30Mb was kind of a big deal for my tiny instance.

Anyway I rebuilt the index also after the shrinking :)

# re: When Subtext DB gets out of control (and how to fix it)

Left by Travis Illig at 6/20/2011 5:38 PM

It's nice to have the data of referrals, but if I'm honest with myself, I have never used it except once, in trying to figure out why one specific blog entry was so popular.

I think Google Analytics is probably a reasonable replacement. I'm not sure I've looked at the referrals since setting up Google Analytics.

If the database maintenance page times out, you may need to add a longer timeout to your DB connection string or modify the AJAX service page to delete fewer records in each batch. I had a similar timeout problem the very first time I ran it, which is how I got to the "only delete 1000 referrals at a time" limit.

Anyway, I'd vote for either "Don't track referrals at all" or "track referrals that don't match spam/search engine patterns." "Track everything" is where we're at now and is troublesome; the others equate to "track an arbitrary subset of the data" which isn't useful.

# re: When Subtext DB gets out of control (and how to fix it)

Left by Simone Chiaretta at 6/20/2011 5:40 PM

I think the best solution is to stop tracking referrals or at least have an option that says:
- Track Everything (but you have to deal with the growth of that tables manually)
- Don't track anything
and default to don't track anything

# re: When Subtext DB gets out of control (and how to fix it)

Left by Simon Philp at 6/21/2011 10:30 AM

I use have created a maintenance page that sites in a protected directory that is called at 12.01 every night that purges the data in all the 3 tables you mention above. I think this was also influenced from Travis's blog post.

I think we should maybe look at analytics from a plugin perspective as this would mean we can use any 3rd party provider..

# re: When Subtext DB gets out of control (and how to fix it)

Left by Igor A. at 6/23/2011 6:29 PM

Although shrinking was helpful in your case, it's not the best practice in the SQLServer maintenance.
SQLServer will allocate files (Data and Transaction Logs) again.
Finally, I guess the Autogrow setting on Transaction Log is set to grow in percent: this could bring SQLServer to increase the file exponentially (also in timing).
If that's true, set to a finite set of MegaBytes (e.g. 10 or 100) so you have linear grow.

Hoping to help setup SQLServer the best way for you.

# re: When Subtext DB gets out of control (and how to fix it)

Left by Simone Chiaretta at 6/24/2011 10:18 AM

I did the shrinking because I knew I removed 90% of the contents of the database, from 350Mb to 30Mb.
Sure, it will grow again, but will take other 1M page-views to reach the same size :)

Comments have been closed on this topic.