Reddnet Scribbles

It makes me want to gouge my eyes out with a cheese grater!

Graffiti Beta 1 - Review (part 2): Performance

Stephen M. Redd
Monday, January 07 2008

I finally ran some simple performance tests on Graffiti Beta 1. By simple I mean that only a stopwatch and web browser were used. The purpose was to get a general "feel" for Graffiti's performance with a large numbers posts in the database.

I ran the test on a Dell Inspiron Laptop (2GHz Core Duo, 2GB Ram, 7200 RPM HDD) running Windows Vista 32bit, IIS 7, and SQL Express 2005. 

Since I was using my own development laptop, and it has limited free HDD space, I changed the DB recovery mode to "Simple". No other database tuning was performed. 

To generate the data I just created a SQL Schema project in Visual Studio 2008 Team System then created a simple data generation plan. The generated data included 5 categories, 100,000 posts, and 200,000 comments. The posts were split evenly among the 5 categories. Comments were also split evenly among the posts (2 comments to a post).  

Results:

100k posts and 200k comments:

  • Graffiti was able to display the user facing pages with only a minor performance impact.
     
  • The initial load of the site's home page took nearly 5 seconds. 
      
  • Refresh of the home page took less than 1 second. 
     
  • The initial load of a category page took about 2 seconds, with refreshes being nearly instant.
     
  • Clicking the "older posts" button in any content list took about 4 seconds. Once loaded though, revisiting those pages was nearly instant.
     
  • Loading a content item's view took about 1 second on first load, and was nearly instant thereafter.
     
  • Loading the control panel resulted in a SQL Timeout error every time.

    Looking at the queries in SQL Profiler: It performs a select count against the posts table, probably for use by the summary chart on the dashboard. 
       
    Testing the query manually: about 1 minute and 30 seconds to count through the table using the same where and group by clauses. Counting the posts without any where or group clauses still takes around 30 seconds. HDD throughput was at 100% during the entire duration of the query's execution.
      
    This is an area where a full version of SQL server on a dedicated piece of server hardware would make a huge difference. This kind of query bottlenecks on the low throughput of the laptop's hard drive... Low HDD performance is a typical problem on any laptop though.

 50k posts and 100k comments:

  • User facing pages loaded with no noticeable performance problems. Speeds were a little snappier than with 100k posts especially on initial page loads and when using the pager in content lists.
     
  • The admin dashboard page loaded in 17 seconds on the first load, and was nearly instant on subsequent loads. The "posts" tab of the admin section was the slowest, taking about 28 seconds on the initial load, and about 9 seconds on subsequent loads. The "comments" tab of the admin tools was surprisingly snappy, taking only 3 seconds to load initially.  

Conclusion:

Graffiti beta 1, without special optimization, should be able to serve the needs of most realistic sites in it's target market. 

Even on a modestly equipped laptop, performance did not significantly degrade until the number of posts exceeds the 50k mark. On real server hardware it would probably be able to handle well over 100,000 posts with little problem for the end users. The admin tools will degrade more significantly than user-facing pages.

Oddly, the number of comments seemed to have little impact on overall performance. 

Looking at the queries that had the most significant performance problems, it will be rather trivial for Telligent to either optimize the queries for larger numbers of posts, or provide alternate configurations that trade convenience features (such as daily report generation rather than real-time reporting) for better performance.

Reality Checks:

  • If you actually have anywhere near 100k posts, you probably are not going to be using any off-the-shelf product without significant customization and DB tuning anyway. Keep in mind that it takes 273 posts a day to get to 100k in a single year. Only a large news outlet is likely to generate that quantity of content and most of them would still take a few years to break the 100k mark. 
      
    However, 100k comments isn't too terribly unrealistic for a very popular site. 
     
  • There is one major difference between this test and a production environment of Graffiti. In a production situation, Graffiti would have created one folder on the file system for each post that was created via the GUI tools.  Within each generated folder would be a generated "default.aspx" file. The main purpose of these files is to give IIS a default document to use since the graffiti URLs don't specify actual page names. Graffiti also stores a little bit of metadata in these files that can help it render the posts faster (it doesn't seem to have trouble rendering the post if these files are absent though). 
     
    I was unable to quickly find a convenient way to mimic how Graffiti generates these folders and files without resorting to writing a custom app of my own... but this is a simple test, so I deemed it not to be worth the effort.
     
    I personally would not want to deploy Graffiti on a large scale without a way to disable the creation of these files and folders. Strictly speaking, this feature is only necessary on IIS 6, or IIS 7 when using the classic asp.net pipeline. It would not be unreasonable to expect that Graffiti will support the IIS7 integrated pipeline in the future and allow the per-post folders and files feature to be disabled (please!). 
Stephen M. Redd
Monday, January 07 2008
Filed under: Code
Tagged as: , , , ,
3 Comments

» Trackbacks & Pingbacks

    No trackbacks yet.
Trackback link for this post:
http://reddnet.net/trackback.ashx?id=176

» Comments

  1. Shiva avatar

    Stephen, thank you doing this evaluation !

    I too think that the per post folders should be done away with.

    Also, can you tell me how you "seeded" the database with the 100k posts ? I have been trying to download wikipedia's free content database to generate realistic test posts to stress test subtext, blogengine.net and now graffiti, but have not had much success. Let me know if you get a change at haathikb[at]gmail[d0t]com

    -Shiva

    Shiva — January 9, 2008 10:41 AM
  2. Stephen M. Redd avatar

    The per post folders are an unfortunate necessity given that Graffiti uses URLS that don't map to a file extension... so IIS doesn't know to send the request over to the asp.net worker process. So creating fake folders and files allows IIS figure out that the request maps to something handled by asp.net.

    With IIS 7 you can specify that IIS should send ALL requests to asp.net no matter what. Even better, you can set that option up in the web.config file without even needing to manually configure anything in IIS. This allows asp.net HttpModules and HttpHandlers to get a crack at all requests, even if the requests aren't aimed at asp.net resources. This is a set of features that should have added to IIS 10 years ago... but for whatever reason weren’t.

    Telligent has to support IIS 6, and they chose the dummy files and folders technique. The alternative would be that all URLs would have to specify an file name ending in .aspx in order for IIS to correctly send the request to asp.net.

    Honestly, requiring a page name isn't really that bad, and is the technique I prefer myself. But both have serious limitations and problems. Only IIS 7 really "solves" the problem in an elegant way.

    What I hope is that Graffiti continues with the dummy folders for IIS 6 and earlier, but gives the option to use the IIS 7 integrated pipeline and turn off dummy files and folders if you happen to be lucky enough to be running on IIS 7.

    As for the data I generated. Here is what I did:

    First I copied my own database from this site (it had 50 posts in it and no comments at the time) as a starting point.

    Then I used a "Data Generation Plan" in Visual Studio. The data generation plan is a feature that was added with the "Visual Studio Team Edition for Database Professionals" or the Team Suite combo version. It doesn't generate "realistic" data, but it does throw everything in the book in there. If the field allows international characters it will throw in a dump-truck load. If the field allows 500 characters it will create a record that uses all of them. The data is uuuuuuugly, but it is also very thorough for testing applications written against the data.

    The last thing I had to do was clean up some of the fields in the generated data via manual update queries so Graffiti wouldn't blow up. For example setting all posts to "published", setting the post's type to "text/html", setting the created by and username fields all to my own user ID instead of random ones, etc... I didn't keep the exact queries I used though, so I can't be more specific.

    And that's how I made data for testing.

    Stephen M. Redd — January 9, 2008 1:14 PM
  3. Shiva avatar

    Ok. Thanks for the info on the data generation. I'll check out the Visual Studio Team Edition for DBs. That should help me for now.

    Shiva — January 9, 2008 2:14 PM

» Leave a Comment