August 20, 2004

Developer migrates from hobbyist to fantasy baseball titan

Author: Blane Warrene

There are all manner of white papers and architecture documents on the features and benefits of migrating from MySQL 3 to MySQL 4. However, sometimes the best example is to see it done on the front lines of the Web
business. I recently had a chance to chat with Tyson Lowery, a developer and proprietor of the online fantasy baseball site SimDynasty.com, and learn about his experience in making the MySQL move, marketing the site, and his views on Java and PHP. This was not a task for the slight of heart, considering Lowery's Internet property receives 75,000 to 100,000 daily page views and includes a corral of 25,000 registered users.

Can you give us a little background on yourself?

I grew up in Akron, Ohio before attending Boston College where I majored in Computer Science and Business. After that I started working for GE Capital (a division of General Electric), where I worked on a variety of projects. Most of the technology was of the Microsoft variety, with some Java and Oracle projects sprinkled here and there.

I started my own technology consulting practice, Teeloh Technology, after leaving GE. My specialty has become database-driven Web applications, particularly in the financial services and real estate industries. But as of late, I have been doing some project work for gaming companies as well.

What drove you to design and launch the fantasy baseball site?

I really became attracted to some of the open source technologies being developed, particularly Java and MySQL. I started my baseball simulation site, www.simdynasty.com, mostly as a way to experiment with Java and MySQL in the evenings while working at GE. Little did I know that it would grow to become a large hit on the Web.

Can you expand on the site's technology platform?

Sure, let me tell you a little bit about the site, so you can follow the examples and understand how we use MySQL a little bit better. Sim Dynasty is a baseball simulation that challenges people in the team ownership and
manager aspects of baseball. Each team owner drafts a team of fictitious baseball players and acts as the owner and manager of the team. They can trade players, set lineups, groom prospects in the minor leagues, etc.

Games are simulated 3 times a day and you can watch the games live, similar to ESPN GameCast. The coolest part about the game is that you manage the team over several seasons and have to decide which player to select in the amateur draft, or which rookie to call up to replace the veteran that retired in the off-season, or whether to sacrifice the future of your team by trading a few prospects for a top-of-the-line pitcher.

So you can imagine when the games are simulated, this causes a large number of records to be updated and inserted into the database to account for player stats, team records, and the play-by-play. But the users are still banging away at the site with reads to check stats, view game results, and look over the waiver wire. Users are also doing their share of database writes with setting lineups, offering trades, and even changing players' names. So you can see we have a lot of different competitors for MySQL's resources - both on the read and write sides.

The site runs using Java, JSP, and MySQL 4.0 on a Linux server - we do use PHP a little bit for a forum and a couple other small areas. We are currently on MySQL 4.0.18. I can't wait until 4.1 becomes production ready -- subselects are finally here!

When you started the site, it sounds like you were on MySQL 3.2x? How did you plan the migration to 4.x?

Probably like a lot of people out there, we made the change out of necessity. We desperately needed some
features in 4.0 to help the performance of the site. We started with 3.23 simply because that was the version that came preinstalled on the server and it worked well for a while.

What factors led you to make the decision to migrate?

The major reason we upgraded was so that InnoDB could be leveraged. I noticed that pages were really slow, and realized the problem when monitoring the number of locked queries in SHOW STATUS.

We have an interesting situation that most sites don't have. A lot of MySQL sites have a ton of Select queries, but not many Inserts, Deletes, or Updates. Or if they do have those, they are done during off-peak hours.

The nature of our site has just about every table being updated hundreds or thousands of times per hour. This made getting to 4.0 critical as our site grew so that we could leverage InnoDB tables. Using InnoDB for the
frequently updated tables has been a game changer for us because it uses row-level locking instead of table locking when doing updates. This allows our Java programs and JSP pages to update tables while our visitors are viewing data from the tables, or even updating the data.

Can you share some MySQL tips and tricks discovered in the process?

Some of the tricks I have learned aren't intuitive. If you are using joins between two tables, performance is greatly improved if the sizes of the columns are the same. For example, if you have one table called customer with a primary key of customer_id that is joined with an order table on a field called order_customer_id, make sure you don't have something like int(10) for one field and int(5) for the other. Changing them both to int(10) even will help the speed of these joins.

The other feature we have leveraged in 4.0 is query caching. Even though the tables are updated many times per hour, we still see good numbers in terms of cached queries being used.

I recommend that you turn on the slow query log for MySQL. This was a good starting point to identify which queries were running slowly. When I do find a slow query, I use MySQL's Analyze query that shows you how
many rows are being read from each table and how they are being joined. You can usually find a better way to get the same result set without scanning so many rows, either by adding indexes on tables or by joining the query differently.

Any hiccups and lessons learned along the way?

Deciding which tables to move to InnoDB from MyISAM is an ongoing process. MyISAM tends to perform better on reads - if you can avoid table-wide locking. Our main database has 34 tables, 7 of which are InnoDB tables now.

There are a few other tables that may be candidates to be changed over in the future. We monitor queries to see if locking becomes an issue with any of them. A nice tool that we found is called mytop. It kind of takes what you would see in MySQL from SHOW STATUS and a few other commands and presents it in a format similar to the Linux top command. You can see
the queries being executed right in front of you and see how long they are taking, and whether any queries are locked.

Another wrinkle we've noticed is that in some instances a table may experience a few days every month or so where it is really getting hit hard with updates. For example, the playoffs for all our leagues happen around the same time. So there are certain tables that will get read and written quite frequently, but this only lasts for about 3 days and
it only happens every other month. During this time we actually switch the table to InnoDB to leverage the row-level locking, and revert it to MyISAM the rest of the time - when it's generally not written to.

The key to the whole process is trial and error. Change a setting slightly or change the index, test and measure the difference, and then decide whether to keep the change or discard it. It can be a tedious process, but the results have been amazing. A few people tried to tell me to invest in a bigger server, but I'm not sure how much that would
have helped our problems.

How about results - have you tracked any metrics on performance since migrating?

Sorry, I wish I had some numbers. I can just say from a customer standpoint that the site flies. It used to be so slow when we had our simulation games running during the day that we had to run the free games overnight so the rest of the site wasn't overly slow. This is no longer a problem.

The number of players we have playing the game has gone through the roof, and I think a lot of this has to do with better server performance. Sometimes certain pages would take up to 10 seconds to load in the past - this can get tiresome for a person to deal with day in and day out and I think we lost a lot of people because of that.

You use Java and PHP - is that correct? Any advice or techniques you have picked up with using Java on MySQL in contrast to PHP?

Both JSP and PHP have their advantages and disadvantages. If you're going to need programs to run on the server, my philosophy has been you should use a J2EE solution so that you can share code between your executable
programs and the Servlets or JSP. But PHP generally puts less stress on a server than Tomcat and there seems to be less moving parts with PHP in terms of server setup and configuration - meaning less things can go wrong.

In terms of interfacing with MySQL, I haven't noticed many differences in terms of performance. PHP is slightly less complex for beginners to figure out how to connect to a database. But in terms of performance, I
haven't seen much of a difference. Would be an interesting experiment to execute the same queries using PHP and JSP and measure the results. Maybe something I can do on a rainy day.

How have you marketed the site? Has it resulted in more subscribers?

The site has primarily been marketed using paid search engine listings such as Google Adwords and Overture. We also advertise on other baseball-related sites. We normally get 75,000 to 100,000 page views a day on our site, with nearly 25,000 registered users.

We have a refer-a-friend program as well, so the site spreads quite a bit through word of mouth.

Any other valuable tools like mytop you have discovered in the process of building and maintaining the site?

We started out using some monitoring scripts on the server to let us know when there were site problems. But one of the problems we found with this was that if there were a major issue, the scripts would stop running as well. We now use a company called Alertra to monitor our sites. For a few bucks a month, they can page us if there is a problem with the site being down.

Webmasterworld is a catch-all Web site with great tips on anything from finding a hosting company, to how to market your site, to HTML syntax questions. Really a valuable Web site if you haven't seen it before.

For marketing, using a banner ad exchange may seem like you're back in 1996, but it really has helped us bring in new customers and get rid of our extra banner ad inventory. After experimenting with a number of these, we exclusively use BCentral.com's exchange that is now owned by Microsoft.

Blane is a writer and researcher focusing on Apple and Open Source technologies. Prior to this, he helped found a commercial software and consulting venture, and worked in the financial services sector as a director of technology and in varying technical roles.

Click Here!