Details of our MySQL Query Analyzer Use Case
November 20, 2008
By Jeff Freund
Last week I was interviewed by Charlie Babcock from InformationWeek about the MySQL Query Analyzer. I love tools that provide immediate, actionable information, and the Query Analyzer is just that.
His article ( http://weblog.infoworld.com/openresource/archives/2008/11/mysql_query_ana.html ) presents a few facts from our results and I chose to publish a more detailed description about it as I think there are some very interested things to learn from our experience.
Optimizing a SaaS platform is a never ending task. I'd like to think that after 6+ years of running our platform in production, and scaling it multiple orders of magnitude, that we've eliminated the obvious bottlenecks. There are just not any killer queries left that are simply too slow.
Having eliminated those low hanging fruit over time, we were left with writing Perl scripts to parse SQL logs and other rudimentary analysis mechanisms. From the first use of the Query Analyzer, it was apparent that the data available in it was opening up a new door of analysis to us, focused less on manual inspection and more on collected statistical information.
Below is a description of our first use of the tool several months ago while it was in alpha.
1) We pointed one of our production application servers (a website publisher) to MySQL Query Analyzer instead of the MySQL database server. The Query Analyzer proxied the requests to the database, capturing statistics and metadata on the fly.
2) We let the Query Analyzer gather statistics over a 20-30 minute period of time. This was live traffic - not a controlled test environment or benchmark test.
3) We analyzed the statistics to determine what was the most "expensive" query. This was determined by looking at several of the statistics that the Query Analyzer records, such as Most Frequently Run, Most Records Return, Largest Result Set (Bytes), and Most Processing Time.
In the end, the chosen query was NOT the most frequently run query, nor was it the slowest query. Rather it was the one the collectively used the most processing time, returned the largest results sets and also ranked in the top 10 in terms of frequency.
We had a software engineer take a look at the code and figure out how to optimize it.
The query in question was intended to load the "placement list" for a website section. It is a single table query that filters based on date ranges and performs a sort based on a rank value. The query is well indexed and very efficient for small result sets.
However, we realized that the query was unbounded and that over time, some of the result sets have grown from the hundreds to the tens of thousands of records. There were two optimizations that the engineer coded:
1) The sort was removed from the SQL query and performed in the java code once the result set was returned as we like to push work out from the databases and distribute it in the application layer when possible.
2) We added logic to the code such that a bounded query was run first. If this satisfied the data need, great! If not, the unbounded query was run. In the end, we end up running more queries, but the average result set size (and cost) is much, much lower.
We have a benchmark test that we run against the website publishing application. It serially hits the server with several hundred URLs from real websites that we publish. This isn't a perfect "real world" test as it doesn't take into account concurrency, but by running the test against a cold server (nothing cached in memory) it does provide an indication of raw speed of execution.
The baseline benchmark times before any optimization was done, averaged over five minutes long at 5:27.
After the code was optimized, the benchmark times dropped to an average of 2:54, a 1.9x performance improvement!
For the amount of time put into the analysis and ultimate optimizations (less than two days of total analysis and engineering time), this represents a HUGE win from a scalability, performance, and cost saving perspective. It was only possible from the database usage and load nuances provided by the deep inspection of the Query Analyzer.