
MySQL remains a little bit of a mystery to me. I can only spend so much time on it, I only have so many database servers to play with and the necessity to tweak the performance comes from the fact the database server runs on a VMware virtualization platform with huge performance penalties, more so then sub-optimal performance of the database server or engine itself.
However, look at the following picture:
It shows you the amount of queries performed on a MySQL database server, per second average over a 5 minute poll interval. This is nice, and has shown us more then once that somewhere between 1000 and 1200 queries per second, the VMware guest is at its maximum capacity. This doesn't really help us to anticipate future usage of the database server though, as the graph just visualizes last day's usage. Luckily, Munin has a year graph too ;-)
Now, if you look at this graph, you can obviously see the trend go up and up and up. However, the trend is too vague. How steep is this usage going up exactly, and deriving from these statistics, how soon am I going to run into the guest's maximum capacity? One can make the trend go up very steep using this graph, but one can also think the usage is barely going up at all.
Admittedly, these numbers aren't all that high and the volume isn't all that huge. I would love for some admins and DBA's of some real databases to pop up and say their thing ;-)
Anyway, the workaround I came up with is to monitor and graph the Queries per Second Average. If graphed over a longer period of time (I've only recently started doing this), this does somewhat represent long-term usage trends. So, the following picture shows the Queries per Second Average (total number of queries over uptime):
Nothing really exciting in the day graph, is there? It's the year graph that, over time, I hope to get more valuable input from:
I'm hoping that this average of close to 500 now, is going to become an average of closer to 600 some time in the (near) future, and that I can accurately predict when it's going to hit 700 or so. Given that this number represents overall usage, including off-peak hours, its trend rising does mean something; it's just not directly related to the peak-hour usage.
Suppose that with an average close to 400 sub-optimal performance due to maxing out the system resources is very incidental. Suppose that with an average close to 500 this type of incident becomes more frequent (it's inevitable you will hit that point some time). Suppose you have this average in a graph going up and up you can then accurately (somewhat accurately) predict when the average is going to hit 600 -since the trend is what you're graphing here.
For those of you interested in doing this type of trending on their own environment (and let me know how it works out for you); here's the Munin plugin.