• 2 September 2016Today's maintenance & perfomance update

    Today's maintenance update went well... but not without hurdles. The results are not as significant as I hoped to be perfectly honest, but overall the return time of Study pages is definitely better.

    Before the database update I routinely saw queries on the Study pages that took 1.5 seconds, so adding the 300 ms of response time a page would be returned in two seconds, sometimes more! (300 ms is an average for me in Belgium, with the servers located in Texas; USA).

    After the update it looks like most pages return within one second. The queries unfortunately can still vary wildly but overall it seems we have at least a 50% improvement in response time.

    Read on for the gory details!

    In addition to that the caching of shared stories has been improved. Now it truly saves the most expensive queries, and the cache has been made to last much longer. Since we invalidate (ie. refresh) the cached page whenever a public story is edited or voted, I made the cache to last for a month (as of writing). There are ups and downs to that. Study pages corresponding to the beginning of the RTK index are likely to see more activity with shared stories and votes, and hence the cache will not be as efficient for them. Time will tell !

    Keep in mind that for a few days, most pages will still take the extra time to be generated because they are not cached yet. However as more users browse the pages, and more pages are cached you should see an improvement.

    The main improvement likely comes from the redesign of the indexes, and getting rid of the "temporary table" problem. And who knows, it may be that on peak times the changes I made are even more beneficial since that would be when bottlenecks happen.

    I'm sorry the update took so long (about 3 hours). As soon as I took the site down, I found out that the MySQL server does not allow partitions! (unlike my developmental version). This was a further performance improvement I had to undo in the code and database schema.

    Some numbers: there are 361871 shared stories as of writing. There are 6209558 stories stored altogether, equivalent to a 1.1 GB MySQL table.

    There are still other interesting ideas I'm brainstorming. One approach for example, to make up for the lack of partitions, would be to move the public stories into a new table. Whenever a story is made public it could be copied there, and that means MySQL would deal with a table that would be less than 100 MB instead of 1.1 GB, with all the public stories addressed by a SQL query clustered together.

By Month