AddThis Social Bookmark Button

Print

How to Optimize Rank Data in MySQL
Pages: 1, 2, 3, 4, 5, 6, 7

The profiling shows the naïve strategy works well only if the leaderboard page being generated is close to the front--that is, if people are interested in the highest-ranking gamers. If your application follows that pattern, you should optimize for it, but it's not always the case. Assume the gaming site doesn't follow this pattern, because all the low-ranked gamers are vain and want to check their rankings all the time. Now you have to optimize for the pages buried low in the ranks.

In any case, the profiling results are encouraging that I'm going in a good direction with the ranking columns, but benchmarking is crucial.

Benchmark Results

I ran benchmarks to show how the queries perform. The benchmarks randomly choose and request leaderboard pages with the same queries I've shown you. I'm including the benchmarking scripts in the code you can download from this article.

I benchmarked against the data I gave you, with 100,000 rows in the score tables. This load is entirely CPU-bound on my single-disk, single-CPU home computer.

I used innotop to watch which queries ran slowly. It was clearly the COUNT() and LIMIT with OFFSET queries.

To simulate concurrent updates, I added threads that choose a random score and change it. In designs #2 and #3, this means not only changing the score, but updating the rank column, too. I did not batch updates to ranks; I simply updated the affected open-ended range every time, so there's still a lot of room for optimization here.

I averaged across five runs for each level of concurrency, from one to twenty concurrent threads. Each thread generated 100 leaderboards. Here are the benchmark results, in leaderboards per second by number of concurrent requests:

Read Threads Design #1 Design #2 Design #3
Reads Only 1 Writer 5 Writers Reads Only 1 Writer 5 Writers Reads Only 1 Writer 5 Writers
1 24.14 16.99 10.20 95.66 52.23 40.00 3071.69 1614.22 1690.79
2 13.94 12.85 11.25 49.83 39.51 24.90 1567.32 999.32 898.81
3 9.73 9.18 8.00 34.48 26.45 23.31 1052.53 647.87 639.45
4 8.21 7.52 6.07 24.45 24.37 15.93 783.22 468.53 549.41
5 6.59 5.77 5.05 20.61 16.17 13.14 622.64 410.76 410.57
6 5.77 5.07 4.22 16.14 16.68 10.32 510.60 416.45 375.16
7 4.78 4.33 3.76 14.45 15.06 7.82 431.09 373.96 230.78
8 4.39 3.76 3.33 12.05 12.17 6.41 382.65 344.19 248.74
9 4.00 3.30 2.96 11.04 11.18 5.15 344.93 278.55 188.06
10 3.49 3.02 2.69 9.56 9.51 4.82 307.32 231.17 164.88
11 3.14 2.70 2.46 8.69 9.51 4.00 273.30 227.49 186.22
12 2.90 2.50 2.25 8.39 7.30 3.62 247.84 215.50 134.52
13 2.37 2.33 2.10 7.40 6.58 3.50 226.10 200.11 186.03
14 2.16 2.16 1.96 6.62 5.96 3.08 216.78 182.34 178.42
15 1.91 2.01 1.85 7.35 5.16 2.96 195.52 185.21 156.92
16 1.74 1.88 1.71 6.70 4.75 2.90 180.02 165.87 151.03
17 1.66 1.75 1.61 5.91 4.40 2.53 181.98 130.18 135.61
18 1.51 1.66 1.53 5.08 3.97 2.44 157.87 147.98 117.41
19 1.42 1.57 1.45 4.94 3.72 2.33 149.26 147.36 106.36
20 1.32 1.50 1.38 5.09 3.52 2.18 140.16 134.02 114.53

Look how much better design #3 performs! This vindicates my hunch that the key to good performance is getting rid of queries that scan lots of rows.

Pages: 1, 2, 3, 4, 5, 6, 7

Next Pagearrow




-->