November 23, 2014

Read Buffers, mmap, malloc and MySQL Performance

Monty Taylor posted interesting investigation of the fact read_buffer_size variable affects connection speed.

This is not something you would expect right ? me too. Not only global user data is expected to be cached on startup but even if it is not why would you do full table scan to fetch single user information ? Something is fishy here and perhaps bug needs to be filed either resulting in fixes or docummentation updates.

The great thing you see from this post is – o_profile (often together with strace) are great tools for MySQL Performance analyses. Too many people end tuning on plans and indexes while there well could be some internal gotchas.

The next Monty comes up with 2 posts on mmap allocation.

Indeed allocating with mmap is much slower than allocating from memory memory pool which process already “owns” and in real life it can get much slower than Monty results show.

Mmap allocation process can be split to two parts – first is finding the free region of required size, which can get expensive if virtual memory gets fragmented. A while ago I measured over 10 times slow down over time as memory becomes fragmented with 50.000+ of memory areas mapped.

As the gap in the process address space is found kernel has to setup the page tables. The larger the buffer is allocated the more pages needs to be set up so the slower it gets. This part is indeed proportional to number of pages being allocated.

On anonimous mmap (which is used for allocations) physical pages are typically not allocated – this only happens first time you access the page and can be responsible for further penalty – especially if you use the buffer only once.

What can be done to speed things up here ?
Of course you can tune you malloc settings so more buffers are allocated from the pools rather than mmap. This though will at least increase memory consumptions as deallocating of these large buffers will not instantly return data to the OS. Plus I’m not sure how much luck you will have with malloc algorithms if they are forced to deal with very large blocks.

The second possibility is to use large pages. These already can be used for Innodb Buffer Pool and Key Buffer but could be used for other large allocations.

Finally this is yet another reasons internal resource allocation manager makes sense. If you would keep say 20*2MB read buffers in the pool you could just grab one having allocation very quick.

Also MySQL should get smarter in terms of which buffer size should be used, even for sake of speeding up the reads and more optimal resource allocation – if you have small table or LIMIT 10 in full table scan query you well may not need large buffer.

About Peter Zaitsev

Peter managed the High Performance Group within MySQL until 2006, when he founded Percona. Peter has a Master's Degree in Computer Science and is an expert in database kernels, computer hardware, and application scaling.

Comments

  1. Sergei Golubchik says:

    I didn’t see any memory allocations of read_buffer_size happening on connect time
    (with breakpoints on all these places and connecting/disconnecting a few times)

  2. peter says:

    Thanks Sergey,

    I would expect this not to be that simple.

    What is about inserts ? Can it get write buffers allocated in some conditions, like inserting blobs etc.

  3. Sergei Golubchik says:

    Insert – yes, the buffer of that size is used for bulk inserts. INSERT … SELECT for example would probably do it. But not simple single-value inserts. In my tests multi-value inserts didn’t trigger such an allocation either, but I could believe they would under different circumstances.

    Although I didn’t try blobs, I don’t think read_buffer_size would be allocated for single value inserts with blobs.

  4. peter says:

    Interesting. So if we’re having multi value inserts (not insert select) – which buffer is allocated for writes ?
    Or are writes to MYD file performed row by row buffered by OS only ?

    In any case I do not insist on something and I would not be surprised if there is something unusual in Monty’s case
    and it would be great to get the repeatable bug for it

  5. Apachez says:

    By the way, how does other DBMS handle memory allocation?

    For example postgre, do they even have this many memory options available to screw things up?

Speak Your Mind

*