WiredTiger B-Tree versus WiredTiger In-Memory: Q & A

WiredTigerIn this blog, I will provide answers to the Q & A for the WiredTiger B-Tree versus WiredTiger In-Memory webinar.

First, I want to thank everybody for attending the October, 13 webinar. The recording and slides for the webinar are available here. Below is the list of questions that I wasn’t able to fully answer during the webinar, with responses:

Q: Does the In-Memory storage engine have an oplog? Do we need more RAM if the oplog is set to be bigger?
Q: So we turn off the oplog?
Q: How is data replicated without oplog? Do you confound it with journaling?

A: Percona Memory Engine for MongoDB can be started with or without oplog, depending on whether it started as part of a replica set or standalone (you cannot explicitly turn oplog on or off). But if created, oplog will be stored in memory as well. You can still control its size with the option --oplogSize .

The recovery log (journal) is disabled for the Percona Memory Engine.

Q: After a crash of the In-Memory storage engine, does it need a complete initial sync? Means, cloning all databases?

A: Yes.

Q: WiredTiger reserves 50% of RAM for de-compression. Is this also true for the In-Memory engine?

A: Where did you find this information? Please point to its location in the docs in the comments section below. I asked Percona developers to confirm or deny this for the Percona Memory Engine, and this was their answer:

WT decompresses data block-wise, and each block is of some reasonable size (usual numbers are couple of Megs, let’s say). Decompressor knows the size of uncompressed data by reading this info from compressed block (this info is stored during compression). It creates an extra buffer of uncompressed block size, decompresses data into this buffer, then uses that decompressed buffer and frees the initial one. So there’s no reserve of memory for either compression or decompression, and no docs stating that.

Please note that this comment applies only to block compression, which is only used during disk I/O when WiredTiger reads and writes blocks, thus not available for Percona Memory Engine.

Q: There is no compression of data in this engine?

A: The Percona Memory Engine uses only prefix compression for indexes. Theoretically, it can use other types of compression: dictionary and Huffman (but they both disabled in MongoDB).

Q: With all the data in memory, is there much benefit to having indexes on the data?

A: Yes, because with index access you will read less data. While reading from memory is much faster than from disk, it is faster to read just few rows from memory instead of scanning millions.

Q: Our db is 70g. Will we need 70g memory to use Percona In-Memory?
Q: How much memory should be allocated for 70g db size?

A: What storage engine do you use? How do you calculate size? If this is WiredTiger and you count the space it allocates, answer is “yes, you need 70G RAM to use Percona Memory Engine.”

Q: What is the difference in size of data between WiredTiger on disks versus WiredTiger In-Memory?

A: There is no difference: the size is same. Please note that WiredTiger (on which the Percona Memory Engine is based) itself can additionally allocate up to 50% of the amount specified in the --inMemorySize option. You can check db.serverStatus().inMemory.cache to find out how much of the specified memory is used for storing your data. "bytes currently in the cache"  shows the total number of bytes occupied by the physical representation of all MongoDB’s databases, and "maximum bytes configured"  shows what is passed in option --inMemorySize. The difference between the two can be used to calculate the amount of memory in bytes available.

Q: What is the way to convert data from disk to In-Memory? Using mongodump and rebuild the indexes?

A: Yes

Q: An enhancement request is to enable regular and In-Memory engines on the same MongoDB instance.

A: This is a MongoDB limitation, but noted and reported for Percona at https://jira.percona.com/browse/PSMDB-88.

Share this post

Comments (2)

  • Mark Garrett Reply

    Is there a suggested way to do authentication for inMemory instances since the system.users collection dies on restart?

    December 5, 2016 at 6:13 pm
    • Sveta Smirnova Reply

      Hi Mark,

      you can only workaround this limitation. Possible solutions are:

      – Use SSH tunnel for all connections to mongod instance
      – Populate system.users collection on startup using custom script
      – More safe modification of the previous suggestion: use LDAP authentication and populate system.users collection with LDAP users on startup

      December 7, 2016 at 2:50 pm

Leave a Reply