4. Collection and Index Options¶
4.1. Collection Options¶
In Percona TokuMX, all collection data is stored in Fractal Tree indexes. Documents are stored in a primaryKey
whose index is, by default, {_id: 1}
and the value stored is the BSON document. Secondary indexes store the user-specified key and use the primaryKey
to reference the full document.
The syntax for creating a new collection is unchanged:
db.createCollection('foo')
db.createCollection()
will create a collection using all of our default values. However, collections and indexes in Percona TokuMX support several new options to control storage on disk.
These options can be mixed in the options BSON, for example:
db.createCollection('foo', {compression: 'quicklz',
readPageSize: '16KB',
primaryKey: {ts: 1, _id: 1}})
In other drivers, these options can be set by adding parameters to the create command, for example in Java:
DBObject cmd = new BasicDBObject();
cmd.put("create", "foo");
cmd.put("compression", "quicklz");
cmd.put("pageSize", 16*1024);
DBObject primaryKeyObj = new BasicDBObject();
primaryKeyObj.put("ts", 1);
primaryKeyObj.put("_id", 1);
cmd.put("primaryKey", primaryKeyObj);
CommandResult result = db.command(cmd);
All collection options can be specified at create time, and index-type options affect the primaryKey
index. Those same index-type options can also be used on secondary indexes when they are created, see Index Options.
-
option
primaryKey
¶ Default Value: {_id: 1}
Supported since 1.4.0
The primary key used to store documents, and used by secondary indexes to reference those documents. This is always a clustering
index. See What’s new in TokuMX 1.4, Part 1: Primary keys for more information.
Setting the primary key can be useful when you know you want a clustering index but don’t want to pay for the storage of an additional clustering index on _id
. You may also want to set the primary key if you want to use a partitioned
collection.
The primary key must end in {_id: 1}
.
Example:
db.createCollection('foo', {primaryKey: {ts: 1, _id: 1}})
-
option
partitioned
¶ Default Value: false
Supported since 1.5.0
Specifies that this collection will be partitioned, according to the primaryKey
. See Partitioned Collections for more details.
Example:
db.createCollection('foo', {partitioned: true
primaryKey: {ts: 1, _id: 1}})
-
variable
compression
¶
Specifies the index option compression
for the primaryKey
.
Example:
db.createCollection('foo', {compression: 'quicklz'})
-
variable
pageSize
¶
Specifies the index option pageSize
for the primaryKey
.
Example:
db.createCollection('foo', {pageSize: '8MB'})
-
variable
readPageSize
¶
Specifies the index option readPageSize
for the primaryKey
.
Example:
db.createCollection('foo', {readPageSize: '64KB'})
-
option
fanout
¶
Specifies the index option fanout for the primaryKey
.
Example:
db.createCollection('foo', {fanout: 64})
4.2. Index Options¶
Collection indexes are also stored in Fractal Tree indexes. Secondary indexes store the user-specified key and use the primaryKey
to reference the full document.
The syntax for creating a new index is unchanged:
db.foo.ensureIndex({x: 1})
db.collection.ensureIndex()
will create an index using all of our default values. Indexes in Percona TokuMX support several new options to control storage on disk.
These options can be mixed in the options BSON, for example:
db.foo.ensureIndex({x: 1}, {compression: 'quicklz',
readPageSize: '16KB'})
To control the options for the primaryKey
index, specify the below options as Collection Options.
-
option
clustering
¶ Default Value: false
If true, denotes that this index will be clustering.
Secondary indexes in basic MongoDB store the indexed fields and a pointer to the document. When a query is run using the secondary index, MongoDB uses the secondary index to find the pointer, or pointers, then uses those pointers in the secondary index to lookup the document from the heap-based data store.
MongoDB supports “covered” indexes. A covered index includes fields at the beginning of the index key to support lookups and range scans; the remainder of the key is defined for the values that need to be retrieved as part of the lookup. For example, an index on {x: 1, a: 1}
“covers” the query db.foo.find({x: 30}, {x: 1, a: 1})
. Adding new fields to documents means that they are no longer covered by existing indexes, so you’ll need to drop and recreate them or build new indexes entirely, if you want to cover added fields.
Secondary indexes in Percona TokuMX store the indexed fields and a copy of the primaryKey
, which is used—instead of the pointer in basic MongoDB to look up the full document. Percona TokuMX also supports this same covering technique.
In addition, Percona TokuMX offers clustering indexes. A clustering index, rather than storing a reference to the primaryKey
, instead stores another copy of the full document. This saves the I/O required to find the full document in the primaryKey
index. Essentially, a clustering index “covers” all queries that use that index.
Important
Clustering indexes require storing a full additional copy of the document itself, which is generally not a problem when documents are compressible. Clustering indexes must also be maintained for any update on the collection, not just updates that affect the index’s key. However, the I/O savings for queries that can use a clustering index is dramatic.
Example:
db.foo.ensureIndex({x: 1}, {clustering: true})
-
option
compression
Default Value: zlib
Calues: zlib
,quicklz
,lzma
,none
.
The compression method used on fractal tree nodes on disk.
Compression is generally a tradeoff between CPU cost and size on disk. Some compressors use more CPU to get a better compression factor. Others sacrifice compression factor to get faster compression and decompression speed.
The default, zlib
, is a balanced compression algorithm good for nearly all workloads. The lzma
compression algorithm is very expensive but can get a better compression ratio for most data, and is therefore better suited to archival applications. The quicklz
compression algorithm is typically faster than zlib
but doesn’t achieve as good a compression ratio for most data.
Example:
db.createCollection('foo', {compression: 'quicklz'})
-
option
pageSize
Default Value: ‘4MB’
The block size used to write out fractal tree nodes to disk, before compression.
Page size represents the size of the nodes in the Fractal Tree index (both internal nodes and leaf nodes). Our internal nodes store the pivots for the node, and a message buffer for each path down the tree, for fanout buffers per node.
Pages can be larger than this defined size if a document is larger than the size, this will not cause an error.
Since all nodes are compressed before writing they are generally much smaller when written to disk.
Example:
db.foo.ensureIndex({x: 1}, {pageSize: '8MB'})
-
option
readPageSize
Default Value: ‘64KB’
The block size used to read portions of a fractal tree node from disk for query, before compression.
Read page size represents the smallest portion of a leaf node that can be read from disk. A leaf node in a Fractal Tree index is actually a set of pivots and a series of “basement nodes”, each of size at most readPageSize
, when uncompressed. These “basement nodes” can be read in and cached in-memory independently of other basement nodes.
Example:
db.foo.ensureIndex({x: 1}, {readPageSize: '64KB'})
-
option
fanout
Default Value: 16 Minimum: 4
Supported since: 1.4.0
The maximum fanout of the fractal tree. A higher fanout favors read performance, a lower fanout favors write performance.
Example:
db.foo.ensureIndex({x: 1}, {fanout: 64})
4.3. Modifying Index Options¶
Index options (compression
, pageSize
, etc.) can be changed after the index is created.
This modifies the header so that all tree nodes written out after this point will use the new options; nodes that aren’t changed won’t see the new options until they are. This makes the modification instantaneous, but means the effect will be delayed. You can later force all nodes to be rewritten by optimizing the indexes with reIndex
.
The interface for modifying index options is db.collection.reIndex(index, options)
.
If index
is not present, all indexes for that collection are affected. If index
is present, it can be an index name as a string (e.g.``’a_1_b_1’) or a key pattern as an object (e.g.``{a: 1, b: 1}
), or the string '*'
to indicate all indexes.
If options
is not present or is the empty object {}
, reIndex
runs a “hot optimize” on the specified index(es). This causes all internal nodes to be flushed, and causes all tree nodes to be rewritten. If options
is a non-empty object, it may have the fields compression
, pageSize
, readPageSize
, and fanout
, and it alters those attributes instead of running an optimize.
Examples: Optimize all indexes:
db.collection.reIndex()
// or
db.collection.reIndex('*')
Optimize just the _id
index:
db.collection.reIndex('_id_')
// or
db.collection.reIndex({_id: 1})
Change the compression method of all indexes to lzma
:
db.collection.reIndex('*', {compression: 'lzma'})
Change the compression method of the all but the _id index to ‘quicklz’ and the _id index to ‘lzma’, and force just the _id index to be converted to ‘lzma’ immediately:
db.collection.reIndex('*', {compression: 'quicklz'})
db.collection.reIndex({_id: 1}, {compression: 'lzma'})
db.collection.reIndex({_id: 1})
The reIndex
command can be used directly by drivers, by running it as a normal command on the database containing the target collection, with a command object of the form
{reIndex: "collection_name", [index: [indexName|keyPattern|"*"]], [options: obj]}
where index
and options
are optional parameters, and if present, have the same meaning as above.
4.4. Caveats¶
When creating a unique index, it is possible to add the option dropDups
. This is an arbitrarily destructive operation, so it was not implemented in Percona TokuMX. Even if it were implemented, there is no way for Percona TokuMX to be sure that it dropped the same documents that MongoDB would have dropped.
Therefore, Percona TokuMX ignores the dropDups
option. If there are duplicate entries while building a unique index, the index build will fail.
Contact Us
For free technical help, visit the Percona Community Forum.To report bugs or submit feature requests, open a JIRA ticket.
For paid support and managed or professional services, contact Percona Sales.