TokuDB and PerconaFT database file management part 1 of 2

okuDB and PerconaFT database file managementIn this blog post, we’ll look at TokuDB and PerconaFT database file management.

The TokuDB/PerconaFT file set consists of many different files that all serve various purposes. These blog posts lists the different types of TokuDB and PerconaFT files, explains their purpose, shows their location and how to move them around.

Peter Zaitsev blogged on the same topic a few years ago. By the time you read back through Peter’s post and reach the end of this series, you should have some ideas to help you to manage your data set more efficiently.

TokuDB and PerconaFT files and file types:

  • tokudb.environment
    • This file is the root of the PerconaFT file set and contains various bits of metadata about the system, such as creation times, current file format versions, etc.
    • PerconaFT will create/expect this file in the directory specified by the MySQL datadir.
  • tokudb.rollback
    • Every transaction within PerconaFT maintains its own transaction rollback log. These logs are stored together within a single PerconaFT dictionary file and take up space within the PerconaFT cachetable (just like any other PerconaFT dictionary).
    • The transaction rollback logs will “undo” any changes made by a transaction if the transaction is explicitly rolled back, or rolled back via recovery as a result of an uncommitted transaction when a crash occurs.
    • PerconaFT will create/expect this file in the directory specified by the MySQL datadir.
    • PerconaFT maintains a mapping of a dictionary name (example: sbtest.sbtest1.main) to an internal file name (example: _sbtest_sbtest1_main_xx_x_xx.tokudb). This mapping is stored within this single PerconaFT dictionary file and takes up space within the PerconaFT cachetable just like any other PerconaFT dictionary.
    • PerconaFT will created/expect this file in the directory specified by the MySQL datadir.
  • Dictionary files
    • TokuDB dictionary (data) files store actual user data. For each MySQL table there will be:
      • One “status” dictionary that contains metadata about the table.
      • One “main” dictionary that stores the full primary key (an imaginary key is used if one was not explicitly specified) and full row data.
      • One “key” dictionary for each additional key/index on the table.
    • These are typically named: _<database>_<table>_<key>_<internal_txn_id>.tokudb
      PerconaFT creates/expects these files in the directory specified by tokudb_data_dir if set, otherwise the MySQL datadir is used.
  • Recovery log files
    • The PerconaFT recovery log records every operation that modifies a PerconaFT dictionary. Periodically, the system will take a snapshot of the system called a checkpoint. This checkpoint ensures that the modifications recorded within the PerconaFT recovery logs have been applied to the appropriate dictionary files up to a known point in time and synced to disk.
    • These files have a rolling naming convention, but use: log<log_file_number>.tokulog<log_file_format_version>
    • PerconaFT creates/expects these files in the directory specified by tokudb_log_dir if set, otherwise the MySQL datadir is used.
    • PeconaFT does not track what log files should or shouldn’t be present. Upon startup, it discovers the logs in the log dir, and replays them in order. If the wrong logs are present, the recovery aborts and possibly damages the dictionaries.
  • Temporary files
    • PerconaFT might need to create some temporary files in order to perform some operations. When the bulk loader is active, these temporary files might grow to be quite large.
    • As different operations start and finish, the files will come and go.
    • There are no temporary files left behind upon a clean shutdown,
    • PerconaFT creates/expects these files in the directory specified by tokudb_tmp_dir if set. If not, the tokudb_data_dir is used if set, otherwise the MySQL datadir is used.
  • Lock files
    • PerconaFT uses lock files to prevent multiple processes from accessing/writing to the files in the assorted PerconaFT functionality areas. Each lock file will be in the same directory as the file(s) that it is protecting. These empty files are only used as semaphores across processes. They are safe to delete/ignore as long as no server instances are currently running and using the data set.
    • __tokudb_lock_dont_delete_me_environment
    • __tokudb_lock_dont_delete_me_recovery
    • __tokudb_lock_dont_delete_me_logs
    • __tokudb_lock_dont_delete_me_data
    • __tokudb_lock_dont_delete_me_temp

PerconaFT is extremely pedantic about validating its data set. If a file goes missing or unfound, or seems to contain some nonsensical data, it will assert, abort or fail to start. It does this not to annoy you, but to try to protect you from doing any further damage to your data.

Look out for part 2 of this series for information on how to move your log, dictionary, and temp files around correctly.

Share this post

Comments (2)

  • Peter Zaitsev


    I wonder why are those lockfiles needed especially so many of them. Innodb seems to be able to achieve the same by simply setting the advisory lock on the file before doing anything with it and it seems to be much more clean and works well.

    September 27, 2016 at 10:03 pm
    • George O. Lorch III

      That is a good question Peter. This behavior predates the commit history available within the github repo which seems to begin around 2010. There is not much information provided in the commit history as to why these were needed to begin with, and why the split out to be so many. It is yet another great case to make sure you describe your changes in your commit messages rather than just referring to some outside system 😉

      September 27, 2016 at 10:39 pm

Comments are closed.

Use Percona's Technical Forum to ask any follow-up questions on this blog topic.