EmergencyEMERGENCY? Get 24/7 Help Now!

TokuDB and PerconaFT database file management part 1 of 2

 | September 27, 2016 |  Posted In: TokuDB

PREVIOUS POST
NEXT POST

okuDB and PerconaFT database file managementIn this blog post, we’ll look at TokuDB and PerconaFT database file management.

The TokuDB/PerconaFT file set consists of many different files that all serve various purposes. These blog posts lists the different types of TokuDB and PerconaFT files, explains their purpose, shows their location and how to move them around.

Peter Zaitsev blogged on the same topic a few years ago. By the time you read back through Peter’s post and reach the end of this series, you should have some ideas to help you to manage your data set more efficiently.

TokuDB and PerconaFT files and file types:

  • tokudb.environment
    • This file is the root of the PerconaFT file set and contains various bits of metadata about the system, such as creation times, current file format versions, etc.
    • PerconaFT will create/expect this file in the directory specified by the MySQL datadir.
  • tokudb.rollback
    • Every transaction within PerconaFT maintains its own transaction rollback log. These logs are stored together within a single PerconaFT dictionary file and take up space within the PerconaFT cachetable (just like any other PerconaFT dictionary).
    • The transaction rollback logs will “undo” any changes made by a transaction if the transaction is explicitly rolled back, or rolled back via recovery as a result of an uncommitted transaction when a crash occurs.
    • PerconaFT will create/expect this file in the directory specified by the MySQL datadir.
  • tokudb.directory
    • PerconaFT maintains a mapping of a dictionary name (example: sbtest.sbtest1.main) to an internal file name (example: _sbtest_sbtest1_main_xx_x_xx.tokudb). This mapping is stored within this single PerconaFT dictionary file and takes up space within the PerconaFT cachetable just like any other PerconaFT dictionary.
    • PerconaFT will created/expect this file in the directory specified by the MySQL datadir.
  • Dictionary files
    • TokuDB dictionary (data) files store actual user data. For each MySQL table there will be:
      • One “status” dictionary that contains metadata about the table.
      • One “main” dictionary that stores the full primary key (an imaginary key is used if one was not explicitly specified) and full row data.
      • One “key” dictionary for each additional key/index on the table.
    • These are typically named: _<database>_<table>_<key>_<internal_txn_id>.tokudb
      PerconaFT creates/expects these files in the directory specified by tokudb_data_dir if set, otherwise the MySQL datadir is used.
  • Recovery log files
    • The PerconaFT recovery log records every operation that modifies a PerconaFT dictionary. Periodically, the system will take a snapshot of the system called a checkpoint. This checkpoint ensures that the modifications recorded within the PerconaFT recovery logs have been applied to the appropriate dictionary files up to a known point in time and synced to disk.
    • These files have a rolling naming convention, but use: log<log_file_number>.tokulog<log_file_format_version>
    • PerconaFT creates/expects these files in the directory specified by tokudb_log_dir if set, otherwise the MySQL datadir is used.
    • PeconaFT does not track what log files should or shouldn’t be present. Upon startup, it discovers the logs in the log dir, and replays them in order. If the wrong logs are present, the recovery aborts and possibly damages the dictionaries.
  • Temporary files
    • PerconaFT might need to create some temporary files in order to perform some operations. When the bulk loader is active, these temporary files might grow to be quite large.
    • As different operations start and finish, the files will come and go.
    • There are no temporary files left behind upon a clean shutdown,
    • PerconaFT creates/expects these files in the directory specified by tokudb_tmp_dir if set. If not, the tokudb_data_dir is used if set, otherwise the MySQL datadir is used.
  • Lock files
    • PerconaFT uses lock files to prevent multiple processes from accessing/writing to the files in the assorted PerconaFT functionality areas. Each lock file will be in the same directory as the file(s) that it is protecting. These empty files are only used as semaphores across processes. They are safe to delete/ignore as long as no server instances are currently running and using the data set.
    • __tokudb_lock_dont_delete_me_environment
    • __tokudb_lock_dont_delete_me_recovery
    • __tokudb_lock_dont_delete_me_logs
    • __tokudb_lock_dont_delete_me_data
    • __tokudb_lock_dont_delete_me_temp

PerconaFT is extremely pedantic about validating its data set. If a file goes missing or unfound, or seems to contain some nonsensical data, it will assert, abort or fail to start. It does this not to annoy you, but to try to protect you from doing any further damage to your data.

Look out for part 2 of this series for information on how to move your log, dictionary, and temp files around correctly.

PREVIOUS POST
NEXT POST
George O. Lorch III

George joined the Percona development team in April 2012. George has over 20 years of experience in software support, development, architecture and project management. Prior to joining Percona, George was focused on Windows based enterprise application server development and network protocol classification and optimization with heavy doses of database schema design, architecture and tuning.

2 Comments

  • George,

    I wonder why are those lockfiles needed especially so many of them. Innodb seems to be able to achieve the same by simply setting the advisory lock on the file before doing anything with it and it seems to be much more clean and works well.

    • That is a good question Peter. This behavior predates the commit history available within the github repo which seems to begin around 2010. There is not much information provided in the commit history as to why these were needed to begin with, and why the split out to be so many. It is yet another great case to make sure you describe your changes in your commit messages rather than just referring to some outside system 😉

Leave a Reply