No announcement yet.

Performance of Large SELECT

  • Filter
  • Time
  • Show
Clear All
new posts

  • Performance of Large SELECT

    I have a database (MySQL 5.1 at Ubuntu 10.10) with some 15000 tables each with ~1 000 000 rows on average. Each table has 6 DOUBLE columns. The storage engine is MyISAM. I have a C++ application that loads the data one table at a time and performs some calculations.
    The way I retrieve the data from the database is simply by: SELECT * FROM table ORDER BY timestamp; (timestamp is the first column (DOUBLE) marked as UNIQUE)
    By far most of the time is spent in loading and fetching. It takes ~15s to load and fetch all the rows in one table (tried with the native C API, C++ Connector and MySQL Query Browser).
    When I load the same dataset from disk (plain text file) using fstream the same operation takes only ~4s.

    Is it possible for MySQL to get anywhere near this value?


    P.S. No optimisations (server/tables) attempted so far.
    P.S. Tables.pdf attached to this post contains some statistics from PROCEDURE ANALYSE.

  • #2
    One observation: when I SELECT only one field from the table, DB and filesystem perform equally well (FS is ~0.5s faster), when two fields are SELECTED the time doubles etc. It almost looks line DB reads field-by-field...

    The text file is read line-by-line, each line is then split into fields and the fields are the pushed into vectors.

    The database version does something like this:

    MYSQL_ROW row;mysql_query(conn, "SELECT timestamp FROM raw.dv113 ORDER BY timestamp;");MYSQL_RES* rows1 = mysql_store_result(conn);while ((row = mysql_fetch_row(rows1))) {vector v;v.push_back(safeStrtod(row[0]));v.push_back(safeStrtod(row[1]));v.push_back(safeStrtod(row[2]));v.push_back(safeStrtod(row[3]));v.push_back(safeStrtod(row[4]));v.push_back(safeStrtod(row[5]));}mysql_free_result(rows1);