GET 24/7 LIVE HELP NOW

Announcement

Announcement Module
Collapse
No announcement yet.

Bad performance with simple join

Page Title Module
Move Remove Collapse
X
Conversation Detail Module
Collapse
  • Filter
  • Time
  • Show
Clear All
new posts

  • Bad performance with simple join

    Hi all,

    I am having issues with the following query, Its very slow (about 3 minutes to complete for just 2 days).

    Could somebody point where is the problem and how I can re-write the query so its much faster?
    I can't see why its going so slow!!

    Note: All the tables queried have the same format, as all of them are statistics from different clusters for the same application.

    The server is a large server: Solaris 10, 8 AMD dual core CPU and 32 GB RAM. Two raid SCSI disks. But during the query I can see just 5% CPU usage!!


    explain extended
    SELECT
    UPPER(CAST(date_format(`cluster2-transactions_per_second`.Da te_Time, '%Y-%m-%d %H:%i') as datetime)) as "Date_Time",
    SUM(DISTINCT (`cluster2-transactions_per_second`.`mail_transactions` )) AS cluster2_mo_attempts,
    SUM(DISTINCT (`cluster3-transactions_per_second`.`mail_transactions` )) AS cluster3_mo_attempts,
    SUM(DISTINCT (`cluster4-transactions_per_second`.`mail_transactions` )) AS cluster4_mo_attempts,
    SUM(DISTINCT (`cluster5-transactions_per_second`.`mail_transactions` )) AS cluster5_mo_attempts,
    SUM(DISTINCT (`cluster6-transactions_per_second`.`mail_transactions` )) AS cluster6_mo_attempts,
    SUM(DISTINCT (`cluster7-transactions_per_second`.`mail_transactions` )) AS cluster7_mo_attempts
    FROM
    `cluster2-transactions_per_second` force index (date_index),
    `cluster3-transactions_per_second` force index (date_index),
    `cluster4-transactions_per_second` force index (date_index),
    `cluster5-transactions_per_second` force index (date_index),
    `cluster6-transactions_per_second` force index (date_index),
    `cluster7-transactions_per_second` force index (date_index)
    WHERE
    `cluster2-transactions_per_second`.Date_Time=`cluster3-trans actions_per_second`.Date_Time AND
    `cluster2-transactions_per_second`.Date_Time=`cluster4-trans actions_per_second`.Date_Time AND
    `cluster2-transactions_per_second`.Date_Time=`cluster5-trans actions_per_second`.Date_Time AND
    `cluster2-transactions_per_second`.Date_Time=`cluster6-trans actions_per_second`.Date_Time AND
    `cluster2-transactions_per_second`.Date_Time=`cluster7-trans actions_per_second`.Date_Time AND
    `cluster2-transactions_per_second`.Date_Time < '2008-10-01 23:55' AND
    `cluster2-transactions_per_second`.Date_Time > '2008-03-19 23:55'
    group by `cluster2-transactions_per_second`.Date_Time order by `cluster2-transactions_per_second`.Date_Time
    \G;



    *************************** 1. row ***************************
    id: 1
    select_type: SIMPLE
    table: cluster2-transactions_per_second
    type: range
    possible_keys: date_index
    key: date_index
    key_len: 22
    ref: NULL
    rows: 1993
    Extra: Using where
    *************************** 2. row ***************************
    id: 1
    select_type: SIMPLE
    table: cluster3-transactions_per_second
    type: ref
    possible_keys: date_index
    key: date_index
    key_len: 22
    ref: load_test_db.cluster2-transactions_per_second.Date_Time
    rows: 19
    Extra:
    *************************** 3. row ***************************
    id: 1
    select_type: SIMPLE
    table: cluster4-transactions_per_second
    type: ref
    possible_keys: date_index
    key: date_index
    key_len: 22
    ref: load_test_db.cluster2-transactions_per_second.Date_Time
    rows: 19
    Extra:
    *************************** 4. row ***************************
    id: 1
    select_type: SIMPLE
    table: cluster5-transactions_per_second
    type: ref
    possible_keys: date_index
    key: date_index
    key_len: 22
    ref: load_test_db.cluster4-transactions_per_second.Date_Time
    rows: 19
    Extra: Using where
    *************************** 5. row ***************************
    id: 1
    select_type: SIMPLE
    table: cluster6-transactions_per_second
    type: ref
    possible_keys: date_index
    key: date_index
    key_len: 22
    ref: load_test_db.cluster5-transactions_per_second.Date_Time
    rows: 19
    Extra: Using where
    *************************** 6. row ***************************
    id: 1
    select_type: SIMPLE
    table: cluster7-transactions_per_second
    type: ref
    possible_keys: date_index
    key: date_index
    key_len: 22
    ref: load_test_db.cluster2-transactions_per_second.Date_Time
    rows: 19
    Extra:
    6 rows in set, 1 warning (0.00 sec)



    mysql> describe `cluster2-transactions_per_second`;
    +-------------------+--------------+------+-----+----------- ----------+-------+
    | Field | Type | Null | Key | Default | Extra |
    +-------------------+--------------+------+-----+----------- ----------+-------+
    | Date_Time | varchar(20) | NO | MUL | 0000-00-00 00:00:00 | |
    | HostName | varchar(255) | NO | MUL | | |
    | mail_transactions | double | YES | | NULL | |
    | CPU_usage | double | YES | | NULL | |
    +-------------------+--------------+------+-----+----------- ----------+-------+
    4 rows in set (0.00 sec)



    mysql> describe `cluster3-transactions_per_second`;
    +-------------------+--------------+------+-----+----------- ----------+-------+
    | Field | Type | Null | Key | Default | Extra |
    +-------------------+--------------+------+-----+----------- ----------+-------+
    | Date_Time | varchar(20) | NO | MUL | 0000-00-00 00:00:00 | |
    | HostName | varchar(255) | NO | MUL | | |
    | mail_transactions | double | YES | | NULL | |
    | CPU_usage | double | YES | | NULL | |
    +-------------------+--------------+------+-----+----------- ----------+-------+
    4 rows in set (0.00 sec)

    mysql> describe `cluster4-transactions_per_second`;
    +-------------------+--------------+------+-----+----------- ----------+-------+
    | Field | Type | Null | Key | Default | Extra |
    +-------------------+--------------+------+-----+----------- ----------+-------+
    | Date_Time | varchar(20) | NO | MUL | 0000-00-00 00:00:00 | |
    | HostName | varchar(255) | NO | MUL | | |
    | mail_transactions | double | YES | | NULL | |
    | CPU_usage | double | YES | | NULL | |
    +-------------------+--------------+------+-----+----------- ----------+-------+
    4 rows in set (0.00 sec)

    mysql> describe `cluster5-transactions_per_second`;
    +-------------------+--------------+------+-----+----------- ----------+-------+
    | Field | Type | Null | Key | Default | Extra |
    +-------------------+--------------+------+-----+----------- ----------+-------+
    | Date_Time | varchar(20) | NO | MUL | 0000-00-00 00:00:00 | |
    | HostName | varchar(255) | NO | MUL | | |
    | mail_transactions | double | YES | | NULL | |
    | CPU_usage | double | YES | | NULL | |
    +-------------------+--------------+------+-----+----------- ----------+-------+
    4 rows in set (0.00 sec)

    mysql> describe `cluster6-transactions_per_second`;
    +-------------------+--------------+------+-----+----------- ----------+-------+
    | Field | Type | Null | Key | Default | Extra |
    +-------------------+--------------+------+-----+----------- ----------+-------+
    | Date_Time | varchar(20) | NO | MUL | 0000-00-00 00:00:00 | |
    | HostName | varchar(255) | NO | MUL | | |
    | mail_transactions | double | YES | | NULL | |
    | CPU_usage | double | YES | | NULL | |
    +-------------------+--------------+------+-----+----------- ----------+-------+
    4 rows in set (0.00 sec)

    mysql> describe `cluster7-transactions_per_second`;
    +-------------------+--------------+------+-----+----------- ----------+-------+
    | Field | Type | Null | Key | Default | Extra |
    +-------------------+--------------+------+-----+----------- ----------+-------+
    | Date_Time | varchar(20) | NO | MUL | 0000-00-00 00:00:00 | |
    | HostName | varchar(255) | NO | MUL | | |
    | mail_transactions | double | YES | | NULL | |
    | CPU_usage | double | YES | | NULL | |
    +-------------------+--------------+------+-----+----------- ----------+-------+
    4 rows in set (0.00 sec)

    mysql> select count(*) from `cluster2-transactions_per_second`;
    +----------+
    | count(*) |
    +----------+
    | 183382 |
    +----------+


    mysql> select count(*) from `cluster3-transactions_per_second`;
    +----------+
    | count(*) |
    +----------+
    | 183382 |
    +----------+

    mysql> show create table `cluster2-transactions_per_second`;
    +-----------------------------------------------------+----- ------------------------------------------------------------ ------------------------------------------------------------ ------------------------------------------------------------ ------------------------------------------------------------ ------------------------------------------------------------ ----------------------------------------------------------+
    | Table | Create Table |
    +-----------------------------------------------------+----- ------------------------------------------------------------ ------------------------------------------------------------ ------------------------------------------------------------ ------------------------------------------------------------ ------------------------------------------------------------ ----------------------------------------------------------+
    | cluster2-transactions_per_second | CREATE TABLE `cluster2-transactions_per_second`(
    `Date_Time` varchar(20) NOT NULL default '0000-00-00 00:00:00',
    `HostName` varchar(255) NOT NULL default '',
    `mail_transactions` double default NULL,
    `CPU_usage` double default NULL,
    KEY `date_index` (`Date_Time`),
    KEY `hostname_index` (`HostName`)
    ) ENGINE=MyISAM DEFAULT CHARSET=latin1 |
    +-----------------------------------------------------+----- ------------------------------------------------------------ ------------------------------------------------------------ ------------------------------------------------------------ ------------------------------------------------------------ ------------------------------------------------------------ ----------------------------------------------------------+
    1 row in set (0.00 sec)

    mysql> show variables;
    +---------------------------------+------------------------- ---------------+
    | Variable_name | Value |
    +---------------------------------+------------------------- ---------------+
    | auto_increment_increment | 1 |
    | auto_increment_offset | 1 |
    | automatic_sp_privileges | ON |
    | back_log | 50 |
    | basedir | /opt/mysql/mysql/ |
    | binlog_cache_size | 2097152 |
    | bulk_insert_buffer_size | 134217728 |
    | character_set_client | latin1 |
    | character_set_connection | latin1 |
    | character_set_database | latin1 |
    | character_set_filesystem | binary |
    | character_set_results | latin1 |
    | character_set_server | latin1 |
    | character_set_system | utf8 |
    | character_sets_dir | /opt/mysql/mysql/share/mysql/charsets/ |
    | collation_connection | latin1_swedish_ci |
    | collation_database | latin1_swedish_ci |
    | collation_server | latin1_swedish_ci |
    | completion_type | 0 |
    | concurrent_insert | 1 |
    | connect_timeout | 30 |
    | datadir | /var/lib/mysql/ |
    | date_format | %Y-%m-%d |
    | datetime_format | %Y-%m-%d %H:%i:%s |
    | default_week_format | 0 |
    | delay_key_write | ON |
    | delayed_insert_limit | 100 |
    | delayed_insert_timeout | 300 |
    | delayed_queue_size | 1000 |
    | div_precision_increment | 4 |
    | keep_files_on_create | OFF |
    | engine_condition_pushdown | OFF |
    | expire_logs_days | 0 |
    | flush | OFF |
    | flush_time | 0 |
    | ft_boolean_syntax | + -><()~*:""&| |
    | ft_max_word_len | 84 |
    | ft_min_word_len | 4 |
    | ft_query_expansion_limit | 20 |
    | ft_stopword_file | (built-in) |
    | group_concat_max_len | 1024 |
    | have_archive | YES |
    | have_bdb | NO |
    | have_blackhole_engine | YES |
    | have_compress | YES |
    | have_crypt | YES |
    | have_csv | YES |
    | have_dynamic_loading | YES |
    | have_example_engine | YES |
    | have_federated_engine | YES |
    | have_geometry | YES |
    | have_innodb | DISABLED |
    | have_isam | NO |
    | have_merge_engine | YES |
    | have_ndbcluster | NO |
    | have_openssl | DISABLED |
    | have_ssl | DISABLED |
    | have_query_cache | YES |
    | have_raid | NO |
    | have_rtree_keys | YES |
    | have_symlink | YES |
    | hostname | mdbserver |
    | init_connect | |
    | init_file | |
    | init_slave | |
    | innodb_additional_mem_pool_size | 16777216 |
    | innodb_autoextend_increment | 8 |
    | innodb_buffer_pool_awe_mem_mb | 0 |
    | innodb_buffer_pool_size | 2147483648 |
    | innodb_checksums | ON |
    | innodb_commit_concurrency | 0 |
    | innodb_concurrency_tickets | 500 |
    | innodb_data_file_path | ibdata1:200M:autoextend |
    | innodb_data_home_dir | |
    | innodb_adaptive_hash_index | ON |
    | innodb_doublewrite | ON |
    | innodb_fast_shutdown | 1 |
    | innodb_file_io_threads | 4 |
    | innodb_file_per_table | OFF |
    | innodb_flush_log_at_trx_commit | 1 |
    | innodb_flush_method | |
    | innodb_force_recovery | 0 |
    | innodb_lock_wait_timeout | 120 |
    | innodb_locks_unsafe_for_binlog | OFF |
    | innodb_log_arch_dir | |
    | innodb_log_archive | OFF |
    | innodb_log_buffer_size | 8388608 |
    | innodb_log_file_size | 268435456 |
    | innodb_log_files_in_group | 3 |
    | innodb_log_group_home_dir | ./ |
    | innodb_max_dirty_pages_pct | 90 |
    | innodb_max_purge_lag | 0 |
    | innodb_mirrored_log_groups | 1 |
    | innodb_open_files | 300 |
    | innodb_rollback_on_timeout | OFF |
    | innodb_support_xa | ON |
    | innodb_sync_spin_loops | 20 |
    | innodb_table_locks | ON |
    | innodb_thread_concurrency | 16 |
    | innodb_thread_sleep_delay | 10000 |
    | interactive_timeout | 28800 |
    | join_buffer_size | 1048571904 |
    | key_buffer_size | 314572800 |
    | key_cache_age_threshold | 300 |
    | key_cache_block_size | 1024 |
    | key_cache_division_limit | 100 |
    | language | /opt/mysql/mysql/share/mysql/english/ |
    | large_files_support | ON |
    | large_page_size | 0 |
    | large_pages | OFF |
    | lc_time_names | en_US |
    | license | GPL |
    | local_infile | ON |
    | locked_in_memory | OFF |
    | log | OFF |
    | log_bin | ON |
    | log_bin_trust_function_creators | OFF |
    | log_error | |
    | log_queries_not_using_indexes | OFF |
    | log_slave_updates | OFF |
    | log_slow_queries | ON |
    | log_warnings | 2 |
    | long_query_time | 2 |
    | low_priority_updates | OFF |
    | lower_case_file_system | OFF |
    | lower_case_table_names | 0 |
    | max_allowed_packet | 67107840 |
    | max_binlog_cache_size | 4294967295 |
    | max_binlog_size | 1073741824 |
    | max_connect_errors | 100000000 |
    | max_connections | 2000 |
    | max_delayed_threads | 20 |
    | max_error_count | 64 |
    | max_heap_table_size | 314572800 |
    | max_insert_delayed_threads | 20 |
    | max_join_size | 18446744073709551615 |
    | max_length_for_sort_data | 1024 |
    | max_prepared_stmt_count | 16382 |
    | max_relay_log_size | 0 |
    | max_seeks_for_key | 4294967295 |
    | max_sort_length | 1024 |
    | max_sp_recursion_depth | 0 |
    | max_tmp_tables | 32 |
    | max_user_connections | 0 |
    | max_write_lock_count | 4294967295 |
    | multi_range_count | 256 |
    | myisam_data_pointer_size | 6 |
    | myisam_max_sort_file_size | 10737418240 |
    | myisam_recover_options | DEFAULT |
    | myisam_repair_threads | 1 |
    | myisam_sort_buffer_size | 104857600 |
    | myisam_stats_method | nulls_unequal |
    | net_buffer_length | 16384 |
    | net_read_timeout | 30 |
    | net_retry_count | 10 |
    | net_write_timeout | 60 |
    | new | OFF |
    | old_passwords | OFF |
    | open_files_limit | 26106 |
    | optimizer_prune_level | 1 |
    | optimizer_search_depth | 62 |
    | pid_file | /var/lib/mysql/mdbserver.pid |
    | port | 3306 |
    | preload_buffer_size | 32768 |
    | protocol_version | 10 |
    | query_alloc_block_size | 8192 |
    | query_cache_limit | 2097152 |
    | query_cache_min_res_unit | 4096 |
    | query_cache_size | 134217728 |
    | query_cache_type | ON |
    | query_cache_wlock_invalidate | OFF |
    | query_prealloc_size | 8192 |
    | range_alloc_block_size | 2048 |
    | read_buffer_size | 524283904 |
    | read_only | OFF |
    | read_rnd_buffer_size | 209711104 |
    | relay_log | |
    | relay_log_index | |
    | relay_log_info_file | relay-log.info |
    | relay_log_purge | ON |
    | relay_log_space_limit | 0 |
    | rpl_recovery_rank | 0 |
    | secure_auth | OFF |
    | secure_file_priv | |
    | server_id | 1 |
    | skip_external_locking | ON |
    | skip_networking | OFF |
    | skip_show_database | OFF |
    | slave_compressed_protocol | OFF |
    | slave_load_tmpdir | /tmp/ |
    | slave_net_timeout | 3600 |
    | slave_skip_errors | OFF |
    | slave_transaction_retries | 10 |
    | slow_launch_time | 2 |
    | socket | /tmp/mysql.sock |
    | sort_buffer_size | 629145592 |
    | sql_big_selects | ON |
    | sql_mode | |
    | sql_notes | ON |
    | sql_warnings | OFF |
    | ssl_ca | |
    | ssl_capath | |
    | ssl_cert | |
    | ssl_cipher | |
    | ssl_key | |
    | storage_engine | MyISAM |
    | sync_binlog | 0 |
    | sync_frm | ON |
    | system_time_zone | GMT |
    | table_cache | 12048 |
    | table_lock_wait_timeout | 50 |
    | table_type | MyISAM |
    | thread_cache_size | 8 |
    | thread_concurrency | 16 |
    | thread_stack | 196608 |
    | time_format | %H:%i:%s |
    | time_zone | SYSTEM |
    | timed_mutexes | OFF |
    | tmp_table_size | 1048576000 |
    | tmpdir | /tmp |
    | transaction_alloc_block_size | 8192 |
    | transaction_prealloc_size | 4096 |
    | tx_isolation | REPEATABLE-READ |
    | updatable_views_with_limit | YES |
    | version | 5.0.52-enterprise-gpl-log |
    | version_comment | MySQL Enterprise Server (GPL) |
    | version_compile_machine | i386 |
    | version_compile_os | pc-solaris2.10 |
    | wait_timeout | 28800 |
    +---------------------------------+------------------------- ---------------+

  • #2
    Quote:


    ... 8 AMD dual core CPU ...


    Does that appear to the OS as 16 CPU's?

    MySQL executes a query in only _one_ thread.
    Which means that if you have a CPU intensive query it will only be able to use 100/16=6.25 percent cpu usage on a 16 way machine.


    But regardless of that it seems like a long time for that query so here are some thoughts:
    1.
    Why is your "Date_Time" a VARCHAR(20) and not a real DATETIME?
    Generally you should try to avoid joining on large fields so just by changing this to a DATETIME you will instead join on a 8 byte field instead of a 22 byte field as you have today.
    The difference is almost 1/3 in index size.

    2.
    Why are you using SUM(DISTINCT(mail_transactions))?
    Since I don't really know your table structure I can't really understand. But the SUM(DISTINCT()) combination sounds a bit strange, if you had been using COUNT(DISTINCT()) I could have understood it better since you are then counting the distinct values. But the SUM means that you are summarizing on a filtered value where if two mail_transactions happen to have the same value you are filtering it away.
    Do you have duplicate entries in the tables that you want to get rid of?

    3.
    Just to rule this out, do you have any IO load or swapping taking place during the execution of this query?


    BTW:
    I applauding your excellent post, a wealth of information which makes trying to figure out your problem so much easier. )

    Comment


    • #3
      My concern is about variables,
      + join_buffer_size = 1048571904 ~ 1000M
      + key_buffer_size = 314572800 ~ 300M
      + read_buffer_size = 524283904 ~ 500M
      + read_rnd_buffer_size = 209711104 ~ 200M

      IMHO, join, read, and read_rnd buffer size is too large; besides that key_buffer_size is small comparing to the MyISAM tables size.

      Comment


      • #4
        Hi all,

        Thanks for the answers. You are correct, its only using one CPU as it runs the query on a single thread. (MySQL limitation)

        The goal of the query is to sum the amount of mail messages that we have per cluster in a single table.

        Each table have the statistical records from a mail cluster, these records are written every 15 minutes.

        For every round the application writes a row per node. The date-time varchar format is a given from the application, we can't change it.

        So for instance, for cluster2:

        mysql> describe `cluster2-transactions_per_second`;
        +-------------------+--------------+------+-----+----------- ----------+-------+
        | Field | Type | Null | Key | Default | Extra |
        +-------------------+--------------+------+-----+----------- ----------+-------+
        | Date_Time | varchar(20) | NO | MUL | 0000-00-00 00:00:00 | |
        | HostName | varchar(255) | NO | MUL | | |
        | mail_transactions | double | YES | | NULL | |
        | CPU_usage | double | YES | | NULL | |
        +-------------------+--------------+------+-----+----------- ----------+-------+
        4 rows in set (0.00 sec)

        we have :
        2008-03-20 18:30:00,cluster2_node1,52.902,14.2
        2008-03-20 18:30:00,cluster2_node2,53.312,15.4
        2008-03-20 18:30:00,cluster2_node3,55.872,15.1
        2008-03-20 18:30:00,cluster2_node4,57.311,15.0
        2008-03-20 18:30:00,cluster2_node5,44.333,14.9

        and

        2008-03-20 18:45:00,cluster2_node1,53.902,15.2
        2008-03-20 18:45:00,cluster2_node2,53.312,15.4
        2008-03-20 18:45:00,cluster2_node3,54.872,15.5
        2008-03-20 18:45:00,cluster2_node4,54.311,15.1
        2008-03-20 18:45:00,cluster2_node5,49.333,14.8

        So what we want to achieve is a result like, for instance:

        Date_Time,cluster2_mo_attempts,cluster3_mo_attempt s,cluster4 _mo_attempts,cluster5_mo_attempts,cluster6_mo_atte mpts
        2008-03-20 18:45:00,264.502,153.142,353.911,453.902,533.420

        My guess is that the query is doing a date comparison per row, then doing the distinc-sum. So if there are 5 rows per date-time entry and we have 5 tables is doing a permutation of 5 for every table and row to present.

        The ideal thing will be to do the sum-distinc per table and then do the comparison, so we have only one row per date-time and table to compare.

        Could somebody tell me how to achieve this? I can't think of any subquery that will solve it.

        Comment


        • #5
          concern is about variables,
          + join_buffer_size = 1048571904 ~ 1000M
          + key_buffer_size = 314572800 ~ 300M
          + read_buffer_size = 524283904 ~ 500M
          + read_rnd_buffer_size = 209711104 ~ 200M


          Again you are correct. For testing, I change these for to huge sizes to avoid use disk on the query to check if the I/0 was the issue.

          Comment


          • #6
            1.
            Your Date_Time string is in the right format.
            It's just that the column is defined in the wrong format in the table.
            A string like '2008-10-01 23:55:00' is the ISO standard format to represent a date and time.
            And this is the format that MySQL takes/returns when you insert/select times into/from a DATETIME column.
            So you can without any changes to the application make an alteration of the table so that this column is a DATETIME and the application will not know the difference.
            Plus you can get rid of the very ugly cast(date_format()) stuff that you have in this query on this column.

            2.
            Don't use double to store decimal values. Use the NUMERIC type instead.
            In your table both the CPU and nr of transactions are created as double and double is per definition an approximation of an value.
            For example in your case the CPU load always seems to be two digits and one decimal, so type should be NUMERIC(3,1). that means the largest value can be 99.9 and the smallest value can be -99.9.

            The same goes for the transactions only this values seems to be 2 digits and 3 decimals so NUMERIC(5,3) could seem appropriate for this (unless there is a chance that it races above 99.999 transactions then a NUMERIC(6,3) is more appropriate.
            Here's an example of how I defined the tables.

            CREATE TABLE `cluster2-transactions_per_second` ( `Date_Time` datetime NOT NULL DEFAULT '0000-00-00 00:00:00', `HostName` varchar(255) NOT NULL DEFAULT '', `mail_transactions` decimal(6,3) DEFAULT NULL, `CPU_usage` decimal(3,1) DEFAULT NULL, KEY `c2_ix_datetime_trans` (`Date_Time`,`mail_transactions`), KEY `hostname_index` (`HostName`)) ENGINE=MyISAM DEFAULT CHARSET=latin1


            3.
            I recommend avoid using a subtraction "-" sign in a table name.
            That is what is forcing you to use back-ticks around the table names.
            If you want separation for readability stick to the underscore sign.
            ( I still used it in the posts here on this post since you have it in your database, but you should avoid it).

            4.
            Now I understand why you had a SUM(DISTINCT...) but the underlying problem is a cross join. And that is solved by the query below in point 5.

            But the problem with your SUM(DISTINCT ...) is that if you have these rows in your table:

            2008-03-20 18:30:00,cluster2_node1,52.902,14.22008-03-20 18:30:00,cluster2_node2,53.312,15.42008-03-20 18:30:00,cluster2_node3,55.872,15.12008-03-20 18:30:00,cluster2_node4,53.312,15.02008-03-20 18:30:00,cluster2_node5,44.333,14.9

            Notice how row 2 and 4 just happen to have the same nr of transactions.
            Then the DISTINCT means that one of these rows will be filtered away and then you will only calculate the sum of the 4 remaining rows. And suddenly you have dropped 1/5 of the transactions.
            That is why I was questioning the SUM(DISTINCT ...) in the first place.

            5.
            I rewrote your query quite a lot because your query was performing a cross join since you had more than one row with the same Date_Time value per table.
            Your DISTINCTs was covering this up but it forced the DBMS to do an enormous amount of unnecessary work since the temporary table would be so huge.

            My query is utilizing sub-selects and aliases and dropped the DISTINCT:

            SELECT ctime.Date_Time as "Date_Time", ( select SUM(c2.`mail_transactions`) FROM `cluster2-transactions_per_second` c2 WHERE ctime.Date_Time = c2.Date_Time) AS cluster2_mo_attempts, (select SUM(c3.`mail_transactions`) FROM `cluster3-transactions_per_second` c3 WHERE ctime.Date_Time = c3.Date_Time) AS cluster3_mo_attempts, (select SUM(c4.`mail_transactions`) FROM `cluster4-transactions_per_second` c4 WHERE ctime.Date_Time = c4.Date_Time) AS cluster4_mo_attempts, (select SUM(c5.`mail_transactions`) FROM `cluster5-transactions_per_second` c5 WHERE ctime.Date_Time = c5.Date_Time) AS cluster5_mo_attempts, (select SUM(c6.`mail_transactions`) FROM `cluster6-transactions_per_second` c6 WHERE ctime.Date_Time = c6.Date_Time) AS cluster6_mo_attempts, (select SUM(c7.`mail_transactions`) FROM `cluster7-transactions_per_second` c7 WHERE ctime.Date_Time = c7.Date_Time) AS cluster7_mo_attemptsFROM (SELECT DISTINCT Date_Time FROM `cluster2-transactions_per_second` c WHERE c.Date_Time < '2008-10-01 23:55:00' AND c.Date_Time > '2008-03-19 23:55:00' ) ctimeORDER BY ctime.Date_Time;

            This query first creates a temp table ctime which only contains unique time values.
            Then it performs the subselects in the SELECT part matching the current unique time value and summarizes these 5 values.
            That way you will only have 5 records at a time for the subselects to summarize.

            6.
            To tweak that little extre of speed you can add these combined index on all tables so that MySQL doesn't have to read from the table at all to execute this query.

            ALTER TABLE `cluster2-transactions_per_second` ADD INDEX c2_ix_datetime_trans(Date_Time, mail_transactions);ALTER TABLE `cluster3-transactions_per_second` ADD INDEX c3_ix_datetime_trans(Date_Time, mail_transactions);ALTER TABLE `cluster4-transactions_per_second` ADD INDEX c4_ix_datetime_trans(Date_Time, mail_transactions);ALTER TABLE `cluster5-transactions_per_second` ADD INDEX c5_ix_datetime_trans(Date_Time, mail_transactions);ALTER TABLE `cluster6-transactions_per_second` ADD INDEX c6_ix_datetime_trans(Date_Time, mail_transactions);ALTER TABLE `cluster7-transactions_per_second` ADD INDEX c7_ix_datetime_trans(Date_Time, mail_transactions);


            It got to be a long post with a lot of pointers, but the main one is the different structure of the query so that it doesn't perform a cross join as it did before.
            The other ones are recommendations.

            Please let me know what kind of speed increase you got.

            Cheers!

            Comment


            • #7
              Hi sterin,

              First of all, thanks for your wide and clear response.

              The query you proposed improved the response time from minutes to few seconds!!!

              Thanks again, thats exactly the kind of feedback I was looking for.

              By Reading your query I can see what the optimizer and MySQL engine was performing wrong and how to work around this.

              Also the workload seems to spread better on the CPU.

              I would like to thank you once more for your response.

              Comment


              • #8
                My pleasure, nice to know that I have made a difference in the world. )

                Take care!

                Comment

                Working...
                X