Announcement

Announcement Module
Collapse
No announcement yet.

Left Join Not using index (or how to index this query)?

Page Title Module
Move Remove Collapse
X
Conversation Detail Module
Collapse
  • Filter
  • Time
  • Show
Clear All
new posts

  • Left Join Not using index (or how to index this query)?

    I have the following query that is executing at a VERY slow rate. Usually takes about 6 seconds to return. I have tried several index strategies (to the best of my ability - which I admit is probably lacking) to no avail. I have copied the query below and the explain - any help in optimizing via an index(es) would be greatly appreciated.

    Please be aware that this is a query produced by a boxed application ( SugarCRM ) and I have little control over the way its written (its kind of ugly) unless I dig though the PHP code. I wanted to try an index optimization first if possible.


    SELECT cases.id, cases_cstm.*, cases.case_number, cases.name, accounts.name account_name1, cases.account_id, cases.priority, cases.status, cases.date_entered , cases.modified_user_id, assigned_user0.user_name modified_user_id, assigned_user1.user_name assigned_user_name, accounts.assigned_user_id account_name1_owner, 'Accounts' account_name1_mod, cases.assigned_user_id FROM cases left JOIN cases_cstm ON cases.id = cases_cstm.id_c left JOIN accounts accounts ON accounts.id= cases.account_id AND accounts.deleted=0 AND accounts.deleted=0 left JOIN users assigned_user0 ON assigned_user0.id=cases.modified_user_id left JOIN users assigned_user1 ON assigned_user1.id=cases.assigned_user_id where (1) AND cases.deleted=0 ORDER BY cases.case_number ASC LIMIT 0,21;


    The EXPLAIN:


    +----+-------------+----------------+--------+-------------------------------------------------+---------+---------+-----------------------------------+------+----------------------------------------------+| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |+----+-------------+----------------+--------+-------------------------------------------------+---------+---------+-----------------------------------+------+----------------------------------------------+| 1 | SIMPLE | cases | ALL | NULL | NULL | NULL | NULL | 1495 | Using where; Using temporary; Using filesort || 1 | SIMPLE | cases_cstm | ALL | NULL | NULL | NULL | NULL | 1537 | || 1 | SIMPLE | accounts | eq_ref | PRIMARY,idx_accnt_id_del,idx_accnt_assigned_del | PRIMARY | 108 | infoathand.cases.account_id | 1 | || 1 | SIMPLE | assigned_user0 | eq_ref | PRIMARY | PRIMARY | 108 | infoathand.cases.modified_user_id | 1 | || 1 | SIMPLE | assigned_user1 | eq_ref | PRIMARY | PRIMARY | 108 | infoathand.cases.assigned_user_id | 1 | |+----+-------------+----------------+--------+-------------------------------------------------+---------+---------+-----------------------------------+------+----------------------------------------------+


    I cannot figure out how to get cases and case_cstm to use an index I setup. Strangely (or maybe not), if I change this query to use an INNER JOIN instead of a LEFT JOIN it executes in .5 sec instead of 6 secs with no change to the current indexes.

    Anyhow, any help is appreciated. I can post show index statements if that helps to see the keys of each table. I appreciate any help - I have been banging my head against a wall for the last day to figure out the MySQL optimizer.

    EDIT: jsut wanted to mention the MySQL version is 5.0.22 running on Ubuntu Dapper Server - using MyISAM

  • #2
    Another interesting twist - if I trim the above query down to one LEFT JOIN where the join is on 2 primary keys in the 2 tables cases and case_cstm, MySQL will NOT use the keys. Why?


    SELECT cases.id , cases_cstm.*, cases.case_number , cases.name , cases.priority , cases.status , cases.date_entered , cases.modified_user_id,cases.assigned_user_id FROM cases left JOIN cases_cstm ON cases.id = cases_cstm.id_c where cases.deleted=0


    EXPLAIN:


    +----+-------------+------------+------+---------------+------+---------+------+------+-------------+| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |+----+-------------+------------+------+---------------+------+---------+------+------+-------------+| 1 | SIMPLE | cases | ALL | NULL | NULL | NULL | NULL | 1495 | Using where || 1 | SIMPLE | cases_cstm | ALL | NULL | NULL | NULL | NULL | 1537 | |+----+-------------+------------+------+---------------+------+---------+------+------+-------------+


    This query takes 5-6 seconds to complete. Change it to an INNER join (instead of LEFT) and its done in .07 sec. The INNER join EXPLAIN is below:


    +----+-------------+------------+--------+---------------+---------+---------+------+------+-------------+| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |+----+-------------+------------+--------+---------------+---------+---------+------+------+-------------+| 1 | SIMPLE | cases_cstm | ALL | NULL | NULL | NULL | NULL | 1537 | || 1 | SIMPLE | cases | eq_ref | PRIMARY | PRIMARY | 108 | func | 1 | Using where |+----+-------------+------------+--------+---------------+---------+---------+------+------+-------------+


    Uses the key/index in this one?!

    I have a handful of queries like the one in the first post that are really beating the server into the ground. I would like to optimize using indexes (if possible). Problem is I can't guess what the optimizer will do. Any pointers are welcome... thanks.

    Comment


    • #3
      Go back to your first query and try adding an index on cases(id, case_number). Maybe that helps?

      Make sure any fields which are never null, are marked as NOT NULL.

      You could also try adding STRAIGHT_JOIN to the query to influence the execution planner. Sadly I dont recall the exact theory behind what it does, but I do know Ive had some very good results with it.
      Its often trial and error with MySQL

      SELECT STRAIGHT_JOIN field1, field2 FROM etc etc

      Comment


      • #4
        Thanks for the tips. Its really weird.

        Currently the id field has a primary key index on it and the number field is a regular index. I'll try the multi-column index you suggest and see what happens. The MySQL optimizer is hard to figure out! I'll see if a straight join helps - in a way I hope it doesn't because this query is generated by software so its going to be hard to force the STRAIGHT JOIN into the query (have to dig through PHP code ( ) I was hoping a few well placed indexes would get this sucker to speed up without rewriting the query.


        BTW SHOW INDEX gives me the below (cases table):


        +-------+------------+---------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+| Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment |+-------+------------+---------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+| cases | 0 | PRIMARY | 1 | id | A | 3061 | NULL | NULL | | BTREE | NULL || cases | 1 | case_number | 1 | case_number | A | NULL | NULL | NULL | | BTREE | NULL || cases | 1 | idx_case_name | 1 | name | A | NULL | NULL | NULL | YES | BTREE | NULL |+-------+------------+---------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+

        Comment


        • #5
          carpii wrote on Wed, 22 August 2007 17:40

          Go back to your first query and try adding an index on cases(id, case_number). Maybe that helps?

          Make sure any fields which are never null, are marked as NOT NULL.

          You could also try adding STRAIGHT_JOIN to the query to influence the execution planner. Sadly I dont recall the exact theory behind what it does, but I do know Ive had some very good results with it.
          Its often trial and error with MySQL

          SELECT STRAIGHT_JOIN field1, field2 FROM etc etc






          No luck (

          I tried adding the cases(id, case_number) index and using the STRAIGHT_JOIN syntax - same results. Slow, slow, slow.

          Comment


          • #6
            what about an index just on cases.case_number ?
            Does the explain change at all?

            Whats the primary key on cases. Is it really 108 bytes?

            Comment


            • #7
              According to the SHOW INDEX above, I already have an index on cases.case_number. So it doesn't appear to do anything. How do I find the index length? SHOW INDEX doesn't tell me key length, I guess from the EXPLAIN in a previous post it states that the "key_len" is 108. IS there another command I can run to display this?

              Comment


              • #8
                Are you sure there is a primary key / index on cases_cstm.id_c?

                how about a simple


                EXPLAIN SELECT * FROM cases_cstm WHERE id_c = number


                to figure this out.

                Comment


                • #9
                  chriswest wrote on Thu, 23 August 2007 11:31

                  Are you sure there is a primary key / index on cases_cstm.id_c?

                  how about a simple


                  EXPLAIN SELECT * FROM cases_cstm WHERE id_c = number


                  to figure this out.




                  Here is the EXPLAIN:


                  mysql> explain select * from cases_cstm where id_c = 'f211ee71-2d3f-9db0-99d1-45e448a63c99';+----+-------------+------------+-------+---------------+---------+---------+-------+------+-------+| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |+----+-------------+------------+-------+---------------+---------+---------+-------+------+-------+| 1 | SIMPLE | cases_cstm | const | PRIMARY | PRIMARY | 36 | const | 1 | |+----+-------------+------------+-------+---------------+---------+---------+-------+------+-------+1 row in set (0.03 sec)


                  Below are the table schema for cases and cases_cstm (SHOW CREATE TABLE results):

                  Cases table:

                  CREATE TABLE `cases` ( `id` char(36) NOT NULL, `case_number` int(11) NOT NULL auto_increment, `date_entered` datetime NOT NULL, `date_modified` datetime NOT NULL, `modified_user_id` char(36) NOT NULL, `assigned_user_id` char(36) default NULL, `created_by` char(36) default NULL, `effort_actual` double default NULL, `effort_actual_unit` varchar(20) default NULL, `travel_time` double default NULL, `travel_time_unit` varchar(20) default NULL, `arrival_time` varchar(30) default NULL, `cust_req_no` varchar(30) default NULL, `cust_contact_id` char(36) default NULL, `cust_phone_no` varchar(30) default NULL, `date_closed` date default NULL, `date_billed` date default NULL, `vendor_rma_no` varchar(30) default NULL, `vendor_svcreq_no` varchar(30) default NULL, `contract_id` char(36) default NULL, `asset_id` char(36) default NULL, `asset_serial_no` varchar(100) default NULL, `category` varchar(40) default NULL, `type` varchar(40) default NULL, `deleted` tinyint(1) NOT NULL default '0', `name` varchar(255) default NULL, `account_name` varchar(100) default NULL, `account_id` char(36) default NULL, `status` varchar(25) default NULL, `priority` varchar(25) default NULL, `description` text, `resolution` text, PRIMARY KEY (`id`), KEY `case_number` (`case_number`), KEY `idx_case_name` (`name`)) ENGINE=MyISAM DEFAULT CHARSET=utf8 |



                  Cases_cstm:


                  CREATE TABLE `cases_cstm` ( `id_c` char(36) NOT NULL, `mcs_steps_to_reproduce_c` text, `mcs_applications_multi_c` text NOT NULL, `mcs_supportcase_source_c` varchar(150) default NULL, `mcs_legacy_tt_number_c` int(11) default NULL, PRIMARY KEY (`id_c`)) ENGINE=MyISAM DEFAULT CHARSET=latin1 |


                  Also the SHOW INDEX from cases_cstm for completeness (the SHOW INDEX for cases is in the previous post):


                  +------------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+| Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment |+------------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+| cases_cstm | 0 | PRIMARY | 1 | id_c | A | 3136 | NULL | NULL | | BTREE | NULL |+------------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+1 row in set (0.00 sec)

                  Comment


                  • #10
                    When you are using a LEFT JOIN you are forcing the DBMS to perform the join in order left to right.

                    When you are changing to use an INNER JOIN the optimizer can choose the join order freely and that is why your query is fast with an INNER JOIN since it chooses the right to left order instead since you have a condition on the right table.

                    The STRAIGHT JOIN syntax is just an INNER JOIN where you are forcing the join order to left to right.

                    But my question is if you have an index on cases_cstm.id_c?
                    Since the join order is cases->cases_cstm that is the index that you need.

                    Comment


                    • #11
                      Well I'm no expert on the optimizer's plans, but I suppose your key is just too long in order to be taken into account for the optimzer. Have you tried to force the use of an index?


                      left JOIN cases_cstm ON cases.id = cases_cstm.id_c FORCE INDEX (id_c)

                      Comment


                      • #12
                        sterin wrote on Thu, 23 August 2007 12:07

                        When you are using a LEFT JOIN you are forcing the DBMS to perform the join in order left to right.

                        When you are changing to use an INNER JOIN the optimizer can choose the join order freely and that is why your query is fast with an INNER JOIN since it chooses the right to left order instead since you have a condition on the right table.

                        The STRAIGHT JOIN syntax is just an INNER JOIN where you are forcing the join order to left to right.

                        But my question is if you have an index on cases_cstm.id_c?
                        Since the join order is cases->cases_cstm that is the index that you need.




                        OK - thanks for the great explanation. That makes sense. To answer your question, there is a primary key index on cases_cstm.id_c as shown in the post above yours. I might have posted it at the same time you posted your response...

                        Comment


                        • #13
                          chriswest wrote on Thu, 23 August 2007 12:13

                          Well I'm no expert on the optimizer's plans, but I suppose your key is just too long in order to be taken into account for the optimzer. Have you tried to force the use of an index?


                          left JOIN cases_cstm ON cases.id = cases_cstm.id_c FORCE INDEX (id_c)




                          I did the following (added the FORCE INDEX for the PRIMARY key index in cases.cases_cstm):


                          SELECT cases.id, cases_cstm.*, cases.case_number , cases.name , accounts.name account_name1, cases.account_id , cases.priority , cases.status , cases.date_entered , cases.modified_user_id , assigned_user0.user_name modified_user_id , assigned_user1.user_name assigned_user_name , accounts.assigned_user_id account_name1_owner , 'Accounts' account_name1_mod , cases.assigned_user_id FROM cases left JOIN cases_cstm FORCE INDEX (PRIMARY) ON cases.id = cases_cstm.id_c left JOIN accounts accounts ON accounts.id= cases.account_id AND accounts.deleted=0 AND accounts.deleted=0 left JOIN users assigned_user0 ON assigned_user0.id=cases.modified_user_id left JOIN users assigned_user1 ON assigned_user1.id=cases.assigned_user_id where (1) AND cases.deleted=0 ORDER BY cases.case_number ASC LIMIT 0,21;


                          The EXPLAIN:


                          +----+-------------+----------------+--------+-------------------------------------------------+---------+---------+-----------------------------------+------+----------------------------------------------+| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |+----+-------------+----------------+--------+-------------------------------------------------+---------+---------+-----------------------------------+------+----------------------------------------------+| 1 | SIMPLE | cases | ALL | NULL | NULL | NULL | NULL | 3087 | Using where; Using temporary; Using filesort || 1 | SIMPLE | cases_cstm | ALL | NULL | NULL | NULL | NULL | 3139 | || 1 | SIMPLE | accounts | eq_ref | PRIMARY,idx_accnt_id_del,idx_accnt_assigned_del | PRIMARY | 108 | infoathand.cases.account_id | 1 | || 1 | SIMPLE | assigned_user0 | eq_ref | PRIMARY | PRIMARY | 108 | infoathand.cases.modified_user_id | 1 | || 1 | SIMPLE | assigned_user1 | eq_ref | PRIMARY | PRIMARY | 108 | infoathand.cases.assigned_user_id | 1 | |+----+-------------+----------------+--------+-------------------------------------------------+---------+---------+-----------------------------------+------+----------------------------------------------+



                          As you can see no change... (

                          Comment


                          • #14
                            I just noticed something:


                            CREATE TABLE `cases` ( ...) ENGINE=MyISAM DEFAULT CHARSET=utf8 |


                            and


                            CREATE TABLE `cases_cstm` ( ... ) ENGINE=MyISAM DEFAULT CHARSET=latin1



                            you have different charsets for both tables - and you are joining on char columns: make those two charsets identical )

                            both utf-8 or both latin1

                            Comment


                            • #15
                              chriswest wrote on Thu, 23 August 2007 12:33

                              I just noticed something:


                              CREATE TABLE `cases` ( ...) ENGINE=MyISAM DEFAULT CHARSET=utf8 |


                              and


                              CREATE TABLE `cases_cstm` ( ... ) ENGINE=MyISAM DEFAULT CHARSET=latin1



                              you have different charsets for both tables - and you are joining on char columns: make those two charsets identical )

                              both utf-8 or both latin1



                              LOL - I think that may be the solution! Didnt notice that at all!

                              Is it OK to just change the charset/collation on that column only? Just to be safe since for the rest of the table char columns. I was thinking of issuing this:


                              ALTER TABLE `cases_cstm` MODIFY COLUMN `id_c` CHAR(36) COLLATE utf8_general_ci NOT NULL

                              Comment

                              Working...
                              X