GET 24/7 LIVE HELP NOW

Announcement

Announcement Module
Collapse
No announcement yet.

baffled by LEFT JOIN versus INNER JOIN speed

Page Title Module
Move Remove Collapse
X
Conversation Detail Module
Collapse
  • Filter
  • Time
  • Show
Clear All
new posts

  • baffled by LEFT JOIN versus INNER JOIN speed

    I've got a two-table setup (well, actually I have 7 tables, with 5 more Authorfunction* tables like the one below, but these 5 don't make much difference in the problem):


    CREATE TABLE `authorfunctionA` (
    `ID` mediumint(9) NOT NULL default '0',
    `value` varchar(255) NOT NULL default '',
    KEY `ID` (`ID`),
    KEY `value` (`value`),
    FULLTEXT KEY `val` (`value`)
    ) ENGINE=MyISAM DEFAULT CHARSET=latin1;



    and


    CREATE TABLE `stories_full` (
    `ID` mediumint(9) NOT NULL default '0',
    `title` varchar(255) NOT NULL default '',
    `displaytitle` varchar(255) NOT NULL default '',
    `story` mediumtext NOT NULL,
    PRIMARY KEY (`ID`)
    ) ENGINE=MyISAM DEFAULT CHARSET=latin1;


    the first table has 165.000 records, the 2nd table has 40.000 records. If I run the query below, mysql needs a full 2 seconds to compute :

    SELECT stories_full.* FROM stories_full LEFT JOIN authorfunctionA AS field_0 ON field_0.id=stories_full.id WHERE (MATCH(field_0.value) AGAINST ('+Roc*' IN BOOLEAN MODE))

    now if I run the same query with a normal join :

    SELECT stories_full.* FROM stories_full JOIN authorfunctionA AS field_0 ON field_0.id=stories_full.id WHERE (MATCH(field_0.value) AGAINST ('+Roc*' IN BOOLEAN MODE)), the result is done in 0.002 seconds

    (this query is still relatively simple : in real life, i'm running 4 joins, and the queries takes up to 15 seconds sometimes)

    here are the explains for the 2 queries :

    +----+-------------+--------------+------+---------------+------+---------+--------------------------+--------+-------------+| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |+----+-------------+--------------+------+---------------+------+---------+--------------------------+--------+-------------+| 1 | SIMPLE | stories_full | ALL | NULL | NULL | NULL | NULL | 164406 | | | 1 | SIMPLE | field_0 | ref | ID | ID | 3 | ipbfull2.stories_full.ID | 1 | Using where | +----+-------------+--------------+------+---------------+------+---------+--------------------------+--------+-------------+

    and

    +----+-------------+--------------+----------+---------------+---------+---------+---------------------+------+-------------+| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |+----+-------------+--------------+----------+---------------+---------+---------+---------------------+------+-------------+| 1 | SIMPLE | field_0 | fulltext | ID,val | val | 0 | | 1 | Using where | | 1 | SIMPLE | stories_full | eq_ref | PRIMARY | PRIMARY | 3 | ipbfull2.field_0.ID | 1 | | +----+-------------+--------------+----------+---------------+---------+---------+---------------------+------+-------------+



    so apparently, the LEFT JOIN is simply not using the index !!!!!

    I've tested this on mysql 4.1.14 and 5.0.45 with the same results

    can anyone shed some light on this matter ? I'm totally in the dark on how to optimize this (

    maaaaaany many thanks in advance !

    ah, a PS : i realised that you might think the query execution order could be in play : running a LEFT join, followed by an inner join might have some query cached. So I ran a dozen more queries in mixed order (first the inner join, then the left join), and every time, the left join was substantially slower

  • #2
    I think that the issue at play here is a misunderstanding of how a LEFT JOIN works.

    An INNER JOIN selects only those records which are common to both tables. Thus you notice that the number of records operated on is only 1, and both indexes are used.

    A LEFT JOIN selects all records from the left side, and whatever records from the right side that match up, filling in NULL values for the right side where there's no match. That's why it did a full scan of the stories_full table, but used the index on the field_0 table.

    Unless there's a real need for a LEFT JOIN, don't use it. In this case, you really don't need it, since you're only returning values from the stories_full table, and only using the other table to narrow down the results you're going to return.

    -Doug

    EDIT: Corrected FULL to INNER JOIN.

    Comment

    Working...
    X