optimization for inner join.. order by.. limit with large order by data

  • Filter
  • Time
  • Show
Clear All
new posts

  • optimization for inner join.. order by.. limit with large order by data

    Trying to optimize the following query in MySQL 5.0.45

    SELECT * FROM artdistinct
    INNER JOIN artartstyles USING (pid)
    WHERE CategoryID = '7778'
    ORDER BY ArtistLN,ArtistFN
    LIMIT 90, 45;

    The artdistinct and artartstyles tables each have about 150K records.

    This takes about 60 seconds to execute. The order by uses two columns that are each a varchar(50). If I change the order by column to pID, the select takes about 3 seconds to execute. Any suggestions on optimization for this? Thanks.

    CREATE TABLE `artdistinct` (
    `pID` int(11) NOT NULL,
    `PublisherCode` char(3) NOT NULL DEFAULT '',
    `ItemNo` varchar(50) NOT NULL,
    `artType` int(11) DEFAULT NULL,
    `IsAvailable` int(11) DEFAULT NULL,
    `CanCanvas` int(11) DEFAULT NULL,
    `CanFrame` int(11) DEFAULT NULL,
    `ArtistLN` varchar(50) DEFAULT NULL,
    `ArtistFN` varchar(50) DEFAULT NULL,
    `Title` varchar(200) DEFAULT NULL,
    `PaperWidth` decimal(10,3) DEFAULT NULL,
    `PaperHeight` decimal(10,3) DEFAULT NULL,
    `ImageWidth` decimal(10,3) DEFAULT NULL,
    `ImageHeight` decimal(10,3) DEFAULT NULL,
    `RetailPrice` decimal(10,2) DEFAULT NULL,
    `count` int(11) DEFAULT NULL,
    PRIMARY KEY (`pID`),
    KEY `NewIndex1` (`ArtistLN`,`ArtistFN`,`Title`),
    KEY `ArtistLN` (`ArtistLN`,`ArtistFN`)

    Index Information
    Indexes Columns Index_Type
    PRIMARY pID Unique
    NewIndex1 ArtistLN, ArtistFN, Title
    ArtistLN ArtistLN, ArtistFN

    CREATE TABLE `artartstyles` (
    `pID` int(11) NOT NULL,
    `CategoryID` int(11) NOT NULL,
    PRIMARY KEY (`pID`,`CategoryID`)

    Index Information
    Indexes Columns Index_Type
    PRIMARY pID, CategoryID Unique

    SHOW PROFILE Result (ordered by duration)

    state duration (summed) in sec percentage
    Sending data 57.77979 99.99960
    freeing items 0.00007 0.00012
    starting 0.00005 0.00009
    init 0.00003 0.00005
    statistics 0.00002 0.00003
    end 0.00001 0.00002
    preparing 0.00001 0.00002
    optimizing 0.00001 0.00002
    Table lock 0.00001 0.00002
    cleaning up 0.00001 0.00002
    Opening tables 0.00001 0.00002
    Total 57.78002 100.00000

    Change Of STATUS VARIABLES Due To Execution Of Query

    variable value description
    Bytes_received 137 Bytes sent from the client to the server
    Bytes_sent 4469 Bytes sent from the server to the client
    Com_select 1 Number of SELECT statements that have been executed
    Handler_commit 1 Number of internal commit statements
    Handler_read_first 1 Number of times the first entry was read from an index. A high value indicates that full index scans were done
    Handler_read_key 152038 Number of requests to read a row based on a key
    Handler_read_next 152035 Number of index columns read with a range constraint or an index scan
    Innodb_buffer_pool_pages_data* 501 The number of pages containing data - both dirty or clean
    Innodb_buffer_pool_pages_misc# 10 The number of buffer pool pages allocated for administrative overhead
    Innodb_buffer_pool_read_ahead_rnd# 41 The number of random reads by Innodb. The query scanned a large portion of a table in random order
    Innodb_buffer_pool_read_ahead_seq# 969 The number of sequential reads by Innodb.
    Innodb_buffer_pool_read_requests# 898039 The number of logical read requests Innodb has done
    Innodb_buffer_pool_reads# 24807 The number of logical reads that were not satisfied from the buffer pool and were read from the disk
    Innodb_data_read# 586055680 The amount of data read by Innodb
    Innodb_data_reads# 35770 The total number of data reads by Innodb
    Innodb_rows_read# 152157 The number of rows read from Innodb tables
    Last_query_cost* 182953 The total cost of this query as computed by the query optimizer
    Questions 1 Number of statements executed by the server
    Select_scan 1 Number of full table scans of the first table in the query
    Slow_queries 1 This query took more than the value specified in long_query_time. It probably requires optimization
    Table_locks_immediate 2 The number of requests for table locks that could be granted immediately

    * Actual values after query execution (not changes due to query)
    # Global values (affected by operations by all clients connected)

    EXPLAIN Result

    id select_type table type possible_keys key key_len refrows filtered Extra
    1 SIMPLE artdistinct index PRIMARY ArtistLN 106 (NULL) 135 112201.48
    1 SIMPLE artartstyle seq_ref PRIMARY PRIMARY 8 qaaf.artdistinct.pID,const 1100.00 Using index

    EXPLAIN EXTENDED Information

    `qaaf`.`artdistinct`.`pID` AS `pID`,
    `qaaf`.`artdistinct`.`PublisherCode` AS `PublisherCode`,
    `qaaf`.`artdistinct`.`ItemNo` AS `ItemNo`,
    `qaaf`.`artdistinct`.`artType` AS `artType`,
    `qaaf`.`artdistinct`.`IsAvailable` AS `IsAvailable`,
    `qaaf`.`artdistinct`.`CanCanvas` AS `CanCanvas`,
    `qaaf`.`artdistinct`.`CanFrame` AS `CanFrame`,
    `qaaf`.`artdistinct`.`ArtistLN` AS `ArtistLN`,
    `qaaf`.`artdistinct`.`ArtistFN` AS `ArtistFN`,
    `qaaf`.`artdistinct`.`Title` AS `Title`,
    `qaaf`.`artdistinct`.`PaperWidth` AS `PaperWidth`,
    `qaaf`.`artdistinct`.`PaperHeight` AS `PaperHeight`,
    `qaaf`.`artdistinct`.`ImageWidth` AS `ImageWidth`,
    `qaaf`.`artdistinct`.`ImageHeight` AS `ImageHeight`,
    `qaaf`.`artdistinct`.`RetailPrice` AS `RetailPrice`,
    `qaaf`.`artdistinct`.`count` AS `count`,
    `qaaf`.`artartstyles`.`CategoryID` AS `CategoryID`
    from `qaaf`.`artdistinct`
    join `qaaf`.`artartstyles`
    where ((`qaaf`.`artartstyles`.`pID` = `qaaf`.`artdistinct`.`pID`)
    and (`qaaf`.`artartstyles`.`CategoryID` = '7778'))
    order by `qaaf`.`artdistinct`.`ArtistLN`,`qaaf`.`artdistinc t`.`Artist FN`
    limit 90,45

  • #2
    How many rows match the following condition?

    WHERE CategoryID = '7778'


    • #3
      only 122, but the result range of rows that this query can be used for range from 2 to 11178.

      I just reran the query using the largest Category and suddenly some of my queries are about 1 second each. Others are 3 to 6 seconds and some occasionally are still about 60 seconds but most are acceptable. I'm really confused now. I have run this many times now with no consistency.

      Yesterday while I was trying to figure this out I added an index for ArtistLN, ArtistFN to the artdistinct table but the performance didn't change. Since yesterday, the only other change I have made is to reboot.

      I don't know if this makes a difference, but the tables that are being used in the query are largely static. They only change when I make an occasional update to the tables when new art is available. During normal use they don't change.


      • #4
        I am thinking you need different indexes. The output you pasted is hard for me to read because some numbers are joined together in EXPLAIN, but I think you might try an index on (CategoryID,ArtistLN,ArtistFN).

        This general type of query looks like a suboptimal design anyway, and you might consider other ways. Looks like a typical pagination query -- search our blog for "pagination" for some ideas.


        • #5
          It is a pagination query so I will take a look your blog for some ideas.

          I thought about using an index for (CategoryID,ArtistLN,ArtistFN but since CategoryID is in a different table than ArtistLN, ArtistFN, I didn't think that this could be done on a join. Am I wrong on this?

          I also thought about de-normalizing this into a single table but I in addition to this query for Art Style, I also have similar queries and paginations for Primary Color, Subjects and Artists so de-normalization is not an option.