Announcement

Announcement Module
Collapse
No announcement yet.

bad optimization or bad structure?

Page Title Module
Move Remove Collapse
X
Conversation Detail Module
Collapse
  • Filter
  • Time
  • Show
Clear All
new posts

  • bad optimization or bad structure?

    Hello,

    I'm working a database that will contain a bunch of releases (movie-titles if you will) and each release (movie) will get a few tags depending on what type of movie it is, eg. english, pal, ntsc etc..

    When i list these it takes about 2s (100 rows) and as EXPLAIN has explained, it uses filesort and im guessing thats why its slow.

    I've tried to use indexes as far as i know but i haven't really understood how to use them just yet i guess.

    Anyway, here is the structure:


    CREATE TABLE `releases` ( `r_id` int(11) NOT NULL auto_increment, `rel_name` text NOT NULL, `rel_title` text NOT NULL, `rel_date` int(11) NOT NULL default '0', `add_date` int(11) NOT NULL default '0', `imdb` text NOT NULL, `group_id` int(11) NOT NULL default '0', `section_id` int(11) NOT NULL default '0', `add_nickhost` text NOT NULL, PRIMARY KEY (`r_id`), KEY `section_id` (`section_id`), KEY `group_id` (`group_id`), KEY `rel_date` (`rel_date`)) ENGINE=MyISAM DEFAULT CHARSET=latin1;CREATE TABLE `sections` ( `s_id` tinyint(4) NOT NULL auto_increment, `section` text NOT NULL, PRIMARY KEY (`s_id`), KEY `section` (`section`(8))) ENGINE=MyISAM DEFAULT CHARSET=latin1;CREATE TABLE `sections` ( `s_id` tinyint(4) NOT NULL auto_increment, `section` text NOT NULL, PRIMARY KEY (`s_id`), KEY `section` (`section`(8))) ENGINE=MyISAM DEFAULT CHARSET=latin1;CREATE TABLE `groups` ( `g_id` int(11) NOT NULL auto_increment, `group_name` text NOT NULL, PRIMARY KEY (`g_id`)) ENGINE=MyISAM DEFAULT CHARSET=latin1;


    releases has about 80000 rows at this point and tags about 10000.

    query i use is:


    SELECT r_id, rel_date, rel_name, rel_title, group_name , imdb, section, GROUP_CONCAT( DISTINCT tag SEPARATOR ' ' ) as tags FROM tags INNER JOIN releases ON releases.r_id = tags.rel_id INNER JOIN groups ON releases.group_id = groups.g_id INNER JOIN sections ON releases.section_id = sections.s_id WHERE section = 'dvdr' GROUP BY rel_id ORDER BY rel_date DESC LIMIT 0,100


    I'm pretty much stuck here and i hope you can at least point me in the right direction.

    Thanks.

    Best Regards, Patrik.

  • #2
    You have not posted EXPLAIN for the query so it is hard to tell exactly.

    I think it is analyzes much more than 100 rows, looking at amount of joins you have and 100 rows may apply just to one of the tables.

    This query is complex and I would not do such queries in interractive application.

    Comment


    • #3
      oh, sorry. didn't think of EXPLAIN.

      here it is:


      +----+-------------+----------+--------+-----------------------------+---------+---------+-----------------------------+--------+---------------------------------+| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |+----+-------------+----------+--------+-----------------------------+---------+---------+-----------------------------+--------+---------------------------------+| 1 | SIMPLE | tags | ALL | NULL | NULL | NULL | NULL | 105208 | Using temporary; Using filesort || 1 | SIMPLE | releases | eq_ref | PRIMARY,section_id,group_id | PRIMARY | 4 | dupenew.tags.rel_id | 1 | || 1 | SIMPLE | groups | eq_ref | PRIMARY | PRIMARY | 4 | dupenew.releases.group_id | 1 | || 1 | SIMPLE | sections | eq_ref | PRIMARY,section | PRIMARY | 1 | dupenew.releases.section_id | 1 | Using where |+----+-------------+----------+--------+-----------------------------+---------+---------+-----------------------------+--------+---------------------------------+


      As you said it doesn't just check 100 rows.

      I don't know what you mean with an interactive application but its for a web page, but still I'd like it to be fast and i need all the info this query returns.

      Is it possible to change the structure of the database to be able to fetch it faster? but still use relations.

      Best Regards, Patrik

      Comment


      • #4
        I'd suggest you to ether

        - pre-generate results (ie have cron job which updates data each 5 minutes)

        - think about data structure which can be updated in real time but provide this info quickly.

        Comment


        • #5
          I guess I will overlook the structure then, cause to pre-generate is not an option. since its also possible to search and exclude some types of rows (thats what 'tags' are for), i don't want this to be much slower then the "default view".

          thanks for looking over this for me.

          Best Regards, Patrik

          Comment


          • #6
            Well if you can't pre-generate you will need to build it so you can retrieve quickly the info you need building dynamic summary tables.

            It can be often done with tags too.

            If it needs to be fully dynamic you can try denormalizing it and then making parallel on number of CPUs

            Comment


            • #7
              what do you mean by dynamic summary tables?

              I'm trying to test other structures at the time. I tried removing the groups tables (one less JOIN). The thing that slows it all down is still the tags part though.
              I need to get all tags for each row in on the result, i first had it looked up while printing the result but that took, as you can imagine, too much time, cause it prints different links depending on what tags there are.

              If I'm not mistaken though, indexes can't be used if there are different ORDER BY and GROUP BY expressions (source), and I need to use GROUP BY to use GROUP_CONCAT and ORDER BY to get the newest first.

              Would really suck not being able to use a normalized structure, since it uses a denormalized at the time.

              Best Regards, Patrik

              Comment

              Working...
              X