Announcement

Announcement Module
Collapse
No announcement yet.

What makes mysql create temporary tables on disk?

Page Title Module
Move Remove Collapse
X
Conversation Detail Module
Collapse
  • Filter
  • Time
  • Show
Clear All
new posts

  • What makes mysql create temporary tables on disk?

    I've long suspected that the 'tmp_table_size' setting in mysql does not work, and temporary tables that are clearly going to fit in memory are created on disk anyway. I've just discovered a simple test case:


    SET GLOBAL tmp_table_size=33554432;CREATE TABLE test ( id int(11) NOT NULL, name varchar(512) NOT NULL, description varchar(513) NOT NULL) ENGINE=MEMORY DEFAULT CHARSET=latin1;INSERT INTO test VALUES (1,'a','b');show status like 'cre%';(SELECT name FROM test WHERE id=1)UNION (SELECT name FROM test WHERE id=2);show status like 'cre%';(SELECT description FROM test WHERE id=1)UNION (SELECT description FROM test WHERE id=2);show status like 'cre%';DROP TABLE test;


    The first of these queries that returns the letter 'a' from a varchar(512) column creates the temporary table in RAM, but the second that selects the letter 'b' from a varchar(513) column creates the temporary table on disk. It isn't a limitation of memory tables, as the original table itself is created with that strorage engine, so why does the 1 byte larger column size make mysql think the results of this query won't fit in 32Mb, despite the source table only having 1 record?

  • #2
    We have been recently struggling with disk temp tables as well, and I came across MySQL 5.1 Reference Manual :: 6 Optimization :: 6.5 Optimizing the MySQL Server :: 6.5.8 How MySQL Uses Internal Temporary Tables which lists this related cause for tmp tables going to disk:

    Quote:

    Presence of any column in a GROUP BY or DISTINCT clause larger than 512 bytes


    Apparently, unions will do it too. I think this manual page could use a great deal more work, particularly in the "why" department, but I'll settle for more on the "what". Maybe a comment there on this point would help. Is there anything else you have found related to tmp tables that is not covered on that page?

    Comment


    • #3
      I reported it as a bug, and they said they will update that manual page to include UNIONs, but they were not forthcoming with an answer to "why"

      Also, the manual is inaccurate - it seems to be 512 characters rather than 512 bytes that is significant. If in my example you declare the character set to be utf-8, then both columns can potentially be more than 512 bytes long, but it is still only the 513 character column that generates a disk table.

      Comment


      • #4
        Very interesting. But isn't UTF-8 still 8 bits per char if you stick to an ASCIIish character set? I.e., I think it only used more than 1 byte for chars outside a certain character set which includes the English stuff, no?

        Comment


        • #5
          The point is it "can" be more than 512 bytes. In the example I gave, the actual content of the table is 'a' in one column and 'b' in the other, so mysql must be basing its decision on the defined column width, not the actual content. Therefore the potential that a 512 character field can be more than 512 bytes means it isn't the byte count that is significant, or there is a bug.

          As an aside, virtually all English text that has passed through a word processor such as Open Office Writer or MS Word contains multi-byte utf-8 characters for smart quotes and lengthened dashes and slanting apostrophe's.

          Comment


          • #6
            Ah, yes, light dawns. All excellent points. Have you tested any queries where the actually content was greater than 512 bytes? If this is just some performance based metric limitation that some one chose, then it is likely to run just fine. But if it is a limitation of the underlying temp table storage engine or something, as with text and blob fields, then it could have some serious consequences. As you say, "or there is a bug," and such a test could help determine it.

            If you have not done such tests, I will, as we are about to switch to UTF8 all over our DB as part of an i18n effort and we have a good deal of DISTINCTs flying around. Let me know if you have and/or I'll let you know what I find.

            Comment


            • #7
              all English text that has passed through a word processor for smart quotes and lengthened dashes and slanting apostrophe's...


              __________________
              Writing jobs

              Comment

              Working...
              X