GET 24/7 LIVE HELP NOW

Announcement

Announcement Module
Collapse
No announcement yet.

varchar index killing csv import

Page Title Module
Move Remove Collapse
X
Conversation Detail Module
Collapse
  • Filter
  • Time
  • Show
Clear All
new posts

  • varchar index killing csv import

    I have a large csv file of about 2 million domain names from urlblacklist.com

    I have an app that queries the table a lot and in order to lookup the domain names fast they need to be indexed. The table is innodb that has columns for domain and categoryID, the combination of these two are the unique key.

    When I try to load 2 million records into the table (load infile) it gets to about 1.5 million then starts to have a real hard time. I let it run for 12 hours but it didnt finish so I gave up.

    My guess is that this is because it is having to check for a unique key on every insert, so as the number of records grows I am put in a worse and worse situation.

    Is there something I can do to make this work? Is mysql just not the right tool? The oracle buffs out there are saying I am just using the wrong platform, is there something I can do to reasonably use mysql for this application?

  • #2
    You're right.

    During Load Data Infile MySQL checks UNIQUE indexes at once while other indexes are built after load is done.

    To optimize it you can ether:

    - have your key_buffer large enough for index fits well into it.
    - load URLs sorted by URL in this case unique checks will be fast.

    Comment


    • #3
      ievolve wrote on Sat, 24 February 2007 19:48

      Is mysql just not the right tool? The oracle buffs out there are saying I am just using the wrong platform, is there something I can do to reasonably use mysql for this application?


      I think InnoDB is wrong tool for this. As soon as your active data set size outgrows buffer pool, you start getting very serious performance problems.

      Comment


      • #4
        If you data set outgrowth caches you'll have the problems in any case. Innodb is similar to MyISAM in this case, it just can be much larger because indexes are not packed.

        There is no silver bullet out where and each choice has benefits and drawbacks.

        In this case the concern was not performance but the fact alerts are dropped because of MyISAM table locks.

        Comment


        • #5
          Peter wrote on Sun, 25 February 2007 11:16


          In this case the concern was not performance but the fact alerts are dropped because of MyISAM table locks.


          Eh? Did I miss something?

          Comment


          • #6
            I see. I replied to the different thread,

            because you did so. I did not advice to use Innodb in this case, I did that to the other guy

            In this case my.cnf looks like first thing to fix.

            Comment


            • #7
              Now I am confused

              Am I ok to use innoDB or are you saying I should maybe switch to myISAM? The database table will not be written to all that much but will be being queried contstantly with lots of traffic.

              Also, what would be a good keybuffer size to start with?

              Comment


              • #8
                MyISAM is fine. You can even pack it after you load it.

                key_buffer_size should be the lowest of 4000MB and 30% of your memory size if you're using only MyISAM table and have a lot of data.

                Comment

                Working...
                X