GET 24/7 LIVE HELP NOW

Announcement

Announcement Module
Collapse
No announcement yet.

How to import large datasets

Page Title Module
Move Remove Collapse
X
Conversation Detail Module
Collapse
  • Filter
  • Time
  • Show
Clear All
new posts

  • How to import large datasets

    Hi,
    I'm very new to mysql performance and have some questions regarding a project I'm working on.

    I guess before I say anything, the project is an e-commerce store.

    I have two tables right now I'm concerned about:
    1. Product Information - 1 record per product, contains name of product, description, etc.
    There will be roughly 3million records in here (myisam at the moment)

    2. Inventory Table - quantity information for current products at different warehouses (innodb)
    There will be rougly 1.5million records in here


    I receive a ~1gb flat file (~3.5m records) from my distributor every 3 days which contains all the product information and another flatfile ~200mb from my distributor every day that contains the entire inventory (~1.5 records) at the distributor's warehouses.

    These flat files aren't merely updates though, they contain all the data over again, just with modifications or new records. So I'm struggling with the best way to import this data quickly, and so the import doesnt affect the front end shoppers...

    I first tried to tackle this problem by just writing scripts that performed LOAD DATA INFILE REPLACE... . It took about 22 minutes to import 3.3million records into my products table, but my table was locked the entire time. I tried adding the CONCURRENT option but it still was locked from read queries (I was researching and there was a bug report saying CONCURRENT was broken in 5.0.19-5.0.30?). I guess I'll just make sure the server I put this on doesnt have that bug, or should I be doing this whole thing a different way?


    As for the inventory table, I also need to do something similar, but as far as I understand because this table is innodb, the CONCURRENT flag won't help and the entire table will be read locked while I run the LOAD DATA command. What's the best way to replace/add records to the inventory table (innodb) without locking the entire table while I'm importing it? Do I just do it in a batch sql file? Should I do it in small bursts? Are there any tricks to doing these mass inserts or updates?

    thank you guys so much your help. Hopefully the above is not too vague.

    Cheers,
    Last edited by TomD; 06-23-2014, 09:08 AM. Reason: Removed several spam links from the end of the message

  • #2
    I receive a ~1gb flat file (~3.5m records)
    ~200mb from my distributor every day
    ------------------
    what are these files?
    is that sql dump? xls? myisam?
    ------------------

    I'm not pretty sure, but the workaround I briefly see is:

    - restore data given from others to non-original table
    let say, to product_copy and inventory_copy tables

    - then use pt-table-sync and sync both [old] product and [new] product_copy table so they will have the same data
    more about pt-table-sync:
    http://www.percona.com/doc/percona-t...able-sync.html

    you can check there "lock" option
    http://www.percona.com/doc/percona-t...ble-sync--lock

    so it will help you to improve lock of table

    Another possible way is:

    - restore data given from others to non-original table
    let say, to product_copy and inventory_copy tables

    - stop writes, rename table, start writes
    RENAME TABLE product_copy TO product

    Comment


    • #3
      did you try to do that vya PhpMyAdmin ?
      it works in my all cases..!
      let me know if it doesn't work, i will get you another alternative !
      ______________________
      Regards:-
      Sam Cox (MySQL Expert)
      1600 Park View, Ampitheatre - California
      Owner of - SEO Services in Gurgaon
      Partnership Business Chemical Earthing in Delhi

      Comment

      Working...
      X