Announcement

Announcement Module
Collapse
No announcement yet.

processing large result sets...

Page Title Module
Move Remove Collapse
X
Conversation Detail Module
Collapse
  • Filter
  • Time
  • Show
Clear All
new posts

  • processing large result sets...

    Hello,

    I have a large table with about half a million (potentially a few million) rows - i need to make a snapshot of the data at a specific moment and process it for export from the database (in an xml file, but that's not that important)...

    I have some ideas on how to process them now but each has its drawbacks (i'm using php to process the data, if that is of any importance)...

    * one could just make a 'select *' on this table and then process it one row at a time at the application level - the problem - the data might grow so large that the result set does not fit the available RAM...

    * one could 'select id from...' and then process it one id at a time, fetching rows and the processing them... but that would require me to first make a copy of the table, so it doesn't change during the whole operation... and, theoretically, it doesn't stop the script from taking too much memory if there are enough rows (not a few million maybe but it's certainly possible)...

    * I was also thinking in a direction of processing it part-by-part using 'select .... limit ...' which might (i'm not sure) allow me to process arbitrarily large result sets but I don't think it's a wise idea from performance point of view... of course, I would still need a copy first in this case...

    Any other ideas? Maybe some obvious way I'm somehow missing? Some standard way of processing large amounts of data from a MySQL database?

    Thanks in advance...

  • #2
    How often do you need to do such database exports? If not very often, then if MySQL is on LVM partition, you can do LVM snapshot, mount snapshot, copy data to another server, stat MySQL with data directory from this snapshot, and export it to XML file.

    Comment


    • #3
      debug wrote on Mon, 28 April 2008 13:47

      How often do you need to do such database exports? If not very often, then if MySQL is on LVM partition, you can do LVM snapshot, mount snapshot, copy data to another server, stat MySQL with data directory from this snapshot, and export it to XML file.


      Thanks for your answer ), the exports will need to be done once or twice a day...

      As for the lvm option, I'd have to think about that - the database is now on a regular filesystem so that would require some shuffling about... But this solves only the "snapshot" part of the problem - the other part of my question still remains unanswered - What is the best way to process and export a large result set (possibly exceeding available memory) from a database to an xml file (or some other format)?

      Comment


      • #4
        sentinel wrote on Tue, 29 April 2008 12:35

        What is the best way to process and export a large result set (possibly exceeding available memory) from a database to an xml file (or some other format)?


        can anyone help here?

        how does one process such large results sets without allocating enormous amounts of RAM? some kind of cursor? limit ... offset ... ? get only id's and then process the relevant rows? (should work if the data set is not really really large)... any other solution?

        best regards,
        sentinel.

        Comment


        • #5
          The way we do it here is first do a:

          SELECT COUNT(*) FROM mytable


          Suppose you get 50000 back. Split that up in pages of the size you are sure will fit in memory, and request each of those pages with a separate call using LIMIT in your query.

          As a side-effect, you might not get a consistent export if records are being added while doing the export, but I don't know of any way to prevent that.

          regards,

          Wim

          Comment

          Working...
          X