Selecting where id in (list of 1000 ids)

  • Filter
  • Time
  • Show
Clear All
new posts

  • #16

    But IN(...) are not bad because MySQL can use an index to solve IN()'s.

    I'll keep that in mind, although virtually all (if not all) of my OR statements aren't indexed (not index worthy; executes the OR after the returning list has already been shaved down - i'm assuming! Hopefully they're not hurting me more than I think!!).


    Reading in all ID's in a PHP array is not usually the answer because reading the values from the DB into a PHP array also takes time and CPU (sometimes a lot)

    I agree. But it's fast. It's a resort to take if you have a lot of RAM and it's not done too routinely and there's seemingly no other way out. If your SQL query's taking 14 seconds, that too is slowing things down (unless there's a better way besides both methods).


    For web servers then yes this is often true but on a lot of other servers it is not true and writes can be very large part of the load on the machine and on these server every extra write is bad.

    I actually may have miscommunicated. I wasn't talking about adding extra rows, but an extra column - set to an ID of a cooresponding table that extends it. When both tables are huge, a left-join was killing me. So I added that column to have the ID of the other one (both having IDs of each). But those IDs don't change -- they're both auto-incremented of each other. In essence, when that extended or child table adds a row, spawning from it's parent, the parent get's the child's ID too -- otherwise it's set at 0, default originally. So I can search WHERE child_id=0, and I get the ones w/o a child, instead of a left join as I was told before.

    I didn't mean it as a once-in-a-while write -- just an extra write after inserting a new child row -- and there can only be one child for the parent.


    Since if most of your records in the master table don't have a corresponding value in the child table it is a waste of space and speed to add data that doesn't even need to be there

    I hope I clarified what I was doing above... again, the master table has a corresponding value in the child table too. When the child table's created, it has the parent ID put in. THEN, the "extra write" is to put that new child's ID in the parent table. The once-in-a-while is due to the fact that a larger pct. of activity when dealing with these tables, pound-for-pound, isn't begetting a new child, it's dealing with an already established parent-child situation.

    Make sense? Would it make more sense to just have both in one table? The reason I split them was because the child table has heavier data in it and weighs more, and about 20% of the master table doesn't have a child.


    The rule for DBMS optimization is to always think about how you can as fast as possible throw away surplus data so you don't need to shuffle it around in any way

    If this helps -- in my example, it's something neat where the PHP script doesn't have to sort, scan, or organize anything for a match when comparing against the huge result set. Initially, I showed him that he'd make a quick "copy" of his old array by setting the 'key' as the value itself. As far as processing power is concerned, it doesn't take much, as it just takes in the ID from the result set and sees if the $array[$id] == $idFromDB; if so, it found a match without having to run any search whatsoever. But YES, it takes a lot of ram! Depending on what extra junk he's bringing in from the table, I may advise against it. If he's bringing in a lot of varchars, datetimes, etc., and not just mainly a few integer column values, then I'd say he better have a lot of RAM to pull in all that junk into memory (because a lot will be ignored and just sitting in memory). ) Of course, right after, he can delete the array carried in from the DB, so it'd only be temporary. And of course, depends how often that's run (routinely).

    My main goal is to learn more about MySQL and speed up my DB. It's not running slow; in fact, I'm shocked that it's running fast on a 512MB DualCore system w/ both Apache & Mysql. Not a million users -- but a business site system doing all kinds of data gathering and writes and it's highly conditional stuff -- which I need to know more about separating data into different tables, knowing more about how Mysql uses indexes, etc., that doesn't follow the work-flow data activity of say, a forum, which one can quick grab a book on and know the game-plan easily.


    • #17
      I don't think mysql optimizes the table at IN (at least in Mysql 5.1) I also get very slow queryes when using this approach..

      However I found a workaround.... Create a temporary table that holds your id's and then use a JOIN statement... this works pretty fast.