This is an easy problem to solve, but seems to be harder to do 'nicely', we cant seem to find a nice solution yet.
The Problem
Select a small number of random rows from those recently added to a table. (this is for a photo upload site, and want to showcase a selection of recently submitted photos)
Our Solution
We maintain a second 'recent' table, which holds just the id's for the most recent 250 images (maintained by a cronjob - quite well we think).
This is the select statement we come up with so far to pick some recent images:
(I havnt included the schema but it should be clear on the columns, gridimage_id is a primary key of both tables)
This of results in a full table scan (almost) of recent_gridimage but it should never contain more than 250 rows anyway.
Other attempts
1) A quick select count(*) and then limit 486643,5; but of course that is very slow too.
2)
select * from recent_gridimage inner join gridimage_search using (gridimage_id) order by rand() limit 5
3) Both of these do quite well (the second slightly better)
select * from gridimage where gridimage_id between 486643 and 486647 limit 5select * from gridimage where gridimage_id in (486643,486644,486645,486646,486647) limit 5
the problem beeing that not all ids are used, (about 1%), so could jsut select 6 to be safe, I don't think it critical that we would occaisaonly only get 4.
.... so any neat tricks to tackle this?
(the last one actully sounds quite good, and only thought of as writing this, but I'll post anyway incase it helps others!)
The Problem
Select a small number of random rows from those recently added to a table. (this is for a photo upload site, and want to showcase a selection of recently submitted photos)
Our Solution
We maintain a second 'recent' table, which holds just the id's for the most recent 250 images (maintained by a cronjob - quite well we think).
This is the select statement we come up with so far to pick some recent images:
(I havnt included the schema but it should be clear on the columns, gridimage_id is a primary key of both tables)
This of results in a full table scan (almost) of recent_gridimage but it should never contain more than 250 rows anyway.
Other attempts
1) A quick select count(*) and then limit 486643,5; but of course that is very slow too.
2)
select * from recent_gridimage inner join gridimage_search using (gridimage_id) order by rand() limit 5
3) Both of these do quite well (the second slightly better)
select * from gridimage where gridimage_id between 486643 and 486647 limit 5select * from gridimage where gridimage_id in (486643,486644,486645,486646,486647) limit 5
the problem beeing that not all ids are used, (about 1%), so could jsut select 6 to be safe, I don't think it critical that we would occaisaonly only get 4.
.... so any neat tricks to tackle this?
(the last one actully sounds quite good, and only thought of as writing this, but I'll post anyway incase it helps others!)
Comment