Where the open source community meets: Secure your spot for Percona Live Amsterdam! - Register

Downloads

Blog

COUNT(*) for Innodb Tables

December 1, 2006

Author

Peter Zaitsev

Insight for Developers

Share this Post:

I guess note number one about MyISAM to Innodb migration is warning what Innodb is very slow in COUNT(*) queries. The part which I often however see omitted is fact it only applies to COUNT(*) queries without WHERE clause.

So if you have query like SELECT COUNT(*) FROM USER It will be much faster for MyISAM (MEMORY and some others) tables because they would simply read number of rows in the table from stored value. Innodb will however need to perform full table scan or full index scan because it does not have such counter, it also can’t be solved by simple singe counter for Innodb tables as different transactions may see different number of rows in the table.

If you have query like SELECT COUNT(*) FROM IMAGE WHERE USER_ID=5 this query will be executed same way both for MyISAM and Innodb tables by performing index rage scan. This can be faster or slower both for MyISAM and Innodb depending on various conditions.

In real applications there are much more queries of second type rather than first type so it is typically not as bad problem as it may look. Most typically count of rows is needed by admin tools which may show it in table statistics, it may also be used in application stats to show something like “We have 123.345 users which have uploaded 1.344.656 images” but these are normally easy to remove.

So remember Innodb is not slow for ALL COUNT(*) queries but only for very specific case of COUNT(*) query without WHERE clause. It does not mean I would not like to see it fixed though, it is pretty annoying.

0 0 votes

Article Rating

29 Comments

Oldest

Newest Most Voted

wesley

19 years ago

You say count(*), would count(id) result in a table scan (MyISAM) since it doesn’t know if id is NULL or not? I’m not even sure if it counts NULL or not, I guess i should check it out.

Author

Peter Zaitsev

19 years ago

It depends on now field is defined if it is NOT NULL it will read the counter. If it can be NULL it will perform full table scan

Martijn Tonies

19 years ago

Wesley,

COUNT ( column ) counts all non-NULL occurences (or should 😉 …

COUNT ( * ) counts all rows.

Kevin Burton

19 years ago

This just needs to be fixed. I forgot about this when I migrated a couple of table sot MyISAM. We were computing COUNT in order to graph DB stats. Bad idea….

The issue is that most people don’t need a 100% accurate count. They just need lazy data. INNODB should add the ability to get this even if it’s not 100% correct.

Author

Peter Zaitsev

19 years ago

Kevin,

I guess having counter such as MyISAM which would count “dirty” number of uncommitted rows could be possible but this would require some special option to enable as you can’t return wrong (approximate) result for query in default mode.

It is quite interesting where other transactional storage engines would stay on this.

Sean

19 years ago

I agree with Kevin – unless you’re doing some sort of life or death calculation, it never needs to be totally accurate. Retarded decision on MySQL’s part…

Author

Peter Zaitsev

19 years ago

Sean,

The question in this case is semantics. SELECT COUNT(*) must return accurate value by standards, There is no way it would be made to return wrong data by default as this may break some application. So we’re speaking about option which will need to be specified such as inaccurate_count=1 ? Still not everyone will be able to use it because single instance can be shared by multiple applications. Having Some form of SQL FLAG, such as SELECT APPROXIMATE COUNT(*) would be better but will require application changes and at the same time one could use info from SHOW TABLE STATUS.

John

19 years ago

I am seeing a query take up to 2 seconds (viewing the listing of members in my site) because at the top there’s a count. What steps can I take to speed this up? I’ve thought about doing a quick count total cache stored in a separate field, but that won’t work either, because depending on the search criteria, the count number can change at any time.

kiran

18 years ago

How to find the top values of table row?

John Swapceinski

18 years ago

If you want a quick and dirty count of the number of rows in a InnoDB table, this seems to work:

explain select count(*) from Table;

The result is fast and will give you a “rows” count that should be within 10% of the number of rows in the Table. So it seems the row count is being cached SOMEWHERE (I don’t know where). BTW, I am using MySQL v5.0.45.

Kishore

18 years ago

Dear Friends,
I am getting headache with MySql. I am new to .net and MYSQl.
At present lots of Issues occured in MySql …but the problem is slow..
“Whenever We are executing the Query it is taking lot’z of time to execute. Our tables contains nearly 3000 records…”
How can I optimize the Stored procedures..
(or)
How can I change this Sp into faster manner…
“Thanks In advance”

Ries van Twisk

18 years ago

to Kishore.

I you need to read upon SQL, usage of indexes etc. 3000 records is nothing for any DB and shouldn’t be slow in any case.
Most likely your DB is not properly normalized and you don’t have proper indexes.

Ries

Author

Peter Zaitsev

18 years ago

Ries, Kishore

Indeed 3000 records is very small unless these are 10MB faxes stored directly in the database 🙂 Wrong indexing or query which MySQL can’t optimize well is likely reason. If you report the EXPLAIN on our forums we might be able to help.

Fekke

17 years ago

3,000 records may be a problem. If you have three tables with 1,000 records each and you do a join involving the three tables with no indexes at all then you would have 1,000,000,000 records full scan to fetch the results.

So the optimization you acquire by creating the proper indexes not only relies on how many records your tables have but also what your queries are.

nop

17 years ago

ready for a headache? try “select count($field) from $table;” and compare the time it takes to “select count($field) from $table use index($field);”
($field is the primary key (int autoincrement) in $table which was converted from myisam and has around 900k rows in it).

for me, the standard count without use index takes around 8 seconds – the count with forced index takes below 0.1 seconds.

any thoughts on why this is so?

nop

17 years ago

correction: replace “8 seconds” with “5 minutes, 8 seconds”.

Wallace

17 years ago

Why the performance gap?

Running this slow query took 138 seconds (2 min 18.09 sec)
SELECT COUNT(*) FROM dbmail_messageblks;
+———-+
| COUNT(*) |
+———-+
| 262788 |
+———-+
1 row in set (2 min 18.09 sec)

After optimizing the SQL, it took 0.27 seconds.
SELECT COUNT(*) FROM dbmail_messageblks use index(physmessage_id_index);
+———-+
| COUNT(*) |
+———-+
| 262796 |
+———-+
1 row in set (0.27 sec)

> SHOW CREATE TABLE dbmail_messageblksG
*************************** 1. row ***************************
Table: dbmail_messageblks
Create Table: CREATE TABLE `dbmail_messageblks` (
`messageblk_idnr` bigint(21) NOT NULL auto_increment,
`physmessage_id` bigint(21) NOT NULL default ‘0’,
`messageblk` longblob NOT NULL,
`blocksize` bigint(21) NOT NULL default ‘0’,
`is_header` tinyint(1) NOT NULL default ‘0’,
PRIMARY KEY (`messageblk_idnr`),
KEY `physmessage_id_index` (`physmessage_id`),
KEY `physmessage_id_is_header_index` (`physmessage_id`,`is_header`),
CONSTRAINT `dbmail_messageblks_ibfk_1` FOREIGN KEY (`physmessage_id`) REFERENCES `dbmail_physmessage` (`id`) ON DELETE CASCADE ON UPDATE CASCADE
) ENGINE=InnoDB AUTO_INCREMENT=602519 DEFAULT CHARSET=utf8

> EXPLAIN SELECT COUNT(*) FROM dbmail_messageblksG
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: dbmail_messageblks
type: index
possible_keys: NULL
key: PRIMARY
key_len: 8
ref: NULL
rows: 1930308
Extra: Using index

> EXPLAIN SELECT COUNT(*) FROM dbmail_messageblks use index(physmessage_id_index)G
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: dbmail_messageblks
type: index
possible_keys: NULL
key: physmessage_id_index
key_len: 8
ref: NULL
rows: 1930310
Extra: Using index

16 years ago

If your table has a primary key, you can also get the count from
show index from , the cardinality of the key will give you count of the rows in the table.

Author

Peter Zaitsev

16 years ago

SC,

Note these stats are approximate.

Peter

16 years ago

‘@john – pretty clever. I haven’t done any testing to see how close this number stays, or how slow it is to update, but it seems to be a great solution if you just need an approximation. Thanks!

Josh Q

14 years ago

I’ve tried the following metrics on one of our production MySQL servers and found some interesting results related to InnoDB count().

We’re using XtraDB(InnoDB by Percona) tables (version 5.5.14-rel20.5-log), this is a web-mail system that stores all messages in a table, so it is fairly good sized.

I did a simple count(*) on the messages table:

mysql> select count(*) from messages;
+———-+
| count(*) |
+———-+
| 2849204 |
+———-+
1 row in set (5.82 sec)

I did this a dozen times and it came back at around 6 seconds each time.

Then I counted the primary key:
mysql> select count(id) from messages;
+———–+
| count(id) |
+———–+
| 2849205 |
+———–+
1 row in set (6.05 sec)

Basically the same results, even after a dozen iterations. (this is a production system so the count will increase).

Then I counted a column that is a key from another table, not null :

mysql> desc messages;
+—————+————–+——+—–+———+—————-+
| Field | Type | Null | Key | Default | Extra |
+—————+————–+——+—–+———+—————-+
| id | bigint(20) | NO | PRI | NULL | auto_increment |
| id_acct | int(11) | NO | MUL | 0 | |

mysql> select count(id_acct) from awm_messages;
+—————-+
| count(id_acct) |
+—————-+
| 2849213 |
+—————-+
1 row in set (3.40 sec)

Each time, right around 3.5 seconds. I’m not sure if this is just more efficient for InnoDB/XtraDB, or if it’s due to the difference in field type (bigint vs int).

Dave

14 years ago

I seek a related clarification:

What about the folk developer wisdom that “select count(1) …” is better than “select count(*) …” – does that apply?
To Innodb? To MyISAM?
Or is it actually only dependent on indexes being set?
Or on the right index being invoked in the “select count () … ” statement?

Leonid

14 years ago

I am sorry if this question is rather stupid (or more).
If it’s all about having the WHERE, why not just use AND WHERE 1?

Andrea

14 years ago

Hi guys, this blog is very useful, thanks and keep up the good work.
My 2 cents: I had an autoincrement coloumn, so the last inserted value is (starting from 0) the exact number of rows. So making (on InnoDB table):

SELECT MAX(id) FROM my_table;

You obviously obtain the same result as COUNT(*) for that table.

The explain result in this case was :

id select_type table type possible_keys key key_len ref rows filtered Extra
1 SIMPLE NULL NULL NULL NULL NULL NULL NULL NULL Select tables optimized away

So the best way as it was MyIsam table

MyXAMoP

14 years ago

you can get approximate number of rows as fast in Innodb as in myisam tables using “SHOW TABLE STATUS” command
it will varry from real number to small degree, but this is perfectly fine for statistics on a page:

mysql> show table status like ‘customers’
*************************** 1. row ***************************
Name: customers
Engine: InnoDB
Version: 10
Row_format: Compact
Rows: 11861
Avg_row_length: 222
Data_length: 2637824
Max_data_length: 0
Index_length: 5718016
Data_free: 5219811328
Auto_increment: 56416
Create_time: 2012-07-19 07:14:09
Update_time: NULL
Check_time: NULL
Collation: utf8_general_ci
Checksum: NULL
Create_options:
Comment: customer table
1 row in set (0.00 sec)

mysql> select count(*) from customers
*************************** 1. row ***************************
count(*): 11649
1 row in set (0.03 sec)

Meena

13 years ago

Table row count is not accurate from information_Schema.tables & table status…
How can I get accurate row count of all tables in a database?

zhuguowei

10 years ago

Hi, I used mysql version : 5.6.21-1~dotdeb.1-log, and have a big table: op_log(ENGINE=InnoDB),
and I found count(*) has better performance than count(id), but I do not know why is so.
please see http://stackoverflow.com/questions/35653049/why-mysql-count-has-better-performance-than-countid?noredirect=1#comment58986427_35653049

Sidharth Singla

9 years ago

Why count value is not stored as counter for Innodb as is done in MyISAM ?

SomeName

6 years ago

14 years later and COUNT(*) queries still take hours. This seems like such a simple optimization. Store the row count, have a per-transaction dirty counter.

These queries take so long there must be a bug. In my case I have a secondary index on a SMALLINT column and EXPLAIN shows that index is being used for the COUNT(*) query. The index is roughly 5GB in size. How the hell does it take hours to read through 5GB of data? The entire table is only 60GB. In the time the COUNT(*) query takes to complete I could have read all 60GB of data byte-by-byte many times over.

It just makes me do a face palm at MySQL every time I have to do one of these queries.