Feature Idea: Finding columns which query needs to access

In query examinations it is often interesting which columns query needs to access to provide result set as it gives you ideas if you can use covering indexes to speed things up or even cache some data by denormalizing tables. So far it has to be done manually – look at SELECT clause, WHERE clause, ORDER BY GROUP BY and HAVING to sum it up, not to mention more complex questions of subselects.

It would be great tool for MySQL Performance Optimization if MySQL Server could show this table and I know it already has this data as it is used by the optimizer.

It would be also good if such tool would show mapping of WHERE clauses to the tables as well as other clauses such as GROUP BY and ORDER BY. For complex queries especially if they do not refer to the columns with table prefix it may take a while to figure our where these columns come from and so how they can be optimized.

16 Comments

Oldest

Newest Most Voted

Inline Feedbacks

View all comments

Xaprb

19 years ago

That’s a great idea! I vote yes.

Sergey Petrunia

19 years ago

Peter, re the mapping of WHERE/GROUP/ORDER BY clauses to the tables, doesn’t EXPLAIN EXTENDED do that when it produces a warning with rewritten query text? Yes, it’s not readable but it is there. So what is needed is to just write a pretty-printing tool?

Sergey Petrunia

19 years ago

Re the used columns, where do you think this info could be shown? One more column in the output of “EXPLAIN WITH USED_COLUMN_LISTS SELECT …”, with list of used columns? That’s the easiest way, but I’m afraid EXPLAIN output will become human-unreadable. Another option I can think of is to make EXPLAIN WITH USED_COLUMN_LISTS produce a temporary table with (table_name, field_name, is_used) columns. Is that better?

Peter Zaitsev

Author

19 years ago

Thank you Sergey,

Yes I guess it can be used together with some parser to provide some readable output. Being readable is important here as you already can manually dig such info it just takes a lot of effort.

It would be also nice to be able to see which where clauses are resolved doing index lookups and which are post-filtering.

Roland Bouman

19 years ago

I would very much like a pseudo table in the information_schema like this:

information_schema.parse_tree(
query_text text — query text for which the parse tree is given
, id int unsigned — id for this node
, parent_id int unsigned — id of the parent node (implements adjacency list)
, position int unsigned — position of this node within the sibling nodes
, lft int unsigned — nested set implementation
, rgt int unsigned — nested set implementation
, node_type enum(‘identifier’,’constant’,…) — reference to syntax production rule
, node_text text — substring of query_Text that is parsed
)

THe usage would be like this:

select *
from information_schema.parse_tree
where query_text = ‘select col1,col2 from aTable where col3 = ”bla”’
;

What would happen is that the parse_tree table would parse the query_text expression, and generate rows for all the nodes in the parse tree.

The nice thing about this table is that you could use for many more things, such as
-discover dependencies between views and tables
-discover dependencies between procedures and other procedures

and…and..

Anyway, what do you think?

Roland Bouman

19 years ago

Extra note:
People I have told about this idea usually first say: “Nah this ugly, does not make sense, why pass the statement as a string, you should have a SHOW command rather than this, etc.”

Well, first: it makes as much sense as EXPLAIN. There too, you feed an expression, and you get a resultset back.
second: you cannot join against a SHOW command. Joining is good, you can use it then to parse the code from existing VIEW and PROCEDURE objects via the corresponding tables in he information schema.
third, it makes sense to pass it as a string. That way you can join to VIEW_DEFINITION and ROUTINE_DEFINITION etc columns in the information_schema VIEWS and ROUTINES columns

More purposes: refactoring code (renaming identifiers etc) but then the parse_tree table would also need the exact string position info.

Peter Zaitsev

Author

19 years ago

Roland,

I think it would be good, however I’m not sure passing query text is good idea. Might be EXPLAIN can simply feel these tables ready to use or something similar.

The problem with such solution is – it will be unlikely implemented any time soon as it not that trivial.

The other problem it may be a bit complicated to use – some people hate SHOW statements but they are much easier to use compared to information schema 🙂

Peter Zaitsev

Author

19 years ago

Sergey,

You can run EXPLAIN … \G in which case number of columns is not the problem any more.

Adding two columns will be great, first listing columns which are used from the table and second listing where/on clauses applying to this table and how they are resolved, something like

SELECT count(*) FROM A,B WHERE A.ID=B.ID and A.C>5 and B.D>6;

A:
Used_Columns: ID,C
Checks: C>5(range)

B:
Used_Columns: ID,D
Checks: ID=A.ID(ref), D>6(row)

This would allow to very easy to see which of the checks are not using index which are great candidates for optimization.

Apachez

19 years ago

Does it have to be an extension to EXPLAIN?

A standalone script (say written in perl or so) would be for me more interresting. This way you could give the script various queries and it will spit out suggestions for indexes needed. Of course in the longer run it would be interresting to have it also available like EXPLAIN EXTENDED or something…

Peter Zaitsev

Author

19 years ago

To write a script you need infrastructure on MySQL Server side still.

You can’t really write full query parser or reliably check which clauses are resolved which way.

I do not really care if it is part of explain or available some other way.

Philip Stoev

19 years ago

Hello, the MySQL parser has been yanked out and is now available as a separate Perl CPAN module, DBIx::MyParse. The current downloadable version has considerable limitations, however I am working on a new one that should parse any and all SELECTS that MySQL accepts. Once I get there, I will try to provide code that dumps the list of columns.

Peter Zaitsev

Author

19 years ago

Thanks Philip,

This would be cool. You can’t get parser to be equivalent to MySQL parser in all cases as there are different versions and even different MySQL Server settings may change things. But it should be good enough for most cases.

Andrisi

19 years ago

Good idea. Does not have to be human-readable. Would be useful in many client applications, that do complex user-defined queries, or ones doing editing with auto-build forms.

Jason

16 years ago

So yeah this is really old but I stumbled upon it looking for the same thing and ended up building myself a php script to tease the info out of mysql. Not sure if this will work in every scenario but if the column isn’t a possible key you probably don’t need to index it anyway. So here it is, pretty self explanatory but I added comments in anyway. This will only work on a single table, so no joins, and it’s not prettied up but you can modify it for your needs.

Tada! Works for me anyway.

Jason

16 years ago

wtf? Didn’t go through. Code tags here? Trying again.

Tada! Works for me anyway.

Jason

16 years ago

Hah! Just getting ugly now. Last try:

$con = mysql_connect(‘ip’, ‘user’, ‘pass’);
mysql_select_db(“db”, $con);

// Fill in your table name and the query you want to check.
$table = “yourtable”;
$checkquery= “yourquery”;

// We should create a tiny database since it’s much faster and we don’t really need the data.
$sql=”create table test like $table”;
$result = mysql_query($sql);
$sql=”insert into test select * from $table limit 1″;
$result = mysql_query($sql);

// Here we get all of the columns from the table and create an index on each one.
$sql=”show columns from test”;
$result = mysql_query($sql);
while($row = mysql_fetch_array($result))
{
$field=$row[‘Field’];
$sql2=”create index $field on test($field);”;
$result2 = mysql_query($sql2);
}

// Now we run explain to see what columns (indexes) MySQL could possibly use. Here’s the trick that should get you a column list from the query.
$sql=”explain $checkquery”;
$result= mysql_query($sql);
while($row = mysql_fetch_array($result))
{
echo “Columns: ” . $row[‘possible_keys’] . ” are used for this query”;
}

// Clean up the mess.
$sql=”drop table test;”;
$result = mysql_query($sql);

MySQL 5.7
Support

Compare Percona to Leading Database Solutions

Software
Downloads

Valkey Contribution

Product Documentation

Resource Hub

Why Percona for MongoDB?

Why Percona for PostgreSQL?

Percona Blog

Percona Community Hub

Percona Events Hub

About Percona

Percona in the News

Our Customers

Our Partners

Careers

Contact Us

Feature Idea: Finding columns which query needs to access

Related Blog Articles

RECOMMENDED ARTICLES

Building a Multi-Cloud Strategy: Cut Costs, Improve Resilience, and Avoid Lock-In

Kubernetes Operators Compared: The Key to Scalable, Cost-Efficient Databases

Kubernetes Multi-Cloud Architecture: Building Portable Databases Without Lock-In

MOST POPULAR ARTICLES

Deploy Django on Kubernetes With Percona Operator for PostgreSQL

MySQL Performance Tuning: Maximizing Database Efficiency and Speed

The Ultimate Guide to Open Source Databases

MySQL 5.7 Support

Compare Percona to Leading Database Solutions

Software Downloads

Valkey Contribution

Product Documentation

Resource Hub

Why Percona for MongoDB?

Why Percona for PostgreSQL?

Percona Blog

Percona Community Hub

Percona Events Hub

About Percona

Percona in the News

Our Customers

Our Partners

Careers

Contact Us

Feature Idea: Finding columns which query needs to access

About the Author

Share This Post!

Stay up to date with the Percona Blog

Related Blog Articles

RECOMMENDED ARTICLES

Building a Multi-Cloud Strategy: Cut Costs, Improve Resilience, and Avoid Lock-In

Kubernetes Operators Compared: The Key to Scalable, Cost-Efficient Databases

Kubernetes Multi-Cloud Architecture: Building Portable Databases Without Lock-In

MOST POPULAR ARTICLES

Deploy Django on Kubernetes With Percona Operator for PostgreSQL

MySQL Performance Tuning: Maximizing Database Efficiency and Speed

The Ultimate Guide to Open Source Databases

MySQL 5.7
Support

Software
Downloads