Monday, November 21, 2011

Postgresql - Explain Analyze

I was researching for a way of telling how long a query will take.
This command displays the execution plan that the PostgreSQL planner generates for the supplied statement. The execution plan shows how the table(s) referenced by the statement will be scanned — by plain sequential scan, index scan, etc. — and if multiple tables are referenced, what join algorithms will be used to bring together the required rows from each input table.

The most critical part of the display is the estimated statement execution cost, which is the planner's guess at how long it will take to run the statement (measured in units of disk page fetches). Actually two numbers are shown: the start-up time before the first row can be returned, and the total time to return all the rows. For most queries the total time is what matters, but in contexts such as a subquery in EXISTS, the planner will choose the smallest start-up time instead of the smallest total time (since the executor will stop after getting one row, anyway). Also, if you limit the number of rows to return with a LIMIT clause, the planner makes an appropriate interpolation between the endpoint costs to estimate which plan is really the cheapest.

The ANALYZE option causes the statement to be actually executed, not only planned. The total elapsed time expended within each plan node (in milliseconds) and total number of rows it actually returned are added to the display. This is useful for seeing whether the planner's estimates are close to reality.

One thing worth having in mind is that the analyzing imposes overhead over the sql statement:

In order to measure the run-time cost of each node in the execution plan, the current implementation of EXPLAIN ANALYZE can add considerable profiling overhead to query execution. As a result, running EXPLAIN ANALYZE on a query can sometimes take significantly longer than executing the query normally. The amount of overhead depends on the nature of the query.

Source:
http://www.postgresql.org/docs/9.0/static/sql-explain.html

Useful resources for further reading:
http://wiki.postgresql.org/wiki/Using_EXPLAIN

http://stackoverflow.com/questions/3160426/statistics-on-query-time-postgresql

Explain Analyze relies on planner statistics, so it's important to have up to date statistics to have the planner making good decisions about how to execute a sql command.
http://wiki.postgresql.org/wiki/Planner_Statistics

No comments:

Post a Comment