Discussion:
what is the cause that scan type is showing as 'seq scan' after indexing
(too old to reply)
Joseph Lemm
2004-01-04 05:30:20 UTC
Permalink
Hi All,
Before indexing query plan was showing cost as 40.00, after indexing query
plan again showing as 'seq scan' and cost as 3060.55.
The field which i indexed is primary key to this table.
May i know
1) what is the cause that scan type is showing as 'seq scan' after indexing
also
2) why it is showing cost as high value compare to previous.
You trimmed out the other parts of the EXPLAIN, so I'm just guessing,
but that cost seems suspiciously round: I'm guessing that you haven't
run VACUUM ANALYZE at all. One thing indexing does is update the 'number
of tuples' statistic. See the archives for why sequential scans still
show up (short answer: index scans aren't free, so at some point, it's
cheaper to scan the entire table than to scan both the index and the
subset of the table returned)
OK, so then what is the explanation for this:

Table "public.post"
Column | Type | Modifiers
--------+-----------------------------+-----------
id | integer |
author | character varying(80) |
text | text |
hidden | boolean |
date | timestamp without time zone |
host | character varying(80) |
Indexes: idx_post_id unique btree (id),
post_author_index btree (author)


VACUUM ANALYZE;
VACUUM

EXPLAIN ANALYZE select host from post where author='George';
QUERY PLAN

--------------------------------------------------------------------------------------------------------
Seq Scan on post (cost=0.00..2869.30 rows=1768 width=27) (actual
time=0.23..520.65 rows=1774 loops=1)
Filter: (author = 'George'::character varying)
Total runtime: 525.77 msec
(3 rows)

So the optimizer decided it's less costly to do a sequential scan here than use
the index, right?


Now:

SET ENABLE_SEQSCAN=OFF;

EXPLAIN ANALYZE select host from post where author='George';
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------
Index Scan using post_author_index on post (cost=0.00..5253.63 rows=1768
width=27) (actual time=28.92..210.25 rows=1774 loops=1)
Index Cond: (author = 'George'::character varying)
Total runtime: 215.00 msec
(3 rows)


So if I force an index scan, I get much better performance (215 vs 525 msec).
Does this mean that the optimizer screwed up when it recommended a sequential
scan?

Thanks.


=====
J.

__________________________________
Do you Yahoo!?
Find out what made the Top Yahoo! Searches of 2003
http://search.yahoo.com/top2003

---------------------------(end of broadcast)---------------------------
TIP 6: Have you searched our list archives?

http://archives.postgresql.org
Gaetano Mendola
2004-01-04 12:02:17 UTC
Permalink
Post by Joseph Lemm
Hi All,
Before indexing query plan was showing cost as 40.00, after indexing query
plan again showing as 'seq scan' and cost as 3060.55.
The field which i indexed is primary key to this table.
May i know
1) what is the cause that scan type is showing as 'seq scan' after indexing
also
2) why it is showing cost as high value compare to previous.
You trimmed out the other parts of the EXPLAIN, so I'm just guessing,
but that cost seems suspiciously round: I'm guessing that you haven't
run VACUUM ANALYZE at all. One thing indexing does is update the 'number
of tuples' statistic. See the archives for why sequential scans still
show up (short answer: index scans aren't free, so at some point, it's
cheaper to scan the entire table than to scan both the index and the
subset of the table returned)
Table "public.post"
Column | Type | Modifiers
--------+-----------------------------+-----------
id | integer |
author | character varying(80) |
text | text |
hidden | boolean |
date | timestamp without time zone |
host | character varying(80) |
Indexes: idx_post_id unique btree (id),
post_author_index btree (author)
VACUUM ANALYZE;
VACUUM
EXPLAIN ANALYZE select host from post where author='George';
QUERY PLAN
--------------------------------------------------------------------------------------------------------
Seq Scan on post (cost=0.00..2869.30 rows=1768 width=27) (actual
time=0.23..520.65 rows=1774 loops=1)
Filter: (author = 'George'::character varying)
Total runtime: 525.77 msec
(3 rows)
So the optimizer decided it's less costly to do a sequential scan here than use
the index, right?
SET ENABLE_SEQSCAN=OFF;
EXPLAIN ANALYZE select host from post where author='George';
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------
Index Scan using post_author_index on post (cost=0.00..5253.63 rows=1768
width=27) (actual time=28.92..210.25 rows=1774 loops=1)
Index Cond: (author = 'George'::character varying)
Total runtime: 215.00 msec
(3 rows)
So if I force an index scan, I get much better performance (215 vs 525 msec).
Does this mean that the optimizer screwed up when it recommended a sequential
scan?
No this mean that you are instructing your optimizer in a wrong way.


Show us your configuration file and in particular these parameters:

effective_cache_size
random_page_cost
cpu_tuple_cost
cpu_index_tuple_cost
cpu_operator_cost

I use these value, that are good enough for a medium HW:

effective_cache_size = 20000
random_page_cost = 2.5
cpu_tuple_cost = 0.005
cpu_index_tuple_cost = 0.0005
cpu_operator_cost = 0.0025


Regards
Gaetano Mendola


















---------------------------(end of broadcast)---------------------------
TIP 8: explain analyze is your friend
Joseph Lemm
2004-01-05 13:42:32 UTC
Permalink
Gaetano, thanks.

My db has only one table (about 29,000 records), so I thought leaving
postgreqsql.conf at its defaults would be OK: the params you mention are
commented out, so they must be at their defaults, tho I can't tell what the
defaults are.
Are there any docs that talk specificially about how to set these params and
what the defaults are (the official docs don't say much)?

Thanks.
Post by Joseph Lemm
Post by Joseph Lemm
Hi All,
Before indexing query plan was showing cost as 40.00, after indexing query
plan again showing as 'seq scan' and cost as 3060.55.
The field which i indexed is primary key to this table.
May i know
1) what is the cause that scan type is showing as 'seq scan' after indexing
also
2) why it is showing cost as high value compare to previous.
You trimmed out the other parts of the EXPLAIN, so I'm just guessing,
but that cost seems suspiciously round: I'm guessing that you haven't
run VACUUM ANALYZE at all. One thing indexing does is update the 'number
of tuples' statistic. See the archives for why sequential scans still
show up (short answer: index scans aren't free, so at some point, it's
cheaper to scan the entire table than to scan both the index and the
subset of the table returned)
Table "public.post"
Column | Type | Modifiers
--------+-----------------------------+-----------
id | integer |
author | character varying(80) |
text | text |
hidden | boolean |
date | timestamp without time zone |
host | character varying(80) |
Indexes: idx_post_id unique btree (id),
post_author_index btree (author)
VACUUM ANALYZE;
VACUUM
EXPLAIN ANALYZE select host from post where author='George';
QUERY PLAN
--------------------------------------------------------------------------------------------------------
Post by Joseph Lemm
Post by Joseph Lemm
Seq Scan on post (cost=0.00..2869.30 rows=1768 width=27) (actual
time=0.23..520.65 rows=1774 loops=1)
Filter: (author = 'George'::character varying)
Total runtime: 525.77 msec
(3 rows)
So the optimizer decided it's less costly to do a sequential scan here than
use
Post by Joseph Lemm
the index, right?
SET ENABLE_SEQSCAN=OFF;
EXPLAIN ANALYZE select host from post where author='George';
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------
Post by Joseph Lemm
Post by Joseph Lemm
Index Scan using post_author_index on post (cost=0.00..5253.63 rows=1768
width=27) (actual time=28.92..210.25 rows=1774 loops=1)
Index Cond: (author = 'George'::character varying)
Total runtime: 215.00 msec
(3 rows)
So if I force an index scan, I get much better performance (215 vs 525
msec).
Post by Joseph Lemm
Does this mean that the optimizer screwed up when it recommended a
sequential
Post by Joseph Lemm
scan?
No this mean that you are instructing your optimizer in a wrong way.
effective_cache_size
random_page_cost
cpu_tuple_cost
cpu_index_tuple_cost
cpu_operator_cost
effective_cache_size = 20000
random_page_cost = 2.5
cpu_tuple_cost = 0.005
cpu_index_tuple_cost = 0.0005
cpu_operator_cost = 0.0025
Regards
Gaetano Mendola
---------------------------(end of broadcast)---------------------------
TIP 8: explain analyze is your friend
=====
J.

__________________________________
Do you Yahoo!?
Find out what made the Top Yahoo! Searches of 2003
http://search.yahoo.com/top2003

---------------------------(end of broadcast)---------------------------
TIP 7: don't forget to increase your free space map settings
Gaetano Mendola
2004-01-05 14:26:04 UTC
Permalink
Post by Joseph Lemm
Gaetano, thanks.
My db has only one table (about 29,000 records), so I thought leaving
postgreqsql.conf at its defaults would be OK: the params you mention are
commented out, so they must be at their defaults, tho I can't tell what the
defaults are.
The values on the line commented are the default values.
Post by Joseph Lemm
Are there any docs that talk specificially about how to set these params and
what the defaults are (the official docs don't say much)?
Take a look at performance NG, at least is what I use to do.



Regards
Gaetano Mendola









---------------------------(end of broadcast)---------------------------
TIP 6: Have you searched our list archives?

http://archives.postgresql.org
Glenn Wiorek
2004-01-05 14:35:19 UTC
Permalink
Also take a look at Annotated postgresql.conf and Global User Configuration
(GUC) Guide

http://www.varlena.com/GeneralBits/Tidbits/annotated_conf_e.html

----- Original Message -----
From: "Gaetano Mendola" <***@bigfoot.com>
To: <pgsql-***@postgresql.org>
Cc: "Joseph Lemm" <***@yahoo.com>
Sent: Monday, January 05, 2004 8:26 AM
Subject: Re: [ADMIN] what is the cause that scan type is showing as 'seq
scan' after
Post by Gaetano Mendola
Post by Joseph Lemm
Gaetano, thanks.
My db has only one table (about 29,000 records), so I thought leaving
postgreqsql.conf at its defaults would be OK: the params you mention are
commented out, so they must be at their defaults, tho I can't tell what the
defaults are.
The values on the line commented are the default values.
Post by Joseph Lemm
Are there any docs that talk specificially about how to set these params and
what the defaults are (the official docs don't say much)?
Take a look at performance NG, at least is what I use to do.
Regards
Gaetano Mendola
---------------------------(end of broadcast)---------------------------
TIP 6: Have you searched our list archives?
http://archives.postgresql.org
---------------------------(end of broadcast)---------------------------
TIP 8: explain analyze is your friend

Loading...