gotchasqlMinor
What is the difference between select from function and select function
Viewed 0 times
thewhatfunctiondifferencebetweenandselectfrom
Problem
I am doing a test with generate_series function on Postgres v.9.6 on Ubuntu 16.04. So, I get a curiosity about loops (loops=100000) when calling that function as below.
However, If I change the query. The loop and result will change.
Here's script to create table
EXPLAIN ANALYZE
SELECT count(*)
FROM t5
WHERE id in (SELECT a FROM generate_series(1,10));
Aggregate (cost=627262.00..627262.01 rows=1 width=8) (actual time=60.534..60.534 rows=1 loops=1)
-> Seq Scan on t5 (cost=0.00..627137.00 rows=50000 width=0) (actual time=0.048..50.267 rows=100000 loops=1)
Filter: (SubPlan 1)
SubPlan 1
-> Function Scan on generate_series (cost=0.00..10.00 rows=1000 width=4) (actual time=0.000..0.000 rows=1 loops=100000)
Planning time: 0.161 ms
Execution time: 60.599 msHowever, If I change the query. The loop and result will change.
EXPLAIN ANALYZE
SELECT count(*)
FROM t5
WHERE id in (SELECT generate_series(1,10) a);
Aggregate (cost=1488.51..1488.52 rows=1 width=8) (actual time=0.050..0.050 rows=1 loops=1)
-> Nested Loop (cost=17.80..1363.51 rows=50000 width=0) (actual time=0.025..0.046 rows=10 loops=1)
-> HashAggregate (cost=17.51..19.51 rows=200 width=4) (actual time=0.012..0.014 rows=10 loops=1)
Group Key: generate_series(1, 10)
-> Result (cost=0.00..5.01 rows=1000 width=4) (actual time=0.004..0.007 rows=10 loops=1)
-> Index Only Scan using t5_pkey on t5 (cost=0.29..6.71 rows=1 width=4) (actual time=0.003..0.003 rows=1 loops=10)
Index Cond: (id = (generate_series(1, 10)))
Heap Fetches: 10
Planning time: 0.124 ms
Execution time: 0.086 msHere's script to create table
CREATE TABLE t5 ( id serial primary key, a int, b text);
INSERT INTO t5 (a,b) SELECT a, 'test_' || a FROM generate_series(1, 100000) a;Solution
It's not the
And
will return the same plan. As will this,
However, this won't.
Notice: you forget to alias the generate_series to
FROM that's causing the different query plans. It's the bad aliasing.EXPLAIN ANALYZE
SELECT count(*)
FROM t5
WHERE id in (SELECT generate_series(1,10));And
EXPLAIN ANALYZE
SELECT count(*)
FROM t5
WHERE id in (SELECT generate_series FROM generate_series(1,10));will return the same plan. As will this,
EXPLAIN ANALYZE
SELECT count(*)
FROM t5
WHERE id in (SELECT x FROM generate_series(1,10) AS gs(x));However, this won't.
EXPLAIN ANALYZE
SELECT count(*)
FROM t5
WHERE id in (SELECT a FROM generate_series(1,10));Notice: you forget to alias the generate_series to
a Now you're essentially running a correlated subquery. It's not just more work, but it's returning the entire table and not just 10 rows.Code Snippets
EXPLAIN ANALYZE
SELECT count(*)
FROM t5
WHERE id in (SELECT generate_series(1,10));EXPLAIN ANALYZE
SELECT count(*)
FROM t5
WHERE id in (SELECT generate_series FROM generate_series(1,10));EXPLAIN ANALYZE
SELECT count(*)
FROM t5
WHERE id in (SELECT x FROM generate_series(1,10) AS gs(x));EXPLAIN ANALYZE
SELECT count(*)
FROM t5
WHERE id in (SELECT a FROM generate_series(1,10));Context
StackExchange Database Administrators Q#176519, answer score: 3
Revisions (0)
No revisions yet.