HiveBrain v1.2.0
Get Started
← Back to all entries
patternsqlMinor

Postgresql constraints SARG

Submitted by: @import:stackexchange-dba··
0
Viewed 0 times
postgresqlconstraintssarg

Problem

Are SQL Server style SARGs relevant to PostgreSQL queries in general and more specifically on table constraints?

I have a parent table with many inherited children, partitioned by a date column.
Would it be better to change the constraints from

CHECK (date_part('year', end_of_dt)::integer = 2016)


to

CHECK (end_of_dt between '2016-01-01 0:00:00' and '2016-12-31 23:59:59.999')

Solution

We can easily test this.. but first, stop using date_part. Instead, use extract. It's the same thing with slightly different syntax, but it's standardized.

Testing it

Let's create a table with all the second TIMESTAMPS for the year.

CREATE TEMPORARY TABLE foo AS (
  SELECT to_timestamp(x) AS ts
  FROM generate_series(
    extract(EPOCH FROM '2016-01-01'::timestamp)::int,
    extract(EPOCH FROM '2017-01-01'::timestamp)::int
  ) AS t(x)
);


Then we can implement your check in a WHERE condition to get a basic idea of speed. BTW, there are a ton more seconds in a year than I first though. There are 31,622,401 seconds in a year. It takes 1093 MB to load this sample data.

took, 12.6, 10.8, 11.0 seconds.

SELECT count(*) FROM foo
WHERE extract(YEAR FROM ts) = 2016;


took, 6.1, 6.3, 6.2 seconds

SELECT * FROM foo
WHERE ts BETWEEN '2016-01-01 0:00:00' AND '2016-12-31 23:59:59.999';


Conclusion

Yes, the second is roughly 2x as fast in three trial runs. For the purposes of creating a check constraint, you should use the latter.

Addendum

PostgreSQL has the ability to add functional indexes, however an index on date_part() is not possible

CREATE INDEX onyear ON foo (date_part('year',ts));
ERROR:  functions in index expression must be marked IMMUTABLE


I assume this is because the result of datetime functions comes back in the client's timezone. Either way, this makes it even more advantageous to use the timestamp ranges. One easy to get more speed if need be would be with materialized views though..

CREATE MATERIALIZED VIEW matviewfoo AS
SELECT
  extract(YEAR FROM ts)::smallint AS year,
  ts
FROM foo;
CREATE INDEX yearidx ON matviewfoo (year);
VACUUM FULL matviewfoo;


Now you can

SELECT count(*) FROM matviewfoo WHERE year = 2016;


And, you're down to 4.1 seconds

Code Snippets

CREATE TEMPORARY TABLE foo AS (
  SELECT to_timestamp(x) AS ts
  FROM generate_series(
    extract(EPOCH FROM '2016-01-01'::timestamp)::int,
    extract(EPOCH FROM '2017-01-01'::timestamp)::int
  ) AS t(x)
);
SELECT count(*) FROM foo
WHERE extract(YEAR FROM ts) = 2016;
SELECT * FROM foo
WHERE ts BETWEEN '2016-01-01 0:00:00' AND '2016-12-31 23:59:59.999';
CREATE INDEX onyear ON foo (date_part('year',ts));
ERROR:  functions in index expression must be marked IMMUTABLE
CREATE MATERIALIZED VIEW matviewfoo AS
SELECT
  extract(YEAR FROM ts)::smallint AS year,
  ts
FROM foo;
CREATE INDEX yearidx ON matviewfoo (year);
VACUUM FULL matviewfoo;

Context

StackExchange Database Administrators Q#160424, answer score: 3

Revisions (0)

No revisions yet.