patternsqlMinor
Fastest way to get the length of the intersection of two text arrays in PostgreSQL
Viewed 0 times
postgresqlthearrayslengthtextwaytwogetfastestintersection
Problem
This is what I am currently using:
Is there a faster way?
CREATE FUNCTION array_intersect(a1 anyarray, a2 anyarray)
RETURNS anyarray AS
$
SELECT ARRAY(
SELECT unnest($1)
INTERSECT SELECT unnest($2)
ORDER BY 1
);
$ LANGUAGE sql
IMMUTABLE STRICT;
--get the length:
select array_length ( array_intersect(array[...], array[...]), 1);Is there a faster way?
Solution
-
You say you're looking for "Jaccard Similarity between two arrays"
You may consider MadLib which provides this as
Jaccard distance between two varchar vectors treated as sets.
-
You may want to try it without
-
You may want to use
-
As Craig Ringer said in the comments you may want out
To give a more accurate answer here,
You say you're looking for "Jaccard Similarity between two arrays"
You may consider MadLib which provides this as
dist_jaccard()Jaccard distance between two varchar vectors treated as sets.
-
You may want to try it without
STRICT. That may be slowing it down.-
You may want to use
INTERSECT ALL Especially if you know you don't have dupes in $1.-
As Craig Ringer said in the comments you may want out
intarray and consider using &.To give a more accurate answer here,
- What are the datatypes?
- How many elements are we talking about on each array?
- How much exclusion does the
INTERSECTprovide?
Context
StackExchange Database Administrators Q#58329, answer score: 2
Revisions (0)
No revisions yet.