patternsqlMinor
Full text search on two tsvector columns
Viewed 0 times
fullsearchcolumnstexttwotsvector
Problem
I'm trying to do a full text search across two columns of two separate tables in PostgreSQL 9.4.2. The tables and columns are:
Where there is a many-to-many relationship between the two. I think I've got most of the way following this blog post. However, I am struggling to formulate the query across the two columns. This query is returning results:
But I think the query suffers many of the same deficiencies from a previous question I asked here.
I am also trying to return the following columns in the query:
I am pretty new to full text search - and SQL in general.
Let me know if you need any more information.
- article.article_title
- keyword.keyword
Where there is a many-to-many relationship between the two. I think I've got most of the way following this blog post. However, I am struggling to formulate the query across the two columns. This query is returning results:
SELECT distinct title, pub_date, web_id, city_points, city_name FROM (
SELECT article.article_title as title,
article.pub_date as pub_date,
article.web_id as web_id,
string_agg(st_asText(city.geom), ', ') as city_points,
string_agg(city.city, ', ') as city_name,
article.title_tsv ||
keyword.keyword_tsv
as document
from article, article_keywords, keyword, article_cities, city
where article.id = article_keywords.article_id
and keyword.id = article_keywords.keyword_id
and article.id = article_cities.article_id
and city.id = article_cities.city_id
GROUP BY article.id, keyword.id) p_search
WHERE p_search.document @@ to_tsquery('putin');But I think the query suffers many of the same deficiencies from a previous question I asked here.
I am also trying to return the following columns in the query:
- article.pub_date
- article.web_id
- city.city_points
- city.city_name
I am pretty new to full text search - and SQL in general.
Let me know if you need any more information.
Solution
In parts similar to my last answer:
What's new here: we are only interested in the existence of a matching keyword, we are not even displaying them. (You may want to rethink that: shouldn't it become obvious from the result why the row was found?)
So I am using an
Once again we aggregate cities per
And once again use
Be aware that a list of comma-separated tables combined with
SELECT a.id, a.article_title AS title, a.pub_date, a.web_id
c.city_points, c.city_names
FROM article a
LEFT JOIN (
SELECT ac.article_id AS id
, string_agg(st_asText(c.geom), ', ') AS city_points
, string_agg(c.city, ', ') AS city_names
FROM article_cities ac
JOIN city c ON c.id = ac.city_id
GROUP BY 1
) c USING (id)
WHERE a.title_tsv @@ to_tsquery('putin')
OR EXISTS (
SELECT 1
FROM article_keywords ak
JOIN keyword k ON k.id = ak.keyword_id
WHERE ak.article_id = a.id
AND k.keyword_tsv @@ to_tsquery('putin')
);What's new here: we are only interested in the existence of a matching keyword, we are not even displaying them. (You may want to rethink that: shouldn't it become obvious from the result why the row was found?)
So I am using an
EXISTS semi-join as alternative WHERE clause. Related:- Count where previous record exists in either of two tables
Once again we aggregate cities per
article_id before joining to article, so we need no aggregate or DISTINCT in the outer SELECT since we do not multiply rows in the join.And once again use
LEFT [OUTER] JOIN to include articles that are not linked to any cities at all.Be aware that a list of comma-separated tables combined with
WHERE conditions to join them is effectively the same as [INNER] JOIN on a join condition. Basic in the manual here.Code Snippets
SELECT a.id, a.article_title AS title, a.pub_date, a.web_id
c.city_points, c.city_names
FROM article a
LEFT JOIN (
SELECT ac.article_id AS id
, string_agg(st_asText(c.geom), ', ') AS city_points
, string_agg(c.city, ', ') AS city_names
FROM article_cities ac
JOIN city c ON c.id = ac.city_id
GROUP BY 1
) c USING (id)
WHERE a.title_tsv @@ to_tsquery('putin')
OR EXISTS (
SELECT 1
FROM article_keywords ak
JOIN keyword k ON k.id = ak.keyword_id
WHERE ak.article_id = a.id
AND k.keyword_tsv @@ to_tsquery('putin')
);Context
StackExchange Database Administrators Q#108005, answer score: 2
Revisions (0)
No revisions yet.