patternsqlMinor
Geographic Synonyms
Viewed 0 times
geographicsynonymsstackoverflow
Problem
Let's say I have a table representing principal country divisions (ex States):
I want users to be easily able to find New York via synonyms such as 'New York', 'NY', or 'New York State'.
So I have a synonyms table:
What is an efficient and easy way to query this and return ONE record for New York?
In particular, they should be able to find the result for the default name 'New York' OR any synonym:
I guess I would start with something like this:
Can I do this only using a view, or should I use a function?
create table principal_country_divisions (
id int primary key,
name text not null,
country_code char(2)
);
insert into principal_country_divisions values (1, 'New York', 'US');I want users to be easily able to find New York via synonyms such as 'New York', 'NY', or 'New York State'.
So I have a synonyms table:
create table synonyms (
syn text,
name text,
primary key (syn, name)
);
insert into synonyms values
('NY', 'New York'),
('New York State', 'New York');What is an efficient and easy way to query this and return ONE record for New York?
In particular, they should be able to find the result for the default name 'New York' OR any synonym:
select * from principal_country_divisions where name = 'NY';
result: {1, 'New York', 'US'}I guess I would start with something like this:
select
id,
name,
country_code
from principal_country_divisions a
where name = 'NY'
or exists (select 1 from synonyms where name = a.name and syn = 'NY')Can I do this only using a view, or should I use a function?
Solution
First of all, you have an integer primary key on
Be sure to add an index on
If you'd want to match patterns, not whole strings, the job would become more complex.
Next, how can you be sure to
return ONE record for New York?
Obviously,
As you commented, a
In case of multiple finds, you can chose what to pick by adding more
principal_country_divisions. Use it. More efficient than joining via name for multiple reasons (storage size, index size, faster integer arithmetic, no collations involved, fixed length).create table principal_country_divisions (
country_id int primary key
,name text not null
,country_code char(2)
);
create table synonyms (
country_id int REFERENCES principal_country_divisions (country_id)
,syn text
,primary key (syn, country_id)
);syn needs to the the first column of the index (pk), you had that right already. The accompanying index automatically covers equality tests on synonyms.syn.Be sure to add an index on
principal_country_divisions.name:CREATE INDEX foo ON principal_country_divisions (name);If you'd want to match patterns, not whole strings, the job would become more complex.
Next, how can you be sure to
return ONE record for New York?
Obviously,
name and syn can be the same. There is no unique constraint over both columns and there isn't even one on syn alone. Otherwise your EXISTS query is a good approach - usually fast. You'd just have to avoid multiple rows. The added benefit of EXISTS would be to eliminate duplicates from synonyms alone, but that's ruled out by the pk. This may be faster for the case:SELECT DISTINCT ON (1)
a.country_id, a.name, a.country_code
FROM principal_country_divisions a
LEFT JOIN synonyms s USING (country_id)
WHERE a.name = 'NY'
OR s.syn = 'NY'
-- ORDER BY 1, As you commented, a
LEFT JOIN is in order to preserve finds in name.In case of multiple finds, you can chose what to pick by adding more
ORDER BY expressions. Leading columns have to agree with DISTINCT ON, though. Details in this related answer on SO.Code Snippets
create table principal_country_divisions (
country_id int primary key
,name text not null
,country_code char(2)
);
create table synonyms (
country_id int REFERENCES principal_country_divisions (country_id)
,syn text
,primary key (syn, country_id)
);CREATE INDEX foo ON principal_country_divisions (name);SELECT DISTINCT ON (1)
a.country_id, a.name, a.country_code
FROM principal_country_divisions a
LEFT JOIN synonyms s USING (country_id)
WHERE a.name = 'NY'
OR s.syn = 'NY'
-- ORDER BY 1, <more expressions to pick from peers>Context
StackExchange Database Administrators Q#55142, answer score: 4
Revisions (0)
No revisions yet.