patternsqlMinor
Equality query on NVARCHAR column yields multiple results in SQL Server 2012
Viewed 0 times
equalitynvarchar2012columnsqlqueryyieldsmultipleserverresults
Problem
I am in the process of moving a pet project from PostgreSQL (9.2.2) to SQL Server (2012 Standard).
I've noticed an interesting phenomenon when querying unicode words. Given the definition:
and the data:
a query for a particular word will return near matches. For example:
returns:
http://sqlfiddle.com/#!6/1ab66/1
However, the same pattern in PostgreSQL only returns the exact match. How can I get SQL Server to do the same?
(PostgreSQL fiddle link): http://sqlfiddle.com/#!12/c57a6/1
I have the distinct feeling that I'm missing something, but I'm not quite able to figure out what it is.
The database collation is
I've noticed an interesting phenomenon when querying unicode words. Given the definition:
CREATE TABLE [word](
[id] [int] IDENTITY(0,1) NOT NULL,
[value] [nvarchar](255) NULL
);and the data:
insert into word (value) values (N'ῥύπῳ');
insert into word (value) values (N'ἀπὸ');
insert into word (value) values (N'ἀπό');
insert into word (value) values (N'ἐπὶ');
insert into word (value) values (N'ἐπί');
insert into word (value) values (N'ὑπὸ');
insert into word (value) values (N'ὑπό');
insert into word (value) values (N'πίῃ');
insert into word (value) values (N'λόγους');
insert into word (value) values (N'λόγχῃ');
insert into word (value) values (N'λόγων');
insert into word (value) values (N'ἀλόης');a query for a particular word will return near matches. For example:
select * from word where value = N'ἀπὸ'returns:
id value
102137 ῥύπῳ
102141 ἀπὸ
102142 ἀπό
102143 ἐπὶ
102144 ἐπί
102145 ὑπὸ
102146 ὑπό
102147 πίῃhttp://sqlfiddle.com/#!6/1ab66/1
However, the same pattern in PostgreSQL only returns the exact match. How can I get SQL Server to do the same?
(PostgreSQL fiddle link): http://sqlfiddle.com/#!12/c57a6/1
I have the distinct feeling that I'm missing something, but I'm not quite able to figure out what it is.
The database collation is
SQL_Latin1_General_CP1_CI_AS (which is also the server's collation) on a local installation.Solution
The collation determines the comparison semantics.
If I try
It only returns
Changing the suffix to
On my install I have tried every collation and
From a quick look through this last group looks to include all the
Script to try this yourself
If I try
CREATE TABLE [word](
[id] [int] IDENTITY(0,1) NOT NULL,
[value] [nvarchar](255) COLLATE Latin1_General_100_CI_AS NULL
);It only returns
ἀπὸ. Changing the suffix to
AI for accent insensitive returns ἀπό also.On my install I have tried every collation and
1526 return 1 (presumably AS and BIN collations), 1264 return 2 rows (presumably AI) and 1095 return 8. From a quick look through this last group looks to include all the
SQL collations and 90 collations whereas all the 100 ones are in the first 2 groups so I presume this is some issue that has been fixed in the 2008 batch of collations. (See What's New in SQL Server 2008 Collations)Script to try this yourself
DECLARE @Results TABLE
(
Count INT,
Collation SYSNAME
)
SET NOCOUNT ON;
DECLARE @N SYSNAME;
DECLARE @C1 AS CURSOR;
SET @C1 = CURSOR FAST_FORWARD FOR
SELECT name
FROM sys.fn_helpcollations();
OPEN @C1;
FETCH NEXT FROM @C1 INTO @N ;
WHILE @@FETCH_STATUS = 0
BEGIN
INSERT @Results
EXEC('SELECT COUNT(*), ''' + @N + ''' from word where value = N''ἀπὸ'' COLLATE ' + @N)
FETCH NEXT FROM @C1 INTO @N ;
END
SELECT *
FROM @Results
ORDER BY Count DESCCode Snippets
CREATE TABLE [word](
[id] [int] IDENTITY(0,1) NOT NULL,
[value] [nvarchar](255) COLLATE Latin1_General_100_CI_AS NULL
);DECLARE @Results TABLE
(
Count INT,
Collation SYSNAME
)
SET NOCOUNT ON;
DECLARE @N SYSNAME;
DECLARE @C1 AS CURSOR;
SET @C1 = CURSOR FAST_FORWARD FOR
SELECT name
FROM sys.fn_helpcollations();
OPEN @C1;
FETCH NEXT FROM @C1 INTO @N ;
WHILE @@FETCH_STATUS = 0
BEGIN
INSERT @Results
EXEC('SELECT COUNT(*), ''' + @N + ''' from word where value = N''ἀπὸ'' COLLATE ' + @N)
FETCH NEXT FROM @C1 INTO @N ;
END
SELECT *
FROM @Results
ORDER BY Count DESCContext
StackExchange Database Administrators Q#31394, answer score: 8
Revisions (0)
No revisions yet.