patternsqlMinor
SQL Server: Collation of the resource database
Viewed 0 times
theresourcesqldatabasecollationserver
Problem
I'm trying to use stopwords so that some words are skipped by the Full-Text Indexer. I can add those stopwords in the table sys.fulltext_stopwords.
When trying to get a list of stopwords and system-stopwords in SQL Server 2012 an error encountered. I'm executing the following (simplified) query:
The error message I get is:
"Cannot resolve the collation conflict between "Latin1_General_CI_AS" and "Latin1_General_BIN" in the UNION operation."
The database collation for the first SELECT statement in my query is Latin1_General_CI_AS. The same is true for the master, model, msdb and tempdb databases.
Where does the Latin1_General_BIN collation come from? It looks like the sys.fulltext_system_stopwords table has a different collation, but why?
EDIT:
I can 'solve' my error by using COLLATE in my query, like this:
I see that the system-stopwords are stored in the resource database, which can explain the difference in collation. The next question would be: why is the collation from the resource database different that the default?
When trying to get a list of stopwords and system-stopwords in SQL Server 2012 an error encountered. I'm executing the following (simplified) query:
SELECT sys.fulltext_stopwords.stopword
FROM sys.fulltext_stopwords
UNION
SELECT sys.fulltext_system_stopwords.stopword
FROM sys.fulltext_system_stopwords;The error message I get is:
"Cannot resolve the collation conflict between "Latin1_General_CI_AS" and "Latin1_General_BIN" in the UNION operation."
The database collation for the first SELECT statement in my query is Latin1_General_CI_AS. The same is true for the master, model, msdb and tempdb databases.
Where does the Latin1_General_BIN collation come from? It looks like the sys.fulltext_system_stopwords table has a different collation, but why?
EDIT:
I can 'solve' my error by using COLLATE in my query, like this:
SELECT sys.fulltext_stopwords.stopword COLLATE DATABASE_DEFAULT
FROM sys.fulltext_stopwords
UNION
SELECT sys.fulltext_system_stopwords.stopword COLLATE DATABASE_DEFAULT
FROM sys.fulltext_system_stopwordsI see that the system-stopwords are stored in the resource database, which can explain the difference in collation. The next question would be: why is the collation from the resource database different that the default?
Solution
The database Collation for the first SELECT statement in my query is Latin1_General_CI_AS.
Well, not exactly. There are a few problems with this statement:
-
The Database's default Collation only matters in a query when using string literals, variables, and return values from UDFs. AND, that default Collation only matters if there is no column or
-
SELECT statements / queries, as a whole, do not use Collations. Collation is assigned per each string field, and it can be different for each field in a query.
-
The column in the first / top query,
I see that the system-stopwords are stored in the resource database, which can explain the difference in collation.
A column (or expression) coming from a different Database does not necessarily explain a difference in Collation. As stated above, Collation is set per each field of a query, whether that field comes from a column in a table or is an expression. Collation is usually derived naturally from Collation Precedence: Column Collation overrides literals and variables, and the
The main point here, however, is that if the
I can 'solve' my error by using
Yes, the
For example, if I run the query with no
Msg 468, Level 16, State 9, Line 4
Cannot resolve the collation conflict between "SQL_Latin1_General_CP1_CI_AS" and "Latin1_General_BIN" in the UNION operation.
So, assuming that the conflict is in the second (i.e. bottom) SELECT statement, I could fix it by applying the current DB's default Collation:
And that works. BUT, what if we try putting
That also works. And in fact, both of the following also work:
Also, rather than use the
Where does the Latin1_General_BIN collation come from? It looks like the sys.fulltext_system_stopwords table has a different collation, but why?
That Collation comes from an unlikely source. Let's look at the system catalog views in the query:
The results indicate that this is a View (as expected) and that the
The results indicate that this is a View (as expected) and that the
We can now dig a little deeper into the definition of each of those system catalog views:
Returns (simplified):
And then:
The results indicate that this is a system table, and that the
Next we can move on to the other system catalog view:
```
EXEC sys.sp_helptext
Well, not exactly. There are a few problems with this statement:
-
The Database's default Collation only matters in a query when using string literals, variables, and return values from UDFs. AND, that default Collation only matters if there is no column or
COLLATE keyword being used.-
SELECT statements / queries, as a whole, do not use Collations. Collation is assigned per each string field, and it can be different for each field in a query.
-
The column in the first / top query,
stopword, is not using the Latin1_General_CI_AS Collation (more on this in a moment).I see that the system-stopwords are stored in the resource database, which can explain the difference in collation.
A column (or expression) coming from a different Database does not necessarily explain a difference in Collation. As stated above, Collation is set per each field of a query, whether that field comes from a column in a table or is an expression. Collation is usually derived naturally from Collation Precedence: Column Collation overrides literals and variables, and the
COLLATE keyword overrides both. When there is a conflict, then you need the COLLATE keyword.The main point here, however, is that if the
stopword column in sys.fulltext_system_stopwords comes from a column in a Table in the Resource Database (i.e. mssqlsystemresource), OR if it comes from an expression in a View in the Resource Database that has its Collation set via the COLLATE keyword, then the default Collation of the Resource Database is irrelevant.I can 'solve' my error by using
COLLATE in my queryYes, the
COLLATE keyword is the way to go. However, to fix this conflict, you only need to specify the COLLATE keyword in one query, though it does not matter which query.For example, if I run the query with no
COLLATE keyword, I get:SELECT ftsw.stopword
FROM sys.fulltext_stopwords ftsw
UNION
SELECT ftssw.stopword
FROM sys.fulltext_system_stopwords ftssw;Msg 468, Level 16, State 9, Line 4
Cannot resolve the collation conflict between "SQL_Latin1_General_CP1_CI_AS" and "Latin1_General_BIN" in the UNION operation.
So, assuming that the conflict is in the second (i.e. bottom) SELECT statement, I could fix it by applying the current DB's default Collation:
SELECT ftsw.stopword
FROM sys.fulltext_stopwords ftsw
UNION
SELECT ftssw.stopword COLLATE SQL_Latin1_General_CP1_CI_AS
FROM sys.fulltext_system_stopwords ftssw;And that works. BUT, what if we try putting
COLLATE on the first / top SELECT:SELECT ftsw.stopword COLLATE SQL_Latin1_General_CP1_CI_AS
FROM sys.fulltext_stopwords ftsw
UNION
SELECT ftssw.stopword
FROM sys.fulltext_system_stopwords ftssw;That also works. And in fact, both of the following also work:
SELECT ftsw.stopword COLLATE Hebrew_100_CI_AS
FROM sys.fulltext_stopwords ftsw
UNION
SELECT ftssw.stopword
FROM sys.fulltext_system_stopwords ftssw;
-- and:
SELECT ftsw.stopword
FROM sys.fulltext_stopwords ftsw
UNION
SELECT ftssw.stopword COLLATE Hebrew_100_CI_AS
FROM sys.fulltext_system_stopwords ftssw;Also, rather than use the
DATABASE_DEFAULT option, which here equates to Latin1_General_CI_AS, I would use Latin1_General_BIN (or better yet: Latin1_General_100_BIN2, which is newer and better) in this particular case because it will ensure that different strings that can be normalized into the same string via the "distinct" behavior of the UNION (without the ALL) and the case-insensitivity of the Latin1_General_CI_AS Collation show up as different rows.Where does the Latin1_General_BIN collation come from? It looks like the sys.fulltext_system_stopwords table has a different collation, but why?
That Collation comes from an unlikely source. Let's look at the system catalog views in the query:
EXEC sys.sp_help N'sys.fulltext_stopwords';The results indicate that this is a View (as expected) and that the
stopword column has a Collation of Latin1_General_BIN (not expected). But wait, if the Latin1_General_BIN Collation is coming from sys.fulltext_stopwords, then what about sys.fulltext_system_stopwords and where is the other Collation coming from? Let's look:EXEC sys.sp_help N'sys.fulltext_system_stopwords';The results indicate that this is a View (as expected) and that the
stopword column has a Collation of SQL_Latin1_General_CP1_CI_AS (not expected).We can now dig a little deeper into the definition of each of those system catalog views:
EXEC sys.sp_helptext N'sys.fulltext_stopwords';Returns (simplified):
SELECT
fts.stopword,
FROM sys.sysftstops ftsAnd then:
EXEC sys.sp_help N'sys.sysftstops';The results indicate that this is a system table, and that the
stopword column indeed has a Collation of Latin1_General_BIN.Next we can move on to the other system catalog view:
```
EXEC sys.sp_helptext
Code Snippets
SELECT ftsw.stopword
FROM sys.fulltext_stopwords ftsw
UNION
SELECT ftssw.stopword
FROM sys.fulltext_system_stopwords ftssw;SELECT ftsw.stopword
FROM sys.fulltext_stopwords ftsw
UNION
SELECT ftssw.stopword COLLATE SQL_Latin1_General_CP1_CI_AS
FROM sys.fulltext_system_stopwords ftssw;SELECT ftsw.stopword COLLATE SQL_Latin1_General_CP1_CI_AS
FROM sys.fulltext_stopwords ftsw
UNION
SELECT ftssw.stopword
FROM sys.fulltext_system_stopwords ftssw;SELECT ftsw.stopword COLLATE Hebrew_100_CI_AS
FROM sys.fulltext_stopwords ftsw
UNION
SELECT ftssw.stopword
FROM sys.fulltext_system_stopwords ftssw;
-- and:
SELECT ftsw.stopword
FROM sys.fulltext_stopwords ftsw
UNION
SELECT ftssw.stopword COLLATE Hebrew_100_CI_AS
FROM sys.fulltext_system_stopwords ftssw;EXEC sys.sp_help N'sys.fulltext_stopwords';Context
StackExchange Database Administrators Q#29917, answer score: 6
Revisions (0)
No revisions yet.