HiveBrain v1.2.0
Get Started
← Back to all entries
patternModerate

What characters are word breakers in English for SQL Server 2005 and 2008 R2?

Submitted by: @import:stackexchange-dba··
0
Viewed 0 times
english2008whatserveraresqlandword2005for

Problem

I can find what DLL supports English word breakers by using sp_help_fulltext_system_components but I have not been able to find an actual list of the word breaking characters for English (like blank, ., %, etc.).

Anyone know of a source for this info?

Solution

This isn't an official list, but using a loop to work through a list of characters, and using sys.dm_fts_parser like so:


declare @i integer
declare @cnt integer
set @i=0
while @i1
begin
print 'this char - '+CASE WHEN @i > 31 THEN char(@i) ELSE '' END+' - char('+convert(varchar(3),@i)+') is a word breaker'
end
set @i=@i+1
end


I can generate a list of characters that sys.dm_fts_parser reckons break the words. (sys.dm_fts_parser returns a row for every 'word' found in the import, so if it returns more than 1 row we had a word breaker)

This could be expanded to extended/non-english character sets by using nchar() rather than char() (and a bigger value for @i), and changing parameter 2 (lcid) in the call to sys.dm_fts_parser

Context

StackExchange Database Administrators Q#25823, answer score: 12

Revisions (0)

No revisions yet.