patternsqlMinor
Relationships between like matching and comparison
Viewed 0 times
likecomparisonbetweenandrelationshipsmatching
Problem
Naively I assumed that the relationship between a
and that those regions are contiguous and in that order.
Also, all rows having t
You can see this in this fiddle.
My question is what other assumptions here may be wrong. Does the result of an anchored like even always give a contiguous slice of the index or can there be gaps?1 Do I at least still have those four contiguous regions?
The underlying problem I'm working on is a generic database UI that shows index slices for a given search term. It's beneficial from a user experience point of view to not have the user select whether they want to make a prefix, equality or greater-than search, but to simply start at the equality (in the case of ascending seeking), following the index and then marking the two further boundaries "no longer equal from here" and "not even a prefix anymore here".
Only character types have this issue as it comes from the complications of collations.
This question is tagged SQL Server, but I feel if there's a RDMS where this isn't true that it's probably a deeper issue I need to know about - so feel free to comme
like 'foo%' and a >= 'foo' is that the latter matches the former and some additional rows that come after in the index order. More generally, I thought that for a given text t an index on a character column splits into four regions, being- the entries before (
- the entries equal to (=
)t,
- the entries greater than (>
)tthat still havetas a prefix (like concat(t, '%')) and
- the rest,
and that those regions are contiguous and in that order.
Also, all rows having t
as a prefix (like concat(t, '%')) form a contiguous region in the index as well and that this region contains region 2 at its start (meaning it's exactly the regions 2 and 3 together).
It may not be that simple though. For example, in SQL Server the word Häuser compares equal to Haeuser in the collation German_PhoneBook_CI_AI and both are >= 'Ha'.
However, only Haeuser matches like 'Ha%'`, so the additional assumptions about prefixes don't entirely hold.You can see this in this fiddle.
My question is what other assumptions here may be wrong. Does the result of an anchored like even always give a contiguous slice of the index or can there be gaps?1 Do I at least still have those four contiguous regions?
The underlying problem I'm working on is a generic database UI that shows index slices for a given search term. It's beneficial from a user experience point of view to not have the user select whether they want to make a prefix, equality or greater-than search, but to simply start at the equality (in the case of ascending seeking), following the index and then marking the two further boundaries "no longer equal from here" and "not even a prefix anymore here".
Only character types have this issue as it comes from the complications of collations.
This question is tagged SQL Server, but I feel if there's a RDMS where this isn't true that it's probably a deeper issue I need to know about - so feel free to comme
Solution
For example, in SQL Server the word Häuser comes before Haeuser in the collation German_PhoneBook_CI_AI
No it doesn't. They are equal.
returns
false
but
And so
It's possible that SQL Server will not be able to completely evaluate the
No it doesn't. They are equal.
select case when N'Häuser' < N'Haeuser' Collate German_PhoneBook_CI_AI
then 'true' else 'false' endreturns
false
N'Hä' = N'Hae' Collate German_PhoneBook_CI_AIbut
N'Ha' < N'Hä' Collate German_PhoneBook_CI_AIAnd so
N'Häuser' is not like 'Ha%' but is like N'Hae%' and like N'Hä%'.It's possible that SQL Server will not be able to completely evaluate the
like predicate in an index seek+ordered scan, instead having to seek+scan a larger range and apply the like operator to the rows in the larger range. See Dynamic Seeks and Hidden Implicit ConversionsCode Snippets
select case when N'Häuser' < N'Haeuser' Collate German_PhoneBook_CI_AI
then 'true' else 'false' endN'Hä' = N'Hae' Collate German_PhoneBook_CI_AIN'Ha' < N'Hä' Collate German_PhoneBook_CI_AIContext
StackExchange Database Administrators Q#316446, answer score: 7
Revisions (0)
No revisions yet.