patternsqlModerate
Alternative to MakeValid() for spatial data in SQL Server 2016
Viewed 0 times
alternativesqlforservermakevaliddataspatial2016
Problem
I have a very large table of geography
The problem: SQL Server has tighter requirements for a valid
For example:
Validating a very simple
Executing the
Unfortunately the
Any ideas?
My actual data contains hundreds/thousands of points.
LINESTRING data that I'm moving from Oracle to SQL Server. There are a number of evaluations that are executed against this data in Oracle, and they will need to be executed against the data in SQL Server, too.The problem: SQL Server has tighter requirements for a valid
LINESTRING than Oracle; "The LineString instance cannot overlap itself over an interval of two or more consecutive points". It just so happens that a percentage of our LINESTRINGs do not meet that criterion, which means that the functions we need to evaluate the data fail. I need to adjust the data so that it can be successfully validated in SQL Server.For example:
Validating a very simple
LINESTRING that doubles back on itself:select geography::STGeomFromText(
'LINESTRING (0 0 1, 0 1 2, 0 -1 3)',4326).IsValidDetailed()24413: Not valid because of two overlapping edges in curve (1).Executing the
MakeValid function against it:select geography::STGeomFromText(
'LINESTRING (0 0 1, 0 1 2, 0 -1 3)',4326).MakeValid().STAsText()LINESTRING (0 -0.999999999999867, 0 0, 0 0.999999999999867)Unfortunately the
MakeValid function changes the order of the points and removes the third dimension, which makes it unusable for us. I'm looking for another approach that solves this problem without reordering or removing the 3rd dimension.Any ideas?
My actual data contains hundreds/thousands of points.
Solution
Let me caveat that I am playing with spatial data in SQL server for the first time (so you probably already know this first part), but it took me a while to figure out that SQL Server isn't treating (x y z) coordinates as true 3D values, it is treating them as (latitude longitude) with an optional "elevation" value, Z, which is ignored by validation and other functions.
Evidence:
Your first example seemed weird to me because (0 0 1), (0 1 2), and (0 -1 3) are not collinear in 3D space (I'm a mathematician, so I was thinking in those terms).
To prove it, just swap the X and Z, and it validates:
This actually makes sense if we think of these as regions or paths traced on the surface of our globe, instead of points in mathematical 3D space.
The second part of your issue is that Z (and M) point values are not preserved by SQL through functions:
Z-coordinates are not used in any calculations made by the library and are not carried through any library calculations.
This is unfortunately by design. This was reported to Microsoft in 2010, the request was closed as "Won't Fix". You might find that discussion relevant, their reasoning is:
Assigning Z and M is ambiguous, because MakeValid splits and merges spatial elements. Points often gets created, removed or moved during this process. Therefore MakeValid (and other constructions) drops Z and M values.
For example:
Values Z and M are ambiguous for point (0 0). We decided to drop Z and M completely instead of returning half-correct result.
You can assign them later on if you know exactly how. Alternatively you can change the way you generate your objects to be valid on input, or keep two versions of your objects, one that is valid and another one that preserves all your features. If you explain your scenario better and what you do with the objects maybe we could be able to give you additional workarounds.
In addition, as you've already seen,
One idea I came across was to store them as a MULTIPOINT object instead:
The problem is when your linestring actually retraces a continuous section of line between two points that was previously traced by the line. By definition, if you're retracing existing points, then the linestring is no longer the simplest geometry that can represent this pointset, and MakeValid() will give you a multilinestring instead (and lose your Z/M values).
Unfortunately, if you're working with GPS data or similar then it's quite likely that you might have retraced your path at some point in the route, so linestrings are not always that useful in these scenarios :( Arguably, such data should be stored as a multipoint anyway since your data represents the discrete location of an object sampled at regular points in time.
In your case it validates just fine:
If you absolutely need to maintain these as LINESTRINGS, then you will have to write your own version of
I'm still working on some code, but give a look at some of the starting ideas here:
EDIT Ok, a few things I found while testing:
I have no idea if ther
Evidence:
select geography::STGeomFromText('LINESTRING (0 0 1, 0 1 2, 0 -1 3)', 4326)
.IsValidDetailed()
24413: Not valid because of two overlapping edges in curve (1).Your first example seemed weird to me because (0 0 1), (0 1 2), and (0 -1 3) are not collinear in 3D space (I'm a mathematician, so I was thinking in those terms).
IsValidDetailed (and MakeValid) is treating these as (0 0), (0 1), and (0, -1), which does make an overlapping line.To prove it, just swap the X and Z, and it validates:
select geography::STGeomFromText('LINESTRING (1 0 0, 2 1 0, 3 -1 0)', 4326)
.IsValidDetailed()
24400: ValidThis actually makes sense if we think of these as regions or paths traced on the surface of our globe, instead of points in mathematical 3D space.
The second part of your issue is that Z (and M) point values are not preserved by SQL through functions:
Z-coordinates are not used in any calculations made by the library and are not carried through any library calculations.
This is unfortunately by design. This was reported to Microsoft in 2010, the request was closed as "Won't Fix". You might find that discussion relevant, their reasoning is:
Assigning Z and M is ambiguous, because MakeValid splits and merges spatial elements. Points often gets created, removed or moved during this process. Therefore MakeValid (and other constructions) drops Z and M values.
For example:
DECLARE @a geometry = geometry::Parse('POINT(0 0 2 2)');
DECLARE @b geometry = geometry::Parse('POINT(0 0 1 1)');
SELECT @a.STUnion(@b).AsTextZM()Values Z and M are ambiguous for point (0 0). We decided to drop Z and M completely instead of returning half-correct result.
You can assign them later on if you know exactly how. Alternatively you can change the way you generate your objects to be valid on input, or keep two versions of your objects, one that is valid and another one that preserves all your features. If you explain your scenario better and what you do with the objects maybe we could be able to give you additional workarounds.
In addition, as you've already seen,
MakeValid can also do other unexpected things, like change the order of points, return a MULTILINESTRING, or even return a POINT object. One idea I came across was to store them as a MULTIPOINT object instead:
The problem is when your linestring actually retraces a continuous section of line between two points that was previously traced by the line. By definition, if you're retracing existing points, then the linestring is no longer the simplest geometry that can represent this pointset, and MakeValid() will give you a multilinestring instead (and lose your Z/M values).
Unfortunately, if you're working with GPS data or similar then it's quite likely that you might have retraced your path at some point in the route, so linestrings are not always that useful in these scenarios :( Arguably, such data should be stored as a multipoint anyway since your data represents the discrete location of an object sampled at regular points in time.
In your case it validates just fine:
select geometry::STGeomFromText('MULTIPOINT (0 0 1, 0 1 2, 0 -1 3)',4326)
.IsValidDetailed()
24400: ValidIf you absolutely need to maintain these as LINESTRINGS, then you will have to write your own version of
MakeValid that slightly adjusts some of the source X or Y points by some tiny value, while still preserving Z (and doesn't do other crazy things like convert it into other object types).I'm still working on some code, but give a look at some of the starting ideas here:
- Converting linestring to points
- Rob Farley Blog: Converting points to a path
- Create geography polyline from points in T-SQL
EDIT Ok, a few things I found while testing:
- If the geometry object is invalid, you just can't do much with it. You can't read the
STGeometryType, you can't get theSTNumPointsor useSTPointNto iterate through them. If you can't useMakeValid, you're basically stuck with operating on the text representation of the geographic object.
- Using
STAsText()will return the text representation of even an invalid object, but doesn't return Z or M values. Instead, we wantAsTextZM()orToString().
- You can't create a function that calls
RAND()(functions need to be deterministic), so I just made it nudge by successively larger and larger values. I really have no idea what the precision of your data is, or how tolerant it is of small changes, so use or modify this function at your own discretion.
I have no idea if ther
Code Snippets
select geography::STGeomFromText('LINESTRING (0 0 1, 0 1 2, 0 -1 3)', 4326)
.IsValidDetailed()
24413: Not valid because of two overlapping edges in curve (1).select geography::STGeomFromText('LINESTRING (1 0 0, 2 1 0, 3 -1 0)', 4326)
.IsValidDetailed()
24400: ValidDECLARE @a geometry = geometry::Parse('POINT(0 0 2 2)');
DECLARE @b geometry = geometry::Parse('POINT(0 0 1 1)');
SELECT @a.STUnion(@b).AsTextZM()select geometry::STGeomFromText('MULTIPOINT (0 0 1, 0 1 2, 0 -1 3)',4326)
.IsValidDetailed()
24400: ValidCREATE FUNCTION dbo.FixBadLineString (@input geography) RETURNS geography
AS BEGIN
DECLARE @output geography
IF @input.STIsValid() = 1 --send valid objects back as-is
SET @output = @input;
ELSE IF LEFT(@input.IsValidDetailed(),6) = '24413:'
--"Not valid because of two overlapping edges in curve"
BEGIN
--make a new MultiPoint object from the LineString text
DECLARE @mp geography = geography::STGeomFromText(
REPLACE(@input.AsTextZM(), 'LINESTRING', 'MULTIPOINT'), 4326);
DECLARE @newText nvarchar(max); --to build output
DECLARE @point int
DECLARE @tinynum float = 0;
SET @output = @input;
--keep going until it validates
WHILE @output.STIsValid() = 0
BEGIN
SET @newText = 'LINESTRING (';
SET @point = 1
SET @tinynum = @tinynum + 0.00000001
--Loop through the points, add a bit and append to the new string
WHILE @point <= @mp.STNumPoints()
BEGIN
SET @newText = @newText + convert(varchar(50),
@mp.STPointN(@point).Long + @tinynum) + ' ';
SET @newText = @newText + convert(varchar(50),
@mp.STPointN(@point).Lat - @tinynum) + ' ';
SET @newText = @newText + convert(varchar(50),
@mp.STPointN(@point).Z) + ', ';
SET @tinynum = @tinynum * -2
SET @point = @point + 1
END
--close the parens and make the new LineString object
SET @newText = LEFT(@newText, LEN(@newText) - 1) + ')'
SET @output = geography::STGeomFromText(@newText, 4326);
END; --this will loop if it is still invalid
RETURN @output;
END;
--Any other unhandled error, just send back NULL
ELSE SET @output = NULL;
RETURN @output;
ENDContext
StackExchange Database Administrators Q#172853, answer score: 13
Revisions (0)
No revisions yet.