patternsqlMinor
TSQL (2014) - Importing XML with accented and punctuation Characters
Viewed 0 times
accentedtsqlpunctuationwithxmlcharactersandimporting2014
Problem
Using this solution from Forrest:
I receive an error like
XML parsing: line 162, character 34, illegal xml character
when need to parse fields containing accented characters (àèòìùé) or °
I tried converting
XML parsing: line 162, character 41, well formed check: undeclared entity
while if I convert to
The problem is I cannot HTMLEncode the XML File since I expect it will convert also all the "" of the XML Structure, and probably also something else.
Can suggest the best solution to HTMLEncode the XML before parsing?
DECLARE @XML xml =
'
334
17.10
marc58
Marc Juppé
Rue garçonneé III° arrondissement
Paris
FR
333333333
22222
4.50
3664
47
MCPU DDA010
mmx
1
6.2
3665
45
MCPU DFZ42
mmy
2
3.2
'
SELECT
x.value('./ItemID[1]','int') AS ItemID,
x.value('./Store[1]','int') AS Store,
x.value('./Title[1]','nvarchar(100)') AS Title,
x.value('./SKU[1]','nvarchar(100)') AS SKU,
x.value('../Quantity[1]','int') AS Qty,
x.value('../Price[1]','decimal(11,2)') AS Price,
x.query('//OrderID[1]').value('.','int') AS OrderID,
x.query('//AmountPaid[1]').value('.','decimal(11,2)') AS AmountPaid,
x.query('//UserID[1]').value('.','nvarchar(100)') AS UserID,
x.query('//ShippingCosts[1]').value('.','decimal(11,2)') AS ShippingCosts
FROM @XML.nodes('//Item/Details') i(x)I receive an error like
XML parsing: line 162, character 34, illegal xml character
when need to parse fields containing accented characters (àèòìùé) or °
I tried converting
é int è but get this error:XML parsing: line 162, character 41, well formed check: undeclared entity
while if I convert to
è it works.The problem is I cannot HTMLEncode the XML File since I expect it will convert also all the "" of the XML Structure, and probably also something else.
Can suggest the best solution to HTMLEncode the XML before parsing?
Solution
The XML you have is invalid in UTF-8 encoding. The accented characters needs to be encoded. For instance
Here is a shorter version that also fails.
Msg 9420, Level 16, State 1, Line 1 XML parsing: line 1, character 39,
illegal xml character
Should be like this instead if you use UTF-8.
You need to trace the source of the data backwards to the producer to see where you end up with invalid XML.
° should be encoded as °.Here is a shorter version that also fails.
declare @X xml = '°';
select @X.value('text()[1]', 'nchar(1)');Msg 9420, Level 16, State 1, Line 1 XML parsing: line 1, character 39,
illegal xml character
Should be like this instead if you use UTF-8.
declare @X xml = '°';
select @X.value('text()[1]', 'nchar(1)');You need to trace the source of the data backwards to the producer to see where you end up with invalid XML.
Code Snippets
declare @X xml = '<?xml version="1.0" encoding="UTF-8"?>°';
select @X.value('text()[1]', 'nchar(1)');declare @X xml = '<?xml version="1.0" encoding="UTF-8"?>°';
select @X.value('text()[1]', 'nchar(1)');Context
StackExchange Database Administrators Q#187492, answer score: 5
Revisions (0)
No revisions yet.