patternsqlMinor
Validating an HTML input string
Viewed 0 times
inputstringhtmlvalidating
Problem
A partial solution is to get all the
Output: Valid HTML
Input:
Output: Invalid HTML
Input:
Output: Valid HTML (Marking it Valid is OK for my use case)
But hey isn't that a valid HTML? It could be
Notes (VARCHAR) where there is an imbalance in `. But how can I write a query for that?
For example:
Input: Hello WorldOutput: Valid HTML
Input:
Output: Invalid HTML
Input:
Output: Valid HTML (Marking it Valid is OK for my use case)
But hey isn't that a valid HTML? It could be
Are there any better ways to check for "invalid" HTML where "invalid" is defined as HTML content with ` at the end.Solution
If all you want to do is confirm that the input string consists of a sequence of `
The input string is valid if every element passes the tests above.
Example
String splitter
I am using the
If the input string might be longer than 4000 characters, the (less efficient)
Any efficient string splitter (T-SQL or otherwise) that can return empty elements correctly with a sequence number will work.
The example below shows the sort of output
The first parameter is the string to split. The second parameter is the delimiter. The third parameter (SplitOption) is set to 1, which means Keep Empty Elements.
Output:
╔══════════╦══════════╗
║ SplitNum ║ SplitVal ║
╠══════════╬══════════╣
║ 1 ║ ║
║ 2 ║ a> ║
║ 3 ║ b> ║
║ 4 ║ c> ║
╚══════════╩══════════╝
Table and test data
Solution
Output
elements:
- Split the string on the
- The first element must be blank
- Subsequent elements must contain exactly one
>character
The input string is valid if every element passes the tests above.
Example
String splitter
I am using the
SQL#.String_Split4k streaming table-valued function from the free edition of the SQLCLR SQLsharp library by Solomon Rutzky.If the input string might be longer than 4000 characters, the (less efficient)
SQL#.String_Split function would be used instead (it handles strings up to 2GB).Any efficient string splitter (T-SQL or otherwise) that can return empty elements correctly with a sequence number will work.
The example below shows the sort of output
SQL#.String_Split4k returns.The first parameter is the string to split. The second parameter is the delimiter. The third parameter (SplitOption) is set to 1, which means Keep Empty Elements.
SELECT
SS.SplitNum,
SS.SplitVal
FROM SQL#.String_Split4k(N'', N'<', 1) AS SS;Output:
╔══════════╦══════════╗
║ SplitNum ║ SplitVal ║
╠══════════╬══════════╣
║ 1 ║ ║
║ 2 ║ a> ║
║ 3 ║ b> ║
║ 4 ║ c> ║
╚══════════╩══════════╝
Table and test data
CREATE TABLE #T
(
row_id integer IDENTITY PRIMARY KEY,
string varchar(50) NOT NULL
);
INSERT #T
(string)
VALUES
('Hello World'),
(''),
('');Solution
SELECT
T.row_id,
string = (SELECT TOP (1) T2.string FROM #T AS T2 WHERE T2.row_id = T.row_id),
is_valid =
MIN
(
CASE
-- First element must be blank
WHEN SS.SplitNum = 1 AND SS.SplitVal = N'' THEN 1
WHEN SS.SplitNum = 1 THEN 0
-- Other elements must contain exactly one >
WHEN SS.SplitVal NOT LIKE N'%>%' THEN 0
WHEN SS.SplitVal LIKE N'%>%>%' THEN 0
-- Otherwise valid
ELSE 1
END
)
FROM #T AS T
CROSS APPLY SQL#.String_Split4k(T.string, '<', 1) AS SS
GROUP BY T.row_id
ORDER BY T.row_id;Output
╔════════╦════════════════════════════╦══════════╗
║ row_id ║ string ║ is_valid ║
╠════════╬════════════════════════════╬══════════╣
║ 1 ║ Hello World ║ 1 ║
║ 2 ║ ║ 0 ║
║ 3 ║ ║ 1 ║
╚════════╩════════════════════════════╩══════════╝
Code Snippets
SELECT
SS.SplitNum,
SS.SplitVal
FROM SQL#.String_Split4k(N'<a><b><c>', N'<', 1) AS SS;CREATE TABLE #T
(
row_id integer IDENTITY PRIMARY KEY,
string varchar(50) NOT NULL
);
INSERT #T
(string)
VALUES
('<html>Hello World</html>'),
('<html><Hello World</html>'),
('<html><Hello World></html>');SELECT
T.row_id,
string = (SELECT TOP (1) T2.string FROM #T AS T2 WHERE T2.row_id = T.row_id),
is_valid =
MIN
(
CASE
-- First element must be blank
WHEN SS.SplitNum = 1 AND SS.SplitVal = N'' THEN 1
WHEN SS.SplitNum = 1 THEN 0
-- Other elements must contain exactly one >
WHEN SS.SplitVal NOT LIKE N'%>%' THEN 0
WHEN SS.SplitVal LIKE N'%>%>%' THEN 0
-- Otherwise valid
ELSE 1
END
)
FROM #T AS T
CROSS APPLY SQL#.String_Split4k(T.string, '<', 1) AS SS
GROUP BY T.row_id
ORDER BY T.row_id;Context
StackExchange Database Administrators Q#158924, answer score: 8
Revisions (0)
No revisions yet.