patternsqlModerate
Why is SQL Server spliting it's (JSON) response into multiple rows?
Viewed 0 times
rowswhyserversqlresponseintosplitingmultiplejson
Problem
I am trying to build a query that results in a JSON object generated by SQL Server. I've found that I can use subqueries to populate a field (in this case the questions field) with a JSON string that that contains a list of questions.
Below is the query:
```
SELECT
quizzes.id AS 'id',
quizzes.name AS 'name',
quizzes.description AS 'description',
quizzes.instructions AS 'instructions',
author.id AS 'author.id',
author.midas AS 'author.midas',
author.first_name AS 'author.first_name',
author.last_name AS 'author.last_name',
author.email AS 'author.email',
author.tel AS 'author.tel',
author.department_name AS 'author.department_name',
author.created_at AS 'author.created_at',
author.last_updated AS 'author.last_updated',
course.id AS 'course.id',
course.name AS 'course.name',
course.description AS 'course.description',
course.crn AS 'course.crn',
instructor.id AS 'course.instructor.id',
instructor.midas AS 'course.instructor.midas',
instructor.first_name AS 'course.instructor.first_name',
instructor.last_name AS 'course.instructor.last_name',
instructor.email AS 'course.instructor.email',
instructor.tel AS 'course.instructor.tel',
instructor.department_name AS 'course.instructor.department_name',
instructor.created_at AS 'course.instructor.created_at',
instructor.last_updated AS 'course.instructor.last_updated',
course.created_at AS 'course.created_at',
course.last_updated AS 'course.last_updated',
(
SELECT
questions.id AS 'id',
questions.text AS 'text',
question_types.id AS 'type.id',
question_types.name AS 'type.name',
question_types.created_at AS 'type.created_at',
question_types.description AS 'type.description',
question_author.id AS 'author.id',
question_author.midas AS 'author.midas',
question_author.first_name AS 'author.
Below is the query:
```
SELECT
quizzes.id AS 'id',
quizzes.name AS 'name',
quizzes.description AS 'description',
quizzes.instructions AS 'instructions',
author.id AS 'author.id',
author.midas AS 'author.midas',
author.first_name AS 'author.first_name',
author.last_name AS 'author.last_name',
author.email AS 'author.email',
author.tel AS 'author.tel',
author.department_name AS 'author.department_name',
author.created_at AS 'author.created_at',
author.last_updated AS 'author.last_updated',
course.id AS 'course.id',
course.name AS 'course.name',
course.description AS 'course.description',
course.crn AS 'course.crn',
instructor.id AS 'course.instructor.id',
instructor.midas AS 'course.instructor.midas',
instructor.first_name AS 'course.instructor.first_name',
instructor.last_name AS 'course.instructor.last_name',
instructor.email AS 'course.instructor.email',
instructor.tel AS 'course.instructor.tel',
instructor.department_name AS 'course.instructor.department_name',
instructor.created_at AS 'course.instructor.created_at',
instructor.last_updated AS 'course.instructor.last_updated',
course.created_at AS 'course.created_at',
course.last_updated AS 'course.last_updated',
(
SELECT
questions.id AS 'id',
questions.text AS 'text',
question_types.id AS 'type.id',
question_types.name AS 'type.name',
question_types.created_at AS 'type.created_at',
question_types.description AS 'type.description',
question_author.id AS 'author.id',
question_author.midas AS 'author.midas',
question_author.first_name AS 'author.
Solution
After further research, I have discovered from this StackOverflow post that SQL Server breaks
Sql Server splits result of FOR JSON query into ~2KB chunk, so you should either concatenate fragments like on the MSDN page or you can stream results into some output stream.
This means that only ~2000 characters can be sent per chunk.
UPDATE:
With the help of Max Vernon and Andriy M we have found a fairly simple solution to this problem.
The chats that lead up to this can be found here.
The above query returns two columns.
Ideally, you would replace the query between the two parentheses with your own query.
Why does this work?
According to Microsoft's Documentation
SQL Server uses a predefined column name for this rowset with one column of type NTEXT - “XML_F52E2B61-18A1-11d1-B105-00805F49916B” – to indicate chunked XML rowset in UTF-16 encoding. This requires special handling of the XML chunk rowset by the APIs to expose it as a single XML instance on the client side. In ADO.Net, one needs to use ExecuteXmlReader, and in ADO/OLEDB one should use the ICommandStream interface.
(While it is true that the above refers to XML, the same is true for JSON as well.)
The reason that JSON and XML responses are returned in chunks, to begin with, are for performance reasons:
For maximum XML [JSON] publishing performance FOR XML [JSON] does steaming XML formatting of the resulting rowset and directly sends its output to the server side TDS code in small chunks without buffering whole XML in the server space. The chunk size is 2033 UCS-2 characters. Thus, XML larger than 2033 UCS-2 characters is sent to the client side in multiple rows each containing a chunk of the XML.
The above solution circumvents this by setting the result of
It should be noted that some database clients (specifically SQL Server Management Studio) are capable of "rebuilding" the chunked response, but if you are working with PHP (PDO) or operating on a Free Trial client (such as TablePlus for Mac) you will see the raw, chunked response.
Performance
As far as performance goes, I have not done any extensive testing, but I can provide the following data from the limited testing I have done:
Using a MacBook Pro running the latest OS, I have found that on average the following query
Will process in about 23,300μs
While the query
Spent on average 137.8μs processing.
This seems to directly conflict with what the documentation says, so I am unsure how trustworthy these results are. However, it may be worth testing this yourself.
FOR JSON queries into "~2kb chunks".Sql Server splits result of FOR JSON query into ~2KB chunk, so you should either concatenate fragments like on the MSDN page or you can stream results into some output stream.
This means that only ~2000 characters can be sent per chunk.
UPDATE:
With the help of Max Vernon and Andriy M we have found a fairly simple solution to this problem.
DECLARE @json nvarchar(max);
;WITH src (n) AS
(
SELECT n = sc1.name
FROM sys.syscolumns sc1
FOR JSON AUTO
)
SELECT @json = src.n
FROM src
SELECT @json, LEN(@json);The chats that lead up to this can be found here.
The above query returns two columns.
- The fully assembled JSON string
- The length of that string
Ideally, you would replace the query between the two parentheses with your own query.
Why does this work?
According to Microsoft's Documentation
SQL Server uses a predefined column name for this rowset with one column of type NTEXT - “XML_F52E2B61-18A1-11d1-B105-00805F49916B” – to indicate chunked XML rowset in UTF-16 encoding. This requires special handling of the XML chunk rowset by the APIs to expose it as a single XML instance on the client side. In ADO.Net, one needs to use ExecuteXmlReader, and in ADO/OLEDB one should use the ICommandStream interface.
(While it is true that the above refers to XML, the same is true for JSON as well.)
The reason that JSON and XML responses are returned in chunks, to begin with, are for performance reasons:
For maximum XML [JSON] publishing performance FOR XML [JSON] does steaming XML formatting of the resulting rowset and directly sends its output to the server side TDS code in small chunks without buffering whole XML in the server space. The chunk size is 2033 UCS-2 characters. Thus, XML larger than 2033 UCS-2 characters is sent to the client side in multiple rows each containing a chunk of the XML.
The above solution circumvents this by setting the result of
FOR JSON to a variable first, then sending the value of the variable, which results in SQL Server returning the response to the client as one row.It should be noted that some database clients (specifically SQL Server Management Studio) are capable of "rebuilding" the chunked response, but if you are working with PHP (PDO) or operating on a Free Trial client (such as TablePlus for Mac) you will see the raw, chunked response.
Performance
As far as performance goes, I have not done any extensive testing, but I can provide the following data from the limited testing I have done:
Using a MacBook Pro running the latest OS, I have found that on average the following query
SELECT n = sc1.name
FROM sys.syscolumns sc1
FOR JSON AUTOWill process in about 23,300μs
While the query
DECLARE
@json nvarchar (max);
;WITH src (n) AS
(
SELECT n = sc1.name
FROM sys.syscolumns sc1
FOR JSON AUTO
)
SELECT @json = src.n
FROM src
SELECT @json, LEN(@json);Spent on average 137.8μs processing.
This seems to directly conflict with what the documentation says, so I am unsure how trustworthy these results are. However, it may be worth testing this yourself.
Code Snippets
DECLARE @json nvarchar(max);
;WITH src (n) AS
(
SELECT n = sc1.name
FROM sys.syscolumns sc1
FOR JSON AUTO
)
SELECT @json = src.n
FROM src
SELECT @json, LEN(@json);SELECT n = sc1.name
FROM sys.syscolumns sc1
FOR JSON AUTODECLARE
@json nvarchar (max);
;WITH src (n) AS
(
SELECT n = sc1.name
FROM sys.syscolumns sc1
FOR JSON AUTO
)
SELECT @json = src.n
FROM src
SELECT @json, LEN(@json);Context
StackExchange Database Administrators Q#279613, answer score: 13
Revisions (0)
No revisions yet.