HiveBrain v1.2.0
Get Started
← Back to all entries
patternsqlModerate

Why is SQL Server spliting it's (JSON) response into multiple rows?

Submitted by: @import:stackexchange-dba··
0
Viewed 0 times
rowswhyserversqlresponseintosplitingmultiplejson

Problem

I am trying to build a query that results in a JSON object generated by SQL Server. I've found that I can use subqueries to populate a field (in this case the questions field) with a JSON string that that contains a list of questions.

Below is the query:

```
SELECT
quizzes.id AS 'id',
quizzes.name AS 'name',
quizzes.description AS 'description',
quizzes.instructions AS 'instructions',
author.id AS 'author.id',
author.midas AS 'author.midas',
author.first_name AS 'author.first_name',
author.last_name AS 'author.last_name',
author.email AS 'author.email',
author.tel AS 'author.tel',
author.department_name AS 'author.department_name',
author.created_at AS 'author.created_at',
author.last_updated AS 'author.last_updated',
course.id AS 'course.id',
course.name AS 'course.name',
course.description AS 'course.description',
course.crn AS 'course.crn',
instructor.id AS 'course.instructor.id',
instructor.midas AS 'course.instructor.midas',
instructor.first_name AS 'course.instructor.first_name',
instructor.last_name AS 'course.instructor.last_name',
instructor.email AS 'course.instructor.email',
instructor.tel AS 'course.instructor.tel',
instructor.department_name AS 'course.instructor.department_name',
instructor.created_at AS 'course.instructor.created_at',
instructor.last_updated AS 'course.instructor.last_updated',
course.created_at AS 'course.created_at',
course.last_updated AS 'course.last_updated',


(
SELECT
questions.id AS 'id',
questions.text AS 'text',
question_types.id AS 'type.id',
question_types.name AS 'type.name',
question_types.created_at AS 'type.created_at',
question_types.description AS 'type.description',
question_author.id AS 'author.id',
question_author.midas AS 'author.midas',
question_author.first_name AS 'author.

Solution

After further research, I have discovered from this StackOverflow post that SQL Server breaks FOR JSON queries into "~2kb chunks".

Sql Server splits result of FOR JSON query into ~2KB chunk, so you should either concatenate fragments like on the MSDN page or you can stream results into some output stream.

This means that only ~2000 characters can be sent per chunk.

UPDATE:

With the help of Max Vernon and Andriy M we have found a fairly simple solution to this problem.

DECLARE @json nvarchar(max);

;WITH src (n) AS
(
    SELECT n = sc1.name
    FROM sys.syscolumns sc1
    FOR JSON AUTO
)
SELECT @json = src.n
FROM src

SELECT @json, LEN(@json);


The chats that lead up to this can be found here.

The above query returns two columns.

  • The fully assembled JSON string



  • The length of that string



Ideally, you would replace the query between the two parentheses with your own query.
Why does this work?

According to Microsoft's Documentation

SQL Server uses a predefined column name for this rowset with one column of type NTEXT - “XML_F52E2B61-18A1-11d1-B105-00805F49916B” – to indicate chunked XML rowset in UTF-16 encoding. This requires special handling of the XML chunk rowset by the APIs to expose it as a single XML instance on the client side. In ADO.Net, one needs to use ExecuteXmlReader, and in ADO/OLEDB one should use the ICommandStream interface.

(While it is true that the above refers to XML, the same is true for JSON as well.)

The reason that JSON and XML responses are returned in chunks, to begin with, are for performance reasons:

For maximum XML [JSON] publishing performance FOR XML [JSON] does steaming XML formatting of the resulting rowset and directly sends its output to the server side TDS code in small chunks without buffering whole XML in the server space. The chunk size is 2033 UCS-2 characters. Thus, XML larger than 2033 UCS-2 characters is sent to the client side in multiple rows each containing a chunk of the XML.

The above solution circumvents this by setting the result of FOR JSON to a variable first, then sending the value of the variable, which results in SQL Server returning the response to the client as one row.

It should be noted that some database clients (specifically SQL Server Management Studio) are capable of "rebuilding" the chunked response, but if you are working with PHP (PDO) or operating on a Free Trial client (such as TablePlus for Mac) you will see the raw, chunked response.

Performance

As far as performance goes, I have not done any extensive testing, but I can provide the following data from the limited testing I have done:

Using a MacBook Pro running the latest OS, I have found that on average the following query

SELECT n = sc1.name
    FROM sys.syscolumns sc1
    FOR JSON AUTO


Will process in about 23,300μs

While the query

DECLARE
    @json nvarchar (max);

;WITH src (n) AS
(
    SELECT n = sc1.name
    FROM sys.syscolumns sc1
    FOR JSON AUTO
)
SELECT @json = src.n
FROM src

SELECT @json, LEN(@json);


Spent on average 137.8μs processing.

This seems to directly conflict with what the documentation says, so I am unsure how trustworthy these results are. However, it may be worth testing this yourself.

Code Snippets

DECLARE @json nvarchar(max);

;WITH src (n) AS
(
    SELECT n = sc1.name
    FROM sys.syscolumns sc1
    FOR JSON AUTO
)
SELECT @json = src.n
FROM src

SELECT @json, LEN(@json);
SELECT n = sc1.name
    FROM sys.syscolumns sc1
    FOR JSON AUTO
DECLARE
    @json nvarchar (max);

;WITH src (n) AS
(
    SELECT n = sc1.name
    FROM sys.syscolumns sc1
    FOR JSON AUTO
)
SELECT @json = src.n
FROM src

SELECT @json, LEN(@json);

Context

StackExchange Database Administrators Q#279613, answer score: 13

Revisions (0)

No revisions yet.