HiveBrain v1.2.0
Get Started
← Back to all entries
snippetsqlModerate

FORMAT returns large row size and data size

Submitted by: @import:stackexchange-dba··
0
Viewed 0 times
formatsizelargereturnsandrowdata

Problem

I am surprise with one of my findings that using a FORMAT () does have very big impact on the row size and data size. It is almost 250x more of the size of not applying FORMAT ().

My question is:

1) Why does using FORMAT() have such big impact on the size? To me, is just 1.23 vs $1.23 difference, which is probably 1 character difference. And does it matter with such huge size?

2) Why do we still be encouraged to use FORMAT() in SQL server instead of using string concatenation as below. Since below is only 2X data size, and using format returns 250x data size. OR is that data size is not a critical measurement?

SELECT '

3) Does having data size of 464MB means i will be returning 464MB of data to client?

========================================================

Below is my findings with AdventureWorks2012 database.

SELECT UnitPrice FROM Sales.SalesOrderDetail;


Actual Number of Rows: 121317

Estimated Number of Rows: 121317

Estimated Row Size: 15B

Estimated Data Size: 1777KB

SELECT '

Actual Number of Rows: 121317

Estimated Row Size: 26B

Estimated Data Size: 3060KB

SELECT FORMAT(UnitPrice, 'c') FROM Sales.SalesOrderDetail;


Actual Number of Rows: 121317

Estimated Row Size: 4011B

Estimated Data Size: 464MB + CONVERT(varchar(10), UnitPrice) FROM Sales.SalesOrderDetail;


3) Does having data size of 464MB means i will be returning 464MB of data to client?

========================================================

Below is my findings with AdventureWorks2012 database.

%%CODEBLOCK_1%%

Actual Number of Rows: 121317

Estimated Number of Rows: 121317

Estimated Row Size: 15B

Estimated Data Size: 1777KB

%%CODEBLOCK_2%%

Actual Number of Rows: 121317

Estimated Row Size: 26B

Estimated Data Size: 3060KB

%%CODEBLOCK_3%%

Actual Number of Rows: 121317

Estimated Row Size: 4011B

Estimated Data Size: 464MB + CONVERT(varchar (10), UnitPrice) FROM Sales.SalesOrderDetail;


Actual Number of Rows: 121317

Estimated Row Size: 26B

Estimated Data Size: 3060KB

%%CODEBLOCK_3%%

Actual Number of Rows: 121317

Estimated Row Size: 4011B

Estimated Data Size: 464MB + CONVERT(varchar(10), UnitPrice) FROM Sales.SalesOrderDetail;

3) Does having data size of 464MB means i will be returning 464MB of data to client?

========================================================

Below is my findings with AdventureWorks2012 database.

%%CODEBLOCK_1%%

Actual Number of Rows: 121317

Estimated Number of Rows: 121317

Estimated Row Size: 15B

Estimated Data Size: 1777KB

%%CODEBLOCK_2%%

Actual Number of Rows: 121317

Estimated Row Size: 26B

Estimated Data Size: 3060KB

%%CODEBLOCK_3%%

Actual Number of Rows: 121317

Estimated Row Size: 4011B

Estimated Data Size: 464MB

Solution

FORMAT() has an (admittedly undocumented) output of nvarchar(4000), at least in the cases of converting ints and dates to strings. The documentation simply says...


The length of the return value is determined by the format.

But then doesn't explain or provide any examples. You can see what I'm describing, though, with:

SELECT TOP (1) object_id, x = FORMAT(object_id, 'en-us') 
  INTO #blat FROM sys.all_objects;

EXEC tempdb.sys.sp_help N'#blat';


Result is that x is an nvarchar with a length of 8,000 (this is the number of bytes, not the number of characters).

Estimated row size is based on an assumption that variable width values will be half-populated. So, it expects 2,000 characters (4,000 bytes) on each row (even if the particular parameters you supply can't possibly result in that many characters). I demonstrate this (but not with FORMAT() specifically) in another answer, Would using varchar(5000) be bad compared to varchar(255)?

This is one reason I prefer to use CONVERT() and TRY_CONVERT() equivalents instead of FORMAT(), in spite of its syntactic sugar. At least with those you can convert to a defined width instead of relying on it "being determined by the format." Which may or may help estimated size, depending on the query. Another example that demonstrates the benefit here (even though it requires uglier code):

DECLARE @m float = 32.74532323;

SELECT 
    a = @m, 
    b = FORMAT(@m, 'c'), 
    c = '

Results:

a    float
b    nvarchar(4000)
c    varchar(13)


Another reason I prefer to use CONVERT() and TRY_CONVERT() is that FORMAT() sucks from a performance perspective (see FORMAT() is nice and all, but…).

Also please don't ever use variable-width types like varchar without also specifying a length. + CONVERT(varchar(12), CONVERT(decimal(8,2),@m)) INTO #splunge FROM sys.all_objects; EXEC tempdb.sys.sp_help N'#splunge';


Results:

%%CODEBLOCK_2%%

Another reason I prefer to use CONVERT() and TRY_CONVERT() is that FORMAT() sucks from a performance perspective (see FORMAT() is nice and all, but…).

Also please don't ever use variable-width types like varchar without also specifying a length.

Code Snippets

SELECT TOP (1) object_id, x = FORMAT(object_id, 'en-us') 
  INTO #blat FROM sys.all_objects;

EXEC tempdb.sys.sp_help N'#blat';
DECLARE @m float = 32.74532323;

SELECT 
    a = @m, 
    b = FORMAT(@m, 'c'), 
    c = '$' + CONVERT(varchar(12), CONVERT(decimal(8,2),@m))
 INTO #splunge 
 FROM sys.all_objects;

EXEC tempdb.sys.sp_help N'#splunge';
a    float
b    nvarchar(4000)
c    varchar(13)

Context

StackExchange Database Administrators Q#246975, answer score: 11

Revisions (0)

No revisions yet.