HiveBrain v1.2.0
Get Started
← Back to all entries
gotchasqlMinor

Converting varbinary data to varchar in SQLServer leads to unexpected results

Submitted by: @import:stackexchange-dba··
0
Viewed 0 times
varbinarysqlservervarcharunexpectedresultsconvertingdataleads

Problem

I need to fix some data conversion issue in SQL server 2008. I got some requirement change on storing the data. Without much care I converted existing varbinary data to varchar using CONVERT(NVARCHAR(max), @bytearraydata, 1).

The same conversion in C# is done using Encoding.Default.GetString and Encoding.Default.GetBytes methods. Encoding.Default.GetBytes(string) gets back the bytearray as it was earlier. But when I try to get back the byte array of string which I converted using CONVERT() gives me a wrong result.

My work is to fetch byte array stored as string the database and convert it to byte array and finally render the content as PDF. Data going through encoding mechanism (while saving and while fetching) works fine for me. But when I try to fetch the data which was converted using CONVERT it is failing to generate PDF.

How can I resolve this problem?

Summary:

Byte array column has been changed to string.

Existing data conversion done using this function:

Convert(NVARCHAR(MAX), @bytearraydata, 1)


In the application byte array conversion is done using Encoding.Default.GetString(bytearraydata)

Are Encoding and CONVERT not compatible?

Solution

There are a few of confusions going on in the question, leading to the unexpected results:

-
The terms VARCHAR and NVARCHAR are being used interchangeably (or so it seems), yet they are quite different. NVARCHAR is a 16-bit encoding – UTF-16 LE (Little Endian) to be exact – and this does not change. VARCHAR is an 8-bit encoding, and the specific 8-bit encoding being used is determined by the Code Page associated with the Collation of the column (we will ignore VARCHAR data in string literals and variables for now since this question is about data stored in a table). If you want to know what Code Page is associated with a particular Collation, you can use the COLLATIONPROPERTY built-in functions:

SELECT COLLATIONPROPERTY(N'Latin1_General_100_CI_AS_SC', 'CodePage') AS [CodePage];
-- 1252


-
When converting between VARBINARY and either VARCHAR or NVARCHAR you need to be careful to be consistent with that string datatype. You can't convert from VARCHAR to VARBINARY and then take that same VARBINARY and convert it to NVARCHAR.

-
The Encoding class in .NET represents a particular encoding of text, whether it be 7-bit, 8-bit, 16-bit, 32-bit, or variable (such as UTF-8). In order to get "expected" results, you need to create an encoding that matches what you need to convert to or from, with respect to the byte[] representation. Strings in .NET are always UTF-16 LE (same as NVARCHAR ), and this is what the Unicode encoding in .NET refers to. The byte[] representation of an encoding will be of whatever encoding was created, but the string representation will always be UTF-16 LE. So, which encoding to create depends on what type of data you are dealing with:

  • NVARCHAR: Use Encoding.Unicode



  • VARCHAR: Determine the Code Page of the Collation via COLLATIONPROPERTY(N'collation_name', 'CodePage') and then use that int value in Encoding.GetEncoding(CodePageIntValue).



-
When using the CONVERT built-in function, be careful which "style" number you are using. For example:

SELECT CONVERT(VARBINARY(50), N'bob');
-- 0x62006F006200


Now take that returned VARBINARY value and convert it back into NVARCHAR, using "style" values of 0 (default) and 1 (which is what your CONVERT function is using in the question):

SELECT CONVERT(NVARCHAR(MAX), 0x62006F006200, 0) AS [Style_0],
CONVERT(NVARCHAR(MAX), 0x62006F006200, 1) AS [Style_1];


Returns:

Style_0 Style_1
bob 0x62006F006200


So, if the following statement from the question is true:


In the application byte array conversion is done using Encoding.Default.GetString(bytearraydata)

then that would equate to using VARCHAR instead of NVARCHAR, and a "style" value of 0 (or nothing) instead of 1:

CONVERT(VARCHAR(MAX), 0x62006F006200)

Context

StackExchange Database Administrators Q#56266, answer score: 4

Revisions (0)

No revisions yet.