patternsqlMajor
EF Code First uses nvarchar(max) for all strings. Will this hurt query performance?
Viewed 0 times
thisnvarcharallqueryusesfirstmaxforwillperformance
Problem
I have some databases created using Entity Framework Code First; the apps are working and in general I'm pretty happy with what Code First lets me do. I am a programmer first, and a DBA second, by necessity. I am reading about DataAttributes to further describe in C# what I want the database to do; and my question is: what penalty will I be eating by having these
There are several columns in this particular table; in C# they are defined as such:
I expect to query and/or sort based on Name, Source, Generated, and Written. I expect Name and Source to be in the 0-50 character length, occasionally up to 150. I expect this table to start pretty small (1m rows). Obviously message could be small or large, and will probably not be queried against.
What I want to know, is there a performance hit for my Name and Source columns being defined as
nvarchar(max) strings in my table (see example below)?There are several columns in this particular table; in C# they are defined as such:
[Key]
[DatabaseGeneratedAttribute(DatabaseGeneratedOption.Identity)]
public int ID { get; set; }
public string Name { get; set; }
public string Message { get; set; }
public string Source { get; set; }
public DateTime Generated { get; set; }
public DateTime Written { get; set; }I expect to query and/or sort based on Name, Source, Generated, and Written. I expect Name and Source to be in the 0-50 character length, occasionally up to 150. I expect this table to start pretty small (1m rows). Obviously message could be small or large, and will probably not be queried against.
What I want to know, is there a performance hit for my Name and Source columns being defined as
nvarchar(max) when I never expect them to be larger than 150 characters?Solution
Larger nvarchar (max) data items (over 8000 bytes or so) will spill over into text storage and require additional I/O. Smaller items will be stored in-row. There are options that control this behaviour - see this MSDN article for more details.
If stored in-row there is no significant I/O performance overhead; there may be additional CPU overhead on processing the data type but this is likely to be minor.
However, leaving nvarchar (max) columns lying around the database where they are not needed is rather poor form. It does have some performance overhead and often data sizes are quite helpful in understanding a data table - for example, a varchar column 50 or 100 chars wide is likely to be a description or a free-text field where one that's (say) 10-20 chars ling is likely to be a code. You would be surprised how much meaning that one often has to infer from a database through assumptions like this.
Working in data warehousing, as often as not on poorly supported or documented legacy systems, having a database schema that's easy to understand is quite valuable. If you think of the database as the application's legacy, try to be nice to the people who are going to inherit it from you.
If stored in-row there is no significant I/O performance overhead; there may be additional CPU overhead on processing the data type but this is likely to be minor.
However, leaving nvarchar (max) columns lying around the database where they are not needed is rather poor form. It does have some performance overhead and often data sizes are quite helpful in understanding a data table - for example, a varchar column 50 or 100 chars wide is likely to be a description or a free-text field where one that's (say) 10-20 chars ling is likely to be a code. You would be surprised how much meaning that one often has to infer from a database through assumptions like this.
Working in data warehousing, as often as not on poorly supported or documented legacy systems, having a database schema that's easy to understand is quite valuable. If you think of the database as the application's legacy, try to be nice to the people who are going to inherit it from you.
Context
StackExchange Database Administrators Q#48408, answer score: 27
Revisions (0)
No revisions yet.