patterncsharpModerate
Serializing objects to delimited files
Viewed 0 times
delimitedfilesobjectsserializing
Problem
For a new project I'm going to need to be able to serialize random types to TSV or CSV files, so I write a class which can be used to serialize any object to a TSV, CSV or any other _SV file you can think of. (You could literally serialize objects to files with the letter "B" or the word "Rawr" as the column or row delimiter.)
It's pretty simple, it starts with a
Then there's a serializer:
```
///
/// Represents a serializer that will serialize arbitrary objects to files with specific row and column separators.
///
public class DelimitedSerializer
{
///
/// The string to be used to separate columns.
///
public string ColumnDelimiter { get; set; }
///
/// The string to be used to separate rows.
///
public string RowDelimiter { get; set; }
///
/// Serializes an object to a delimited file. Throws an exception if any of the property names, column names, or values contain either the or the .
///
/// The type of the object to serialize.
/// A list of the items to serialize.
/// The serialized string.
public string Serialize(List items)
{
if (string.IsNullOrEmpty(ColumnDelimiter))
{
throw new ArgumentException($"The property '{nameof(ColumnDelimiter)}' cannot be null or an empty string.");
}
if (string.IsNullOrEmpty(RowDelimiter))
{
throw new ArgumentException($"The property '{nameof(RowDelimiter)}' cannot be null or an empty string.");
}
var result = new ExtendedStringBuilder();
var properties = typeof(T).GetProperties()
It's pretty simple, it starts with a
DelimitedColumnAttribute.///
/// Represents a column which can be used in a .
///
[AttributeUsage(AttributeTargets.Property)]
public class DelimitedColumnAttribute : Attribute
{
///
/// The name of the column.
///
public string Name { get; set; }
///
/// The order the column should appear in.
///
public int Order { get; set; }
}Then there's a serializer:
```
///
/// Represents a serializer that will serialize arbitrary objects to files with specific row and column separators.
///
public class DelimitedSerializer
{
///
/// The string to be used to separate columns.
///
public string ColumnDelimiter { get; set; }
///
/// The string to be used to separate rows.
///
public string RowDelimiter { get; set; }
///
/// Serializes an object to a delimited file. Throws an exception if any of the property names, column names, or values contain either the or the .
///
/// The type of the object to serialize.
/// A list of the items to serialize.
/// The serialized string.
public string Serialize(List items)
{
if (string.IsNullOrEmpty(ColumnDelimiter))
{
throw new ArgumentException($"The property '{nameof(ColumnDelimiter)}' cannot be null or an empty string.");
}
if (string.IsNullOrEmpty(RowDelimiter))
{
throw new ArgumentException($"The property '{nameof(RowDelimiter)}' cannot be null or an empty string.");
}
var result = new ExtendedStringBuilder();
var properties = typeof(T).GetProperties()
Solution
-
-
-
-
What happens if a column is
-
The argument-guards seems a little repetitive, we can put them into a function :
So, we can use it like :
-
Using
Expected output :
Actual output :
You can use a small trick here, knowing that
Or, you can use
Full code :
```
///
/// Represents a serializer that will serialize arbitrary objects to files with specific row and column separators.
///
public class DelimitedSerializer
{
///
/// The string to be used to separate columns.
///
public string ColumnDelimiter { get; set; }
///
/// The string to be used to separate rows.
///
public string RowDelimiter { get; set; }
///
/// Serializes an object to a delimited file. Throws an exception if any of the property names, column names, or values contain either the or the .
///
/// The type of the object to serialize.
/// A list of the items to serialize.
/// The serialized string.
public string Serialize(List items)
{
if (string.IsNullOrEmpty(ColumnDelimiter))
{
throw new ArgumentException($"The property '{nameof(ColumnDelim
readonly will not make members of your static serializers readonly. While you cannot reassign another serializer to replace it, its members can still be modified. Since you have access to C# you can use a get-only property to return a new instance :public static DelimitedSerializer TsvSerializer => new DelimitedSerializer { ColumnDelimiter = "\t", RowDelimiter = Environment.NewLine };-
properties can be optimized as such :var properties = typeof(T).GetProperties()
.Select((PropertyInfo p) => new
{
// caching the result, so you don't have to look it up repeatly
Attribute = p.GetCustomAttribute(),
Info = p,
})
.Where(x => x.Attribute != null)
// ?. is not needed here, but it makes testing easier with anonymous class
.OrderBy(x => x.Attribute?.Order)
.OrderBy(x => x.Attribute?.Name)
.OrderBy(x => x.Info.Name)
// properties are used multiple times, so you want to avoid deferred execution here
.ToList();-
properties is never materialized. LINQ use deferred execution, meaning that the query is never done ahead of time, but only when being iterated. This means that everytime you loops throught properties via foreach, the above query is execute. Once for header, and once for every single row. So, materialize it with ToList().-
What happens if a column is
null? NullReferenceException!// NullReferenceException
var value = property.GetValue(item).ToString();
var value = property.Info.GetValue(item).ToString(); // (changed in previous bullet)
// if the property is null, value will be null as well
var value = property.Info.GetValue(item)?.ToString();
// this also need to be fixed
if (value?.Contains(ColumnDelimiter) == true)-
The argument-guards seems a little repetitive, we can put them into a function :
Action checkForInvalidCharacters = (name, value) =>
{
if (value == null) return;
if (value.Contains(ColumnDelimiter))
{
throw new ArgumentException($"The {name} string '{value}' contains an invalid character: '{ColumnDelimiter}'.");
}
if (value.Contains(RowDelimiter))
{
throw new ArgumentException($"The {name} string '{value}' contains an invalid character: '{RowDelimiter}'.");
}
};So, we can use it like :
foreach (var property in properties)
{
var name = property.Attribute?.Name ?? property.Info.Name;
checkForInvalidCharacters("column name", name);
// ...
}
foreach (var item in items)
{
var row = new ExtendedStringBuilder();
foreach (var property in properties)
{
var value = property.Info.GetValue(item)?.ToString();
checkForInvalidCharacters("property value", value);
// ...
}
//...
}-
Using
row.Length > 0 to determine adding a column delimiter is wrong. If the first few properties are null, you will have trouble deserializing it later, as the column will be left shift by them. Take this example :// Yeah... I modified the function a bit to make testing easier...
/* //.Where(x => x.Attribute != null)
.OrderBy(x => x.Attribute?.Order)
.OrderBy(x => x.Attribute?.Name) */
DelimitedSerializer.CsvSerializer
.Serialize(new[]
{
new { A = "QQ", B = "qwe", C = 1 },
new { A = (string)null, B = (string)null, C = 2 },
new { A = "asd", B = "cc", C = 3 }
})Expected output :
A,B,C
QQ,qwe,1
,,2
asd,cc,3Actual output :
A,B,C
QQ,qwe,1
2
asd,cc,3You can use a small trick here, knowing that
(string)null + (string)null = string.Empty:string row = null;
foreach (var property in properties)
{
var value = property.Info.GetValue(item)?.ToString();
checkForInvalidCharacters("property value", value);
if (row != null)
row += ColumnDelimiter;
row += value;
}Or, you can use
string.Join:result += string.Join(ColumnDelimiter, properties
.Select(x =>
{
var name = x.Attribute?.Name ?? x.Info.Name;
checkForInvalidCharacters("column name", name);
return name;
}));Full code :
```
///
/// Represents a serializer that will serialize arbitrary objects to files with specific row and column separators.
///
public class DelimitedSerializer
{
///
/// The string to be used to separate columns.
///
public string ColumnDelimiter { get; set; }
///
/// The string to be used to separate rows.
///
public string RowDelimiter { get; set; }
///
/// Serializes an object to a delimited file. Throws an exception if any of the property names, column names, or values contain either the or the .
///
/// The type of the object to serialize.
/// A list of the items to serialize.
/// The serialized string.
public string Serialize(List items)
{
if (string.IsNullOrEmpty(ColumnDelimiter))
{
throw new ArgumentException($"The property '{nameof(ColumnDelim
Code Snippets
public static DelimitedSerializer TsvSerializer => new DelimitedSerializer { ColumnDelimiter = "\t", RowDelimiter = Environment.NewLine };var properties = typeof(T).GetProperties()
.Select((PropertyInfo p) => new
{
// caching the result, so you don't have to look it up repeatly
Attribute = p.GetCustomAttribute<DelimitedColumnAttribute>(),
Info = p,
})
.Where(x => x.Attribute != null)
// ?. is not needed here, but it makes testing easier with anonymous class
.OrderBy(x => x.Attribute?.Order)
.OrderBy(x => x.Attribute?.Name)
.OrderBy(x => x.Info.Name)
// properties are used multiple times, so you want to avoid deferred execution here
.ToList();// NullReferenceException
var value = property.GetValue(item).ToString();
var value = property.Info.GetValue(item).ToString(); // (changed in previous bullet)
// if the property is null, value will be null as well
var value = property.Info.GetValue(item)?.ToString();
// this also need to be fixed
if (value?.Contains(ColumnDelimiter) == true)Action<string, string> checkForInvalidCharacters = (name, value) =>
{
if (value == null) return;
if (value.Contains(ColumnDelimiter))
{
throw new ArgumentException($"The {name} string '{value}' contains an invalid character: '{ColumnDelimiter}'.");
}
if (value.Contains(RowDelimiter))
{
throw new ArgumentException($"The {name} string '{value}' contains an invalid character: '{RowDelimiter}'.");
}
};foreach (var property in properties)
{
var name = property.Attribute?.Name ?? property.Info.Name;
checkForInvalidCharacters("column name", name);
// ...
}
foreach (var item in items)
{
var row = new ExtendedStringBuilder();
foreach (var property in properties)
{
var value = property.Info.GetValue(item)?.ToString();
checkForInvalidCharacters("property value", value);
// ...
}
//...
}Context
StackExchange Code Review Q#128539, answer score: 10
Revisions (0)
No revisions yet.