HiveBrain v1.2.0
Get Started
← Back to all entries
patterncsharpMinor

Performance of object-to-string conversion

Submitted by: @import:stackexchange-codereview··
0
Viewed 0 times
conversionobjectstringperformance

Problem

Using an OR/M I map a lot of rows from database to array of objects (~300k). In these objects some properties are marked with special attribute [Signed]. Row-by-row I combine them in string, calculate salted MD5 of string and compare to hash I received from database.

Everything was working pretty okay for a long time, but now my website started to work quite slow and I run a profiler. It says the slowest method of whole website is ValueToString. I optimized it as much as I can. Is it possible to optimize it more? I want to mention I do not need exact string representation so feel free to change formatters.

Another question if I will redesign everything from string to byte[] - will it speed up the process, what do you think?

```
static string ValueToString(object value)
{
if (value == null)
return "";

string value_string = value as string;
if (value_string != null)
return value_string;

int value_int;
if (value is Enum) {
value_int = (int)value;
if (value_int >= 0 && value_int (object o, out T r)
{
if (o is T) {
r = (T)o;
return true;
}

r = default(T);
return false;
}

static readonly string[] hexStringTable = new string[]
{
"00", "01", "02", "03", "04", "05", "06", "07", "08", "09", "0A", "0B", "0C", "0D", "0E", "0F",
"10", "11", "12", "13", "14", "15", "16", "17", "18", "19", "1A", "1B", "1C", "1D", "1E", "1F",
"20", "21", "22", "23", "24", "25", "26", "27", "28", "29", "2A", "2B", "2C", "2D", "2E", "2F",
"30", "31", "32", "33", "34", "35", "36", "37", "38", "39", "3A", "3B", "3C", "3D", "3E", "3F",
"40", "41", "42", "43", "44", "45", "46", "47", "48", "49", "4A", "4B", "4C", "4D", "4E", "4F",
"50", "51", "52", "53", "54", "55", "56", "57", "58", "59", "5A", "5B", "5C", "5D", "5E", "5F",
"60", "61", "62", "63", "64", "65", "66", "67", "68", "69", "6A", "6B", "6C", "6D", "6E", "6F",
"70", "71", "72", "73", "74", "75", "76", "77", "78", "79

Solution

(I've made a few assumptions, so please correct me if I'm wrong.)

Let's suppose we have the following types:

public class Foo
{
    public int Id { get; set; }

    [Signed]
    public DateTime Timestamp { get; set; }

    [Signed]
    public string Name { get; set; }

    [Signed]
    public decimal Price { get; set; }

    [Signed]
    public SomeEnum Thing { get; set; }
}

public enum SomeEnum
{
    One = 1,
    Two,
    Three
}

public class SignedAttribute : Attribute
{
}


We want to hash the Signed properties of an instance of Foo (actually, many instances of Foo).

I assume you're currently getting these properties via reflection. Something like this:

var properties = typeof(Foo)
    .GetProperties()
    .Where(p => Attribute.IsDefined(p, typeof(SignedAttribute)))
    .OrderBy(p => p.Name)
    .ToArray();

// ...

var sb = new StringBuilder();
foreach (var property in properties)
{
    sb.Append(ValueToString(property.GetValue(foo)));
}

var bytes = Encoding.UTF8.GetBytes(sb.ToString());
md5.TransformBlock(bytes, 0, bytes.Length, bytes, 0);


Let's set up a test to see how long this takes

var foo = new Foo
{
    Id = 1,
    Name = "Thing",
    Price = 123.45m,
    Thing = SomeEnum.Two,
    Timestamp = new DateTime(2015, 04, 07)
};

byte[] hash;
var sw = Stopwatch.StartNew();
using (var md5 = MD5.Create())
{
    for (var i = 0; i < 300000; i++)
    {
        var sb = new StringBuilder();
        foreach (var property in properties)
        {
            sb.Append(ValueToString(property.GetValue(foo)));
        }

        var bytes = Encoding.UTF8.GetBytes(sb.ToString());
        md5.TransformBlock(bytes, 0, bytes.Length, bytes, 0);
    }

    md5.TransformFinalBlock(new byte[0], 0, 0);
    hash = md5.Hash;
}

sw.Stop();


Here ValueToString is the code as posted in your question. On my machine, I get a time of 0.8s.

How much better can we do? Suppose we could cheat and use the following method:

private static string FooToString(Foo foo)
{
    var sb = new StringBuilder();
    sb.Append(foo.Name);
    sb.Append(foo.Price.ToString(
        "0.############################",
        CultureInfo.InvariantCulture));
    sb.Append(((int) foo.Thing).ToString("X"));
    sb.Append(foo.Timestamp.Ticks.ToString("X"));
    return sb.ToString();
}


So now our loop looks like this:

for (var i = 0; i < 300000; i++)
{
    var value = FooToString(foo);
    var bytes = Encoding.UTF8.GetBytes(value);
    md5.TransformBlock(bytes, 0, bytes.Length, bytes, 0);
}


I get a time of 0.36s, which I'll take as the baseline.

Great, but let's assume we can't write FooToString. We can create an equivalent method dynamically.

We introduce some helper methods:

public static string ValueToString(string value)
{
    return value ?? string.Empty;
}

public static string ValueToString(Enum value)
{
    return Convert.ToInt32(value).ToString("X");
}

public static string ValueToString(decimal value)
{
    return value.ToString("0.############################", CultureInfo.InvariantCulture);
}

public static string ValueToString(DateTime value)
{
    return value.Ticks.ToString("X");
}


And then create the method dynamically, using reflection to find the properties of Foo that have the Signed attribute.

var properties = (typeof(Foo))
    .GetProperties()
    .Where(p => Attribute.IsDefined(p, typeof(SignedAttribute)))
    .OrderBy(p => p.Name)
    .ToArray();

var getSignatureMethod = new DynamicMethod(
    "GetSignature",
    typeof(string),
    new[] { typeof(Foo) },
    typeof(Foo).Module);

var generator = getSignatureMethod.GetILGenerator();
generator.DeclareLocal(typeof(StringBuilder));
generator.Emit(OpCodes.Newobj, typeof(StringBuilder).GetConstructor(Type.EmptyTypes));
generator.Emit(OpCodes.Stloc_0);
generator.Emit(OpCodes.Ldloc_0);

var append = typeof(StringBuilder).GetMethod("Append", new[] { typeof (string) });
foreach (var property in properties)
{
    generator.Emit(OpCodes.Ldarg_0);
    generator.Emit(OpCodes.Callvirt, property.GetGetMethod());
    if (property.PropertyType.BaseType == typeof(Enum))
    {
        generator.Emit(OpCodes.Box, property.PropertyType);
    }

    generator.Emit(OpCodes.Call, typeof(Program).GetMethod("ValueToString", new[] { property.PropertyType }));
    generator.Emit(OpCodes.Callvirt, append);
}

generator.Emit(OpCodes.Pop);
generator.Emit(OpCodes.Ldloc_0);
generator.Emit(OpCodes.Callvirt, typeof(object).GetMethod("ToString", new Type[] { }));
generator.Emit(OpCodes.Ret);

var getSignature = (Func)
    getSignatureMethod.CreateDelegate(typeof(Func));


(I haven't worked with dynamically generated code before, there might be some things that could be improved.)

Now our loop looks like this:

for (var i = 0; i < 300000; i++)
{
    var value = getSignature(foo);
    var bytes = Encoding.UTF8.GetBytes(value);
    md5.TransformBlock(bytes, 0, bytes.Length, bytes, 0);
}


And on my machine, it takes 0.39s, which i

Code Snippets

public class Foo
{
    public int Id { get; set; }

    [Signed]
    public DateTime Timestamp { get; set; }

    [Signed]
    public string Name { get; set; }

    [Signed]
    public decimal Price { get; set; }

    [Signed]
    public SomeEnum Thing { get; set; }
}

public enum SomeEnum
{
    One = 1,
    Two,
    Three
}

public class SignedAttribute : Attribute
{
}
var properties = typeof(Foo)
    .GetProperties()
    .Where(p => Attribute.IsDefined(p, typeof(SignedAttribute)))
    .OrderBy(p => p.Name)
    .ToArray();

// ...

var sb = new StringBuilder();
foreach (var property in properties)
{
    sb.Append(ValueToString(property.GetValue(foo)));
}

var bytes = Encoding.UTF8.GetBytes(sb.ToString());
md5.TransformBlock(bytes, 0, bytes.Length, bytes, 0);
var foo = new Foo
{
    Id = 1,
    Name = "Thing",
    Price = 123.45m,
    Thing = SomeEnum.Two,
    Timestamp = new DateTime(2015, 04, 07)
};

byte[] hash;
var sw = Stopwatch.StartNew();
using (var md5 = MD5.Create())
{
    for (var i = 0; i < 300000; i++)
    {
        var sb = new StringBuilder();
        foreach (var property in properties)
        {
            sb.Append(ValueToString(property.GetValue(foo)));
        }

        var bytes = Encoding.UTF8.GetBytes(sb.ToString());
        md5.TransformBlock(bytes, 0, bytes.Length, bytes, 0);
    }

    md5.TransformFinalBlock(new byte[0], 0, 0);
    hash = md5.Hash;
}

sw.Stop();
private static string FooToString(Foo foo)
{
    var sb = new StringBuilder();
    sb.Append(foo.Name);
    sb.Append(foo.Price.ToString(
        "0.############################",
        CultureInfo.InvariantCulture));
    sb.Append(((int) foo.Thing).ToString("X"));
    sb.Append(foo.Timestamp.Ticks.ToString("X"));
    return sb.ToString();
}
for (var i = 0; i < 300000; i++)
{
    var value = FooToString(foo);
    var bytes = Encoding.UTF8.GetBytes(value);
    md5.TransformBlock(bytes, 0, bytes.Length, bytes, 0);
}

Context

StackExchange Code Review Q#85956, answer score: 5

Revisions (0)

No revisions yet.