HiveBrain v1.2.0
Get Started
← Back to all entries
patterncppMinor

std::vector memory manipulation with serialization and deserialization

Submitted by: @import:stackexchange-codereview··
0
Viewed 0 times
stdwithserializationmanipulationdeserializationmemoryandvector

Problem

This is my code and I would like to get it code reviewed. It is functional and behaves as expected.

I pass some basic types to the Serialize() function and then deserialize the output to get back the original value.

size_t returnSize(const char* s)
{
       string string(s);
       return string.size();
};
template
size_t returnSize(const T& t)
{
       return sizeof(t);
};

template
string Serialize(const T& t)
{
    T* pt = new T(t);
    vector CasttoChar;

    for (int i =0 ;i(pt)[i]);
    }
    delete pt;
    return  string( CasttoChar.begin(), CasttoChar.end() );

};

template<>
string Serialize(const string& t)
{
    return Serialize(t.c_str()); 
};

template
T DeSerialize(const string& cstr)
{
    const T* a = reinterpret_cast(cstr.c_str());
    return *a;
}

template<>
string DeSerialize(const string& cstr)
{
    return DeSerialize(cstr);
}

int _tmain(int argc, _TCHAR* argv[])
{
    int x = 97;
    string c = Serialize(x);
    cout (c) (g);
    cout (c2) (k.c_str());
    cout (c3) (ka);
    cout (c4)).c_str();

    system("PAUSE");
    return EXIT_SUCCESS;
}

Solution

First of all, fundamentally: std::string is a data type to hold text. Exclusively. It should not be used to hold binary data. Use a std::vector for that.

Secondly, you are using heap allocation without needing to:

T* pt = new T(t);


This makes no sense at all, and introduces the potential of a memory leak. You could simply make a copy – but you don’t even need to do that. Just work on the original data.

Next, about naming; it’s general convention in C++ to use underscore_separated_names rather than camelCase or PascalCase. This is only a convention but I’d follow it unless there’s a very compelling reason not to.

Also on the topic of naming: returnSize is redundant, return has no place in the name. Rename it to object_size or something along those lines. You are also inconsistent in your naming convention here.

To get the size of a zero-terminated string, use std::strlen, that’s more efficient than converting to a std::string.

Finally, the code simply doesn’t work. It creates a bit-by-bit shallow representation. That only works for simple composite objects, it no longer works for objects that use pointers or references internally – you’ve already noticed that, because otherwise you wouldn’t need to create a special chase for char and std::string. Incidentally, for the case of char you specialise returnSize but for the case of std::string you specialise Serialize and DeSerialize. That’s asymmetrical and means that you have several places that might require changing when new types are added. You should reduce this to a single point which requires specialisation.

Ignoring that, I’d rewrite the code as follows:

#include 
#include 
#include 
#include 

typedef unsigned char byte_t;
typedef std::vector buffer;

std::size_t object_size(const char* s) {
    return std::strlen(s);
};

template
std::size_t object_size(T const& obj) {
    return sizeof(obj);
};

template
buffer serialize(const T& obj) {
    std::size_t size = object_size(obj);
    buffer buf(size);

    byte_t const* obj_begin = reinterpret_cast(&obj);
    std::copy(obj_begin, obj_begin + size, buf.begin());

    return buf;
};

template<>
buffer serialize(std::string const& str) {
    return serialize(str.c_str());
};

template
T deserialize(buffer const& buf) {
    return *reinterpret_cast(&buf[0]);
}

template<>
std::string deserialize(buffer const& buf) {
    return deserialize(buf);
}

int main() {
    using std::cout;

    int x = 97;
    buffer c = serialize(x);
    cout (c) (c2) (c3) (c4) << "\n";
}


… but like I said, this solution doesn’t generalise and is pretty useless in practice. Have a look at Boost.Serialization for a proper implementation. Unfortunately this is quite a complex problem.

Code Snippets

T* pt = new T(t);
#include <algorithm>
#include <iostream>
#include <string>
#include <vector>

typedef unsigned char byte_t;
typedef std::vector<byte_t> buffer;

std::size_t object_size(const char* s) {
    return std::strlen(s);
};

template<typename T>
std::size_t object_size(T const& obj) {
    return sizeof(obj);
};

template<typename T>
buffer serialize(const T& obj) {
    std::size_t size = object_size(obj);
    buffer buf(size);

    byte_t const* obj_begin = reinterpret_cast<byte_t const*>(&obj);
    std::copy(obj_begin, obj_begin + size, buf.begin());

    return buf;
};

template<>
buffer serialize<std::string>(std::string const& str) {
    return serialize(str.c_str());
};

template<typename T>
T deserialize(buffer const& buf) {
    return *reinterpret_cast<const T*>(&buf[0]);
}

template<>
std::string deserialize<std::string>(buffer const& buf) {
    return deserialize<char*>(buf);
}

int main() {
    using std::cout;

    int x = 97;
    buffer c = serialize(x);
    cout << deserialize<int>(c) << "\n";

    char g = 'g';
    buffer c2 = serialize(g);
    cout << deserialize<char>(c2) << "\n";

    std::string k = "blabla";
    buffer c3 = serialize(k.c_str());
    cout << deserialize<char*>(c3) << "\n";

    std::string ka = "string";
    buffer c4 = serialize(ka);
    cout << deserialize<std::string>(c4) << "\n";
}

Context

StackExchange Code Review Q#20474, answer score: 8

Revisions (0)

No revisions yet.