HiveBrain v1.2.0
Get Started
← Back to all entries
patterncppMinor

strncpy To strcpy Equivalence

Submitted by: @import:stackexchange-codereview··
0
Viewed 0 times
strncpystrcpyequivalence

Problem

I have this ugly function, and I feel that the entire strncpy should just be an strcpy:

void PackData(char*& cursor, const std::string& data) {
    *(reinterpret_cast(cursor)) = static_cast(data.length() + 1);
    cursor += sizeof(int);
    // copy the text to the buffer
    ::strncpy(cursor, data.c_str(), data.size());
    cursor += (data.length() * sizeof(char));
    *(reinterpret_cast(cursor)) = 0;
    cursor += sizeof(char);
}


cursor is guaranteed to have enough room to contain all the data copied. And data only contains a '\0' character at termination.

I want to update this function to use strcpy, and to remove some of the ugly. Here's what I have:

void PackData(char*& cursor, const std::string& data) {
    const int size = data.size() + 1;

    std::copy_n(cursor, sizeof(int), reinterpret_cast(&size));
    cursor += sizeof(int);
    strcpy(cursor, data.c_str());
    cursor += size;
}


My code works fine, but I wanted to ask if anyone sees any misbehavior that I may have missed?

Solution

The new version of your function does not have any actual mistakes I could spot. It is good that you have replaced the assignment to *(reinterpret_cast(cursor)) with a call to std::copy_n because the old version (presumably) violated strict aliasing rules and therefore invoked undefined behavior. (Also think about alignment.) It also seemed wrong in your old function that you casted data.length() + 1 to short instead of int.

I have a few “soft” remarks, though.

Avoid output parameters

Taking a pointer by reference and then modifying does not appear pretty to me. Instead, I'd take the pointer by-value and return the number of bytes by which it was advanced.

std::size_t PackData(char * cursor, const std::string& data);


It looks better and allows for more flexible use.

Chose the appropriate library function

The string manipulation functions from the C standard library all operate on NUL terminated strings. That is, std::strcpy has to look at every byte of the string to determine whether it is the end of the string. You've mentioned that your string does not contain NUL bytes today. But what if it will do so in the future? Will you remember to update the function? And even if you are sure that NUL bytes will never be an issue, it is still less efficient than it could be. Since the std::string already knows its own length, you have a couple of options available.

  • std::memcpy – if you like doing things the C way



  • std::copy – if you like doing things the STL way



  • std::string::copy – if you like doing things the OO way



Note that none of these functions will NUL terminate the copied string, so you'll have to add the NUL byte yourself. Not hard to do.

cursor[size] = '\0';


Using std::memcpy could also help you with storing the length. Since its arguments are void pointers, there is no need to cast anything.

std::memcpy(cursor, &size, sizeof(size));


looks cleaner to me than

std::copy_n(cursor, sizeof(int), reinterpret_cast(&size));


Don't repeat yourself

Your code mentions the type of size three times. Once in the declaration – which is fine – and then two times in sizeof expressions. By using sizeof(size) instead of sizeof(int), you can eliminate this redundancy. If you ever change the data type, you only need to do so in one place and you can't get it inconsistent.

Verify your assumptions

The cast

const int size = data.size() + 1;


looks rather safe. But are you sure? Integer overflow is a nasty source of bugs and a potential entry door for crackers. I would recommend two things.

First, make the fact that you're narrowing explicit by spelling it out.

const auto size = static_cast(data.size() + 1);


Note that I can use auto now in order to follow the advice from the previous section.

Second, add defensive code (before the cast) to verify your assumptions.

assert(INT_MAX > data.size());


This is safe if your function is only called with trusted input. If attackers are of concern, assert is the wrong tool because it may be compiled out and you don't want your debug builds to be safe only to open security holes in your release builds. Therefore, an explicit check like

if (INT_MAX <= data.size())
  throw std::invalid_argument {"data too large"};


would be preferred in this case.

This code is safe as long as sizeof(int) <= sizeof(std::string::size_type) which I believe is true on almost any implementation. However, it might still trigger a compiler warning about comparing a signed and an unsigned integer. Actually, size could be unsigned anyway.

Alternatively, you could use a helper function checked_cast(data.size() + 1). I believe that the argument cannot overflow because a std::string always has to account for the possibility to NUL terminate its buffer.

Watch out for data leaks

I don't know about the environment in which your function is called but is the buffer that cursor points to aligned? If so, watch out for unused padding bytes after the end of the string and the next thing that is written into the buffer. They might leak information you'd rather not have anybody see.

Code Snippets

std::size_t PackData(char * cursor, const std::string& data);
cursor[size] = '\0';
std::memcpy(cursor, &size, sizeof(size));
std::copy_n(cursor, sizeof(int), reinterpret_cast<char*>(&size));
const int size = data.size() + 1;

Context

StackExchange Code Review Q#117494, answer score: 3

Revisions (0)

No revisions yet.