HiveBrain v1.2.0
Get Started
← Back to all entries
patterncppMinor

Filtering Streambuf

Submitted by: @import:stackexchange-codereview··
0
Viewed 0 times
streambuffilteringstackoverflow

Problem

Today's code is a filtering streambuf (and associated stream type) to automate word-wrapped output in C++:

``
#include
#include
#include

class widthbuf: public std::streambuf {
public:
widthbuf(int w, std::streambuf* s)
: indent_width(0),
def_width(w),
width(w),
sbuf(s),
count(0)
{}

~widthbuf() { overflow('\n'); }

void set_indent(int w) {
if (w == 0) {
prefix.clear();
indent_width = 0;
width = def_width;
}
else {
indent_width += w;
prefix = string(indent_width, space);
width -= w;
}
}

private:

typedef std::basic_string string;

// This is basically a line-buffering stream buffer.
// The algorithm is:
// - Explicit end of line ("\r" or "\n"): we flush our buffer
// to the underlying stream's buffer, and set our record of
// the line length to 0.
// - An "alert" character: sent to the underlying stream
// without recording its length, since it doesn't normally
// affect the a appearance of the output.
// - tab: treated as occupying
tab_width characters, but is
// passed through undisturbed (but if we wanted to expand it
// to
tab_width` spaces, that would be pretty easy to do so
// you could adjust the tab-width if you wanted.
// - Everything else: really basic buffering with word wrapping.
// We try to add the character to the buffer, and if it exceeds
// our line width, we search for the last space/tab in the
// buffer and break the line there. If there is no space/tab,
// we break the line at the limit.
int_type overflow(int_type c) {
if (traits_type::eq_int_type(traits_type::eof(), c))
return traits_type::not_eof(c);
switch (c) {
case '\n':
case '\r':
{
buffer += c;
count = 0;

Solution

I looked over this code carefully and came up with a number of aspects that might be considered for future enhancements.

Write initializers in declaration order

Since initialization is done in declaration order rather than the order that initializers are written, it's helpful to readers of your code to make sure that initialization order matches declaration order. In the widthbuf constructor, the code is like this:

widthbuf(int w, std::streambuf* s)
    : indent_width(0), 
    def_width(w), 
    width(w), 
    sbuf(s), 
    count(0) 
{}


But because of their declaration order, width actually gets initialized before def_width and count before sbuf, so it would be one fewer opportunity for misunderstanding if these were arranged in declaration order.

Don't rely on promises not actually made

The constructor order comment also applies to the constructor of widthstream and the order of the underlying base class std::ostream and widthstream::buf but in this particular case it's not actually guaranteed to work:

widthstream(size_t width, std::ostream &os) 
    : buf(width, os.rdbuf()),
    std::ostream(&buf)
{}


It looks as though the intent was to initialize buf first and then use it to construct the underlying std::ostream but that's not actually what happens. In fact, the code is passing a reference to a not-yet-fully-constructed buf which is unusual but not technically illegal as long as the constructor receiving the reference does not do anything with it, such as dereference it, attempt to access member functions, etc. Because the underlying base class std::ostream makes no such guarantee not to dereference, this code is prone to failure.

This can be easily demonstrated by replacing std::ostream in widthstream with evil_ostream which attempts to get the locale of the std::streambuf that is the base class of widthbuf:

class evil_ostream : public std::ostream {
public:
    evil_ostream(std::streambuf *buf) : std::ostream(buf) { 
        std::cout getloc();   // kaboom!
    }
    virtual ~evil_ostream() {};
};


The line that attempts to fetch the locale causes a segfault because the underlying object is not actually completely constructed.

Handle wstring or state that you don't

The code does not seem to be happy with std::wstring instead of std::string as input. This isn't necessarily a problem, but it should be stated, at least in comments that wide characters are not handled by this code. Alternatively, adding support for wchar_t strings could probably be done with the use of templates.

Use all of the required #includes

The type std::basic_string is used but its declaration is in #include which is not actually in the list of includes.

Consider using facets

As mentioned by @LokiAstari in comments, it might be appropriate to consider the use of a facet for this purpose. The reasons one might do so would include the fact that formatting of text representation is a particular strength of the facet approach. However, there are also reasons one might not do so, including the fact that facets typically incorporate a cultural apect that isn't really appropos here and that facets tend to have complementary input and output aspects which also aren't really gemane to this application.

Code Snippets

widthbuf(int w, std::streambuf* s)
    : indent_width(0), 
    def_width(w), 
    width(w), 
    sbuf(s), 
    count(0) 
{}
widthstream(size_t width, std::ostream &os) 
    : buf(width, os.rdbuf()),
    std::ostream(&buf)
{}
class evil_ostream : public std::ostream {
public:
    evil_ostream(std::streambuf *buf) : std::ostream(buf) { 
        std::cout << "(ab)using reference to unconstructed buf\n"; 
        auto loc = buf->getloc();   // kaboom!
    }
    virtual ~evil_ostream() {};
};

Context

StackExchange Code Review Q#54845, answer score: 5

Revisions (0)

No revisions yet.