HiveBrain v1.2.0
Get Started
← Back to all entries
patternpythonMinor

Processing C++ comments

Submitted by: @import:stackexchange-codereview··
0
Viewed 0 times
processingcommentsstackoverflow

Problem

Here's the first functional version of my Python 2 script for processing comments in C++ source files. It's a personal project, I expect to expand it later with more advanced options (mainly about replacing comments with whitespace or marking their original positions in the comment-only output).

It's also intended as a learning excercise. I am self-learned in Python, my primary language is C++. So the core of my question is whether the code is "Pythonic" and if not, how to improve on that. I don't want to "write C++ with a different syntax," I want to (learn to) write proper Python.

I will of course also welcome any other comments (general style, efficiency, safety).

```
#! /usr/bin/env python

# Copyright Petr Kmoch 2014

"""Script for processing comments and non-comment code in C++ files.

The goal of this script to extract comments from C++ files and output only the comments, only the
non-comment code, or both.

For a quick usage summary, pass '-h' or '--help' as a command-line argument.
"""

import argparse
import os
import re
import sys

class Progress(object):
"""This class stores intermediary data when processing a single line of input.

It is used as a means of communicating data between the processFile() function and State_*
objects.

It contains the following members:
line: The tail part of the input line which has not been processed yet.
finished: Boolean flag indicating that the entire line has been processed.
stateChange:
If not None, this member holds a callable which takes the state stack as argument
and will modify it according to the results of processing the line.
noncomments: String of non-comments extracted from the line during processing.
comments: String of comments extracted from the line during processing.
"""

def __init__(self, line):
object.__init__(self)
self.line = line
self.resetProcessing()

def resetProcessing(self):
"""Clear the results of previously processing a piece

Solution

A few brief comments from an initial read through:

-
A lot of your method names are mixed case, but the Python convention is lowercase with underscores. The Python style guide is PEP 8:


Function names should be lowercase, with words separated by underscores as necessary to improve readability.


mixedCase is allowed only in contexts where that's already the prevailing style (e.g. threading.py), to retain backwards compatibility.

PEP 8 also recommends 4 spaces for indentation, rather than the 2 space which you’ve used, but that’s not worth getting too worked up about.

-
Why do the docstrings for extractNonComment and extractComment both say they append to noncomments, when this doesn’t seem to match what they’re actually doing?

-
Within the extract method: if in an initial bound in a string slice isn’t set, it defaults to 0, so you can replace val = self.line[0 : length] by val=self.line[:length].

Next, string slices that start or end in None return the whole string. For example:

>>> my_string = "12345\n"
>>> my_string[:None]
"12345\n"


So you don’t need to check for length is None explicitly: just set val = self.line[:length]. Then you could just trim self.line by the length of val. Something like:

def extract(self, length = None):
    val = self.line[:length]
    self.line = self.line[len(val):]
    return val

Code Snippets

>>> my_string = "12345\n"
>>> my_string[:None]
"12345\n"
def extract(self, length = None):
    val = self.line[:length]
    self.line = self.line[len(val):]
    return val

Context

StackExchange Code Review Q#45484, answer score: 4

Revisions (0)

No revisions yet.