patternpythonMinor
Filtering file for command line tool
Viewed 0 times
filelineforcommandtoolfiltering
Problem
I've written a little command-line utility for filtering log files. It's like grep, except instead of operating on lines it operates on log4j-style messages, which may span multiple lines, with the first line always including the logging level (TRACE, DEBUG etc.).
Example usage on a file short.log with contents like this:
I think the main loop of the program could probably be simplified with some sort of parse and filter.
Example usage on a file short.log with contents like this:
16:16:12 DEBUG - Something happened, here's a couple lines of info:
debug line
another debug line
16:16:14 - I'm being very verbose 'cause you've put me on TRACE
trace info
16:16:15 TRACE - single line trace
16:16:16 DEBUG - single line debug
logrep -f short.log DEBUG produces:16:16:12 DEBUG - Something happened, here's a couple lines of info:
debug line
another debug line
16:16:16 DEBUG - single line debug
I think the main loop of the program could probably be simplified with some sort of parse and filter.
file = fileinput.input(options.file)
try:
line = file.next()
while True:
if any(s in line for s in loglevels):
if filter in line:
sys.stdout.write(line)
line = file.next()
while not any(s in line for s in loglevels):
sys.stdout.write(line)
line = file.next()
continue
line = file.next()
except StopIteration:
return
Solution
I see two problems with your loops.
In terms of style, calling
In terms of functionality, I would personally consider the grepper to be buggy because it will fail to find a log message where the
To solve both problems, I would decompose the problem into two parts: reconstructing the logical messages (somewhat ugly) and searching (relatively straightforward).
Note that I've used a regular expression to look for TRACE, DEBUG, etc. The
In terms of style, calling
file.next() and catching StopIteration is highly unconventional. The normal way to iterate is:for line in fileinput.input(options.file):
…In terms of functionality, I would personally consider the grepper to be buggy because it will fail to find a log message where the
filter keyword that you are seeking appears on a continuation line.To solve both problems, I would decompose the problem into two parts: reconstructing the logical messages (somewhat ugly) and searching (relatively straightforward).
import fileinput
import re
def log_messages(lines):
"""
Given an iterator of log lines, generate pairs of
(level, message), where message is a logical log message.
possibly multi-line.
"""
log_level_re = re.compile(r'\b(TRACE|DEBUG|WARN|ERROR|CRITICAL)\b')
message = None
for line in lines:
match = log_level_re.search(line)
if match: # First line
if message is not None:
yield level, message
level, message = match.group(), line
elif message is not None: # Continuation line
message += line
if message is not None: # End of file
yield level, message
for level, message in log_messages(fileinput.input(options.file)):
if filter in message:
sys.stdout.write(message)Note that I've used a regular expression to look for TRACE, DEBUG, etc. The
\b anchors ensure that we don't mistake words like "INTRACELLULAR" for a TRACE message.Code Snippets
for line in fileinput.input(options.file):
…import fileinput
import re
def log_messages(lines):
"""
Given an iterator of log lines, generate pairs of
(level, message), where message is a logical log message.
possibly multi-line.
"""
log_level_re = re.compile(r'\b(TRACE|DEBUG|WARN|ERROR|CRITICAL)\b')
message = None
for line in lines:
match = log_level_re.search(line)
if match: # First line
if message is not None:
yield level, message
level, message = match.group(), line
elif message is not None: # Continuation line
message += line
if message is not None: # End of file
yield level, message
for level, message in log_messages(fileinput.input(options.file)):
if filter in message:
sys.stdout.write(message)Context
StackExchange Code Review Q#31768, answer score: 2
Revisions (0)
No revisions yet.