HiveBrain v1.2.0
Get Started
← Back to all entries
patternpythonMinor

Read firewall logs

Submitted by: @import:stackexchange-codereview··
0
Viewed 0 times
firewalllogsread

Problem

I am still a Python beginner and would appreciate some help with this code.

I am looking through some firewall log files, more specific all lines with Deny in them. For those files I am extracting the protocol, source IP, destination IP and destination port. The output is summarized and a hit counter is added (thanks memoselyk). Everything is working as intended but some optimization is still needed.

What I still try to optimize is:

-
If I enter a >3GB log file it will take a LONG time. Of course it is a large file but we are talking many hours.

-
The output could need some tweaking, especially getting rid of the [, ] and ' characters, and tab the output. I've actually been trying for some hours but no success.

-
I have to do some optimization in regards to ICMP traffic, but I will give this a go later. In short it is due to my regex where i use the / character, but this isn't present in ICMP traffic.

Any help would be appreciated.

Example log output:

```
Nov 9 00:36:10 firewall %ASA-4-106023: Deny tcp src outside:1.1.1.1/43882 dst outside:2.2.2.2/23 by access-group "outside-in" [0x0, 0x0]
Nov 9 00:36:10 firewall %ASA-4-106023: Deny tcp src outside:1.1.1.1/38780 dst outside:2.2.2.2/23 by access-group "outside-in" [0x0, 0x0]
Nov 9 00:36:11 firewall %ASA-4-106023: Deny tcp src outside:1.1.1.1/8273 dst outside:2.2.2.2/23 by access-group "outside-in" [0x0, 0x0]
Nov 9 00:36:12 firewall %ASA-4-106023: Deny tcp src outside:1.1.1.1/23433 dst outside:2.2.2.22/23 by access-group "outside-in" [0x0, 0x0]
Nov 9 00:36:12 firewall %ASA-4-106023: Deny tcp src outside:1.1.1.1/25175 dst outside:2.2.2.24/23 by access-group "outside-in" [0x0, 0x0]
Nov 9 00:36:12 firewall %ASA-4-106023: Deny tcp src outside:1.1.1.1/15855 dst outside:2.2.2.26/23 by access-group "outside-in" [0x0, 0x0]
Nov 9 00:36:12 firewall %ASA-4-106023: Deny tcp src outside:1.1.1.1/24574 dst outside:2.2.2.27/23 by access-group "outside-in" [0x0, 0x0]
Nov 9 00:36:12 firewall %ASA-4-106023: Deny tcp

Solution

So you load an entire file (3+Gb), then collect all the "Deny" lines in a match list (another 3+Gb). That's a serious waste of RAM, and a huge burden on the cache and page tables. No wonder it runs so slow.

Notice that lines have no context, and each one could be processed independently. Streaming is an obvious optimization:

for line in data:
        process_line(line)


Also notice that the line structure is very well-defined (that is, every field you are interested in has a fixed number), so regex is overkill:

for line in data:
        fields = line.split()
        if fields[5] != 'Deny':
            continue
        ....


In any case, if you want to use regular expressions, compile them.

Code Snippets

for line in data:
        process_line(line)
for line in data:
        fields = line.split()
        if fields[5] != 'Deny':
            continue
        ....

Context

StackExchange Code Review Q#110492, answer score: 5

Revisions (0)

No revisions yet.