patternpythonMinor
Read firewall logs
Viewed 0 times
firewalllogsread
Problem
I am still a Python beginner and would appreciate some help with this code.
I am looking through some firewall log files, more specific all lines with Deny in them. For those files I am extracting the protocol, source IP, destination IP and destination port. The output is summarized and a hit counter is added (thanks memoselyk). Everything is working as intended but some optimization is still needed.
What I still try to optimize is:
-
If I enter a >3GB log file it will take a LONG time. Of course it is a large file but we are talking many hours.
-
The output could need some tweaking, especially getting rid of the [, ] and ' characters, and tab the output. I've actually been trying for some hours but no success.
-
I have to do some optimization in regards to ICMP traffic, but I will give this a go later. In short it is due to my regex where i use the / character, but this isn't present in ICMP traffic.
Any help would be appreciated.
Example log output:
```
Nov 9 00:36:10 firewall %ASA-4-106023: Deny tcp src outside:1.1.1.1/43882 dst outside:2.2.2.2/23 by access-group "outside-in" [0x0, 0x0]
Nov 9 00:36:10 firewall %ASA-4-106023: Deny tcp src outside:1.1.1.1/38780 dst outside:2.2.2.2/23 by access-group "outside-in" [0x0, 0x0]
Nov 9 00:36:11 firewall %ASA-4-106023: Deny tcp src outside:1.1.1.1/8273 dst outside:2.2.2.2/23 by access-group "outside-in" [0x0, 0x0]
Nov 9 00:36:12 firewall %ASA-4-106023: Deny tcp src outside:1.1.1.1/23433 dst outside:2.2.2.22/23 by access-group "outside-in" [0x0, 0x0]
Nov 9 00:36:12 firewall %ASA-4-106023: Deny tcp src outside:1.1.1.1/25175 dst outside:2.2.2.24/23 by access-group "outside-in" [0x0, 0x0]
Nov 9 00:36:12 firewall %ASA-4-106023: Deny tcp src outside:1.1.1.1/15855 dst outside:2.2.2.26/23 by access-group "outside-in" [0x0, 0x0]
Nov 9 00:36:12 firewall %ASA-4-106023: Deny tcp src outside:1.1.1.1/24574 dst outside:2.2.2.27/23 by access-group "outside-in" [0x0, 0x0]
Nov 9 00:36:12 firewall %ASA-4-106023: Deny tcp
I am looking through some firewall log files, more specific all lines with Deny in them. For those files I am extracting the protocol, source IP, destination IP and destination port. The output is summarized and a hit counter is added (thanks memoselyk). Everything is working as intended but some optimization is still needed.
What I still try to optimize is:
-
If I enter a >3GB log file it will take a LONG time. Of course it is a large file but we are talking many hours.
-
The output could need some tweaking, especially getting rid of the [, ] and ' characters, and tab the output. I've actually been trying for some hours but no success.
-
I have to do some optimization in regards to ICMP traffic, but I will give this a go later. In short it is due to my regex where i use the / character, but this isn't present in ICMP traffic.
Any help would be appreciated.
Example log output:
```
Nov 9 00:36:10 firewall %ASA-4-106023: Deny tcp src outside:1.1.1.1/43882 dst outside:2.2.2.2/23 by access-group "outside-in" [0x0, 0x0]
Nov 9 00:36:10 firewall %ASA-4-106023: Deny tcp src outside:1.1.1.1/38780 dst outside:2.2.2.2/23 by access-group "outside-in" [0x0, 0x0]
Nov 9 00:36:11 firewall %ASA-4-106023: Deny tcp src outside:1.1.1.1/8273 dst outside:2.2.2.2/23 by access-group "outside-in" [0x0, 0x0]
Nov 9 00:36:12 firewall %ASA-4-106023: Deny tcp src outside:1.1.1.1/23433 dst outside:2.2.2.22/23 by access-group "outside-in" [0x0, 0x0]
Nov 9 00:36:12 firewall %ASA-4-106023: Deny tcp src outside:1.1.1.1/25175 dst outside:2.2.2.24/23 by access-group "outside-in" [0x0, 0x0]
Nov 9 00:36:12 firewall %ASA-4-106023: Deny tcp src outside:1.1.1.1/15855 dst outside:2.2.2.26/23 by access-group "outside-in" [0x0, 0x0]
Nov 9 00:36:12 firewall %ASA-4-106023: Deny tcp src outside:1.1.1.1/24574 dst outside:2.2.2.27/23 by access-group "outside-in" [0x0, 0x0]
Nov 9 00:36:12 firewall %ASA-4-106023: Deny tcp
Solution
So you load an entire file (3+Gb), then collect all the "Deny" lines in a
Notice that lines have no context, and each one could be processed independently. Streaming is an obvious optimization:
Also notice that the line structure is very well-defined (that is, every field you are interested in has a fixed number), so regex is overkill:
In any case, if you want to use regular expressions, compile them.
match list (another 3+Gb). That's a serious waste of RAM, and a huge burden on the cache and page tables. No wonder it runs so slow.Notice that lines have no context, and each one could be processed independently. Streaming is an obvious optimization:
for line in data:
process_line(line)Also notice that the line structure is very well-defined (that is, every field you are interested in has a fixed number), so regex is overkill:
for line in data:
fields = line.split()
if fields[5] != 'Deny':
continue
....In any case, if you want to use regular expressions, compile them.
Code Snippets
for line in data:
process_line(line)for line in data:
fields = line.split()
if fields[5] != 'Deny':
continue
....Context
StackExchange Code Review Q#110492, answer score: 5
Revisions (0)
No revisions yet.