HiveBrain v1.2.0
Get Started
← Back to all entries
patternMinor

Best way to replace a beginning and end character in Perl using Regular Expression?

Submitted by: @import:stackexchange-codereview··
0
Viewed 0 times
expressionperlregularreplacewaycharacterbeginningusingendand

Problem

I'm wondering if there is a simplier regex I could use in my code to remove the beginning and ending char in a line. Maybe combine some regex's? In this instance, it's a comma at the beginning and ending of a line. My output should be the fields seperated into CSV format.

#!/usr/local/bin/perl

use strict;
use warnings;

parse_DPCRS();

sub parse_DPCRS {
    open ( FILEIN, 'txt_files/AKR_DPCRS.txt' );
    open ( FILEOUT, '>txt_files/AKR_DPCRS.csv' );
    while () {
        next if /^(\s)*$/;          #skip blank lines
        next if /^\>/;              #skip command line that start with >
        next if /^\s+POINT\sCODE/;  #skip header
        next if /^\s+NODE\sNAME/;   #skip header
        next if /^\s+\=+/;          #skip header
        next if /^\s+CCS\sDPCRS/;   #skip pageination footer
        chomp;                      #removing trailing newline character
        s/\s+/,/g;                  #replace white space with a comma
        s/^,//;                     #replace beginning comma with empty
        s/,$//;                     #replace ending comma with empty
        my (
            $nodeName, $pointCodeDec ) =  split( "," );
        print FILEOUT ($nodeName . "," . $pointCodeDec . "\n");
        #print "$_\n";
    }
};
close (FILEOUT);
close (FILEIN);
exit;


Here's a slice of the text file I'm parsing

```
>DISP CCS DPCRS ALL 0

POINT CODE POINT CODE TYPE OF ROUTESET NOTIFY NODE
NODE NAME DECIMAL HEX ROUTE MASTER SCCP LOCATION
=========== =========== ========== ======= ======== ====== ========
PBVJPRCO01T 1-1-1 010101 FULL PC 119 NO NON-ADJ
ROCHNYXA06T 1-6-1 010601 FULL PC 58 NO NON-ADJ
NYCNNYDRW17 1-6-2 010602 FULL PC 58 NO NON-ADJ
SYRCNYSW01T 1-6-3 010603 FULL PC 22 NO NON-ADJ
SYRCNYSWDS0 1-6-15 01060F FULL PC 58 NO NON-ADJ
ROCHNYFEDS0 1-6

Solution

Here is a single regex that removes , (comma) at the beginig or at the end of a string:

$str =~ s/^,+|,+$//g;


and here is a benchmark that compares this regex with a double one:

use Benchmark qw(:all);

my $str = q/,a,b,c,d,/;
my $count = -3;
cmpthese($count, {
        'two regex' => sub {
            $str =~ s/^,+//;
            $str =~ s/,+$//;
        },
        'one regex' => sub {
            $str =~ s/^,+|,+$//g;
        },
    });


The result:

Rate one regex two regex
one regex  597559/s        --      -58%
two regex 1410348/s      136%        --


We can see that two regex are really faster than one regex that combines the two.

Code Snippets

$str =~ s/^,+|,+$//g;
use Benchmark qw(:all);

my $str = q/,a,b,c,d,/;
my $count = -3;
cmpthese($count, {
        'two regex' => sub {
            $str =~ s/^,+//;
            $str =~ s/,+$//;
        },
        'one regex' => sub {
            $str =~ s/^,+|,+$//g;
        },
    });
Rate one regex two regex
one regex  597559/s        --      -58%
two regex 1410348/s      136%        --

Context

StackExchange Code Review Q#5234, answer score: 2

Revisions (0)

No revisions yet.