HiveBrain v1.2.0
Get Started
← Back to all entries
patternMinor

Regex to extract selective lines

Submitted by: @import:stackexchange-codereview··
0
Viewed 0 times
regexselectivelinesextract

Problem

I have a list of error messages:

def errorMessages = ["Line : 1 Invoice does not foot Reported"
                     "Line : 2 Could not parse INVOICE_DATE value"
                     "Line 3 : Could not parse ADJUSTMENT_AMOUNT value"
                     "Line 4 : MATH ERROR"
                     "cl_id is a required field"
                     "File Error : The file does not contain delimiters"
                     "lf_name is a required field"]


I am trying to create a new list which doesn't match the regex "^Line\\s(?:(\\d+)\\s)?\\s*:\\s+(\\d+)?.+" but has the text Invoice does not foot Reported.

My desired new list should be like this:

def headErrors= ["Line : 1 Invoice does not foot Reported"
                 "cl_id is a required field"
                 "File Error : The file does not contain delimiters"
                 "lf_name is a required field"]


This is what I am doing for now:

regex = "^Line\\s(?:(\\d+)\\s)?\\s*:\\s+(\\d+)?.+"
errorMessages.each{
    if(it.contains('Invoice does not foot Reported'))
        headErrors.add(it)
    else if(!it.matches(regex)
        headErrors.add(it)
}


Is there a way it can be done just using regex instead of if else?

Solution

Use findAll() instead of each(). findAll() is a filter function which can be used on a Collection/Iterable or Map. It filters the result on satisfying the condition provided by the Closure parameter.

Here, you can filter String which has the desired text or does not match with the supplied regex. Using findAll is better than each because it returns the filtered collection as a result as compared to each which returns the Collection itself.

def regex = "^Line\\s(?:(\\d+)\\s)?\\s*:\\s+(\\d+)?.+"

def headErrors = errorMessages.findAll { 
    it.contains('Invoice does not foot Reported') || !(it ==~ regex) 
}

assert headErrors ==  [
    'Line : 1 Invoice does not foot Reported', 
    'cl_id is a required field', 
    'File Error : The file does not contain delimiters', 
    'lf_name is a required field'
]

Code Snippets

def regex = "^Line\\s(?:(\\d+)\\s)?\\s*:\\s+(\\d+)?.+"

def headErrors = errorMessages.findAll { 
    it.contains('Invoice does not foot Reported') || !(it ==~ regex) 
}

assert headErrors ==  [
    'Line : 1 Invoice does not foot Reported', 
    'cl_id is a required field', 
    'File Error : The file does not contain delimiters', 
    'lf_name is a required field'
]

Context

StackExchange Code Review Q#86159, answer score: 6

Revisions (0)

No revisions yet.