patterngoMinor
Regular expression matching with string slice in Go
Viewed 0 times
expressionwithregularstringmatchingslice
Problem
I have a slice of strings, and within each string contains multiple
Here is my code.
I think my code is inefficient because I cannot determine how to
key=value formatted messages. I want to pull all the keys out of the strings so I can collect them to use as the header for a CSV file. I do not know all potential key fields, so I have to use regular expression matching to find them.Here is my code.
package main
import (
"fmt"
"regexp"
)
func GetKeys(logs []string) []string {
// topMatches is the final array to be returned.
// midMatches contains no duplicates, but the data is `key=`.
// subMatches contains all initial matches.
// initialRegex matches for anthing that matches `key=`. this is because the matching patterns.
// cleanRegex massages `key=` to `key`
topMatches := []string{}
midMatches := []string{}
subMatches := []string{}
initialRegex := regexp.MustCompile(`([a-zA-Z]{1,}\=)`)
cleanRegex := regexp.MustCompile(`([a-zA-Z]{1,})`)
// the nested loop for matches is because FindAllString
// returns []string
for _, i := range logs {
matches := initialRegex.FindAllString(i, -1)
for _, m := range matches {
subMatches = append(subMatches, m)
}
}
// remove duplicates.
seen := map[string]string{}
for _, x := range subMatches {
if _, ok := seen[x]; !ok {
midMatches = append(midMatches, x)
seen[x] = x
}
}
// this is where I remove the `=` character.
for _, y := range midMatches {
clean := cleanRegex.FindAllString(y, 1)
topMatches = append(topMatches, clean[0])
}
return topMatches
}
func main() {
y := []string{"key=value", "msg=payload", "test=yay", "msg=payload"}
y = GetKeys(y)
fmt.Println(y)
}I think my code is inefficient because I cannot determine how to
Solution
You're not making good use of regular expressions. A single regex can do the job:
The parentheses
You can use
Instead of a
Putting it together:
If the input strings are not guaranteed to be in the right format matching the pattern, then you might want to add a guard statement inside the main for loop, for example:
pattern := regexp.MustCompile(`([a-zA-Z]+)=`)The parentheses
(...) are the capture the interesting part for you.You can use
result = pattern.FindAllStringSubmatch(s) to match a string against the regex pattern. The return value is a [][]string, where in each []string slice, the 1st element is the entire matched string, and the 2nd, 3rd, ... elements have the content of the capture groups. In this example we have one capture group (...), so the value of the key will be in item[1] of each []string slice.Instead of a
map[string]string map for seen, a map[string]boolean would be more efficient.Putting it together:
func GetKeys(logs []string) []string {
var keys []string
pattern := regexp.MustCompile(`([a-zA-Z]+)=`)
seen := make(map[string]bool)
for _, log := range(logs) {
result := pattern.FindAllStringSubmatch(log, -1)
for _, item := range result {
key := item[1]
if _, ok := seen[key]; !ok {
keys = append(keys, key)
seen[key] = true
}
}
}
return keys
}If the input strings are not guaranteed to be in the right format matching the pattern, then you might want to add a guard statement inside the main for loop, for example:
if len(result) != 2 {
continue
}Code Snippets
pattern := regexp.MustCompile(`([a-zA-Z]+)=`)func GetKeys(logs []string) []string {
var keys []string
pattern := regexp.MustCompile(`([a-zA-Z]+)=`)
seen := make(map[string]bool)
for _, log := range(logs) {
result := pattern.FindAllStringSubmatch(log, -1)
for _, item := range result {
key := item[1]
if _, ok := seen[key]; !ok {
keys = append(keys, key)
seen[key] = true
}
}
}
return keys
}if len(result) != 2 {
continue
}Context
StackExchange Code Review Q#121924, answer score: 6
Revisions (0)
No revisions yet.