patternjavaMinor
Regex to find addresses and phone numbers
Viewed 0 times
numbersaddressesphonefindandregex
Problem
I am trying to optimize my Java code where I am parsing an address field.
Address fields have the format:
where
and where
So my string is
My object location stores each of the above attributes.
I am a bit confused how to optimize the regex so that it is easier for someone else to understand what my regex is doing. Any idea or suggestion will be helpful.
Address fields have the format:
full_address;phone; full_address;phone; full_address;phone;where
full_address = addresstype^street^city^state^zipand where
street = street1;street2;street3;street4;So my string is
final String string = "Billing^Tata;3001 Garden Parkway^^NJ^;100-00-0009;Home^Goggle;3341 Main Parkway^^NY^;;";My object location stores each of the above attributes.
//regular expression to match the address type
Pattern newPattern = Pattern.compile("(([^\\^]*)\\^([^\\^]*)\\^([^\\^]*)\\^([^\\^]*)\\^([^;]*);([^;]*);)");
Matcher newMatcher = newPattern.matcher(addressLongText);
List discreteListOfLocations = new ArrayList();
MatchResult result = null;
while (newMatcher.find())
{
result = newMatcher.toMatchResult();
Location location = new Location();
location.setAddressTypeCdValue(result.group(2));
String[] str_arr = result.group(3).split(";");
if (str_arr.length > 0)
{
location.setStreetAddress1(str_arr[0]);
}
if (str_arr.length > 1)
{
location.setStreetAddress2(str_arr[1]);
}
if (str_arr.length > 2)
{
location.setStreetAddress3(str_arr[2]);
}
if (str_arr.length > 3)
{
location.setStreetAddress4(str_arr[3]);
}
location.setCity(result.group(4));
location.setState(result.group(5));
location.setZip(result.group(6));
discreteListOfLocations.add(location);
}I am a bit confused how to optimize the regex so that it is easier for someone else to understand what my regex is doing. Any idea or suggestion will be helpful.
Solution
Not sure about Java string catenation.
Below is your regex formatted and commented (by RegexFormat 5)
This puts it in expanded mode. The good thing is anybody can read it in
your source code for later reference.
Below is 2 versions. One a c++ normal catenation where newline
added. Two a single quoted string where the newline is natural.
The nice thing about doing this in your code is you can always print it out
for debug purposes. It prints as a nice format.
======================================
Below is your regex formatted and commented (by RegexFormat 5)
This puts it in expanded mode. The good thing is anybody can read it in
your source code for later reference.
Below is 2 versions. One a c++ normal catenation where newline
\n areadded. Two a single quoted string where the newline is natural.
The nice thing about doing this in your code is you can always print it out
for debug purposes. It prints as a nice format.
"(?x) \n"
" ( [^\\^]* ) # (1), Address type \n"
" \\^ \n"
" ( [^\\^]* ) # (2), street1;street2;street3;street4; \n"
" \\^ \n"
" ( [^\\^]* ) # (3), City \n"
" \\^ \n"
" ( [^\\^]* ) # (4), State \n"
" \\^ \n"
" ( [^;]* ) # (5), Zip \n"
" ; \n"
" ( [^;]* ) # (6), Phone \n"
" ; \n"======================================
"(?x)
( [^\\^]* ) # (1), Address type
\\^
( [^\\^]* ) # (2), street1;street2;street3;street4;
\\^
( [^\\^]* ) # (3), City
\\^
( [^\\^]* ) # (4), State
\\^
( [^;]* ) # (5), Zip
;
( [^;]* ) # (6), Phone
;
"Code Snippets
"(?x) \n"
" ( [^\\^]* ) # (1), Address type \n"
" \\^ \n"
" ( [^\\^]* ) # (2), street1;street2;street3;street4; \n"
" \\^ \n"
" ( [^\\^]* ) # (3), City \n"
" \\^ \n"
" ( [^\\^]* ) # (4), State \n"
" \\^ \n"
" ( [^;]* ) # (5), Zip \n"
" ; \n"
" ( [^;]* ) # (6), Phone \n"
" ; \n""(?x)
( [^\\^]* ) # (1), Address type
\\^
( [^\\^]* ) # (2), street1;street2;street3;street4;
\\^
( [^\\^]* ) # (3), City
\\^
( [^\\^]* ) # (4), State
\\^
( [^;]* ) # (5), Zip
;
( [^;]* ) # (6), Phone
;
"Context
StackExchange Code Review Q#69169, answer score: 2
Revisions (0)
No revisions yet.