patternjavaMinor
Parsing multiple line records using Java 8 Streams
Viewed 0 times
linerecordsjavaparsingusingmultiplestreams
Problem
There was a suggestion in this StackOverflow question to move this question here. I added some more information and restructured the question a bit.
I'm trying to parse the following file which contains information in the following format:
TABLE_NAME
VARIABLE_LIST_OF_COLUMNS
VARIABLE_NUMBER_OF_ROWS (Seperated by a tab seperator)
An example (using ',' as the seperator for the question; actual seperator is a tab):
STUDENTS
ID
NAME
1,Mike
2,Kimberly
The idea is to build a list of insert sql statements (context for the code snippet). So the output would be (ignore numeric/null values for now):
INSERT INTO STUDENTS (ID, NAME) VALUES ('1','Mike');
INSERT INTO STUDENTS (ID, NAME) VALUES ('2','Kimberly');
What I want to know is whether this kind of multiline parsing is at all possible using java 8 streams API? This is what I have at the moment:
```
public final class StatementGeneratorMain {
public static void main(final String[] args) throws Exception{
List fileNames = Arrays
.asList("STUDENTS.txt");
fileNames.stream()
.forEach(fileName -> {
String tableName;
List columnNames;
List dataRows;
try (BufferedReader br = getBufferedReader(fileName)) {
tableName = br.lines().findFirst().get();
} catch (Exception e) {
throw new RuntimeException(e);
}
try (BufferedReader br = getBufferedReader(fileName)) {
//skip the first line because its been processed.
columnNames = br.lines().skip(1).filter(v -> v.split("\t").length == 1).collect(toList());
} catch (Exception e) {
throw new RuntimeException(e);
}
try (BufferedReader br = getBufferedReader(fileName)) {
//skip the first line and the columns leng
I'm trying to parse the following file which contains information in the following format:
TABLE_NAME
VARIABLE_LIST_OF_COLUMNS
VARIABLE_NUMBER_OF_ROWS (Seperated by a tab seperator)
An example (using ',' as the seperator for the question; actual seperator is a tab):
STUDENTS
ID
NAME
1,Mike
2,Kimberly
The idea is to build a list of insert sql statements (context for the code snippet). So the output would be (ignore numeric/null values for now):
INSERT INTO STUDENTS (ID, NAME) VALUES ('1','Mike');
INSERT INTO STUDENTS (ID, NAME) VALUES ('2','Kimberly');
What I want to know is whether this kind of multiline parsing is at all possible using java 8 streams API? This is what I have at the moment:
```
public final class StatementGeneratorMain {
public static void main(final String[] args) throws Exception{
List fileNames = Arrays
.asList("STUDENTS.txt");
fileNames.stream()
.forEach(fileName -> {
String tableName;
List columnNames;
List dataRows;
try (BufferedReader br = getBufferedReader(fileName)) {
tableName = br.lines().findFirst().get();
} catch (Exception e) {
throw new RuntimeException(e);
}
try (BufferedReader br = getBufferedReader(fileName)) {
//skip the first line because its been processed.
columnNames = br.lines().skip(1).filter(v -> v.split("\t").length == 1).collect(toList());
} catch (Exception e) {
throw new RuntimeException(e);
}
try (BufferedReader br = getBufferedReader(fileName)) {
//skip the first line and the columns leng
Solution
Simply put, you only need a
In my sample implementation above, I've made the
SQL injection
The sample implementation provides a
The parameter list is driven by an
With that said, assuming you can absolutely trust your file-based input (i.e. sanitized inputs), we can then proceed with using
Here's what I used in my
And the sample output:
Consumer-implementing class that knows what lines to read as the table name, the columns and rows.public class StatementGenerator implements Consumer {
private static String START = " (";
private static String END = ") ";
private static String ROW_DELIMITER = "\n";
private String delimiter = "\t";
private String parameter = "%";
private ObjIntConsumer> parameterSupplier =
(list, i) -> { list.add(parameter); };
private String table = null;
private List columns = new ArrayList<>();
private List rows = new ArrayList<>();
public void setDelimiter(String delimiter) {
this.delimiter = delimiter;
}
public void setParameter(String parameter) {
this.parameter = parameter;
}
public void setParameterSupplier(ObjIntConsumer> parameterSupplier) {
this.parameterSupplier = parameterSupplier;
}
@Override
public void accept(String t) {
if (table == null) {
table = t;
return;
}
if (!t.contains(delimiter)) {
columns.add(t);
return;
}
rows.add(t);
}
public String getTableName() {
return table;
}
public List getColumns() {
return columns;
}
public List getRows() {
return rows;
}
public String getParameterizedStatement() {
StringBuilder result = new StringBuilder("INSERT INTO ");
result.append(constructSegment(getTableName(), getColumns())).append(
constructSegment("VALUES", IntStream.rangeClosed(1, getColumns().size())
.collect(ArrayList::new, parameterSupplier, List::addAll)));
return result.append(";").toString();
}
public List getRawStatements() {
String placeholderStatement = getParameterizedStatement()
.replaceAll(Pattern.quote(parameter), "%s");
return getRows().stream().map(r -> String.format(placeholderStatement,
Pattern.compile(delimiter).splitAsStream(r)
.map(v -> "'" + v + "'").toArray()))
.collect(Collectors.toList());
}
public String getFullStatement() {
return getRawStatements().stream().collect(Collectors.joining(ROW_DELIMITER));
}
@Override
public String toString() {
return getParameterizedStatement();
}
private static String constructSegment(String prefix, List list) {
return prefix + list.stream().collect(Collectors.joining(", ", START, END));
}
}In my sample implementation above, I've made the
delimiter configurable, but there's one other configuration I want to highlight - parameter.SQL injection
The sample implementation provides a
getParameterizedStatement() as the starting point, because by right you should let your database driver handle the escaping of quotes (and possibly other magic values). We certainly do not need Little Bobby Tables to hang around here. The parameter list is driven by an
ObjIntConsumer, because if I'm not mistaken, there are certain drivers allowing for an index-based parameter substitution. That will mean you can potentially override parameterSupplier with something like (list, i) -> { list.add("::" + i); }; (or whatever the placeholder format is).With that said, assuming you can absolutely trust your file-based input (i.e. sanitized inputs), we can then proceed with using
getRawStatements(), which performs a simple substitution on the parameter with our String.format()'s "%s" placeholder. This is more aligned with what you are asking for. getFullStatement() simply concatenates all the rows into a single String, if that is what you actually require.Here's what I used in my
main() code:public class StatementGeneratorMain {
public static void main(String[] args) throws IOException, URISyntaxException {
StatementGenerator generator = new StatementGenerator();
generator.setDelimiter(",");
try (Stream lines = Files.lines(Paths.get(
ClassLoader.getSystemResource("sqlRecords.txt").toURI()))) {
lines.forEach(generator);
}
System.out.println(generator.getTableName());
System.out.println(generator.getColumns());
System.out.println(generator.getParameterizedStatement());
generator.getRawStatements().forEach(System.out::println);
}
}And the sample output:
STUDENTS
[ID, NAME]
INSERT INTO STUDENTS (ID, NAME) VALUES (%, %) ;
INSERT INTO STUDENTS (ID, NAME) VALUES ('1', 'Mike') ;
INSERT INTO STUDENTS (ID, NAME) VALUES ('2', 'Kimberly') ;Code Snippets
public class StatementGenerator implements Consumer<String> {
private static String START = " (";
private static String END = ") ";
private static String ROW_DELIMITER = "\n";
private String delimiter = "\t";
private String parameter = "%";
private ObjIntConsumer<List<String>> parameterSupplier =
(list, i) -> { list.add(parameter); };
private String table = null;
private List<String> columns = new ArrayList<>();
private List<String> rows = new ArrayList<>();
public void setDelimiter(String delimiter) {
this.delimiter = delimiter;
}
public void setParameter(String parameter) {
this.parameter = parameter;
}
public void setParameterSupplier(ObjIntConsumer<List<String>> parameterSupplier) {
this.parameterSupplier = parameterSupplier;
}
@Override
public void accept(String t) {
if (table == null) {
table = t;
return;
}
if (!t.contains(delimiter)) {
columns.add(t);
return;
}
rows.add(t);
}
public String getTableName() {
return table;
}
public List<String> getColumns() {
return columns;
}
public List<String> getRows() {
return rows;
}
public String getParameterizedStatement() {
StringBuilder result = new StringBuilder("INSERT INTO ");
result.append(constructSegment(getTableName(), getColumns())).append(
constructSegment("VALUES", IntStream.rangeClosed(1, getColumns().size())
.collect(ArrayList::new, parameterSupplier, List::addAll)));
return result.append(";").toString();
}
public List<String> getRawStatements() {
String placeholderStatement = getParameterizedStatement()
.replaceAll(Pattern.quote(parameter), "%s");
return getRows().stream().map(r -> String.format(placeholderStatement,
Pattern.compile(delimiter).splitAsStream(r)
.map(v -> "'" + v + "'").toArray()))
.collect(Collectors.toList());
}
public String getFullStatement() {
return getRawStatements().stream().collect(Collectors.joining(ROW_DELIMITER));
}
@Override
public String toString() {
return getParameterizedStatement();
}
private static String constructSegment(String prefix, List<String> list) {
return prefix + list.stream().collect(Collectors.joining(", ", START, END));
}
}public class StatementGeneratorMain {
public static void main(String[] args) throws IOException, URISyntaxException {
StatementGenerator generator = new StatementGenerator();
generator.setDelimiter(",");
try (Stream<String> lines = Files.lines(Paths.get(
ClassLoader.getSystemResource("sqlRecords.txt").toURI()))) {
lines.forEach(generator);
}
System.out.println(generator.getTableName());
System.out.println(generator.getColumns());
System.out.println(generator.getParameterizedStatement());
generator.getRawStatements().forEach(System.out::println);
}
}STUDENTS
[ID, NAME]
INSERT INTO STUDENTS (ID, NAME) VALUES (%, %) ;
INSERT INTO STUDENTS (ID, NAME) VALUES ('1', 'Mike') ;
INSERT INTO STUDENTS (ID, NAME) VALUES ('2', 'Kimberly') ;Context
StackExchange Code Review Q#91896, answer score: 2
Revisions (0)
No revisions yet.