patternjavaMinor
Java 8 stream collector for numbering vertices
Viewed 0 times
streamcollectorjavaforverticesnumbering
Problem
I've been suggested to put the following snippet of code up for review, hence I will do so, review of everything is appreciated, also posting as much relevant code as possible:
```
public class VertexData {
private Vector3f vertex;
private Vector3f normal;
public VertexData(final Vector3f vertex, final Vector3f normal) {
this.vertex = vertex;
this.normal = normal;
}
public Vector3f getVertex() {
return vertex;
}
public void setVertex(final Vector3f vertex) {
this.vertex = vertex;
}
public Vector3f getNormal() {
return normal;
}
public void setNormal(final Vector3f normal) {
this.normal = normal;
}
@Override
public String toString() {
return "VertexData(" + vertex + ", " + normal + ")";
}
@Override
public int hashCode() {
int hash = 7;
return hash;
}
@Override
public boolean equals(Object obj) {
if (obj == null) {
return false;
}
if (getCla
abstract public class Primitive {
protected List vertexData;
public List getVertexData() {
return vertexData;
}
public static void calculateNormals(final List primitives) {
primitives.stream()
.flatMap(primitive -> primitive.getVertexData().stream())
.collect(Collectors.groupingBy(VertexData::getVertex)) //Map>
.entrySet().stream()
.map(Map.Entry::getValue) //List
.forEach(Primitive::calculateNormalsOfVertexData);
}
private static void calculateNormalsOfVertexData(final List vertexData) {
Vector3f averageNormal = vertexData.stream()
.map(VertexData::getNormal)
.reduce(new Vector3f().zero(), (n1, n2) -> n1.add(n2))
.scale(1f / vertexData.size());
vertexData.forEach(vd -> vd.setNormal(averageNormal));
}
}```
public class VertexData {
private Vector3f vertex;
private Vector3f normal;
public VertexData(final Vector3f vertex, final Vector3f normal) {
this.vertex = vertex;
this.normal = normal;
}
public Vector3f getVertex() {
return vertex;
}
public void setVertex(final Vector3f vertex) {
this.vertex = vertex;
}
public Vector3f getNormal() {
return normal;
}
public void setNormal(final Vector3f normal) {
this.normal = normal;
}
@Override
public String toString() {
return "VertexData(" + vertex + ", " + normal + ")";
}
@Override
public int hashCode() {
int hash = 7;
return hash;
}
@Override
public boolean equals(Object obj) {
if (obj == null) {
return false;
}
if (getCla
Solution
long vs. int
In your map you are keyed off the long value of the index. The long is simply related to the size of the map as you accumulate things.
A Map can never have more than Integer.MAX_INT members, thus you can never accumulate more than that number of key values.... thus, why are you using a Long when an Integer will suffice?
Simplification
Consider this simplification, where the distinct phase is done in the intermediate Collector:
Conclusion
In this particular case, I am not certain that the streaming API from Java8 is the right solution. You are creating much more uncertainty than needs to happen.
The right solution to this is a LinkedHashSet data structure. Your Java7 equivalent is a better solution, but change the HashSet to a LinkedHashSet, and then your results are going to be deterministic. The code is simpler, more maintainable, and the parallel benefits of your stream are not possible anyway, so the added overheads of creating intermediate maps, running complicated
Just because it can be done with a stream does not mean that is the better solution.
Collector associativeness
One of the properties of a Collector is that it is supposed to be associative.
The associativity constraint says that splitting the computation must produce an equivalent result.
Your collector is not associative. Consider your implementation:
If your collector is used on a concurrent stream, there is no way for you to determine the order of the combination function calls:
For example, suppose your stream is split in two, and the two parts are accumulated in two maps
Your streams will produce non-deterministic values for your inputs because your Collector is non-deterministic.
I am not certain how I would solve this problem.... there has to be a way to 'tag' the data at the beginning of the stream such that it is labelled with a 'key' sooner.... instead of arbitrarily, and non-deterministically assigning one later.
Conclusion
I believe the code produces results that are 'correct' for the context of the way you use it, but the results are non-deterministic, and thus are going to be a problem in the future when things go wrong.
You need to do something to fix the non-determinisim earlier in the stream:
I believe the only decent solution would be to do something like guarantee sequential processing until the data leaves the
In your map you are keyed off the long value of the index. The long is simply related to the size of the map as you accumulate things.
A Map can never have more than Integer.MAX_INT members, thus you can never accumulate more than that number of key values.... thus, why are you using a Long when an Integer will suffice?
Simplification
Consider this simplification, where the distinct phase is done in the intermediate Collector:
Map vdata = primitives
.stream()
.collect(Collector.of(
LinkedHashSet::new,
(acc, t) -> acc.add(t),
(left, right) -> {left.addAll(right); return left;}))
.stream().collect(CustomCollectors.indexing());
vdMapping.forEach((k, v) -> System.out.println("k = " + k + " / v = " + v));Conclusion
In this particular case, I am not certain that the streaming API from Java8 is the right solution. You are creating much more uncertainty than needs to happen.
The right solution to this is a LinkedHashSet data structure. Your Java7 equivalent is a better solution, but change the HashSet to a LinkedHashSet, and then your results are going to be deterministic. The code is simpler, more maintainable, and the parallel benefits of your stream are not possible anyway, so the added overheads of creating intermediate maps, running complicated
distinct() filters, and other processes are just painful....Just because it can be done with a stream does not mean that is the better solution.
Collector associativeness
One of the properties of a Collector is that it is supposed to be associative.
The associativity constraint says that splitting the computation must produce an equivalent result.
Your collector is not associative. Consider your implementation:
return Collector.of(
HashMap::new,
(map, t) -> map.put(Long.valueOf(map.size() + 1), t),
(left, right) -> {
final long size = left.size();
right.forEach((k, v) -> left.put(k + size, v));
return left;
},
Collector.Characteristics.CONCURRENT
);If your collector is used on a concurrent stream, there is no way for you to determine the order of the combination function calls:
(left, right) -> {
final long size = left.size();
right.forEach((k, v) -> left.put(k + size, v));
return left;
},For example, suppose your stream is split in two, and the two parts are accumulated in two maps
mapA and mapB. Your collector should produce the same results regardless of whether left == mapA && right == mapB or whether left == mapB && right == mapA.Your streams will produce non-deterministic values for your inputs because your Collector is non-deterministic.
I am not certain how I would solve this problem.... there has to be a way to 'tag' the data at the beginning of the stream such that it is labelled with a 'key' sooner.... instead of arbitrarily, and non-deterministically assigning one later.
Conclusion
I believe the code produces results that are 'correct' for the context of the way you use it, but the results are non-deterministic, and thus are going to be a problem in the future when things go wrong.
You need to do something to fix the non-determinisim earlier in the stream:
Map vdMapping = primitives.stream()
.flatMap(primitive -> primitive.getVertexData().stream())
.distinct()
.collect(CustomCollectors.indexing());
vdMapping.forEach((k, v) -> System.out.println("k = " + k + " / v = " + v));I believe the only decent solution would be to do something like guarantee sequential processing until the data leaves the
distinct() phase, and then add a map phase at that point which numbers the VertexData that exits at that point.Code Snippets
Map<Long, VertexData> vdata = primitives
.stream()
.collect(Collector.of(
LinkedHashSet<VertexData>::new,
(acc, t) -> acc.add(t),
(left, right) -> {left.addAll(right); return left;}))
.stream().collect(CustomCollectors.indexing());
vdMapping.forEach((k, v) -> System.out.println("k = " + k + " / v = " + v));return Collector.of(
HashMap::new,
(map, t) -> map.put(Long.valueOf(map.size() + 1), t),
(left, right) -> {
final long size = left.size();
right.forEach((k, v) -> left.put(k + size, v));
return left;
},
Collector.Characteristics.CONCURRENT
);(left, right) -> {
final long size = left.size();
right.forEach((k, v) -> left.put(k + size, v));
return left;
},Map<Long, VertexData> vdMapping = primitives.stream()
.flatMap(primitive -> primitive.getVertexData().stream())
.distinct()
.collect(CustomCollectors.indexing());
vdMapping.forEach((k, v) -> System.out.println("k = " + k + " / v = " + v));Context
StackExchange Code Review Q#43607, answer score: 4
Revisions (0)
No revisions yet.