HiveBrain v1.2.0
Get Started
← Back to all entries
patternjavaMinor

Glue Shredded Pieces (CodeEval)

Submitted by: @import:stackexchange-codereview··
0
Viewed 0 times
piecescodeevalshreddedglue

Problem

I have solved problem 185 on CodeEval, Glue Shredded Pieces.

The solution got a 100% score on all tests (edit: see comments), within the allotted time and memory constraints. I'm looking mostly looking for tips to reduce memory usage (and improving speed) but all comments are welcome.

Description


Imagine that you are taking part in the investigation of a criminal organization. Having determined its location, you came there and found out that the criminals had recently left that place. But it is not a dead end, because there are lots of shredded documents in the room. On investigating these documents, you came to conclusion that the criminals had not been very careful. Firstly, the papers are shredded horizontally, and you can read some pieces of text. Secondly, there are many copies of the same documents, and the pieces of text overlap each other.


For example, you can put pieces together and get the original text:

evil pl
 vil pla
  il plan




The answer is ‘evil plan’.


Your task is to print out the original text. Due to repetitions in the text, you will sometimes get identical pieces.

INPUT SAMPLE:


Your program should accept a path to a file as its first argument. Each line in the file is one test case with the pieces of shredded text separated by a pipe. Each test case starts and finishes with symbol '|'.


For example:

|deEva|lan t|to ha|evil |ankin|il-ev|o hac| to h|vil p|an to|The e|CodeE| evil|plan |hack |Eval |ack C|l ran|king.|l-evi|evil-|-evil|l pla|il pl| hack|al ra|vil-e|odeEv|he ev|n to |ck Co|eEval|nking| rank| Code|e evi|ranki|k Cod| plan|val r|



OUTPUT SAMPLE:


Print to stdout the original text for each test case.


For example:

The evil-evil plan to hack CodeEval ranking.



CONSTRAINTS:



  • For the text with the length t shredded into pieces with the length n, there are t - (n - 1) pieces of text in the input file. Each piece of text is shifted by one charact

Solution

Focusing on memory usage: forget all "one-of" stuff - that leaves vertices (use new ArrayList<>(ExpectedOutgoingEdges)) and suffix2next (use new ArrayList<>(ExpectedPrefixSuccessors)):

static final int
    ExpectedVertices = 123,
    ExpectedOutgoingEdges = 2,
    ExpectedPrefixSuccessors = ExpectedOutgoingEdges, //2,
    ExpectedCharacters = 28;
public static class Graph {
    private final List
        vertices = new ArrayList<>(ExpectedVertices);
    private final StringBuilder
        result = new StringBuilder(ExpectedVertices+ExpectedCharacters);
    private final Map> suffix2next;

/** alternative population of {@code suffix2next} */
    Graph(Stream pieces) {
        suffix2next = pieces
            .map(p -> new Vertex(p))
            .peek(v -> vertices.add(v))
            .collect(Collectors.groupingBy(v -> v.getPrefix(),
                () -> new HashMap>(Expectedvertices)
                , Collectors.toCollection(
                    () -> new ArrayList<>(ExpectedPrefixSuccessors))
                ));
    }
    Graph() {
        suffix2next = new java.util.HashMap<>(ExpectedVertices);
    }
    public void insert(Vertex aPiece) {
        vertices.add(aPiece);
    // Add this piece to the list of pieces that can follow pieces that
    // end with the beginning of this piece. Yeah that's a mouthful...
        suffix2next.computeIfAbsent(aPiece.getPrefix(),
            k -> new ArrayList(ExpectedPrefixSuccessors)).add(aPiece);
    }

    void showResult(PrintStream aOutput) {
        aOutput.println(result.reverse());
        aOutput.append(suffix2next.entrySet().stream()
            //.collect(Collectors.summarizingInt(e -> e.getValue().size()))
            .collect(Collectors.groupingBy(e -> e.getValue().size(), Collectors.counting()))
            .toString());
    }
    …
}
public static void main(String[] args) throws Exception {
    try (BufferedReader reader =
         new BufferedReader(new java.io.FileReader(args[0]))) {
        final Pattern bar = Pattern.compile("|", Pattern.LITERAL);
        for (String line ; null != (line = reader.readLine()) ; ) {
//            line = "|deEva|lan t|to ha|evil |ankin|il-ev|o hac| to h|vil p|an to|The e|CodeE| evil|plan |hack |Eval |ack C|l ran|king.|l-evi|evil-|-evil|l pla|il pl| hack|al ra|vil-e|odeEv|he ev|n to |ck Co|eEval|nking| rank| Code|e evi|ranki|k Cod| plan|val r|";
//            piece.splitAsStream(line.substring(line.indexOf('|')+1))
//              .forEach(p -> graph.insert(new Vertex(p)));
            Graph graph = new Graph(bar.splitAsStream(line.substring(line.indexOf('|')+1)));
            graph.solve(System.out);
        }
    }
}


(Using Map.computeIfAbsent, MultiMap doesn't hurt as much as it used to, even less with Collectors.groupingBy().)

Using outgoingEdges.trimToSize() probably doesn't reduce max. memory usage, trimming the elements of suffix2next.values() might help a bit.

You could use pieceSize and (line.lastIndexOf('|') - line.indexOf('|')) / pieceSize to improve upon the static ExpectedVertices and ExpectedCharacters.

Then, you can roll your own List extends AbstractList (drop ArrayList's size, have the element in a NonArrayList or an Object refer to the array, …) - if (List)Iterators don't kill you, subList() returning a view might.

Or use 3rd party collections - Goldman Sachs FixedSizeLists (&of(element)/with(element) look good, MutableMultimaps not bad.

Code Snippets

static final int
    ExpectedVertices = 123,
    ExpectedOutgoingEdges = 2,
    ExpectedPrefixSuccessors = ExpectedOutgoingEdges, //2,
    ExpectedCharacters = 28;
public static class Graph {
    private final List<Vertex>
        vertices = new ArrayList<>(ExpectedVertices);
    private final StringBuilder
        result = new StringBuilder(ExpectedVertices+ExpectedCharacters);
    private final Map<String, List<Vertex>> suffix2next;

/** alternative population of {@code suffix2next} */
    Graph(Stream<? extends CharSequence> pieces) {
        suffix2next = pieces
            .map(p -> new Vertex(p))
            .peek(v -> vertices.add(v))
            .collect(Collectors.groupingBy(v -> v.getPrefix(),
                () -> new HashMap<CharSequence,
                                  List<Vertex>>(Expectedvertices)
                , Collectors.toCollection(
                    () -> new ArrayList<>(ExpectedPrefixSuccessors))
                ));
    }
    Graph() {
        suffix2next = new java.util.HashMap<>(ExpectedVertices);
    }
    public void insert(Vertex aPiece) {
        vertices.add(aPiece);
    // Add this piece to the list of pieces that can follow pieces that
    // end with the beginning of this piece. Yeah that's a mouthful...
        suffix2next.computeIfAbsent(aPiece.getPrefix(),
            k -> new ArrayList<Vertex>(ExpectedPrefixSuccessors)).add(aPiece);
    }

    void showResult(PrintStream aOutput) {
        aOutput.println(result.reverse());
        aOutput.append(suffix2next.entrySet().stream()
            //.collect(Collectors.summarizingInt(e -> e.getValue().size()))
            .collect(Collectors.groupingBy(e -> e.getValue().size(), Collectors.counting()))
            .toString());
    }
    …
}
public static void main(String[] args) throws Exception {
    try (BufferedReader reader =
         new BufferedReader(new java.io.FileReader(args[0]))) {
        final Pattern bar = Pattern.compile("|", Pattern.LITERAL);
        for (String line ; null != (line = reader.readLine()) ; ) {
//            line = "|deEva|lan t|to ha|evil |ankin|il-ev|o hac| to h|vil p|an to|The e|CodeE| evil|plan |hack |Eval |ack C|l ran|king.|l-evi|evil-|-evil|l pla|il pl| hack|al ra|vil-e|odeEv|he ev|n to |ck Co|eEval|nking| rank| Code|e evi|ranki|k Cod| plan|val r|";
//            piece.splitAsStream(line.substring(line.indexOf('|')+1))
//              .forEach(p -> graph.insert(new Vertex(p)));
            Graph graph = new Graph(bar.splitAsStream(line.substring(line.indexOf('|')+1)));
            graph.solve(System.out);
        }
    }
}

Context

StackExchange Code Review Q#117010, answer score: 3

Revisions (0)

No revisions yet.