HiveBrain v1.2.0
Get Started
← Back to all entries
patternjavaMinor

Build all possible phrase pairs from a sentence pair

Submitted by: @import:stackexchange-codereview··
0
Viewed 0 times
allbuildpossiblephrasesentencefrompairspair

Problem

I have an array for an English sentence (enSentence) and for a German sentence (deSentence). Each value represents one word. I want to print every possible phrase pair (enPhrase and dePhrase), where no phrase contains more than maxhPhraseLength words. A phrase can consist of a single word, all words of the sentence, and everything inbetween (consecutive).

So for Hello world ! and Hallo Welt !, possible phrase pairs would be:

  • Hello - Hallo



  • Hello world - Hallo



  • Hello world ! - Hallo



  • world - Hallo



  • world ! - Hallo



  • ! - Hallo






  • Hello - Hallo Welt



  • Hello - Hallo Welt !



  • Hello - Welt






  • Hello world ! - Hallo Welt !



But phrases like Hello ! or world Hello are not possible.

This is my first try:

enLoop:
for(int en1 = 0; en1  maxPhraseLength) {
      continue enLoop;
    }

    deLoop:
    for(int de1 = 0; de1  maxPhraseLength) {
           continue deLoop;
         }

         // building the phrase pair
         enPhrase = "";
         dePhrase = "";
         for(int e = en1; e <= en2; e++) {
           enPhrase = enPhrase + " " + enSentence[e];
         }
         for(int d = de1; d <= de2; d++) {
           dePhrase = dePhrase + " " + deSentence[d];
         }
         enPhrase = enPhrase.trim();
         dePhrase = dePhrase.trim();
         System.out.println(enPhrase + " - " + dePhrase);

       }
     }

  }
}


I’m interested in any kind of feedback (I rarely create code from scratch).

I considered naming the variables language-neutral (in case I want to use the code for making phrase pairs in other languages, too), but couldn’t come up with something short and catchy (sourceSentence/source1 and targetSentence/target1 are longer, and I always have to remind myself what the source and what the target is, while it’s immediately clear to me if thinking about the terms en/English and de/German instead; but the difference doesn’t really matter in this case anywa

Solution

Repeating work

You should move all of the code that computes enPhrase to before the deLoop. There is no need to recompute enPhrase each time you modify de2.

Furthermore, when you compute dePhrase, you can do so by appending the next word deSentence[de2] to the current dePhrase. Right now you reconstruct all of dePhrase in the inner loop, making it an \$O(n)\$ operation when it could be an \$O(1)\$ operation.

Get rid of loop labels

You can get rid of the loop labels if you precalculate the proper end of the loop.

Use StringBuilder

It's more efficient to use a StringBuilder to concatenate strings rather than using the + operator.

Rewrite

Here is a rewrite showing the above changes:

StringBuilder enPhrase = new StringBuilder();
StringBuilder dePhrase = new StringBuilder();

for(int en1 = 0; en1 < enSentence.length; en1++) {
  int en2Max = Math.min(enSentence.length, en1+maxPhraseLength);
  for(int en2 = en1; en2 < en2Max; en2++) {
    if (en2 == en1) {
      enPhrase.setLength(0);
    } else {
      enPhrase.append(" ");
    }
    enPhrase.append(enSentence[en2]);
    for(int de1 = 0; de1 < deSentence.length; de1++) {
       int de2Max = Math.min(deSentence.length, de1+maxPhraseLength);
       for(int de2 = de1; de2 < de2Max; de2++) {
         if (de2 == de1) {
           dePhrase.setLength(0);
         } else {
           dePhrase.append(" ");
         }
         dePhrase.append(deSentence[de2]);

         System.out.print(enPhrase.toString());
         System.out.print(" - ");
         System.out.println(dePhrase.toString());
       }
     }

  }
}

Code Snippets

StringBuilder enPhrase = new StringBuilder();
StringBuilder dePhrase = new StringBuilder();

for(int en1 = 0; en1 < enSentence.length; en1++) {
  int en2Max = Math.min(enSentence.length, en1+maxPhraseLength);
  for(int en2 = en1; en2 < en2Max; en2++) {
    if (en2 == en1) {
      enPhrase.setLength(0);
    } else {
      enPhrase.append(" ");
    }
    enPhrase.append(enSentence[en2]);
    for(int de1 = 0; de1 < deSentence.length; de1++) {
       int de2Max = Math.min(deSentence.length, de1+maxPhraseLength);
       for(int de2 = de1; de2 < de2Max; de2++) {
         if (de2 == de1) {
           dePhrase.setLength(0);
         } else {
           dePhrase.append(" ");
         }
         dePhrase.append(deSentence[de2]);

         System.out.print(enPhrase.toString());
         System.out.print(" - ");
         System.out.println(dePhrase.toString());
       }
     }

  }
}

Context

StackExchange Code Review Q#117906, answer score: 3

Revisions (0)

No revisions yet.