patternjavaModerate
Vertically placing the words in a string
Viewed 0 times
placingthewordsverticallystring
Problem
Please let me know a better approach to solve below problem:
Problem: Vertically arrange the words in a string and print on console.
Sample Input:
Output:
My solution:
Problem: Vertically arrange the words in a string and print on console.
Sample Input:
Hello JackOutput:
H J
e a
L c
L k
oMy solution:
public static void toVerticalWords(String str){
//split the words by whitespace
String[] strArr = str.split("\\s");
int maxWordLen = 0;
//get the longest word length
for(String strTemp : strArr) {
if(strTemp.length() > maxWordLen)
maxWordLen = strTemp.length();
}
//make a matrix of the words with each character in an array block
char[][] charArr = new char[strArr.length][maxWordLen];
for(int i=0; i<strArr.length; i++) {
int j=0;
for(char ch : strArr[i].toCharArray()){
charArr[i][j] = ch;
j++;
}
}
//print the vertical word pattern, or transpose of above matrix (2D array)
for(int j=0; j<maxWordLen; j++) {
for(int i=0; i<strArr.length; i++) {
if (i!=0)
System.out.print(" ");
System.out.print(charArr[i][j]);
}
System.out.println();
}
}Solution
I assume you are only trying to work with ASCII characters, but I would like to point out a few Unicode-related failure modes, which can be exhibited by the following code:
The first string produces ordinary output:
The second string contains a grapheme cluster that is made up from multiple code points – the
The third string uses mathematical glyphs which have code points beyond
Only the
Not shown here: how fullwidth and halfwidth charaters can mess up your vertical layout.
The moral of the story: code points, UTF-16 code units, characters and glyphs are different things. Dealing with them correctly is excessively difficult, and I don't know about any suitable tools. If you are not prepared to deal correctly with any input you may encounter, then check that the input conforms to a subset you are comfortable with.
String[] funkyStrings = {
"foo bar",
"foo ba\u0308r",
"\ud835\udcbb\u2134\u2134 \ud835\udcb7\ud835\udcb6\ud835\udcc7"
};
for (String str : funkyStrings) {
System.out.println(str);
toVerticalWords(str);
}The first string produces ordinary output:
foo bar
f b
o a
o rThe second string contains a grapheme cluster that is made up from multiple code points – the
a and an accent ̈ which together create the glyph ä (← that's the precomposed form showing how it should look, as not all fonts handle combined characters correctly)foo bär
f b
o a
o ̈
rThe third string uses mathematical glyphs which have code points beyond
U+FFFF. Because Java is slightly Unicode-retarded and can't deal with such high code points properly, its chars are actually UTF-16 code units (16 bit wide, far to small for what we are dealing with here). Therefore, higher code points have to be specified as a pair of surrogate halves. The encoding for U+1D4BB is d835 dcbb in UTF-16BE. This now produces the following output:ℴℴ
? ?
? ?
ℴ ?
ℴ ?
?
?Only the
ℴ is displayed correctly, because it has a lower code point of U+2134. For the other characters, the surrogate halves are separated from each other, leading to invalid characters.Not shown here: how fullwidth and halfwidth charaters can mess up your vertical layout.
The moral of the story: code points, UTF-16 code units, characters and glyphs are different things. Dealing with them correctly is excessively difficult, and I don't know about any suitable tools. If you are not prepared to deal correctly with any input you may encounter, then check that the input conforms to a subset you are comfortable with.
for (char c : str) {
if (c > 0x7F) {
throw new IllegalArgumentException("The input string may only contain ASCII code points");
}
}Code Snippets
String[] funkyStrings = {
"foo bar",
"foo ba\u0308r",
"\ud835\udcbb\u2134\u2134 \ud835\udcb7\ud835\udcb6\ud835\udcc7"
};
for (String str : funkyStrings) {
System.out.println(str);
toVerticalWords(str);
}foo bar
f b
o a
o rfoo bär
f b
o a
o ̈
rℴℴ
? ?
? ?
ℴ ?
ℴ ?
?
?for (char c : str) {
if (c > 0x7F) {
throw new IllegalArgumentException("The input string may only contain ASCII code points");
}
}Context
StackExchange Code Review Q#46624, answer score: 11
Revisions (0)
No revisions yet.