HiveBrain v1.2.0
Get Started
← Back to all entries
patternjavaMinor

Listing all the chars in a given UnicodeBlock

Submitted by: @import:stackexchange-codereview··
0
Viewed 0 times
theallunicodeblocklistingcharsgiven

Problem

I want to visually inspect all the characters that Java thinks are in any given UnicodeBlock. The following method, as far as I can tell, does the task. But, it sure feels like awful design.

public static void displayUnicodeBlock(Character.UnicodeBlock block) {
    int maxCharVal = (int) Math.pow(2, 16);
    for(int cInt = 0; cInt <= maxCharVal; cInt++) {
        char c = (char) cInt;           
        if(Character.UnicodeBlock.of(c) == block) { 
            System.out.println("_" + c + "_");
        }
    }
}


  • Would reflection be the ideal solution?



  • Can reflection actually show every character in a UnicodeBlock? How?



  • My solution is stupid because it is inefficient?



  • My solution is stupid because it is hardwired with 2^16?

Solution

It is truly unfortunate that Character.UnicodeBlock doesn't have a min() and max(). In fact, the way it is implemented in OpenJDK is as a private static blockStarts table, which is entirely unhelpful to you.

I don't have a better solution, but rather a complication. The Java char type only covers the Basic Multilingual Plane. To be pedantic, you would need to cover even larger codepoints, with Character.UnicodeBlock.of(int)! (For example, Character.UnicodeBlock.CJK_UNIFIED_IDEOGRAPHS_EXTENSION_B starts at U+20000, which cannot be encoded using a single char, and therefore iterating up to 216 would not find it.)

It appears that existing Unicode blocks have sizes that are multiples of 16, so you could take advantage of that instead of testing every single codepoint.

Code blocks are contiguous. Therefore, a possible optimization is that once you have found a character that is in the desired block, if you then encounter a character that is not in the block, then you're done.

Context

StackExchange Code Review Q#83314, answer score: 5

Revisions (0)

No revisions yet.