HiveBrain v1.2.0
Get Started
← Back to all entries
patternjavaMinor

A system for determination of human language from input text:: robustification

Submitted by: @import:stackexchange-codereview··
0
Viewed 0 times
humansystemtextlanguageinputdeterminationrobustificationforfrom

Problem

This program works, but it's pretty clunky and fragile. I want to make it more dynamic / solid.

For instance some of the functionality depends on the order I entered things into an ArrayList, that seems to me to be an example of a terribly odious smelly code.

```
//These structures hold the word list & corresponding flag indicator for each language
static HashMap de_map = new HashMap();
static HashMap fr_map = new HashMap();
static HashMap ru_map = new HashMap();
static HashMap eng_map = new HashMap();

public static void main( String[] args ) throws URISyntaxException, IOException
{
//put them all in a list for later processing
List> lang_maps = new ArrayList>();

lang_maps.add( de_map );
lang_maps.add( fr_map );
lang_maps.add( ru_map );
lang_maps.add( eng_map );

//this holds all the dictionary files, i.e. word lists garners from language folders
ArrayList dictionary_files = new ArrayList();

//THIS SEGMENT IS FOR DYNAMICALLY LOCATING THE DIRECTORY, SO THE PROGRAM WORKS "OUT OF THE BOX"
/*****/
File currentDir = new File( "." ); // Read current file location
//System.out.println(currentDir.getAbsolutePath());

File targetDir = null;
if (currentDir.isDirectory())
{
targetDir = new File( currentDir, "word_lists_1" ); // Construct the target directory file with the right parent directory
}
if ( targetDir != null && targetDir.exists() )
{
SearchDirectories.listDirectoryAndFiles( targetDir.toPath(), dictionary_files );
}
/*****/

//this populates word presence data structs for each language
for(Path dir : dictionary_files)
{

String word_holding_directory_path = dir.toString().toLowerCase()

Solution

You're right about the clunky and fragile.
A lot of improvements are possible.
Most notably,
many unnecessary elements can be rewritten simpler, shorter, cleaner.
Let's go from the top.

Prefer interface types instead of implementation

Instead of this:

static HashMap  de_map = new HashMap();


If you don't have a specific need for a HashMap, declare as a Map:

static Map  de_map = new HashMap();


Do this everywhere.

Map or Set

Maps with boolean as value type are always suspicious.
Very often they can be rewritten as Set<> instead.
We'll get to that too a bit later.

Use the diamond operator

Instead of:

static Map  fr_map = new HashMap();


Use the modern diamond operator:

static Map  fr_map = new HashMap<>();


This is supported as of Java 7, which is the lowest officially supported Java version. Consider migrating to it ASAP.

Another similar example, instead of this:

List> lang_maps = new ArrayList>();


Should be:

List> lang_maps = new ArrayList<>();


Don't throw exceptions in vain

The main method is declared to throw URISyntaxException,
but I don't see where that can come from.
Perhaps from SearchDirectories.listDirectoryAndFiles ?
I doubt you really need it.

Perhaps related to this (or perhaps not at all) is that SearchDirectories.listDirectoryAndFiles takes a Path as parameter.
I'm wondering if that's necessary at all.
Can't you rewrite it with a simple File ?

Unnecessary stuff

Instead of this:

File currentDir = new File( "." ); // Read current file location

    File targetDir = null;
    if (currentDir.isDirectory()) 
    {
        targetDir = new File( currentDir, "word_lists_1" ); // Construct the target directory file with the right parent directory
    }
    if ( targetDir != null && targetDir.exists() )
    {
        // ...
    }


This is exactly the same:

File currentDir = new File(".");

    targetDir = new File(currentDir, "word_lists_1"); 

    if (targetDir.exists()) {
        // ...
    }


Simpler isn't it? I also removed the redundant comments,
and changed the formatting to the style used by major IDEs.
Speaking of comments,
putting comments at the end of lines is not a great idea,
as it makes the lines longer,
and forces the reader to scroll horizontally to the far right.

if or if-else ?

About this code:

if (word_holding_directory_path.toLowerCase().contains("/de/")) {
                de_map.put(line, false);
            }
            if (word_holding_directory_path.toLowerCase().contains("/ru/")) {
                ru_map.put(line, false);
            }
            if (word_holding_directory_path.toLowerCase().contains("/fr/")) {
                fr_map.put(line, false);
            }
            if (word_holding_directory_path.toLowerCase().contains("/eng/")) {
                eng_map.put(line, false);
            }


Can you have paths like something/fr/.../eng/.../de/?
I suspect that a path will only contain either "fr", "eng", "de", and so on.
In which case these conditions should be chained with else-if.

Also,
instead of evaluating word_holding_directory_path.toLowerCase()
multiple times,
it would be better to evaluate it only once.

Even worse,
there's really no need to re-evaluate any of these conditions for each line of input.
This would be better:

Map map;
        if (word_holding_directory_path.toLowerCase().contains("/de/")) {
            map = de_map;
        } else if (word_holding_directory_path.toLowerCase().contains("/ru/")) {
            map = ru_map;
        } else if (word_holding_directory_path.toLowerCase().contains("/fr/")) {
            map = fr_map;
        } else if (word_holding_directory_path.toLowerCase().contains("/eng/")) {
            map = eng_map;
        } else {
            continue;
        }

        while ((line = br.readLine()) != null) {
            map.put(line, false);
        }


Unnecessary stuff 2

Instead of this:

ArrayList input_text = new ArrayList();
    String [] tokens = in.nextLine().split("\\s");

    for (int i = 0; i < tokens.length; i++)
    {
        input_text.add( tokens[i].toString() );
    }


This is exactly the same:

List input_text = Arrays.asList(in.nextLine().split("\\s"));


But actually,
you don't need a List at all later in your code.
So you could as well just replace the above with:

String[] input_text = in.nextLine().split("\\s");


And the rest of the code will work the same way.

Unnecessary stuff 3

The nested loop after the you get the user input can be simplified using enhanced for-each loops, dropping unnecessary toString calls,
and written better as:

```
for (Map working_lang_map : lang_maps) {
for (Map.Entry entry : working_lang_map.entrySet()) {
for (String word : input_text) {
if (entry.getKey().toLowerCase().trim().equals(word.toLowerCase().trim())) {
working_lang_map.put(entry.getKey(), true);

Code Snippets

static HashMap<String, Boolean>  de_map = new HashMap<String, Boolean>();
static Map<String, Boolean>  de_map = new HashMap<String, Boolean>();
static Map<String, Boolean>  fr_map = new HashMap<String, Boolean>();
static Map<String, Boolean>  fr_map = new HashMap<>();
List<HashMap<String, Boolean>> lang_maps = new ArrayList<HashMap<String, Boolean>>();

Context

StackExchange Code Review Q#90978, answer score: 3

Revisions (0)

No revisions yet.