patternjavaMinor
A system for determination of human language from input text:: robustification
Viewed 0 times
humansystemtextlanguageinputdeterminationrobustificationforfrom
Problem
This program works, but it's pretty clunky and fragile. I want to make it more dynamic / solid.
For instance some of the functionality depends on the order I entered things into an
```
//These structures hold the word list & corresponding flag indicator for each language
static HashMap de_map = new HashMap();
static HashMap fr_map = new HashMap();
static HashMap ru_map = new HashMap();
static HashMap eng_map = new HashMap();
public static void main( String[] args ) throws URISyntaxException, IOException
{
//put them all in a list for later processing
List> lang_maps = new ArrayList>();
lang_maps.add( de_map );
lang_maps.add( fr_map );
lang_maps.add( ru_map );
lang_maps.add( eng_map );
//this holds all the dictionary files, i.e. word lists garners from language folders
ArrayList dictionary_files = new ArrayList();
//THIS SEGMENT IS FOR DYNAMICALLY LOCATING THE DIRECTORY, SO THE PROGRAM WORKS "OUT OF THE BOX"
/*****/
File currentDir = new File( "." ); // Read current file location
//System.out.println(currentDir.getAbsolutePath());
File targetDir = null;
if (currentDir.isDirectory())
{
targetDir = new File( currentDir, "word_lists_1" ); // Construct the target directory file with the right parent directory
}
if ( targetDir != null && targetDir.exists() )
{
SearchDirectories.listDirectoryAndFiles( targetDir.toPath(), dictionary_files );
}
/*****/
//this populates word presence data structs for each language
for(Path dir : dictionary_files)
{
String word_holding_directory_path = dir.toString().toLowerCase()
For instance some of the functionality depends on the order I entered things into an
ArrayList, that seems to me to be an example of a terribly odious smelly code. ```
//These structures hold the word list & corresponding flag indicator for each language
static HashMap de_map = new HashMap();
static HashMap fr_map = new HashMap();
static HashMap ru_map = new HashMap();
static HashMap eng_map = new HashMap();
public static void main( String[] args ) throws URISyntaxException, IOException
{
//put them all in a list for later processing
List> lang_maps = new ArrayList>();
lang_maps.add( de_map );
lang_maps.add( fr_map );
lang_maps.add( ru_map );
lang_maps.add( eng_map );
//this holds all the dictionary files, i.e. word lists garners from language folders
ArrayList dictionary_files = new ArrayList();
//THIS SEGMENT IS FOR DYNAMICALLY LOCATING THE DIRECTORY, SO THE PROGRAM WORKS "OUT OF THE BOX"
/*****/
File currentDir = new File( "." ); // Read current file location
//System.out.println(currentDir.getAbsolutePath());
File targetDir = null;
if (currentDir.isDirectory())
{
targetDir = new File( currentDir, "word_lists_1" ); // Construct the target directory file with the right parent directory
}
if ( targetDir != null && targetDir.exists() )
{
SearchDirectories.listDirectoryAndFiles( targetDir.toPath(), dictionary_files );
}
/*****/
//this populates word presence data structs for each language
for(Path dir : dictionary_files)
{
String word_holding_directory_path = dir.toString().toLowerCase()
Solution
You're right about the clunky and fragile.
A lot of improvements are possible.
Most notably,
many unnecessary elements can be rewritten simpler, shorter, cleaner.
Let's go from the top.
Prefer interface types instead of implementation
Instead of this:
If you don't have a specific need for a
Do this everywhere.
Maps with boolean as value type are always suspicious.
Very often they can be rewritten as
We'll get to that too a bit later.
Use the diamond operator
Instead of:
Use the modern diamond operator:
This is supported as of Java 7, which is the lowest officially supported Java version. Consider migrating to it ASAP.
Another similar example, instead of this:
Should be:
Don't throw exceptions in vain
The
but I don't see where that can come from.
Perhaps from
I doubt you really need it.
Perhaps related to this (or perhaps not at all) is that
I'm wondering if that's necessary at all.
Can't you rewrite it with a simple
Unnecessary stuff
Instead of this:
This is exactly the same:
Simpler isn't it? I also removed the redundant comments,
and changed the formatting to the style used by major IDEs.
Speaking of comments,
putting comments at the end of lines is not a great idea,
as it makes the lines longer,
and forces the reader to scroll horizontally to the far right.
About this code:
Can you have paths like
I suspect that a path will only contain either "fr", "eng", "de", and so on.
In which case these conditions should be chained with
Also,
instead of evaluating
multiple times,
it would be better to evaluate it only once.
Even worse,
there's really no need to re-evaluate any of these conditions for each
This would be better:
Unnecessary stuff 2
Instead of this:
This is exactly the same:
But actually,
you don't need a
So you could as well just replace the above with:
And the rest of the code will work the same way.
Unnecessary stuff 3
The nested loop after the you get the user input can be simplified using enhanced for-each loops, dropping unnecessary
and written better as:
```
for (Map working_lang_map : lang_maps) {
for (Map.Entry entry : working_lang_map.entrySet()) {
for (String word : input_text) {
if (entry.getKey().toLowerCase().trim().equals(word.toLowerCase().trim())) {
working_lang_map.put(entry.getKey(), true);
A lot of improvements are possible.
Most notably,
many unnecessary elements can be rewritten simpler, shorter, cleaner.
Let's go from the top.
Prefer interface types instead of implementation
Instead of this:
static HashMap de_map = new HashMap();If you don't have a specific need for a
HashMap, declare as a Map:static Map de_map = new HashMap();Do this everywhere.
Map or SetMaps with boolean as value type are always suspicious.
Very often they can be rewritten as
Set<> instead.We'll get to that too a bit later.
Use the diamond operator
Instead of:
static Map fr_map = new HashMap();Use the modern diamond operator:
static Map fr_map = new HashMap<>();This is supported as of Java 7, which is the lowest officially supported Java version. Consider migrating to it ASAP.
Another similar example, instead of this:
List> lang_maps = new ArrayList>();Should be:
List> lang_maps = new ArrayList<>();Don't throw exceptions in vain
The
main method is declared to throw URISyntaxException,but I don't see where that can come from.
Perhaps from
SearchDirectories.listDirectoryAndFiles ?I doubt you really need it.
Perhaps related to this (or perhaps not at all) is that
SearchDirectories.listDirectoryAndFiles takes a Path as parameter.I'm wondering if that's necessary at all.
Can't you rewrite it with a simple
File ?Unnecessary stuff
Instead of this:
File currentDir = new File( "." ); // Read current file location
File targetDir = null;
if (currentDir.isDirectory())
{
targetDir = new File( currentDir, "word_lists_1" ); // Construct the target directory file with the right parent directory
}
if ( targetDir != null && targetDir.exists() )
{
// ...
}This is exactly the same:
File currentDir = new File(".");
targetDir = new File(currentDir, "word_lists_1");
if (targetDir.exists()) {
// ...
}Simpler isn't it? I also removed the redundant comments,
and changed the formatting to the style used by major IDEs.
Speaking of comments,
putting comments at the end of lines is not a great idea,
as it makes the lines longer,
and forces the reader to scroll horizontally to the far right.
if or if-else ?About this code:
if (word_holding_directory_path.toLowerCase().contains("/de/")) {
de_map.put(line, false);
}
if (word_holding_directory_path.toLowerCase().contains("/ru/")) {
ru_map.put(line, false);
}
if (word_holding_directory_path.toLowerCase().contains("/fr/")) {
fr_map.put(line, false);
}
if (word_holding_directory_path.toLowerCase().contains("/eng/")) {
eng_map.put(line, false);
}Can you have paths like
something/fr/.../eng/.../de/?I suspect that a path will only contain either "fr", "eng", "de", and so on.
In which case these conditions should be chained with
else-if.Also,
instead of evaluating
word_holding_directory_path.toLowerCase()multiple times,
it would be better to evaluate it only once.
Even worse,
there's really no need to re-evaluate any of these conditions for each
line of input.This would be better:
Map map;
if (word_holding_directory_path.toLowerCase().contains("/de/")) {
map = de_map;
} else if (word_holding_directory_path.toLowerCase().contains("/ru/")) {
map = ru_map;
} else if (word_holding_directory_path.toLowerCase().contains("/fr/")) {
map = fr_map;
} else if (word_holding_directory_path.toLowerCase().contains("/eng/")) {
map = eng_map;
} else {
continue;
}
while ((line = br.readLine()) != null) {
map.put(line, false);
}Unnecessary stuff 2
Instead of this:
ArrayList input_text = new ArrayList();
String [] tokens = in.nextLine().split("\\s");
for (int i = 0; i < tokens.length; i++)
{
input_text.add( tokens[i].toString() );
}This is exactly the same:
List input_text = Arrays.asList(in.nextLine().split("\\s"));But actually,
you don't need a
List at all later in your code.So you could as well just replace the above with:
String[] input_text = in.nextLine().split("\\s");And the rest of the code will work the same way.
Unnecessary stuff 3
The nested loop after the you get the user input can be simplified using enhanced for-each loops, dropping unnecessary
toString calls,and written better as:
```
for (Map working_lang_map : lang_maps) {
for (Map.Entry entry : working_lang_map.entrySet()) {
for (String word : input_text) {
if (entry.getKey().toLowerCase().trim().equals(word.toLowerCase().trim())) {
working_lang_map.put(entry.getKey(), true);
Code Snippets
static HashMap<String, Boolean> de_map = new HashMap<String, Boolean>();static Map<String, Boolean> de_map = new HashMap<String, Boolean>();static Map<String, Boolean> fr_map = new HashMap<String, Boolean>();static Map<String, Boolean> fr_map = new HashMap<>();List<HashMap<String, Boolean>> lang_maps = new ArrayList<HashMap<String, Boolean>>();Context
StackExchange Code Review Q#90978, answer score: 3
Revisions (0)
No revisions yet.