HiveBrain v1.2.0
Get Started
← Back to all entries
snippetjavaMinor

Generate and store hypernyms for all words in a hashmap

Submitted by: @import:stackexchange-codereview··
0
Viewed 0 times
hypernymsallwordsstoregenerateforandhashmap

Problem

I have a system which reads in a clause in the form of a prolog "fact", i.e. 'is'('a sentence', 'this').. I want to generalize this up into higher-order classes and types, rather than just single words. At this juncture, one things I'm going to try out, as a sort of initial step, is the just generate all the hypernyms associated with that word and store them, possibly, alongside their root in the hashmap (where I'm currently storing sentences). What would be the best way to do that?

Eventually the subject and objects get cross-referenced to check for identical entities, and subsequently generating learned "rules" through inference, i.e. 'contains'('vitamin c', 'oranges')., 'prevents'('scurvy', 'vitamin c'). would yield the output "rule" 'prevents'('scurvy', 'oranges').

(FYI) I've incorporated WordNet and also the Java WordNet Interface

This is the main method:

```
public class lets_go
{
public static void main(String[] args) throws IOException
{
MITJavaWordNetInterface wordnet_interface = new MITJavaWordNetInterface();

Ontology ontology = new Ontology();
BufferedReader br = new BufferedReader(new FileReader("file.txt"));
Pattern p = Pattern.compile("'(.?)'\\('(.?)',\\s'(.?)'\\)\\.");
String line;
while ((line = br.readLine()) != null)
{
Matcher m = p.matcher(line);
if( m.matches() )
{
String verb = m.group(1);
String object = m.group(2);
String subject = m.group(3);
ontology.addSentence( new Sentence( verb, object, subject ) );
}
}

for( String joint: ontology.getJoints() )
{
for( Integer subind: ontology.getSubjectIndices( joint ) )
{
Sentence xaS = ontology.getSentence( subind );

for( Integer obind: ontology.getObjectIndices( joint ) )
{

Sentence yOb

Solution

main

MITJavaWordNetInterface wordnet_interface = new MITJavaWordNetInterface();


You never actually use this. You could just as well embed it in the comment:

/**
     * This is an example call to the hypernym generating function
     * 
     * MITJavaWordNetInterface wordnet_interface = new MITJavaWordNetInterface();
     * wordnet_interface.getHypernyms( "word" );        
     **/


Now there are both an example declaration and use in the comment.

You do a lot of work in the main method. Consider pushing some of that out into functions. E.g.

private static Ontology readFile(String filename) throws IOException
{
    Ontology ontology = new Ontology();

    BufferedReader br = new BufferedReader(new FileReader(filename));
    Pattern p = Pattern.compile("'(.*?)'\\('(.*?)',\\s*'(.*?)'\\)\\.");
    String line;
    while ((line = br.readLine()) != null) 
    {
        Matcher m = p.matcher(line);
        if ( m.matches() ) 
        {
            String verb    = m.group(1);
            String object  = m.group(2);
            String subject = m.group(3);
            ontology.addSentence( new Sentence( verb, object, subject ) );
        }
    }

    return ontology;
}

private static void makeInferences(Ontology ontology) {
    for( String joint: ontology.getJoints() )
    {
        for( Integer subjectIndex: ontology.getSubjectIndices( joint ) )
        {
            Sentence xaS = ontology.getSentence( subjectIndex );

            for( Integer objectIndex: ontology.getObjectIndices( joint ) )
            {

                Sentence yOb = ontology.getSentence( objectIndex );

                Sentence s = new Sentence( xaS.getVerb(),
                                       xaS.getObject(),
                                       yOb.getSubject() );

                ontology.numberRules( s );    

            }
        }
    }
}

public static void main(String[] args) throws IOException 
{
    Ontology ontology = readFile("file.txt");
    makeInferences(ontology);

    /**
     * This is an example call to the hypernym generating function
     * 
     * MITJavaWordNetInterface wordnet_interface = new MITJavaWordNetInterface();
     * wordnet_interface.getHypernyms( "word" );        
     **/

    // this prints out each observed datum sentence once on the basis 
    // of how often it was seen in our corpus
    ontology.ruleCount.entrySet().stream()
    .filter(e -> e.getValue() > 0 )
    .sorted(reverseOrder(Map.Entry.comparingByValue()))
    .forEach(e -> System.out.println(e.getKey() + " : " + e.getValue()));
}


I just made them members of lets_go (which I renamed to Main), but you could also put them in Ontology or a new class if you wanted. I would tend to think that makeInferences should be in Ontology while readFile should be in a separate class, but it really depends on how you use them. There's an argument that addSentence should handle making inferences in which case makeInferences would be redundant.

I wrote out subjectIndex and objectIndex to make it clearer what each represents.

What are xaS and yOb? Longer, more descriptive names would make the logic easier to understand.

Notice how none of the variables defined in the first function are needed in the second (with the exception of ontology which is actually declared in main). This is a big part of why I felt that they should be in separate functions.

Now main knows that it needs to read input and make inferences, but it doesn't need to know how to do those things.

Ontology

/*
     * why do you call s.getSubject() if this is an integer 
     * value? wouldn't that return a string and not an 
     * integer? Oh, wait a minute... maybe this is a tricky form
     * of Hashmap key, whereby he renders the string as a number
     * and uses that as they key of the hashmap... clever. maybe. 
     */


This seems like this is not code that you wrote, so it would be off-topic for review. As a general rule, you should answer questions like those posed here before sending the code out for review. In particular, if you don't like variable names in code that you maintain, then why not change them?

MITJavaWordNetInterface

Your indentation is inconsistent here. A more typical form:

// print out each h y p e r n y m s id and synonyms
    List  words;
    for ( ISynsetID sid : hypernyms ) {
        words = dict.getSynset( sid ).getWords();
        System.out.print( sid + " {");
        for ( Iterator i = words.iterator(); i.hasNext(); ) {
            System.out.print( i.next().getLemma() );
            if ( i.hasNext() ) {
                System.out.print(", ");
            }
        }
        System.out.println("}");
    }


It's much easier to see where each block begins this way.

Also note that I added curly brackets {} around the statement in the innermost if. Not only does that make it clearer where one piece begins and the other ends, but it avo

Code Snippets

MITJavaWordNetInterface wordnet_interface = new MITJavaWordNetInterface();
/**
     * This is an example call to the hypernym generating function
     * 
     * MITJavaWordNetInterface wordnet_interface = new MITJavaWordNetInterface();
     * wordnet_interface.getHypernyms( "word" );        
     **/
private static Ontology readFile(String filename) throws IOException
{
    Ontology ontology = new Ontology();

    BufferedReader br = new BufferedReader(new FileReader(filename));
    Pattern p = Pattern.compile("'(.*?)'\\('(.*?)',\\s*'(.*?)'\\)\\.");
    String line;
    while ((line = br.readLine()) != null) 
    {
        Matcher m = p.matcher(line);
        if ( m.matches() ) 
        {
            String verb    = m.group(1);
            String object  = m.group(2);
            String subject = m.group(3);
            ontology.addSentence( new Sentence( verb, object, subject ) );
        }
    }

    return ontology;
}

private static void makeInferences(Ontology ontology) {
    for( String joint: ontology.getJoints() )
    {
        for( Integer subjectIndex: ontology.getSubjectIndices( joint ) )
        {
            Sentence xaS = ontology.getSentence( subjectIndex );

            for( Integer objectIndex: ontology.getObjectIndices( joint ) )
            {

                Sentence yOb = ontology.getSentence( objectIndex );

                Sentence s = new Sentence( xaS.getVerb(),
                                       xaS.getObject(),
                                       yOb.getSubject() );

                ontology.numberRules( s );    

            }
        }
    }
}

public static void main(String[] args) throws IOException 
{
    Ontology ontology = readFile("file.txt");
    makeInferences(ontology);

    /**
     * This is an example call to the hypernym generating function
     * 
     * MITJavaWordNetInterface wordnet_interface = new MITJavaWordNetInterface();
     * wordnet_interface.getHypernyms( "word" );        
     **/

    // this prints out each observed datum sentence once on the basis 
    // of how often it was seen in our corpus
    ontology.ruleCount.entrySet().stream()
    .filter(e -> e.getValue() > 0 )
    .sorted(reverseOrder(Map.Entry.comparingByValue()))
    .forEach(e -> System.out.println(e.getKey() + " : " + e.getValue()));
}
/*
     * why do you call s.getSubject() if this is an integer 
     * value? wouldn't that return a string and not an 
     * integer? Oh, wait a minute... maybe this is a tricky form
     * of Hashmap key, whereby he renders the string as a number
     * and uses that as they key of the hashmap... clever. maybe. 
     */
// print out each h y p e r n y m s id and synonyms
    List < IWord > words;
    for ( ISynsetID sid : hypernyms ) {
        words = dict.getSynset( sid ).getWords();
        System.out.print( sid + " {");
        for ( Iterator< IWord > i = words.iterator(); i.hasNext(); ) {
            System.out.print( i.next().getLemma() );
            if ( i.hasNext() ) {
                System.out.print(", ");
            }
        }
        System.out.println("}");
    }

Context

StackExchange Code Review Q#78971, answer score: 3

Revisions (0)

No revisions yet.