HiveBrain v1.2.0
Get Started
← Back to all entries
patternjavaMinor

loopification of highly procedureal, though fully functional, multiclass perceptron

Submitted by: @import:stackexchange-codereview··
0
Viewed 0 times
loopificationhighlyprocedurealmulticlassperceptronfullyfunctionalthough

Problem

I've implemented the multiclass perceptron in the one vs. all style.

I just thought about it and tried to implement it in the most basic way. I think it's correct though my f_measure is a bit low. Which I believe is due in part to the fact that I haven't cleaned up my input, it's a feature vector of document words and I left all kinds of pronouns, 'the' and the like, which doubtless contribute not insignificantly to this confusion. In addition to the highly naieve tokenization of just every " ", quite rudimentary. ANYWAY, that's neither here nor there.

I'd like to know if some one among you might be able to devise a way to loopify this program, i.e. expunge this highly repetitive code and replace it with elegant loops.

```
public class MulticlassPerceptron
{
static int MAX_ITER = 100;
static double LEARNING_RATE = 0.1;
static int theta = 0;

static final String LABEL_atheism = "atheism";
static final String LABEL_sports = "sports";
static final String LABEL_science = "science";
static final String LABEL_politics = "politics";

public static void perceptron( Table train_freq_count_against_globo_dict,
Table test_freq_count_against_globo_dict,
Set GLOBO_DICT )
{
int globo_dict_size = GLOBO_DICT.size();
int number_of_files__train = train_freq_count_against_globo_dict.size();

double[] weights__science = new double[ globo_dict_size + 1 ];//one for bias
double[] weights__sports = new double[ globo_dict_size + 1 ];//one for bias
double[] weights__politics = new double[ globo_dict_size + 1 ];//one for bias
double[] weights__atheism = new double[ globo_dict_size + 1 ];//one for bias

for (int i = 0; i cell: train_freq_count_against_globo_dict.cellSet() )
{
int[] container_of_feature_vector = cell.getRowKey();

Solution

So, a few things.

  • The main advantage given by an Object-Oriented language is support for Objects. You should try some!



  • Your naming conventions are wildly against the java standards. Use camelCase, not snake_case. The use of double underscores is particularly noxious.



  • Variables should be private wherever possible, and final wherever possible.



  • Your spacing around generics and commas is wonky. They should look like this: for (Cell cell :



  • Likewise, parens don't have whitespace around them, except for a leading space after a control flow statement. if (foo) is correct, = ( foo * bar ) is not.



  • Most of your variable names are not at all descriptive. What's a 'globo'?



I took a start at an object that should remove a lot of duplication from your code. You'll need to make modifications to your existing classes in order to use it correctly. Hopefully it will put you on the right track. You will need to confirm the logic, as I probably made at least one mistake. It is final because it should not be extended. It is package scope (not public) because I do not expect it will need to be used outside its package. Obviously you will want to rename it.

final class Foo {

    private static final int BIAS = 1;

    private final String name;
    private final int dictSize;
    private final double[] weights;
    private final int[] outputs;

    public Foo(
            final String name,
            final int dictSize,
            final int numOutputs,
            final Table trainingTable) {

        this.name = name;

        this.dictSize = dictSize;

        this.weights = new double[dictSize + BIAS];
        for (int i = 0; i  trainingTable) {
        int z = 0;

        for (final Cell cell: trainingTable.cellSet()) {
            final int[] container_of_feature_vector = cell.getRowKey();

            final String columnKey = String.valueOf(cell.getColumnKey());
            this.outputs[z] = columnKey.equalsIgnoreCase(this.name) ? 0 : 1;

            z++;
        }
    }

    private static final double rootMeanSquaredError(
            final double error,
            final double trainingSize) {
        return Math.sqrt(error / trainingSize);
    }

    public int calculateOutput(
            final int theta,
            final double[][] trainingMatrix,
            final int p) {
        throw new UnsupportedOperationException();
    }

    private static double randomNumber(final int minValue, final int maxValue) {
        throw new UnsupportedOperationException();
    }
}

Code Snippets

final class Foo {

    private static final int BIAS = 1;

    private final String name;
    private final int dictSize;
    private final double[] weights;
    private final int[] outputs;

    public Foo(
            final String name,
            final int dictSize,
            final int numOutputs,
            final Table<int[], String, Integer> trainingTable) {

        this.name = name;

        this.dictSize = dictSize;

        this.weights = new double[dictSize + BIAS];
        for (int i = 0; i < this.weights.length; i++) {
            this.weights[i] = randomNumber(0, 1);
        }

        this.outputs = new int[numOutputs];
        this.initializeOutputs(trainingTable);
    }

    public void train(
            final int theta,
            final double learningRate,
            final double[][] trainingMatrix,
            final int maxIterations) {

        int iteration = 0;
        double globalError;
        do {
            iteration++;
            globalError = 0;

            //loop through all instances (complete one epoch)
            for (int p = 0; p < this.outputs.length; p++) {

                // calculate predicted class
                final int output =
                        this.calculateOutput(theta, trainingMatrix, p);

                // difference between predicted and actual class values
                final double localError = this.outputs[p] - output;

                //update weights and bias
                for (int i = 0; i < this.dictSize; i++) {
                    this.weights[i] += (learningRate * localError * trainingMatrix[p][i]);
                }
                this.weights[this.dictSize] += (learningRate * localError);

                //summation of squared error (error value for all instances)
                globalError += (localError * localError);
            }

            /* Root Mean Squared Error */
            System.out.println(this.name + " vs. All");
            System.out.println(
                    String.format("Iteration $d2 : RMSE = $f",
                            iteration, rootMeanSquaredError(globalError, trainingMatrix.length)));
        } while (globalError != 0 && iteration <= maxIterations);
    }

    private void initializeOutputs(final Table<int[], String, Integer> trainingTable) {
        int z = 0;

        for (final Cell<int[], String, Integer> cell: trainingTable.cellSet()) {
            final int[] container_of_feature_vector = cell.getRowKey();

            final String columnKey = String.valueOf(cell.getColumnKey());
            this.outputs[z] = columnKey.equalsIgnoreCase(this.name) ? 0 : 1;

            z++;
        }
    }

    private static final double rootMeanSquaredError(
            final double error,
            final double trainingSize) {
        return Math.sqrt(error / trainingSize);
    }

    public int calculateOutput(
            final int theta,
            final double[][] trainingMatrix,
            final int p) {
        throw new Un

Context

StackExchange Code Review Q#82809, answer score: 3

Revisions (0)

No revisions yet.