HiveBrain v1.2.0
Get Started
← Back to all entries
patternjavaMinor

Loading tab-separated tweet data into an array

Submitted by: @import:stackexchange-codereview··
0
Viewed 0 times
tweetarraytabintoseparatedloadingdata

Problem

I'm working on this school project and was wondering if there was any way of storing this information better, faster, more professionally. I'm also restricted to only using an array; don't ask me we are not allowed to use ArrayLists yet.

My Code:

public void loadTweets(String fileName){

    try {
        File file = new File(fileName);
        Scanner s = new Scanner(file);

        while(s.hasNextLine()){
            numberOfTweets++;
            s.nextLine();
        }
        tweets = new String[numberOfTweets];
        s.close();

        s = new Scanner(file);
        int counter = 0;
        while(s.hasNextLine()){
            String[] elements = s.nextLine().split("\t");
            tweets[counter] = elements[2];
            counter++;
        }
        s.close();
    } catch (IOException e) {
        e.printStackTrace();
    }   
}


File Example:

Each field is separated by a tab and it goes, user > date posted > tweet.

USER_989b85bb 2010-03-04T15:34:46 @USER_6921e61d can I be...
USER_989b85bb 2010-03-04T15:34:47 superstar
USER_a75657c2 2010-03-03T00:02:54 @USER_13e8a102 They reached a
USER_a75657c2 2010-03-07T21:45:48 So SunChips made a bag...
USER_ee551c6c 2010-03-07T15:40:27 drthema: Do something today that
USER_6c78461b 2010-03-03T05:13:34 @USER_a3d59856 yes, i watched...
USER_92b2293c 2010-03-04T14:00:11 RT @USER_5aac9e88: Let no 1 push u
USER_75c62ed9 2010-03-07T03:35:38 @USER_cb237f7f Congrats on...

Solution

The prohibition on ArrayList is unfortunate. One natural solution would be to use Files.readAllLines(), but that returns a List, which is probably off-limits to you. Likewise, Files.lines() produces a Stream, which would be even better and thus probably even more forbidden to you.

Your workaround is to open the file twice, which is definitely undesirable. (File I/O is considered "expensive".) If I had to make a recommendation based on arrays, I would suggest

  • Files.readAllBytes() to slurp the entire file into a byte array.



  • Make a String from the byte array.



  • Use String.split() to form an array of lines.



  • For each line, retain only the third field.



My reasoning is that you eventually have to read the entire file anyway, so you might as well read it all at once, and only once. Once you have a string, you can take advantage of String.split().

I would also like to note that catching IOException to print a stack trace is counterproductive. If you don't have a good way to handle an exception, just let it propagate by declaring public void loadTweets(…) throws IOException. That way, you're letting the caller know that something went wrong — which is exactly what exceptions are meant for.

Context

StackExchange Code Review Q#118843, answer score: 2

Revisions (0)

No revisions yet.