patternjavaMinor
Twitter Streaming Client - Round#2
Viewed 0 times
twitterclientstreaminground
Problem
I've been experimenting with the Twitter Streaming API and would like some critical feedback. Specifically code correctness, code smells, overall structure, and my usage of collections and queues.
The application leverages the Twitter Streaming API to identify the top trending hashtags for the supplied hashtag, or string.
Sample invocation:
java -jar ./target/lotus-1.0-SNAPSHOT-jar-with-dependencies.jar apple
Top 10 Hashtags
{#Apple=223, #iTunes=182, #iPhone=160, #Music=62, #Mac=59, #apple=43, #Apps=38, #Movies=25, #iTunesU=21, #Video=19}.
Total Tweets Processed: 1935
AbstractClient.java
MessageData.java
```
package com.gmail.lifeofreilly.lotus;
import org.apache.log4j.Logger;
import com.google.common.collect.Multiset;
import com.google.common.collect.Multisets;
import com.google.common.collect.TreeMultiset;
import java.util.concurrent.BlockingQueue;
import java.util.concurrent.LinkedBlockingQueue;
import java.util.Iterator;
import java.util.LinkedHashMap;
import java.util.Map;
import java.util.Set;
/**
* A blocking message queue and the hashtags extracted.
*/
public class MessageData {
private final static Logger lo
The application leverages the Twitter Streaming API to identify the top trending hashtags for the supplied hashtag, or string.
Sample invocation:
java -jar ./target/lotus-1.0-SNAPSHOT-jar-with-dependencies.jar apple
Top 10 Hashtags
{#Apple=223, #iTunes=182, #iPhone=160, #Music=62, #Mac=59, #apple=43, #Apps=38, #Movies=25, #iTunesU=21, #Video=19}.
Total Tweets Processed: 1935
AbstractClient.java
package com.gmail.lifeofreilly.lotus;
/**
* An Abstract client for retrieving messages that contain hashtags. Can be extended for target social network.
*/
public abstract class AbstractClient implements Runnable {
private final String trackedTerm;
private final MessageData messageData;
public AbstractClient(final String trackedTerm, final MessageData messageData) {
this.trackedTerm = trackedTerm;
this.messageData = messageData;
}
public MessageData getMessageData() {
return messageData;
}
public String getTrackedTerm() {
return trackedTerm;
}
@Override
public String toString() {
return "AbstractClient{" +
"trackedTerm='" + trackedTerm + '\'' +
", class=" + this.getClass() +
'}';
}
}MessageData.java
```
package com.gmail.lifeofreilly.lotus;
import org.apache.log4j.Logger;
import com.google.common.collect.Multiset;
import com.google.common.collect.Multisets;
import com.google.common.collect.TreeMultiset;
import java.util.concurrent.BlockingQueue;
import java.util.concurrent.LinkedBlockingQueue;
import java.util.Iterator;
import java.util.LinkedHashMap;
import java.util.Map;
import java.util.Set;
/**
* A blocking message queue and the hashtags extracted.
*/
public class MessageData {
private final static Logger lo
Solution
This implementation looks better than the one in the original question. As I said in my previous answer, your design looks a bit too complex.
I'd go for a more reactive, event based solution. I'd structure your client and your message processor in the following way.
The core idea is to have a client that gets data from the twitter stream, allows some one or more message processors to subscribe for events and notifies them when there are new messages available.
These interface are completely agnostic with respect to what you need to do with your messages.
Let's address the specific problem you want to tackle: processing the messages, analysing them and storing them in a data structure. You only need to query the hashtags and you don't really care about the messages, so I'd avoid storing them. Let's use a simple hashtag counter. Again I'm just showing the interface. To address possible threading issue you should consider using a thread safe multi set.
Your message processors
Finally, you need to have a separate
In your main you have to wire up all your dependencies in the following way.
Note that the
This should work fine in a basic use case. As they suggested in your previous question, if you are looking for high performance and you need to address more complex scenarios you should look for a disruptor-like approach.
I'd go for a more reactive, event based solution. I'd structure your client and your message processor in the following way.
interface Client
{
void registerMessageProcessor(MessageProcessor messageProcessor);
}
interface MessageProcessor
{
void onNewMessages(Collection messages);
}The core idea is to have a client that gets data from the twitter stream, allows some one or more message processors to subscribe for events and notifies them when there are new messages available.
These interface are completely agnostic with respect to what you need to do with your messages.
Let's address the specific problem you want to tackle: processing the messages, analysing them and storing them in a data structure. You only need to query the hashtags and you don't really care about the messages, so I'd avoid storing them. Let's use a simple hashtag counter. Again I'm just showing the interface. To address possible threading issue you should consider using a thread safe multi set.
interface HashTagCounter
{
void addHashTag(String hashTag);
Map topHashTags(int maxNumberOfHashTags);
}Your message processors
onNewMessages method will just need to find the hash tags in the messages and call addHashTag.Finally, you need to have a separate
Reporter thread that polls the HashTagCounter periodically to output the top hash tags. That should be simple enough as it should just call topHashTags and nicely format the result.In your main you have to wire up all your dependencies in the following way.
public static void main(String[] args)
{
Client client = new TwitterClient(/* the args you need */);
HashTagCounter hashTagCounter = new HashTagCounter();
Reporter reporter = new Reporter(hashTagCounter);
String keyword = "Code review";
MessageProcessor messageProcessor = new MessageProcessor(keyword, hashTagCounter)
client.registerMessageProcessor(messageProcessor);
client.start();
reporter.start();
}Note that the
MessageProcessor runs on the TwitterClient thread.This should work fine in a basic use case. As they suggested in your previous question, if you are looking for high performance and you need to address more complex scenarios you should look for a disruptor-like approach.
Code Snippets
interface Client
{
void registerMessageProcessor(MessageProcessor messageProcessor);
}
interface MessageProcessor
{
void onNewMessages(Collection<Message> messages);
}interface HashTagCounter
{
void addHashTag(String hashTag);
Map<String,Integer> topHashTags(int maxNumberOfHashTags);
}public static void main(String[] args)
{
Client client = new TwitterClient(/* the args you need */);
HashTagCounter hashTagCounter = new HashTagCounter();
Reporter reporter = new Reporter(hashTagCounter);
String keyword = "Code review";
MessageProcessor messageProcessor = new MessageProcessor(keyword, hashTagCounter)
client.registerMessageProcessor(messageProcessor);
client.start();
reporter.start();
}Context
StackExchange Code Review Q#52560, answer score: 6
Revisions (0)
No revisions yet.