Connecting Streams to Twitter

[Back to InfoSphere Streams]

1) Create a Twitter account  and attach an App  to your account twitterStreamy

2) Ensure that you have a valid Consumer Key/Secret and Access Token/Secret.

3) Launch InfoSphere Streams Studio in the VM

4) Select workspace

workspace

5) Create a new SPL Application Project

newProject

6) Name the project – note that this will also create a main composite which serves as the entry point to this program.

newProject2

7) Create a new SPL Namespace by right clicking on project name -> New -> SPL Namespace (I’ve named my Twitter Tally)

8) The new directory/namespace will be created under the Resources directory in the Project Explorer:

newFolders

9) Create new SPL Source File under that new directory

SPLSource

10) Name the main composite (I’ve called mine TwitterConnector) and keep the source type as SPL Source. Right click on the canvas and select Open with SPL Editor.

11) Define a new type:

type completeTwitterStream = rstring id_str, int64 id, rstring action, rstring action_date, rstring text ;

This is the format to be used to parse the twitter JSON responses. As far as I know, the text is the actual messages on Twitter and the rest are meta data around that tweet.

11) As this composite is to be designed and reused as much as possible so we will have to change the composite definition to public and also add an output port to the composite itself – which I’ve named TwitterStreamData.

public composite TwitterConnector(output TwitterStreamData)

12) Now we are ready to add the connectors! Open with SPL Graphical Editor.

EmptyCanvas

13) From the toolkit menu on the left, find and add HTTPGetStream and JSONToTuple to the composite:

Step12

14) Rename the operators accordingly and connect them up, note the @ indicates I’ve put a view on that operator…. more on this later.

step14

15) Right click and open with SPL Editor

16) For the HTTPGetStream operator, add the following parameters in:

stream<rstring tweet_data>StuffTwitterSays = HTTPGetStream()
{
param
url : "https://stream.twitter.com/1.1/statuses/sample.json" ;
dataAttributeName : "tweet_data" ;
authenticationType : "oauth" ;
authenticationFile :
"/home/streamsadmin/workspace/TwitterStream/data/twitter.securities" ;
}

The url denotes the resource address to get the stream from (I am using the URL defined by the Twitter Dev page)

The dataAttributeName shows which attribute to place the HTTP stream

I have also specified an authenticationFile, this holds the client key/secret & access token/secret.

The format of the authentication file as follows:
consumerKey=xxxxxxxxxxxxxxxxxxxxxxxxx
consumerSecret=xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
accessToken=xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
accessTokenSecret=xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx

17) Now edit the JSONToTuple operator

stream<completeTwitterStream> TwitterStreamData =
JSONToTuple(StuffTwitterSays)
{
param
jsonStringAttribute : "tweet_data" ;
}

The only parameter here determines the name of the tuple which provides the raw JSON stream. This output of this stream is then connected to the output of the composite (note the name of the operator and the name of the output port of the composite)

[Back to InfoSphere Streams]

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s