How is everyone’s feeling lately?

The question suddenly came up to me after getting 10 to-do-assignments for 5 subjects I take this semester. The classes just started last week, but now I’m already facing a lot of work and all due next week! I’m a bit worried of my survival this semester, but wonder if other students feel the same? Are they happy with this kind of situation? Or do they think negatively on this? Are they angry with the professors? I am so curious and determined to analyze their feeling through most recent tweets from Twitter.

But since I don’t have any prior experience in Python programming, I tried to expose to this new environment by following the Python documentation (Chapter 3-7) and NLTK book (Chapter 1-3). Following documentation without particular goals is kind of tedious, so I just decided to jump to the Twitter data directly. Before I start, my classmate Sam suggested me to use iPython Notebook, a web-based interactive computational environment where you can combine code execution, text, mathematics, plots and rich media into a single document. There are several ways to install, I chose to install iPython and dependencies manually from command line. When I ran the iPython, it crashed! I hardly tried to find the problem, but couldn’t find any. I reinstalled the iPython using Anaconda, but it crashed again! After some observation, I came to know that this was apparently due to socket issue. One magical command line that saved my life:

# ipython notebook --ip=127.0.0.1

I then started by collecting user’s timeline from Twitter. I used Python wrapper library to interface with Twitter API called python-twitter. Now Twitter requires key and access tokens in order to use their API, previously it wasn’t required. I simply created application on Twitter developer site and got the access tokens to put on my Python script. I successfully grabbed my own timeline and tweets from other friends as well. It looked like this: (sorry my tweets are mostly written in Bahasa)

py1

 

Finally, I defined sets of emotional dictionary to categorize the words from tweets. Here the wording analysis came from NLTK library. I am now able to classify which tweets include happiness or sadness, and which don’t. The analysis results are plotted into simple visualization using matplotlib, a Python plotting library. I wish I could do D3 to visualize beautifully, but for now let me just use this simple one. The visualization of my tweets look like this:

py2

My sister said that she tweets when she is sad, and the script proves it! Here is the result of my sister’s tweets:

py3

I put hundreds of english words to the dictionary, so I doubt it will analyze non-english tweets accurately like my Bahasa tweets. I tried to analyze tweets that are actively written in English. Here is what Danco’s (my professor) tweets look like:

py4

Apparently the students on my lab don’t actively tweet on Twitter, and even most of them  don’t have twitter account, so I am not able to analyze their tweets. In my home country, Twitter is very popular. Jakarta is the most active Twitter city in the world, and my hometown Bandung is in the 6th position. It is mentioned here in Forbes article.

This activity reminds me of three key points worth mentioning:

  • In this modern IT world, there are millions of useful libraries that we can use for almost anything! We don’t need to make from scratch anymore.
  • In an IT research project, there are several methods to achieve our goals. Try to find the approach that will save our time and resources.
  • Google has a lot of resources to help, don’t hesitate to look when we are stuck in the middle.

Please find the script I made in this project here.

 

 

2 thoughts on “How is everyone’s feeling lately?

  1. awesome work. welcome to grad study in the US! you will barely have a literally ‘free’ weekend :p

    so, how did you define the emotion in Bahasa? you know, our Bahasa language is one of the most abstract language, and most of Indonesian people use non-formal language which might be harder to detect. Like, instead of writing “Senang” (happy), they write “seneennggggg”.

    glad you posted the project here. I’ll be happy to learn it! thanks

    • thanks for your comment, I’m enjoying my weekend in the lab, lol.

      that’s absolutely true for Bahasa, it’s rather difficult. In this project, the script analyzes feeling mostly from emoticon for non-english tweets. But this is just the first assignment I got from class (I only used simple NLP), I will try to improve it by applying insight from the lectures, maybe later after learning the machine learning technique.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s