Your analytical research using Twitter

4 minute read

Besides computer science, I like psychological experiments, scientific analysis, and conclusion made upon the historical data. I think it’s absolutely adorable to learn some new facts standing on true events. Eventually, you may come to the thoughtful consequence. Does it make sense to buy this house? Is it a good idea to own British Pound?

Pretty much on any occasion, you may find the particular study where you can share your own opinion and your own experience on the subject where in fact, it can make a tremendous change to all human population.

The Internet and social networks are certainly huge things that help us to evolve and further develop new key elements for a better society.

Connecting with each other, we reach the highest and loudest spot, where indeed it can make a difference. Because the neighbors can finally hear the important message escaping all the buzz. With all that being said, I want to show you an example of how we can use Twitter for our experiments and analysis.

Have you heard about Keyword Planner created by Google?

Get historical statistics and traffic forecasts. Use statistics like search volume to help you decide which keywords to use … to get an idea of how a list of keywords might perform for a given bid and budget.

Today we are going to do a similar task. We want to find out whether this particular term has a tendency to be published more frequently based on the recent data.

We are planning to measure how many tweets have been published per timeframe and then to compare its rank built on them.

Luckily Twitter presented a very good Query Operators, that is sort of like a human SQL if you will. It allows us to find pretty much anything that has been written and stored in the Twitter database unless someone has removed or it’s older than a week.

Needless to point out that company invested lots of effort and made it available to all of us without exception. It allows us to find pictures, links, and it lends us even the chance of finding an attitude of it.

Finding its positive and negative attitude of it, Carl!

– Yes, I highlighted that with text selection. It brought enormous possibilities that we can use in our research. Besides you can learn out of this data something about the specific subject or personality like you, me, Justin Bieber or other celebrity like Bill Gates or Barack Obama. Isn’t that brilliant?!

Remember Erik and his Twitter gem from my article Implementing Twitter Bot using Ruby? – I will certainly use it once again. Thanks to Erik and Peter Cooper’s RubyWeekly my article has gained some additional views that at first motivated me to write another article. But then, I decided I want to keep it for the better occasion and to invest more time to make it more useful, or useful at all – So I am sorry that it took that long to release it. – It just wasn’t ready.

We are moving to use REST Client here. If you don’t know by now, we need to use credentials that we’ve got when we’ve created our first app in here.

@client = Twitter::REST::Client.new(
  consumer_key: 'YOUR_CONSUMER_KEY',
  consumer_secret: 'YOUR_CONSUMER_SECRET',
  access_token: 'YOUR_ACCESS_TOKEN',
  access_token_secret: 'YOUR_ACCESS_TOKEN_SECRET'
)

Now we can search for the tweets that interest us the most. For example, if you are Justin Bieber fanatic – like we all are, at least secretly, and if you “Love yourself” and feel like a true Biebs fan, you can list all the people who begged Canadian celebrity to marry him/her. – I mean theoretically. It’s possible using the documentation.

client.search("to:justinbieber marry me", result_type: "recent").take(3).each do |tweet|
  puts tweet.text
end

I am not sure if this data brings any value to you. Nevertheless, we can find these posts, and we can say something based upon it.

Anyway, furthermore, as I promised we will do simple math to tell if these hashtags or words make any sense to use or not. At least, we can start with it.

Alright. Let’s begin.

def popularity(string)
  tweets = @client.search(string, result_type: 'recent').take(1_000)
  tweets.length / (tweets.first.created_at - tweets.last.created_at) * 60
end

Two lines of code and we are almost done. We search for the string, let’s say #javascript hashtag, then we order the recent ones and ask for 1000 examples of it if they are available.

We check how much time has it taken. So technically we substract from the latest and oldest tweet the created_at DateTime and then we divide the result by the whole count of posts and in the flesh we say the mean or average of how many tweets have been published per minute.

And even if I am a big fan of Ruby, the facts lay down on the side of Python here. Posts mentioned with #python pop up more regularly, and by the way, if you dig deeper you can tell that there are more jobs available, as well. And no wonder why. – Because more people actually use it and well there is more noise which I believe makes Python an attractive alternative to Ruby. Besides language specification and all other stuff if you didn’t know, though.

Anyhow, I won’t insist that all provided examples will remain the face value for completely anything. Because, not absolutely everything, unfortunately, is available through the Rest API.

But, it’s still a good source for researching something hot or viral. Plus, I mean seriously, if we compare #python versus #ruby or #python vs #javascript excluding false samples, that usually won’t cause any big deviation, it should be enough to tell that something is less frequently mentioned and therefore less hot.

It does not matter how many samples were given in the case below, whether it’s 3, 10 or 1000, the outcome remains to be the same.