"EVERY MOMENT IS A MOMENT OF TRUTH IN SOCIAL MEDIA"
Over the last 20 years, cricket had become a sport beyond what it used to be. India had played 880 ODI's so far since 13th July 1974, when we played our first ODI. Incidentally, we have played the most ODI's in the world. More than half the matches were played since year 2000. With digital media explosion, there is no shortage on statistics on the game like this....you name it it is there. The fans are bombarded with detailed reviews before, during and after every match / series, interviews, minute by minute coverage, ball by ball reporting, digital scoring engines....the list of content generators for the cricket crazy fans in India, is never ending. The fans are pampered.... without a doubt.
But, have you ever thought the 'coverage' had always been about the game and nowhere -how the fans react to it?
The idea of this blog is to see what happens on the other side through the fan's chitter chatter & murmurs in twitter! Twitter is a fabulous platform in many different ways...the foremost for me -Twitter is open in sharing a small proportion of conversations for free, a boon for researchers / analysts to try, experiment and understand. To provide a perspective, this small proportion could mean 2 - 3 lac tweets a day if a topic is trending. What more can I ask for!!
The match between SA and NZ during world cup series was the most revered one in the recent times. I captured the tweets through this match. I have analyzed and the summary of conversations is as follows:
The basics:
Number of tweets captured: Twitter provides 1% of randomly selected tweets on any specific topic. Twitter permits analyzing the data and share the findings but not release actual tweets(which I am not interested in or have any such intention). I have used a 'streaming' program that captures tweets as they get tweeted.
In all I was able to gather around 2.7 lac tweets in over 1.5 days before, during and after the match. This makes a dependable data set, big enough for us to read the findings without much scrutiny on validity of the sample set.
Tweets Vs Retweets: Tweets in my opinion is more organic as compared to re-tweets. Debates are welcome. However, many marketers like re-tweets a lot, as it widens their reach, appearing in the timelines of your followers' followers and accessing a twitter segment which otherwise is not accessible. Keeping the relevancy of re-tweets for a marketer aside, the below chart shows an interesting trend.
The timeline indicated above is re-calibrated to IST and the match started at 9:00 A.M. approximately. Before the match started we could see tweets and re-tweets more or less at the same intensity levels. Once into the match, the fans unleash their individuality. The surge is on till the match got over by around 4:00 P.M. where tweeting takes a back seat. Post match, its all about reflecting their sentiments with that of others, by re-tweeting.
Location: Where do the tweeples come from?
Of the top 10, 5 of them are from India. Please note, this includes only those tweeples who have indicated their location in their profile or the location from where the tweet originated from. I have sliced the same data by before, first half, second half and post match of the game. It gave interesting insights. Though through the day Indians topped in tweeting, post match I could see lot of tweets coming from London, New York, Saudi Arabia & Pakistan.
Who tweeted the most?
Some of them are spammers, like Trends247 or Ishq_Da_Warris....who others are genuine tweeters, like ajsmarty3 or Senthil557. 300 - 350 tweets in a day!!!! Smells addiction to me!
How do they tweet? The Device:
Couple of years back, when my wife gifted me a Samsung S3, somewhere deep inside I was wondering if she should have gone for an iPhone! Now I don't have any such qualms. No intention of creating the android vs iphone war yet again, what I found rather interesting in the above chart was the Mobile Web M2. Its accessing twitter through the mobile browser. Still active. Don't know about their experience though!
Who are they talking about? Mentions:
ABdeVilliers - 7500 times in a day! Almost 20% more than the entire NZ team!! Some spammers here too!
What are they talking about: Hashtags
After analysing twitter, I have become a big fan of #hashtags purely from a communication perspective. (Note, I am an introvert and don't communicate as much as I should or others wish). I consider them as power packed communication nuggets that summarizes wide range of emotions. I liked #won'tgiveitback and #backtheblackcaps a lot.
URLs:
Be it twitter, facebook or any other social media platform, sharing pictures / video links have become main stream. They get promulgated through tiny urls. I have tracked some of them. The following is the most shared links / pages:
ICC Website:
ESPNcricinfo : 2nd most shared site during SAvsNZ
A wacky twitter picture: 5th most forwarded url.
What's more:
These are some very basic analysis that one can do with twitter. There are many advanced models too. Twitter being text heavy I wanted to try my hands on analyzing the actual tweets with an aim of classifying them. I learnt couple of models. I tried using k-NN classification (nearest neighbour) and Naive Bayes Classifier. Since I found it difficult to implement k-NN classifier on 140 character long tweets, I decided to use Naive Bayes classifier instead. Its one hell of tool, I bet.
Couple of lines on how it works: Initially, one has to create a training set. Which means, after going through few hundred initial tweets I figured out I could clearly classify the tweets into four different groups viz., those who are doubtful which team will win, those who support NZ, those supporting SA and those commenting generally like "its raining in Auckland", "I will wake up at 3:00 in the morning"...so on and so forth.
Naive Bayes Classifier, calculates the probability of every word from every tweet, remembers it within a trained cluster and applies the probabilities on a new tweet that gets introduced to the classifier. It then assigns the new tweet to an appropriate cluster. I started off with a training set of 400 tweets which I have classified into 4 groups. I started iterating with newer tweets. My first iteration resulted in 17% successful classification. I reclassified the remaining tweets and added them into appropriate cluster. With every new iteration I could see the correctness of predictions increasing. After my 15th iteration and 2800 tweets classified approximately, the correctness levels reached 90% and above. I was astonished when the program classified the remaining 13000 odd tweets accurately. A snapshot of the same is given below as four word clouds: one for each cluster.
Doubtful:
General:
What Emraan Hashmi got to do with world cup??? Only god knows!!!! But those tweets are there!! :-(...i checked. Guess that's his birthday.
New Zealand supporters:
New Zealand supporters are all over Boult, McCullum, Guptill, Henry, etc. I felt their discussions / choices were centered more around individual players than the team. On the contrary, with South Africa, the team comes first. Having read some 2800 tweets myself, every time when I read SA supporters, I could very well empathize with them.
South Africa Supporters:
Proteas / Proteafire is all over! You have to remember I have removed mentions and hashtags such as #NZvsSA or #NZ etc.
I could have done more. But keeping the attention span and paucity of time in mind, I am reserving them for later. Some of them are really interesting such as: understanding the social network - who is talking to whom, what is being talked about / to commentators like Harsha Bhogle, perceptual mapping of teams and the associated key words (correspondence analysis - folks who have worked with me know what I am talking about ;-) ), categorizing conversations and clustering them into groups to see what major themes emerge.
Much more to do. With IPL beginning today its only getting more exciting. Free data, free tools to analyze, online support eco-system...I couldn't ask for more. I am hooked.
What this space for more.
Thank you for coming this far. Would be glad if you could leave your comments.
Tweets Vs Retweets: Tweets in my opinion is more organic as compared to re-tweets. Debates are welcome. However, many marketers like re-tweets a lot, as it widens their reach, appearing in the timelines of your followers' followers and accessing a twitter segment which otherwise is not accessible. Keeping the relevancy of re-tweets for a marketer aside, the below chart shows an interesting trend.
The timeline indicated above is re-calibrated to IST and the match started at 9:00 A.M. approximately. Before the match started we could see tweets and re-tweets more or less at the same intensity levels. Once into the match, the fans unleash their individuality. The surge is on till the match got over by around 4:00 P.M. where tweeting takes a back seat. Post match, its all about reflecting their sentiments with that of others, by re-tweeting.
Location: Where do the tweeples come from?
Of the top 10, 5 of them are from India. Please note, this includes only those tweeples who have indicated their location in their profile or the location from where the tweet originated from. I have sliced the same data by before, first half, second half and post match of the game. It gave interesting insights. Though through the day Indians topped in tweeting, post match I could see lot of tweets coming from London, New York, Saudi Arabia & Pakistan.
Who tweeted the most?
Some of them are spammers, like Trends247 or Ishq_Da_Warris....who others are genuine tweeters, like ajsmarty3 or Senthil557. 300 - 350 tweets in a day!!!! Smells addiction to me!
How do they tweet? The Device:
Couple of years back, when my wife gifted me a Samsung S3, somewhere deep inside I was wondering if she should have gone for an iPhone! Now I don't have any such qualms. No intention of creating the android vs iphone war yet again, what I found rather interesting in the above chart was the Mobile Web M2. Its accessing twitter through the mobile browser. Still active. Don't know about their experience though!
Who are they talking about? Mentions:
ABdeVilliers - 7500 times in a day! Almost 20% more than the entire NZ team!! Some spammers here too!
What are they talking about: Hashtags
After analysing twitter, I have become a big fan of #hashtags purely from a communication perspective. (Note, I am an introvert and don't communicate as much as I should or others wish). I consider them as power packed communication nuggets that summarizes wide range of emotions. I liked #won'tgiveitback and #backtheblackcaps a lot.
URLs:
Be it twitter, facebook or any other social media platform, sharing pictures / video links have become main stream. They get promulgated through tiny urls. I have tracked some of them. The following is the most shared links / pages:
ICC Website:
ESPNcricinfo : 2nd most shared site during SAvsNZ
A wacky twitter picture: 5th most forwarded url.
What's more:
These are some very basic analysis that one can do with twitter. There are many advanced models too. Twitter being text heavy I wanted to try my hands on analyzing the actual tweets with an aim of classifying them. I learnt couple of models. I tried using k-NN classification (nearest neighbour) and Naive Bayes Classifier. Since I found it difficult to implement k-NN classifier on 140 character long tweets, I decided to use Naive Bayes classifier instead. Its one hell of tool, I bet.
Couple of lines on how it works: Initially, one has to create a training set. Which means, after going through few hundred initial tweets I figured out I could clearly classify the tweets into four different groups viz., those who are doubtful which team will win, those who support NZ, those supporting SA and those commenting generally like "its raining in Auckland", "I will wake up at 3:00 in the morning"...so on and so forth.
Naive Bayes Classifier, calculates the probability of every word from every tweet, remembers it within a trained cluster and applies the probabilities on a new tweet that gets introduced to the classifier. It then assigns the new tweet to an appropriate cluster. I started off with a training set of 400 tweets which I have classified into 4 groups. I started iterating with newer tweets. My first iteration resulted in 17% successful classification. I reclassified the remaining tweets and added them into appropriate cluster. With every new iteration I could see the correctness of predictions increasing. After my 15th iteration and 2800 tweets classified approximately, the correctness levels reached 90% and above. I was astonished when the program classified the remaining 13000 odd tweets accurately. A snapshot of the same is given below as four word clouds: one for each cluster.
Doubtful:
General:
What Emraan Hashmi got to do with world cup??? Only god knows!!!! But those tweets are there!! :-(...i checked. Guess that's his birthday.
New Zealand supporters:
New Zealand supporters are all over Boult, McCullum, Guptill, Henry, etc. I felt their discussions / choices were centered more around individual players than the team. On the contrary, with South Africa, the team comes first. Having read some 2800 tweets myself, every time when I read SA supporters, I could very well empathize with them.
South Africa Supporters:
Proteas / Proteafire is all over! You have to remember I have removed mentions and hashtags such as #NZvsSA or #NZ etc.
I could have done more. But keeping the attention span and paucity of time in mind, I am reserving them for later. Some of them are really interesting such as: understanding the social network - who is talking to whom, what is being talked about / to commentators like Harsha Bhogle, perceptual mapping of teams and the associated key words (correspondence analysis - folks who have worked with me know what I am talking about ;-) ), categorizing conversations and clustering them into groups to see what major themes emerge.
Much more to do. With IPL beginning today its only getting more exciting. Free data, free tools to analyze, online support eco-system...I couldn't ask for more. I am hooked.
What this space for more.
Thank you for coming this far. Would be glad if you could leave your comments.
Nice
ReplyDeleteNice
ReplyDelete