Sunday, 12 April 2015

Three days into IPL

"EVERY MOMENT IS A MOMENT OF TRUTH IN SOCIAL MEDIA"

A student went to his meditation teacher and said, "My meditation is horrible!  I feel so distracted, or my legs ache, or I'm constantly falling asleep.  It's just horrible!"

"It will pass," the teacher said matter-of-factly.

A weeklater, the student came back to his teacher, "My meditation is wonderful!  I feel so aware, so peaceful, so alive! It's just wonderful!"

"It will pass,"the teacher replied matter-of-factly.

                                                                                Zen story taken from www.theunboundedspirit.com

Three days gone by since the start of IPL8....is IPL8 suffering a lukewarm response?   I think so.  Probably, the fans might take time to come out of the World Cup fatigue.  The zen story that copied above is not to connect the IPL fatigue, but has a different purpose. 
     
Before getting into that, I want to rant a little bit about day 1 of IPL – the match between KKR and MI.  The stakes on IPL as an entertainment format of cricket is far too high for the team owners, broadcasters, and the advertisers.  Hence, it is bound to be highly commercialized.  In such high decibel events, it is also common for advertisers to see opportunities for maximal branding. 

I saw the first couple of over’s of KKR VS MI match on 8th April.  I must admit I was flabbergasted with the volume of branding on the players jerseys.  The jersey had Gionee (a change from Nokia) in the front, the font so big it hits you on the face, USPA on the right chest and KKR on the left, Sansui in the back.  Dish TV and Colors TV on the right hand sleeve, Royal Stag & Pepsi IPL on the left sleeve...poor Morkel was looking like a hoarding that walks, talks and bowls!!   I was downright ugly. Most team jerseys are!!

#hashtags

Zen stories always have higher contextual insight.  The story that makes sense to someone in a particular context may not make sense in a different scenario.  I believe #hashtags are much similar to zen trickeries.  Couple of examples:

Have you heard about “#awesomeness

You might link it to many things when you read it.  Tata Nano is using this hash tag.  Think about it now.  With Tata Nano in mind,  #awesomeness sounds a great brand fit.   Hashtag popularity and usage changes over time though.  #wontgiveitback does not have much meaning now.   Some are very short lived, like the ones propagated by the news channels as their daily feed of sensation.  


Since my focus is more on Twitter for analytics for Cricket, I thought it would help to introduce some concepts of Twitter.  I am sure many would know this, for others this explanation might provide a broader overview.  

Mechanics of Twitter:

@reply:

@AnushkaSharma So which team you are supporting  ???  KKRiders or  RCB ???  I am not stupid... This is not a stupid question...


A SRK fan asking this question directly to Anushka Sharma. Here the fan had used the handle at the beginning of the tweet.   This tweet will appear in AnushkaSharma’s timeline and all those followers of the fan who are also following Anushka Sharma.  This is considered as @reply by twitter.

RT: Retweet:

RT @realpreityzinta: Waiting to see the #IPL opening ceremony but all I see is last years finals between #KXIP & KKR on TV ! Kinda strange.

Here the fan shares this message directly with his / her followers. 

@Mentions:

IPL 2015 Auction: Full list of players bought by Kolkata Knight Riders (KKR) for IPL 8 via @cricket_country http://t.co/3GhIX1bQi9

Mentions are when the user includes the @handle but doesn’t begin the tweet with the @handle.  Mentions and replies are very important in building engagement.  Mentions show up in the brands stream, users stream and the stream of anyone following the user.  Advantage of mentions: These tweets have the potential to reach Twitter users who many not be following you.

Favorites: 

The user stars a tweet from someone, indicating their support to that someone.  This is not visible to the users followers or the person who had sent out that tweet.


What happened in the last 3 days:

The first three days is all about the media buzz in Twitter and how it helps a team build a brand.  I think one of the best media managed team as of now is KKR owned by Mr. Shah Rukh Khan.   Mr.Khan, himself a celebutante, his personality lends a lot of support to his team.  I was browsing through the data and few interesting snippets emerged.  Again this analysis is based on sample dataset or around 3 lac tweets from Twitter and my gratitude to twitter remains.


Significance of SRK: How Mr.Khan’s stardom is helping KKR?

173 user names carried SRK in them!!  Leave alone "Shah Rukh".    Like,

@iamsrk, 
@SRK, 
@SRKUniverseMsia, 
@NishuSRKian, etc. 

Some wacky ones which I liked:

@PrietySrkFan
@TheSRKdisciple,
@iamSRKsEYES, 
@iamsrkwife

I was expecting more interesting handles in the lines of last two guys.....but sad, couldn't find. 

#AskSRK

The fan following needs to be fuelled with appropriate engagement mechanisms. Remember #awesomeness.   Mr.Khan use #AskSRK.  Questions sent to Mr.Khan were mostly simple. 

How happy you are having Narine back in KKR  ??
If not KKR then which team is your fav
is there any possibility of watching you at todays inagural ceremony of #kkr at #Kolkata ???
Are you planning to attend all the matches played by #kkr

We wanna see lungi dance once again on dance floor of #IPL2015  opening ceremony.


Moving on, I want to present 4 charts.  These charts indicate the popularity of teams, players and matches without getting into the positive or negative sentiments associated with the tweets.  I am assuming if a fan attaches a team say for e.g., #csk, he is associating himself with the team.

Popular Teams - fans associating themselves with.



KKR clearly leads the pack followed by CSK and MI.  This is just a summation of tweets of first three days.  Hence, it is fairly early to conclude this as true.

Most engaging match



KKRvsMI - Too many factors in play...Mr.Khan's popularity, KKR and MI team's, first match of the tournament to name a few.

Players - fans want to connect with most:



In Twitter, @IamVKohli is not as popular as @SirJadeja or @ImRo45 (Rohit Sharma).  I want to specify couple of handles who are not cricketers that's been in discussions with fans,  but journalists.

Meet: @Brokencricket - Mentioned 2434 times in the first 3 days




Meet: @fwildecricket: next only to +Sir Jadeja in Twitter circles within cricket - a journalist by profession: Mentioned 2714 times during the first 3 days of IPL,




Teams - fans want to connect with most:


KKRiders is way up above the rest.   Very similar to the associations with #hashtag of teams.

What's brewing with KKR?  Celebrations!!!

Let's see that in detail in some other post.

Twitter provides enormous opportunities for data exploration and understanding.  However, like any analysis, one need to have a strong hypothesis to test.  As I dig deeper, the layers of insight literally leaves me with sleepless nights wondering the possibilities of using Twitter & social media data for good.   

Recent history is laden with examples of how twitter is being used for predictive analytics...I am trying to see if I can use my little Bayes algorithm coupled with other indicators to predict the team that will lift the cup.  Seems damn too risky for me...but i am sure I will learn a lot in the process.

See you soon.



Tuesday, 7 April 2015

Not everything is about cricket.........


"EVERY MOMENT IS A MOMENT OF TRUTH IN SOCIAL MEDIA"

Over the last 20 years, cricket had become a sport beyond what it used to be.   India had played 880 ODI's so far since 13th July 1974, when we played our first ODI.  Incidentally, we have played the most ODI's in the world.  More than half the matches were played since year 2000.  With digital media explosion, there is no shortage on statistics on the game like this....you name it it is there.   The fans are bombarded with detailed reviews before, during and after every match / series, interviews, minute by minute coverage, ball by ball reporting, digital scoring engines....the list of content generators for the cricket crazy fans in India, is never ending.  The fans are pampered.... without a doubt.

But, have you ever thought the 'coverage' had always been about the game and nowhere -how the fans react to it?

The idea of this blog is to see what happens on the other side through the fan's chitter chatter & murmurs in twitter!  Twitter is a fabulous platform in many different ways...the foremost for me -Twitter is open in sharing a small proportion of conversations for freea boon for researchers / analysts to try, experiment and understand.  To provide a perspective, this small proportion could mean 2 - 3 lac tweets a day if a topic is trending.  What more can I ask for!!

The match between SA and NZ during world cup series was the most revered one in the recent times.  I captured the tweets through this match.   I have analyzed and the summary of conversations is as follows:

The basics:

Number of tweets captured:  Twitter provides 1% of randomly selected tweets on any specific topic.  Twitter permits analyzing the data and share the findings but not release actual tweets(which I am not interested in or have any such intention).  I have used a 'streaming' program that captures tweets as they get tweeted.



In all I was able to gather around 2.7 lac tweets in over 1.5 days before, during and after the match. This makes a dependable data set, big enough for us to read the findings without much scrutiny on validity of the sample set.


Tweets Vs Retweets:  Tweets in my opinion is more organic as compared to re-tweets. Debates are welcome.  However, many marketers like re-tweets a lot, as it widens their reach, appearing in the timelines of your followers' followers and accessing a twitter segment which otherwise is not accessible.  Keeping the relevancy of re-tweets for a marketer aside, the below chart shows an interesting trend.



The timeline indicated above is re-calibrated to IST and the match started at 9:00 A.M. approximately.    Before the match started we could see tweets and re-tweets more or less at the same intensity levels.  Once into the match, the fans unleash their individuality.  The surge is on till the match got over by around 4:00 P.M. where tweeting takes a back seat.  Post match, its all about reflecting their sentiments with that of others, by re-tweeting.

Location:  Where do the tweeples come from?


Of the top 10, 5 of them are from India.  Please note, this includes only those tweeples who have indicated their location in their profile or the location from where the tweet originated from.  I have sliced the same data by before, first half, second half and post match of the game.  It gave interesting insights.  Though through the day Indians topped in tweeting, post match I could see lot of tweets coming from London, New York, Saudi Arabia & Pakistan.

Who tweeted the most?


Some of them are spammers, like Trends247 or Ishq_Da_Warris....who others are genuine tweeters, like ajsmarty3 or Senthil557.  300 - 350 tweets in a day!!!! Smells addiction to me!


How do they tweet? The Device:


Couple of years back, when my wife gifted me a Samsung S3, somewhere deep inside I was wondering if she should have gone for an iPhone!  Now I don't have any such qualms.  No intention of creating the android vs iphone war yet again, what I found rather interesting in the above chart was the Mobile Web M2.  Its accessing twitter through the mobile browser.  Still active.  Don't know about their experience though!

Who are they talking about? Mentions:


ABdeVilliers - 7500 times in a day!  Almost 20% more than the entire NZ team!!  Some spammers here too!

What are they talking about: Hashtags

After analysing twitter, I have become a big fan of #hashtags purely from a communication perspective. (Note, I am an introvert and don't communicate as much as I should or others wish).  I consider them as power packed communication nuggets that summarizes wide range of emotions.  I liked #won'tgiveitback and #backtheblackcaps a lot.



URLs:

Be it twitter, facebook or any other social media platform, sharing pictures / video links have become main stream.  They get promulgated through tiny urls.  I have tracked some of them.  The following is the most shared links / pages:

ICC Website:



ESPNcricinfo : 2nd most shared site during SAvsNZ


A wacky twitter picture:  5th most forwarded url.


What's more:

These are some very basic analysis that one can do with twitter.  There are many advanced models too.  Twitter being text heavy I wanted to try my hands on analyzing the actual tweets with an aim of classifying them.  I learnt couple of models.   I tried using k-NN classification (nearest neighbour) and Naive Bayes Classifier.  Since I found it difficult to implement k-NN classifier on 140 character long tweets, I decided to use Naive Bayes classifier instead.  Its one hell of tool, I bet.

Couple of lines on how it works:  Initially, one has to create a training set.  Which means, after going through few hundred initial tweets I figured out I could clearly classify the tweets into four different groups viz., those who are doubtful which team will win, those who support NZ, those supporting SA and those commenting generally like "its raining in Auckland", "I will wake up at 3:00 in the morning"...so on and so forth.

Naive Bayes Classifier, calculates the probability of every word from every tweet, remembers it within a trained cluster and applies the probabilities on a new tweet that gets introduced to the classifier.  It then assigns the new tweet to an appropriate cluster.  I started off with a training set of 400 tweets which I have classified into 4 groups.  I started iterating with newer tweets.  My first iteration resulted in 17% successful classification.  I reclassified the remaining tweets and added them into appropriate cluster. With every new iteration I could see the correctness of predictions increasing.  After my 15th iteration and 2800 tweets classified approximately, the correctness levels reached 90% and above.  I was astonished when the program classified the remaining 13000 odd tweets accurately.   A snapshot of the same is given below as four word clouds: one for each cluster.

Doubtful:

General:


What Emraan Hashmi got to do with world cup???  Only god knows!!!! But those tweets are there!! :-(...i checked.  Guess that's his birthday.

New Zealand supporters:

New Zealand supporters are all over Boult, McCullum, Guptill, Henry, etc.  I felt their discussions / choices were centered more around individual players than the team.  On the contrary, with South Africa, the team comes first.  Having read some 2800 tweets myself, every time when I read SA supporters, I could very well empathize with them.

South Africa Supporters:

Proteas / Proteafire is all over! You have to remember I have removed mentions and hashtags such as #NZvsSA or #NZ etc.

I could have done more.  But keeping the attention span and paucity of time in mind, I am reserving them for later.  Some of them are really interesting such as:  understanding the social network - who is talking to whom, what is being talked about / to commentators like Harsha Bhogle, perceptual mapping of teams and the associated key words (correspondence analysis - folks who have worked with me know what I am talking about ;-) ), categorizing conversations and clustering them into groups to see what major themes emerge.

Much more to do.  With IPL beginning today its only getting more exciting. Free data, free tools to analyze, online support eco-system...I couldn't ask for more. I am hooked.

What this space for more.

Thank you for coming this far.  Would be glad if you could leave your comments.