Hey Klout, Adding More Decimal Places Does Not Make Your Score More Accurate

Posted: October 26th, 2011 | Author: | Filed under: Klout, PeerIndex, Statistics | No Comments »

Klout has been hyping up their score changes for a week now. The CEO Joe Fernandez has claimed that this makes the score more accurate, more transparent, and may cure some forms of cancer (well maybe not the last claim). Let’s just say I haven’t been this disappointed since the 2000 election. Let’s start with their first claim: accuracy. See figure 1, my new score

It’s exactly the same graphic as before, but with two decimal places. While my 8th grade Chemistry teacher may be glad that they are using more significant digits, I honestly don’t care. They were there before, just not displayed. Lame.

In their blog post, they claim: “This project represents the biggest step forward in accuracy, transparency and our technology in Klout’s history.” They support this vague claim with the histogram below, showing the differences in Klout scores, before and after the change:

This histogram leaves tons of open questions. Is this different than your normal daily shift in scores? The histogram reminds me of a t-distribution with a fatter positive tail. If more people are signing up for Klout than are leaving, thats probably what it should look like anyways as users hookup more networks and gradually become more active online. The graphic doesn’t show that your score is any better, just that it changed. That’s not impressive at all.

My beef with Klout remains simply that the service provides us with no real validation or explanation of our scores. They don’t show us how many times we have been RT’ed, mentioned, etc. On Google, you can look up your page rank, on app stores you can see your average rating and number of ratings, on Klout, you are told that your true reach has increased, but not told what that implies or how you can verify it.

Klout is still the social influence measurement leader, but with Peerindex rapidly improving (and better in many ways in my opinion), and new competitors such as Proskore and Kred popping up, Klout should be worried. I’ll have a review of both Proskore and Kred up shortly as well so you can easily compare them for yourself.


Klout improves score by making it less transparent and even harder to explain

Posted: August 29th, 2011 | Author: | Filed under: Klout, PeerIndex, Statistics | No Comments »

After taking a few weeks off from reaming Klout, their newest “improvements” have left me with no choice but to write a sardonic and snarky response. Klout has added 5 new services (Instagram, Flickr, tumblr, Last.fm, and Blogger) and removed ANY secondary statistics from our profile pages. I’m still not sure which is worse, just that both are stupid. I’ll start by criticizing the addition of new services with a simulated conversation between Klout and myself.

Part 1: A conversation with Klout about their new signals

Alex: This brings the total services to 10. Really Klout, you need 10 services?
Klout: Of course this will help make your Klout score even better!
Alex: But you didn’t do a good job with just Twitter and Facebook, how can I expect you to do a good job with 10?
Klout: More data always improves the performance of complicated, black box, machine learning algorithms like our own.
Alex: That’s actually false.
Klout: Ummmm, look dude, I’m just a data whore and want to sell your data to the man.
Alex: So you just want all of my data to sell it to the man and give me nothing in return?
Klout: We actually have a terrific Klout perks program. I see you’ve received two Klout perks.
Alex: Yup, you sent me a Rizzoli and Isles gift pack, a TV show on a network I don’t have and literally hadn’t heard of before receiving the gift pack. Did I mention that the gift pack came with handcuff earrings?
Klout: But what about your other Klout perk, a party at Rolo, a store in SF that sells jeans. Careful analysis of your Twitter, Facebook, Foursquare and LinkedIn data led us to believe that you like or wear jeans.
Alex: Everyone wears jeans. That’s similar to predicting that I like to go on vacation or eat tasty food. These jeans happened to be $175, which doesn’t sound like much of a perk to me.

On top of this, android users actually can’t even connect their Klout accounts to Instagram because the app is iPhone only. Ironically, the Klout blog just posted about the average Klout of iPhone and Android users, finding the former beat out the latter 42.0 to 40.6. Perhaps the comparison would be more equal if Android users were allowed to connect 10 services rather than 9? Does MG Siegler actually need more Klout?

Part 2: Klout removes any accountability from website

Finally, let’s discuss, the complete lack of transparency imposed by their recent banishment of the majority of profile stats. Here is a screen shot of my “Network Influence” before:

and after:

You will notice that the supporting stats are gone. Though this absence makes it much harder for me to criticize the inconsistencies in their score, it also takes away most of the utility I received from Klout. Unless you have your own Twitter analytics, most people don’t have access to this info, thats one of the reason Klout was cool. It indulges my latent nerd narcissism. How many RTs do I get? How many @ mentions? How many unique ones? Now I just get a number with little explanation. Luckily, Klout competitor, Peerindex, still has much of that info:

From Klout’s point of view, I completely understand why they would want to add more services: greater reach, more data, more partners, etc. I suppose they could justify the removal of more specific stats by saying that things could get too crowded on the main page, but then put the data on another page, don’t take it away. Twitter and Facebook still drive the large majority of usage. Do you really think Blogger cares if their stats aren’t on the main page? Seems nefarious to me.


50 signals used to compute your Klout score

Posted: August 3rd, 2011 | Author: | Filed under: Klout | 3 Comments »

In my ongoing quest to deconstruct Klout, I’ve decided to begin to tackle the question “How is my Klout computed?” by looking at what signals make up an individual’s score. Klout CEO Joe Fernandez stated that his company’s score is computed using at least 50 signals. This post is my best guess at those 50 variables. Below I have a breakdown of putative signal by source (Twitter, Facebook, LinkedIn, etc.)

Twitter

  1. followers
  2. following – followers
  3. total RT
  4. weighted total RT
  5. unique RT
  6. weighted unique RT
  7. RT/tweet
  8. @mentions
  9. weighted @mentions
  10. unique @mentioners
  11. weighted unique @mentioners
  12. @mentions/tweet
  13. weighted @mentions/tweet

Facebook

  1. friends
  2. total likes
  3. likes/post
  4. total comments
  5. comments/post

LinkedIn

  1. recommenders
  2. likes
  3. comments
  4. connections

The astute among you will notice that only 22 signals are mentioned above, however, this fails to account for time, one critical aspect of Klout. Below I have a plot of my Network Influence subscore of my Klout score from a few days ago.

You will notice a big drop towards the beginning of the plot. This occurred exactly one month after my initial Klout post that was tweeted by Robert Scoble. That was the point at which my Klout began to increase significantly (it has since decreased significantly). If we include a each of the scores above over all time and the past month, that yields 44 signals. In addition, the phrase “In the past 90 days” appears in the new Klout UI (pictured above), so I don’t think its a huge leap to infer that each signal is also used over a 90 day period as well, yielding 66 signals. Finally, Klout now allows you to connect your Foursquare and Youtube accounts, so I assume they are tracking friends, checkins, comments, mayorships, Youtube thumbs up, Youtube comments, etc., yielding an ever larger signal total.

I don’t actually think that all of these “raw” signals are being used directly to calculate scores, that would be naive. I’m sure scores/totals are transformed (perhaps log), then normalized on a scale from 0 to 100, similar to the Klout score. I also think its likely that several of the individual signals listed above are multiplied or otherwise combined to create composite signals. As an example, its impressive if you are retweeted often OR if you receive many @ mentions, however its super impressive/kloutastic if you are retweeted often AND receive many @ mentions. A composite signal may capture that interplay.

In closing, I think someone with some time and access to the Klout api, could use these signals to reconstruct the Klout algorithm. If you’d like to try, shoot me an email and I’d be happy to help in my spare time.


Klout perks: Nudie jeans party at Rolo SF

Posted: June 24th, 2011 | Author: | Filed under: Klout | No Comments »

Thursday June 23, I attended my first Klout perk party. Apparently I’ve built up enough Klout by bashing Klout to deserve an invite. The event was showcasing Nudie Jeans at the store Rolo in the SOMA district of San Francisco.


They had free food:

and a DJ:

If you’ve ever wondering what I’d look like as a hipster, here’s a shot of me in some very tight hipster jeans.

For those curious, this pair is the Slim Jim Org. Dry Dark, which can be yours for $179. Overall it was a fun event and one that may signify a move by Klout into the local space. This was the first event of its kind.


An interview with Klout

Posted: June 20th, 2011 | Author: | Filed under: Klout | 2 Comments »

After my initial Klout blog posts, I followed up with their Marketing Manager Megan Berry @meganberry and Director of Ranking Ash Rust @ashrust in a series of emails. I sent them a barrage of questions; below are their answers.

Q: How Klout will deal with identical individuals across multiple networks?
K: We look at each platform holistically to try and determine what the signals of influence are. We then perform sophisticated analysis to weight the different platforms appropriately for each person.
A: No information was conveyed in this answer. In academia and finance the word “sophisticated” is a completely loaded term roughly translating to “it’s actually trivially easy, but I think you are too stupid to understand.”

Q: My Foursquare friends are a strict subset of my Facebook and Twitter friends. Will they double count?
K: Follower and friend count are really not part of what we do — it is about ability to drive action.
A: I thought this was a decent answer. I have a few friends that are very active on foursquare, but relatively inactive on Twitter. If my activity drives actions on both Twitter and foursquare, I should get more credit.

Q: CEO Joe Fernandez stated: “Klout Score is not about followers or your activity level but about how people react to your content. ” This is a bit vauge. Now that we have more than 140 characters, can you elaborate a bit?
K: Yes, we don’t believe that followers or friends are a good measure of influence. Instead we’re looking at the engagement you get from people (i.e. RTs, @msgs, likes, etc.) and how influential those people are.
A: I agree that followers/friends should be secondary to actions indicating that individuals are actively engaged with your content (i.e. RTs, @msgs, likes, etc.). I also agree that a Robert Scoble or Michael Arrington RT or @ reply should be worth far more than my mom RTing something I say.

Q: Why do you feel you are better than competitors such as Peerindex and twittergrader?
K: We are the emerging standard in this industry — we are used by over 2000 applications and major brands to understand and measure online influence.
A: Worst answer ever, even worse than your parents saying “because I said so.” I more or less agree with their assessment that they are the emerging industry standard, however, would they still be if they didn’t get a 1 year jump start over all other competitors? What if PeerIndex’s infrastructure were more scalable and could handle the same scale as Klout? I expected some statement assessing their relative quality in terms of ranking or infrastructure, not a catch 22 or tautological response.

Q: I think the new K+ system is awesome, but am worried about spam. What steps are you taking to ameliorate the risk?
K: We’re watching this very carefully to understand how people are using it. We also limit people to 5 +K’s a day.
A: Here’s my favorite example so far:

This wasn’t spam; the label was generated by Klout for Daniel Bogan @waferbaby. I strongly believe that Daniel is authoritative on unicorns so I even voted for his Klout in this area

His current Klout topics (which still include unicorns) are available here. It also seems like Klout slightly changed their shade of orange from the pics above.

On a more serious note, I think the K+ system is an incredibly important step in the evolution of Klout. The next step is to provide topic specific scores. Using these they can start to tackle the holy grail of influence measures: individualized influence scores. A serious problem with existing Klout scores is that it removes individual context from the equation. Justin Bieber will forever have a Klout of 0 for me, even though his systemwide Klout is 100. It would be easy to check that Justin Bieber has no Klout for the topics I am interested in (apps, startups, statistics, mongodb, etc.) This approach is much more computationally expensive and harder to get right. Tweets, Facebook posts, etc. are not labeled with topics, so these must be inferred. This is a VERY difficult problem due to the small amount of text. I’m sure Klout and PeerIndex are both working very hard to tackle the problem. Whoever gets that right will take the “influence” market.

Here are several questions they refused to answer:

  • What % of Klout systemwide is attributed to Twitter v Facebook?
  • Will adding another network ALWAYS increase your score? If not always, empirically, what % of the time does an increase occur?
  • In a recent Kloutchat, the statement was made: “Nearly 50 variables in generating klout score but it all boils down to how people react to your content” Whats the most interesting/surprising variable that you are willing to divulge.

It’s not surprising that Megan and Ash did not answer all of my questions. I think the answer to the first question is roughly 85/15, but they won’t say that publicly because a) it might piss of Facebook if they think they are “underweighted” b) they don’t want people outside the company arguing about how this should be weighted. Im sure they have had tons of discussions about this internally. For the second question, the answer is yes, until Klout tells us otherwise. I won’t go on a rant about the silliness of this, however, if you want to boost your score, attach your FB account. I’ll do a longer “Klout SEO” post in a few weeks.

I sent PeerIndex CEO Azeem Azhar the same group of questions, and will post his responses when I hear back. Next post, I’ll answer the question: “If you were creating my own Klout/PeerIndex/Twittergrader competitor, what signals would you use?” I bet I can come up with a set of 50 variables very close to those used by Klout.


Comparing Klout competitors and alternatives: PeerIndex and Twittergrader

Posted: June 8th, 2011 | Author: | Filed under: Klout, PeerIndex | 2 Comments »

After spending two blog posts on the shortcomings of Klout, it only seems fair that I look into the quality of its competitors PeerIndex and Twitter Grader.

Like Klout, both consume your Twitter stream and provide a summary score of your “influence” between 0 and 100. PeerIndex also allows you to connect your Facebook, LinkedIn and Quora accounts, while Twitter Grader is limited to Twitter. In my previous post I claimed that any good measure of influence should have the following properties:

  1. Ordering should make sense in the real world – the score should roughly represent the degree to which one is influential or has clout
  2. The score should not be easy to game – people should not be able to hack their score in a few days by getting bots to RT, squatting on hashtags, or simply connecting a Facebook account
  3. The score should be monotonic – if another member has higher stats than me in ALL categories, then he/she should have a higher score

The primary competitor is PeerIndex, so we will start by seeing if their score satisfies the above conditions more than Klout. Here is my PeerIndex summary:

Though overall PeerIndex better satisfies the three rules better, the first thing I noticed is a bug: my PeerIndex is reported as 50 at the top of this screenshot and 44 at the bottom. PeerIndex focuses on three areas: Activity, Authority, and Audience (as compared to Network Influence, Amplification Probability, and True Reach on Klout). These individual numbers are also different at the top and bottom of the screenshot. Though I don’t have a great idea of how these scores are calculated, Activity and Audience (followers) sound much more easily gamed to me. Authority is in theory a great measure, but in practice it falls short. I encountered several other bugs. Mouseover elements in the graph don’t always disappear, and some accounts just aren’t able to be added to the comparision. @Williamsonwines (my favorite winery) and @uberjon a product marketing executive at Facebook and X-Googler are two examples. Several other issues and bugs are highlighted in the comparison screenshot below:

PeerIndex claims that @feltron should have an audience score of 0, however, the well known data analyst and designer had 11,219 Followers and was listed 907 times at the time of this post. @coachella, the epic music festival in Indio, CA, has an activity score of 0, though the account has tweeted 436 times and several immediately preceding this blog post. Finally, @MummNapaWinery has 0’s across the board even though the account has 312 tweets, 3,122 followers, and is one of my favorite wineries in Napa. I also find it ironic and mildly awkward that PeerIndex gives itself a low authority score and Klout a slightly higher one.

For due diligence purposes, I performed the same group comparisons as in my original Klout post, but I didn’t find the same monotonicity issues. As a result, they are harder to poke fun at, so I placed them at the bottom of this post. You will notice a few individuals are left out relative to the original Klout comparisons. The UI requires 6 or less people, so I took a few out randomly. If you stare at the authority scores in these examples long enough, I think you’ll agree that they are kind of wonky (check out Vic Gundotra and Carla Borsoi, VPs at Google and AOL, respectively). Put more succinctly:

PeerIndex Advantages:

  • No or fewer monotonicity issues
  • Facebook, LinkedIn, and Quora already integrated (Klout is just adding LinkedIn)
  • Twitter Elite Lists are VERY interesting: Top Users, Top Women, Top Brands, Top Cities

PeerIndex Disadvantages:

  • Can’t see or track your score over time
  • Fewer details in summary
  • VERY slow to update (~7 days initially and updates only every couple of days)
  • Authority score needs more explanation

In summary, PeerIndex is a legitimate competitor, but they need to fix the user facing bugs I highlighted and really speed up their scoring cycle. If a user shows up to the site and can’t immediately access their score they will have HUGE retention issues. One of the most under appreciated features of Klout, regardless of your feelings concerning the legitimacy of its score, is its infrastructure. The fact that they can ingest the Twitter and Facebook streams, process the data, and update every day is an incredible engineering feat, especially for a startup of its size. Perhaps there is an unseen tradeoff between speed and quality of score within Klout and PeerIndex.

Next we consider Twitter Grader.


Their approach is totally opaque. The summary simply lists stats that I can get from my own twitter account, a number from 0 to 100, and a relative rank. I have no idea where this rank comes from, especially because there are way more than 9 million people on Twitter. Perhaps this is the number of accounts ever scored on Twitter Grader? I certainly hope not, because that would be incredibly biased. They do provide an article in their help center: How does Twitter Grader Calculate Twitter Rankings. Honestly their score may be great, but I just don’t have enough information and their site is lacking many of the features found in PeerIndex and Klout.

Twitter Grader Advantages:

  • Pulls your data and calculates score instantly
  • Associated with Hubspot (which has a great reputation)
  • It only uses twitter (see also disadvantages)

Twitter Grader Disadvantages:

  • It only uses twitter
  • Can’t see or track your score over time
  • Lack features

In my next post on the subject, I’ll have more followup from the ranking folks at Klout and the CEO of PeerIndex

As previously mentioned, the PeerIndex score comparisons mirroring those from the original Klout post can be found below:

Comparison 1

Alex Braunstein (me), @alexbraunstein – Statistician, Research Scientist at Chomp, X-Googler
Binh Tran, @binhtran – the co-founder and CTO of Klout,
Chomp, @chomp – app search engine
Vic Gundotra, @vicgundotra – SVP and head of social at Google
Carla Borsoi, @u_m – VP of Consumer Insights at AOL

Comparison 2

Paul Graham, @paulg – the fearless leader of Y Combinator.
Y Combinator, @ycombinator – startup incubator
500 Startups, @500startups – startup incubator
Adria Richards, @adriarichards – a tech consultant and popular blogger (also my roommate)

Comparison 3

Tim Ferriss, @tferriss – author of the 4 Hour Workweek and 4 Hour Body
Matt Cutts, @mattcutts – head of web spam team at Google
MG Siegler, @parislemon – my favorite writer for Techcrunch
Klout, @klout – the service I’m trashing in this post
Jeffrey Zeldman, @zeldman – designer, writer, and publisher

Comparison 4

Robert Scoble, @scobleizer – blogger, tech evangelist, and author
Perez Hilton, @perezhilton – master of celebrity gossip
Charlie Sheen, @charliesheen – #winning
Guy Kawasaki, @guykawasaki – entrepreneur and former Chief Evangelist at Apple
Justin Bieber, @justinbieber – never saying never


Klout reacts

Posted: June 2nd, 2011 | Author: | Filed under: Klout | 5 Comments »

Seems like I struck a nerve with my earlier post about Klout. Binh Tran, the CTO of Klout, Megan Berry, marketing manager at Klout, and Klout itself are all now following me on Twitter. In addition, I received a lengthy response from Ash Rust, the Director of Ranking at Klout, which I have included in full at the end of the post.

Ash’s three main points were:

  • Klout is just beginning and has flaws
  • Your Klout Score is about quality not quantity
  • Adding additional networks should increase your Klout

I appreciate Ash’s candid response. On the first point, he’s right. It’s unrealistic for me or anyone to expect perfection of Klout or any of the competing metrics/companies, especially at this relatively early stage of Klout. Think about Google 2 years afters its launched and how far its come since then. Still, companies need to know what users find wrong with their products to iterate and improve. I didn’t receive hundreds of RTs because my writing was so exceptional or witty; I received them because I articulated a set of issues seen by others in Klout.

Ash’s second and third points seem to contradict each other. Categorically stating that adding another network will always increase your score, seems to be a victory for quantity, not quality. In the Klout Chat yesterday, CEO Joe Fernandez announced that Foursquare and LinkedIn will soon be added. Without the proper Twitter/Facebook balance in the current system, I worry that adding two additional networks to the mix, will exacerbate existing issues. Additionally, I wonder how Klout will deal with identical individuals across multiple networks. My Foursquare friends are a strict subset of my Facebook and Twitter friends. Do they double count? Should I get any credit at all for adding friends already accounted for elsewhere in the system?

As I have time over the next few days, I’ll gather a few unanswered questions from the #Kloutchat in addition to others I have. I’ll send Ash a list that they will hopefully answer. Let me know if you have a few to add to the list.

As promised, here is Ash’s full response:

Hi Alex,
I’m the Director of Ranking here at Klout and wanted to respond directly to some of the points you raised here.
1) Thanks a lot for writing this.
It’s great feedback on the understandability of our score and mirrors a lot of the (intense) debates we have internally around how the score works and what data to deliver to our users.
2) Klout is just beginning.
We believe we’re at the very first stage of development for this paradigm, much like online document search was in 1998 when Google was founded, so we can expect some growing pains especially given the volume of data we process. That said, we know we need to do better and we’re working hard to improve, we have a team of excellent scientists working on improving the score.
3) Your Klout score is about quality not quantity.
While some users may have amassed many thousands of friends and followers, those people may not be listening or may not even be real people at all; this is why we use our own audience metric: True Reach. We also assess the influence of each person in your audience, so if someone you interact with is very influential that can have a much larger impact on your score than a group of people with lower levels of influence; for example if @BarackObama retweets you, it’ll increase your score more than if I do.
4) Adding additional networks to your Klout should increase your score.
We can only measure the data we have. If you add a network, like Facebook, and are influencing people on that network, then it should increase your score; assuming you’re influencing people on that network. If I influence 10 people on Facebook and then add my Twitter account, where I influence 3 people, Klout can now see me influencing 13 people, hence the score increase.
I hope this answers some of your questions and please feel free to follow up with me directly.
Thanks
—-
Ash Rust
Director of Ranking | Klout
http://klout.com
ash [at] klout dot com
@AshRust


Why your Klout score is meaningless

Posted: June 1st, 2011 | Author: | Filed under: Klout, Statistics | 45 Comments »

As a Ph D Statistician and search quality engineer, I know a lot about how to properly measure things. In the past few months I’ve become an active Twitter user and very interested in measuring the influence of individuals. Klout provides a way to measure influence on Twitter using a score also called Klout. The range is 0 to 100. Light users score below 20, regular users around 30, and celebrities start around 75. Naturally, I was intrigued by the Klout measurement, but a careful analysis led to some serious issues with the score.

Everything in life can be measured. Some quantities live on natural measurement scales: height, weight, temperature, etc. Some quantities are derived measurements: happiness, deliciousness, hunger, etc. Though all useful measurements, research has repeatedly shown derived measurements to be inconsistent and not trustworthy individually. Specifically, if two individuals tell you their happiness levels are an 8 and a 9 on a scale of 10, we have no way to know:

  • what this means for each individual without significant amounts of context
  • which individual is “happier” even if 8 is less than 9

I argue that Klout is far more similar to a derived measurement and has several suboptimal properties. Specifically, there are 3 basic, desirable properties the Klout score should satisfy:

  1. Ordering by Klout should make sense in the real world – the score should roughly represent the degree to which one is influential or has clout
  2. The score should not be easy to game – people should not be able to hack their klout in a few days by getting bots to RT, squatting on hashtags, or simply connecting a Facebook account
  3. The score should be monotonic – if another member has higher stats than me in ALL categories, then he/she should have a higher score

To demonstrate the issues Klout has with these principles, we provide 4 groups of Klout score comparisons:

  • a set of individuals with Klout in the 40-49 range
  • a set of individuals with Klout in the 55-64 range
  • a set of individuals with Klout in the 70-79 range
  • a set of individuals with Klout >= 80

The four groups were chosen to span the Klout range and contain bloggers, executives, tech pundits, and celebrities of varying levels of activity in social media, notoriety, influence, importance, etc.

Group 1 (Klout 40-49)

Alex Braunstein (me), @alexbraunstein – Statistician, Research Scientist at Chomp, X-Googler
Ben Keighran, @benkeighran – the CEO of Chomp
Binh Tran, @binhtran – the co-founder and CTO of Klout,
Chomp, @chomp – app search engine
Vic Gundotra, @vicgundotra – SVP and head of social at Google
Carla Borsoi, @u_m – VP of Consumer Insights at AOL

Let’s consider a few pairwise comparisons. First, Ben’s stats dominate mine excepting likes per post and comments per post, however, his Klout score is 7 points lower than mine. Next, Binh’s stats completely dominate my own in EVERY category, often by very large factors, yet we have identical Klout scores. Carla’s scores also completely dominate mine, but her score is lower. Finally, consider Chomp and Vic Gundotra. Vic’s stats blow Chomp out of the water, yet his Klout score is lower. In the “real world” sense of the word clout, Vic should dominate this group. The group 1 comparisons demonstrate the Klout score violating rules 1 and 3 from above.

Group 2 (Klout 55-64)

Paul Graham, @paulg – the fearless leader of Y Combinator.
Y Combinator, @ycombinator – startup incubator
500 Startups, @500startups – startup incubator
Adria Richards, @adriarichards – a tech consultant and popular blogger (also my roommate)
Stefanie Michaels, @adventuregirl – go to person for everything travel

In group 2, my roommate has a higher Klout score than Paul Graham? Really? By 5 points? Paul has 6x more followers, 2x total RTs, and 4x as many unique RTs, but he hasn’t linked his FB account. Adria has incredibly low FB stats (she uses it sparingly), but apparently that still gives her a tremendous boost. Adding a FB account is far too easy a way to game your score higher. I understand that Klout wants to incentivize the attachment of FB accounts and keep growing virally, but this aspect of the Klout score seems broken. Additionally, the pairwise comparison of Y Combinator and Paul is confusing. Paul’s stats are much higher, but they are assigned the same score. One could argue different, perhaps more Klout-tastic people, are following Y Combinator, however, I find that unlikely given that Paul is in charge of it. Finally, its wrong that Adventure Girl’s Klout is so low. She has been named one of the top 100 people on Twitter, has been featured in Time magazine, etc., but her Klout is only two points higher than Adria’s.

Group 3 (Klout 70-79)

Tim Ferriss, @tferriss – author of the 4 Hour Workweek and 4 Hour Body
Jack Dorsey, @jack – Executive Chairman of Twitter and CEO of Square
Matt Cutts, @mattcutts – head of web spam team at Google
MG Siegler, @parislemon – my favorite writer for Techcrunch
Klout, @klout – the service I’m trashing in this post
David Pogue, @pogue – tech guy from the NYT
Jeffrey Zeldman, @zeldman – designer, writer, and publisher

Things get very confusing in this group. Jack Dorsey’s stats dominate those of David Pogue, but his score is 4 points lower. Matt Cutts has 4000 more total RTs but 1.5M fewer followers relative to Jack Dorsey, so his Klout score is 1 point higher? I’ll go out on a limb and state that 4000 incrementral RTs seem FAR less valuable than 1.5M incremental followers. Klout, the company, has fewer followers, total RTs and unique RTers by a factor of at least 6, but 7K more unique mentioners, so Klout’s Klout score is 4 points higher than Jacks? But if unique mentions are so valuable, how can Jack Dorsey have a lower score than Matt Cutts when he has 16K additional unique mentioners? This is just the start of the inconsistencies.

Without FB, MG Siegler’s score would likely be 10 points lower. Jeffrey Zeldman’s blog is super high quality, but does he deserve to have more Klout than David Pogue? Again, Facebook puts him over the top. I think that Klout’s score is far too high, though perhaps its not surprising Klout does well on its own metric. Finally, I included Tim Ferriss not just because I’m a huge fan, but his stats provide an interesting counterpoint for even more interesting pairwise comparisons. It will lead you to several more contradictions concerning the relative value of followers, RTs, unique RTers, and unique mentioners.

Group 4 (Klout >= 80)

Robert Scoble, @scobleizer – blogger, tech evangelist, and author
Perez Hilton, @perezhilton – master of celebrity gossip
Charlie Sheen, @charliesheen – #winning
Guy Kawasaki, @guykawasaki – entrepreneur and former Chief Evangelist at Apple
Justin Bieber, @justinbieber – never saying never

The pairings of Scoble/Hilton and Sheen/Kawasaki again demonstrate the severe miscalibration regarding Facebook scores. Also, I’m not sure I trust any system which has Justin Bieber as most influential.

In conclusion, there are some serious inconsistencies with Klout that render it nearly meaningless in some circumstances. It often does not correctly order individuals in terms of how influential they are, is easy to game higher simply by adding a Facebook account, and does not respect some very basic monotonicity rules. Put simply, it acts like a derived measurement. From this analysis, I have gleaned the following rough rules of thumb for understanding your Klout score:

  • Connecting an additional account (ie Facebook) will ALWAYS increase your Klout.
  • The degree to which your followers are influential seems to be irrelevant or matter very little
  • The differential between number of people you follow seems to be irrelevant or matter very little.
  • In terms of value to your Klout score: follow < RT < unique RT < unique mention but this can be inconsistent
  • In terms of value to your Klout score: like < comment but this can be inconsistent

To be fair, Klout does not want their score to be completely transparent. Then it would be easy to rip off and even easier to game. That being said it should be possible respect the three conditions I enumerated and still keep a lid on their secret sauce. As I have time, I’m going to mess around with the Klout API a bit and gather more comprehensive data to further demonstrate the points made in this post, including a similar study concerning the Klout of companies/brands. Additionally, I will submit several questions regarding my analysis to Joe Fernandez’s (the CEO of Klout) Klout chat, and hope the company follows up. I’ll post any details/answers I receive here.

More info about Klout can be found in Techcrunch articles about their initial launch, series A funding, series B funding, addition of Facebook to their ranking, and their crunchbase profile.

As any good statistician should, I need to qualify my analysis. There is of course selection bias in the examples enumerated above. Although not as egregious, these head scratching scores are the rule, not the exception. All data was pulled on 5/29/11, and may not reflect current scores. Finally, please remember that this is my personal blog and reflects my opinion alone. In particular, it does not reflect the opinions of any employer past, present, or future.