Dec 11

The Hashtag Fail Whale

If you’ve spent time trying to analyze twitter data you undoubtedly have come across the topic problem.

The Problem
“What’s this tweet about?”
“Are there other tweets that are related?”

Somewhere along the way, Twitter, it seems, decided the hashtag was enough to answer the above.

Hashtags are the wild west of naming conventions. Anyone can create or use one (and most people don’t). Further, with a character limited tweet, proper tagging cuts into tweet content.

But as a user, I want to be able to find all tweets about a news story, an event, a book that was just released. Relying on Twitter users to manually add hashtags to do that is a fail whale of a different color.

The Solution
Why hasn’t Twitter borrowed from Delicious – a service that finds the balance between the freedom and discovery of a folksonomy, and the clarity and utility of a taxonomy?

Why not auto suggest tags as someone generates the tweet?  And these tags don’t count against the character limit.  And further, if someone doesn’t select or create a tag, Twitter auto generates one (and designates it differently than a user generated/selected one).

Just as I can search for an @ so I can connect a person, making a # (or a new version of a tag) work the same way could make Twitter even more powerful — and make it more money.

A Consumer Use Case
Imagine I just watched top chef last week and I wanted to Tweet about a cheftestant.  I start drafting my pithy tweet and auto suggested tags include: top chef, top chef season 8, top chef season 8 episode 7, top chef cheftestant smith.

Since I’m commenting on Cheftestant Smith about her performance in episode 7 of the current season, I select those two last tags.

Later, when a fan of Cheftestant Smith (or she herself), is searching for tweets, she doesn’t have to use an @ people search on herself (and if she used twitter, the autosuggest would suggest her @ handle as well).  But more useful, instead of relying on a #topchef tag, searching on top chef would bring up all the tags as discussed above, allowing a user to see more related tweets (rather than those hashtagged), less noise (created by NLP searches), and drill down (a certain episode, ingredient, etc of the episode).

But how does this make Twitter money?

The Business Case

1.  Tag Bidding

2. Page Buying

3.  Tag Structuring

Using the previous use case, imagine where Bravo can get value.  First, they bid on auto suggest tags — they can help structure the conversation.  And auto suggest will bring in both “paid” and “organic” tags to ensure the user still finds value in the tags.

Then, for each tag, a “page” can be registered/bought.  When someone clicks a tag, they go to that page that aggregates the conversations, but also allows for custom content as well — sponsored links blossom into sponsored tags and pages.

Finally, organizations can created “tag structures.”  In the Iron Chef example, Bravo could create a taxonomy of tags, creating order and hierarchy around the folksonomy — show level tags, episode, etc.

Jul 10

Email AI

I’m in the middle of a couple books on complexity — and they’ve affected me.    In the email hell that is my worklife, I started to think about an emergent/adaptive inbox.  Here’s what I’m thinking:

Decaying Relevancy:  Imagine all incoming email has a “freshness” date.  It came in X time ago.  That email is either read or unread.  There’s quite a bit of intelligence in these two dimensions.  Email I care about will be read while fresh.  Email I drudge through — or that is complex — will be read and stale.  Email that is unread and stale is useless.  That leaves fresh unread email.  That’s what I really care about making sense of.

Sender Benchmarking:  Now is when it gets interesting.  Imagine that I know, for each sender, a distribution on read/unread and freshness.   That distribution is going to tell a lot — some shapes will be fat-headed, where 80% of the emails I read right away.  Some will be fat-tailed — where I read very few of them right away.  Based on these distributions, I can start to categorize and prioritize my email (or rather, it can be done adaptively based on my normal interaction with email).

Behavioral Conclusion:  The last area that rounds out my adaptive email system is what happens with the read email.  My biggest problem with work email is actually read mail that is sitting in my inbox.  I either need to reply, archive, or take other action.  The first two conclusions are part of my normal email workflow.  Adding a follow-up action usually puts an email in purgatory.  This is where the freshness comes in.

Bringing It All Together

So my problems fall out as follows:

  1. Unread important email
  2. Read, non-concluded, important email
  3. Read, non-concluded, unimportant email
  4. Unread, unimportant email

The key unknown is importance.  Using sender distributions — I can determine importance as a function of freshness.  Adding behavioral conclusion to the freshness metrics, we now can calculate % of items read and % concluded per sender.  We can also weight these by the time it takes to do both of those.

By creating benchmarks/norms, each email can be given an importance rating.  High importance items, with longer times in the unread and/or non-concluded buckets, receive the highest priority.   Just by organizing my email inbox into unread-important and a read-non-concluded views, prioritized by freshness, I know I’d be quite a bit more productive.

Other interesting by-products could be a way to score how productive you are by day/daypart.  You could also create a feedback mechanism to people you interact with via email — how important their emails are to you.  Finally, you could break apart a senders importance distribution to allow them to explicitly rank a message on how important they think it is.  Their rating could then be matched to how your emergent system rates it — and productive feedback loops could ensue.

Dec 09

Machine Gun Photography

Kottke is on to something here.  Seems like a perfectly logical progression.

Just as the introduction of the machine gun fundamentally changed warfare, so the affordable high-resolution digital video camera will change photography. Now you don’t have to wait for exactly the right moment for the perfect shot; just take 10 minutes of HD video and find the best shots later.

Jan 09


Every once in a while I come across a blog/site that, in every post, inspires me, challenges me, engages me.  S&W’s Pulse Laser, found on links from Kottke, is my source today — and about 5 posts in, the best combination of 5 posts, topics, concepts, themes yet.  How can I work with these guys?