Over the past six months, there have been some interesting changes in the Twitter API. Seeing as those in the anisphere have integrated with the platform as a primary medium for conversation, I thought I’d share some of the positive aspects I’ve seen during my interactions with the system. (This may be a boy-talk post, sorry loves.)
One of the more technical changes we’ve seen is that status IDs are essentially large and random numerics. Though they are fairly sequential, I’m afraid to assume that the order correlates to creation timestamps 100% of the time. There are likely 20 explanations as to why this is, but the common denominator rests with nodes and the distributed messaging queue (kestrel), which would only be hindered by atomic iteration across N-nodes. The purpose of these nodes is far simpler, and one of the reasons why we see duplicate statuses posted sometimes; they receive a message and distribute/handle accordingly.
Now here is the interesting implication of these identifiers, which rests largely on the premise that ID denotes only a fuzzy creation timestamp. With that in mind, the system is largely based around timestamps, rather than IDs regarding how we view an ordered timeline, and because of this independence, if that’s really the case, what prevents us from updating the timestamp while keeping the same identifier? Or for that matter, editing the entire entry?
Just something to think about, though with the way Twitter is presenting itself in the streaming API, editing of message content may be completely unnecessary in the future.
Real-time Event Piping
When the real-time stream was released last year, I was somewhat boggled as to why the company would want to do this when already facing instability in the application, but after mingling with the real-time streams, I can understand the benefit for twitter and it’s users. First off, the real-time stream offers no persistent data whatsoever; a hint. This means, if the client did not receive the update in real-time, it won’t be receiving it without a fall back. The reasoning here is that messaging streams are part of the distribution system and have little or no reliance on emitting data that has been saved into the database.
A simple explanation as to the benefits of streaming is that there is no database storage with regards to functionality. Messaging queues, nodes, receptors, dispatchers, and the like can operate with little or no state, and by connecting users on both ends of I/O, Twitter is able to hop the database on various levels. (TL note: disk-based databases and persistent storage are slow)
In essence, Twitter is slowly becoming purely a distribution method, rather than a content host, and in light of streams, the platform can offer more comprehensive data, the way it used to when it doesn’t need to worry about rendering every user’s historical timeline. Is it safe to assume this is what Twitter wants? I believe the last focus is a good indicator.
Mention Streams – Conversation
As long-time users know, early Twitter consisted of a fairly raw and simple timeline approach. Following a user meant every status was visible, regardless of conversational semantics (relations and @mentions). During 2009 Twitter was battling performance issues and developed a solution which would alleviate timeline churn; selective mentions.
The idea behind withholding mentions from timelines where the recipients were not actively followed was, in my opinion, purely a performance decision. Brief reasoning: given two related users, their intersecting subscriptions will almost always be a substantially smaller subset of their followers. The implication here is that the number of relevant timelines which require update is drastically reduced for a single status. While some users may find this a privacy feature, the Streaming API may argue otherwise.
Many users are aware of the real-time Twitter, and for a while a number of them toyed with the “all replies” toggle, which made visible replies to users they were not following. This functionality mimicked Twitter’s pre-2009 timeline, though by now, I’m sure many users find it cluttered or not useful (blame the clients). However, the current “all replies” feature takes another step into robust conversation.
Enabling all replies does allow the user to view all statuses created by their subscriptions, but there are additional updates, which are often relevant to conversation. These extra updates are directed at followed users, regardless of following status. A small example of how this works: if 6ry follows anya_fennec and Homuhomu_ mentions anya_fennec in a status, Homuhomu_‘s update becomes visible to 6ry.
Mind you, this is not how the REST API – generally what we see on twitter.com – functions, but because the streams practically skip the database overhead, the platform is capable of such functionality. In my opinion, this is exactly what Twitter wants and precisely what clients should be written to handle efficiently and in an organized manner.
That’s all for now, but hopefully the picture of what Twitter is and where it’s going is a little clearer.