Balleen Whales of the Internet Ocean

A few days ago, Facebook announced plans to construct a second data center in Prineville, OR. The expansion mirrors the user growth from 350 to 750 million, and the increasing servers and storage required for all the conversations and the billions of photos uploaded every month.

Twitter, with a much smaller user base racks up a billion tweets per week. Google has been indexing tens of billions of webpages every few days for a decade or more, and can easily subsume the firehose of twitter data and incorporate it into search results.

Obviously for Google, Facebook and Twitter, an expansion is nothing like the traditional factory, assembly or warehouse model. It is about the ability to accommodate and process more data. Since they never charge users directly (and never will), their revenues have little relationship with the amount of data they process. Some would simplistically consider them as ad-driven businesses. And indeed they often use terms from the ad industry as a first approximation of the very innovative mechanisms by which they broker user engagement and attention to buyers.

It helps to think of these companies as the balleen whales of the Internet ocean.

Humpback Whale

A fin, humpback or blue whale gulps in ten thousand gallons of water as it dives and lunges through krill-rich waters. It spouts the water through its balleens, and ingests the millions of tiny krills and microscopic plankton trapped within. It does this a few times per dive, hundreds of dives per day, adding up to thousands of pounds of food.

If you think about it, that is an incredible amount of work. Dive, lunge, gulp, surface, spout, eat. Repeat. And it seems a bit tedious compared to the coordinated packs of orcas who hunt seals and otters. Regardless, the balleen whale approach is calorically efficient and ecologically successful. In fact, lunge feeding correlates with largeness.

To the whales that are Facebook, Google and Twitter, the billions of photos, links, videos, tweets, documents and webpages are as gallons of water to be gulped and spouted. Their techno-genetic adaptations to this environment take the form of MapReduce and Cassandra. The algorithms and ad platforms are the balleens to get the krills of monetizable attention.

Another way of evaluating fitness in the whale ecological niche is to look for the development of such unique adaptations. On that count, going by what is open-sourced and known monetization plans, Twitter is currently the weakest of the whales.

Fail Whale

Obviously, whales can survive only at “scale”. A minnow cannot gulp enough water to find nourishment that can sustain even its tiny body.

Mapping the analogy back, small companies cannot monetize like Google or Facebook, but investors love whales. So, when VCs say thay they invest in people, they are looking for the embryonic startup that can grow into a whale of a company. And they are willing to gestate it with hundreds of millions of dollars.

As a founder of a two-person startup, it is worth checking within yourself if you have the vocation and the DNA to grow into a whale. If you do, the mechanisms and monetization will express themselves.

Comments [0]

Slopegraphs and Envisioning Information

Edward Tufte’s books are an enjoyable reference on information design and visualization. His concepts on data-ink ratio and analyis of historical maps, brochures and illustrations were a revelation on how less can be more.

Some of Tufte’s ideas such as sparklines have seen widespread usage in presenting analytics and trends. Less used, and perhaps less-catchily-named is the slopegraph.

Charlie Park has a great post about it, with some examples in the wild, along with the nuances of labeling and automatically generating slopegraphs.

Envisioning information is an art, requiring deep and intuitive understanding of the data and presentation goals. A website can today generate and sprinkle sparklines like text in a web page. There are several attempts at automating slopegraph generation.

Large amounts of data is now publicly available, along with the tools to structure and collate it. Some even provide visualization widgets. A service that connects unfolding events with appropriately selected visual information would be helpful. All one asks of such a service is that it not perpetuate chartjunk.

Comments [0]

Tuning In To People

Michael Fogus talks about his increased impedance mismatch with regard to the Hacker News community. Granted, communities become predictable, and which is why, predictably, I found his post via Hacker News.

I would tend to agree with his takeways:

Great commentators transcend context. Individuals are where its at.

Intuitively, this makes sense because, if you respect someone, you tend to weigh their opinions more, especially in contexts that are adjacent, or with occupational overlap. This is true for the spectrum of users and topics on Hacker News.

He also describes how he follows comments from selected Hacker News commentators. That is an interesting subset of what led me to write a Hacker News Bot. In my case, I wanted to surface postings and links to which interesting people contributed comments.

I came across an even more intriguing example of following great commentators. Marshall Kirkpatrick, in responding to Twitter’s acquisition of Backtype, describes how he used Backtype features to monitor and be alerted to blogosphere comments by a select list of people.

Focusing on people comes naturally to us. However, when a medium other than air is interposed between us, we get stuck in the peculiarities of the media. Compare a conversation in a coffee shop with the discussion during a phone conference or corresponding through letters. And this is just through person-to-person media; broadcast or narrowcast media bring their own twists.

Even in the case of the people-following experience that is Twitter, the mechanics of following, the psychology of follower counts and side-effects in terms of influence scores can skew – if not dominate – the interactions.

So, what Marshall and Fogus are doing here is, in my opinion, a purer use case. Its reaching out, not strutting about.

They also exemplify two broad strategies in listening. One is to be selective, the other is to try and be more aware. One gets you fewer but more interesting links to read, the other (with some work) lets you stay up to date and be in the know. Both are tuning in to people, one is focused on a narrow channel, the other is monitoring a broad spectrum.

How do you build a service that best combines these intents and strategies? How would you use it? What features would you want?

Let us know.

Comments [0]

Do You Want Free Engraving With That?

Via Eli Dourado, a speculation on: Why does Apple offer free engraving?

The real reason Apple offers free engraving is to weaken the secondary market. iP*ds are durable goods. Apple has a monopoly on iP*ds, but it still has to compete with the products of its former self. If people get tired of their iP*ds or decide they want to upgrade to a newer model, they can sell their devices to other consumers, who in turn are not giving their money to Apple.

Who wants a weird engraving chosen by the previous owner on his iP*d?

Who indeed? Buying something with the next sale in mind is not something I do. So, naturally, I ordered ours engraved.

Our iPad2 engraving - Improved Means to an Unimproved End

With apologies to Henry David Thoreau:

“Our inventions are wont to be pretty toys, which distract our attention from serious things. They are but improved means to an unimproved end, an end which it was already but too easy to arrive at.”

Would you buy this iPad, if it were for sale?

Comments [0]

On Traction - My 0.001 BTC

A retrospective on the ideas behind Bitcoin in terms of getting traction.

Don’t stop at an idea.

Back in 1998, Wei Dai described bmoney, a group of untraceable digital pseudonyms to pay each other with money and to enforce contracts amongst themselves without outside help. Other proof-of-work schemes, such as Hashcash cite that paper, and are themselves utilized by Bitcoin. Nick Szabo described a similar scheme and associated economics in bitgold.

But an idea becomes real only when you start implementing, tackle details, surmount objections and show it working.

Make no little plans.

The sages of cryptography, Rivest and Shamir, published two micropayment schemes akin to digital coins “for making small purchases over the Internet”.

Compare that to the opening sentence by Nakamoto – “Commerce on the Internet has come to rely almost exclusively on financial institutions serving as trusted third parties to process electronic payments”.

Both papers introduce essentially the same arguments typical of a micropayment scheme. One brings to mind candy machines and newspapers, while the other evokes a new world order.

Embrace controversy.

Sometimes people make pathetic attempts to create a controversy. But if you think big (see above), then others will do it for you, embroiling notables.

Accept the controversy, even if you are hesitant about being misrepresented, inaccurately portrayed, or associated with something you would rather avoid.

Let others profit.

Bitcoin is designed from the protocol layer to help participants profit. The reward for tracking everyone’s spending and generating transaction blocks is you get to give yourself Bitcoins.

In addition to bootstrapping the economy, it attracts precisely the kind of people who promote the project, provide the trading and exchange infrastructure and spark a speculation frenzy.

Finally, let others drive it forward.

1BTC = ~20USD today, according to Mt. Gox

Comments [0]

Bitcoin Mania

In the past two months, Bitcoin rapidly cycled through the entire mainstream news cycle of discovery, buzz, hype and punditry. Fueled by a report on peddlers and feted by pirates, we only needed furor by politicians to complete the picture. With this increased attention, speculation in Bitcoins has reached new highs, with one Bitcoin trading for 21 USD, up 200% in the last week.

Bitcoin was publicly outlined three years ago by Satoshi Nakamoto as a peer-to-peer electronic cash system. Specifically a cryptographic mechanism to ensure that this cash cannot be spent twice. As in any p2p system, the mechanism was designed to not require a trusted third party.

In Bitcoin, the electronic coin is simply a chain of digital signatures, akin to a paper note that you co-sign with the next person when you spend it onward. In crypto-speak, you sign a hash of the previous transaction and the public key of the next owner and add them to the end of the coin.

However, this chain of ownership should be verifiable by the receiving party at the time of the transaction. A paper banknote is verifiable through physical ownership since counterfeiting is difficult due to security features supplemented by serial numbers. Bitcoin verification requires a radically different scheme. Not just because the coin is a string of text and therefore easily copiable, but also because the creators do not want a central mint to sign and verify the text. The approach taken is to instead track spending and safeguard against double spending in the chain of ownership as the coin passes from one person to another.

All transactions are announced publicly, widely broadcast and “known” by the network. That is, a majority of nodes on the network agree on a particular chain of ownership certifying that the previous owner did not sign any earlier transactions before giving a coin to you.

The system utilizes a distributed p2p timestamp server, based on proof-of-work, which is a hash that takes a moderate amount of CPU time (work) to generate, but is near instantaneous to verify. This makes it harder for counterfeiters in cheating or forging ownership records. Each node collects transactions for several coins into a block, and tries to find the proof-of-work for that block. If it does, it broadcasts that block to all nodes, who verify that transactions and the hash are valid. If the block is accepted, its hash is chained to the next block as all nodes get busy trying to find proof-of-work for it.

The genius of the system is that the node that gave the proof-of-work, gets to insert for itself a reward of (currently 50) Bitcoins as the first transaction in its block. This is what is called mining. The supply of bitcoins thus generated is to be capped at approximately 21 million.

Obviously the creators of Bitcoin are well familiar with the functions of money as both a medium of exchange and a store of value. You could consider that the worth of a bitcoin is the amount of CPU, memory, network and electricity costs expended into generating it. Of course, since the bitcoins generated are a reward for helping people track the spending of bitcoins, this is highly circular, literally bringing forth the bitcoin economy by a combination of an act of faith and the straps of one’s own boots (or bits).

It is worth noting some specific facts about Bitcoins:

  1. It is definitely a fiat currency, however much touted otherwise. The difficulty of the proof-of-work and the reward for finding it are mandated by the Bitcoin software and network protocol, similar to how central bank committees control money supply.
  2. It is not strictly anonymous. All transactions are publicly announced, and require public key of the recipient. At some point, bitcoin users will use their email addresses, or not change keys for every transaction. Data analysis systems less powerful than those used by ad servers on the web can be used to correlate transactions and identify patterns, and possibly tie spending with online identities.
  3. A Bitcoin collection has to be stored in digital wallets and transacted through software interfaces provided by bitcoin clients. Your holdings are therefore susceptible to worms, viruses, computer failures and even user errors. Just like cash, if you lose it, its gone.

The Bitcoin system can be attacked for gain or to induce loss.

  1. The system relies on CPU power to maintain chains of ownerships, essentially one-CPU-one-vote. Those who can bring large scale, high compute GPU nodes to the party get to mine more bitcoins.
  2. All chains of ownerships are based on agreement by a majority of nodes. A dishonest entity or cartel can gain majority on the network, and forge ownership chains.
  3. An entity that controls the majority of the client distribution may fork the client software to tweak the protocol for their own benefit.

Regardless, speculators have latched onto Bitcoin as an investment vehicle. If you consider this a capped commodity or a limited edition (only 21M will be “minted”), you might want to get yours today. Or if you see how this appeals to libertarian-anarchists, anti-government types, you may be able to profit from it. So far there has been no downside for the early miners and even the early buyers. Some of them have not been shy in promoting and blogging about it.

Bitcoin logo Tulip bulb

Bubbles take on a life of their own, until they pop. At the height of the Tulip Mania in Amsterdam, a single tulip bulb would sell for more than 10 times the annual income of a skilled craftsman. When do people decide that holding a commodity of limited convertibility and no intrinsic value is not aligned with their investment portfolio? When do tulip bulbs go back to just being a gardener’s delight?

Ultimately, Bitcoin is a reminder that even though money is simply a medium of exchange for productive services, it becomes a force of its own and exerts a powerful effect on everything we do.

Comments [0]

Inviting the First Mavenns

This week, we sent out the first set of invites to people. Mavenn is still minimal but usable. We have working infrastructure for several kinds of information streams. The display is spartan, some actions and possibilities are not quite obvious. But its heartening to see people appreciate little tricks with bookmarklets and ease of use.

It is fulfilling to make a sustained push to wrap todo items, close issues and work towards a release. Most importantly, our friends have something tangible to view and interact, which makes it easy for them to point out where we suck and what we can do better.

In train-speak, we are just getting underway and still switching tracks, but I don't mind tooting the horn.

579478004_dd573a5c45

Sign up for an invite: http://mavenn.com
@askmavenn on twitter

Photo by nozoomii

Comments [0]

Alternative Sources

Tim Bray succintly and specifically highlights how the stories being told by the mainstream news media are either information-free, wrong, or immaterial.

Japan

As a positive counterpoint, he commends three sources in particular. 

1. Martyn Williams from IDG, via his tweetstream, reporting just-the-facts and images and videos from past and present. 

2. Randall Munroe (of xkcd) fame who has a very nice Montessori-math style radiation dose chart.

3. Charles Stross, a science fiction writer who of course has a knack of telling stories. Read about his guided tour of a nuclear reactor complex

4. The MIT Nuclear Information Hub. A previous, highly optimistic post is also now hosted and updated there. 

The mainstream media is of course doing what it does best. But, as always, the broader connections and deeper understanding comes from mavens - some famous, many unknown. And what they have to say is sometimes unexpected, but always rewarding. 

Thank you. 

Comments [0]

Google Reader And Friends

I have not used Google Reader or other such feedreaders for over two years. Scobleizer, who processes and disseminates information for a living, described a year ago his reasons for the same. Here are my pronouncemens and predictions on how Google Reader and other feedreaders fare in a social world. 

0) Scale 
Everything that Google does is at scale, and Reader is no exception. Hundreds of thousands of feeds are pinged, refreshed, indexed and managed for millions of people several thousand times per hour. Anything you wish to subscribe to, is probably already in the system. No one else can come close.

1) Interface 
5 years ago, Google Reader launched with an ambitious Javascript driven interface. Collapsible panels, pop-up dialogs, keyboard shortcuts, and much more. To me, it seemed a bit cluttered and inconsistent even then. Today it is both dated and cluttered. 

2) Organization 
Google has added organizational elements such as labels, folders and bundles to manage subscriptions.  I must confess that I have not kept up with the nuances of these different organizational levels. I prefer some organization (to none) but also like to lean on others (e.g. twitter lists) to figure out what's worthwhile. I can neither commend nor find fault. 

3) Social 
Google started sharing with a built in advantage - your Google chat and email contacts, and eventually Google Buzz followers. You can share things with them with a single click, or by adding a note. Some of my contacts do that. But many of my friends are tweeting their cool finds and writing comments on Facebook, so I don't have much shared stuff in Google Reader.  

4) Friends
You can't make new friends through Google Reader. There is no way to resonate your likes and dislikes with a group of people. Read your feeds in psuedo-solitude and share with your existing contacts perhaps, but you cannot connect with interesting strangers by discussing news, events and information.

5) Community 
Unlike Digg, Reddit or HackerNews, there is not a community of people who care about keeping the place well tended and hospitable. I spend many hours a week on Hacker News and others spend a lot more. That cuts into the time or the motivation for feed-reading. 

6) Freshness 
Google Reader is no longer a "News" Reader. New information shows up first through tweets. If you absolutely, positively, must know about something in real-time, Google Reader simply does not work. 

7) Patterns 
Professional bloggers, startup founders, leading edge marketers and many others already have a deep understanding of their domain. They care about patterns and events that alter their patterns. Numerous 3rd party tools exist to provide these insights, but they are aimed at content providers, not sophisticated readers.

8) Openness
It may come as a surprise, but Google is quite reticent about providing API to some of its core services. It may be for valid and logical reasons such as privacy policies, but Google's data about feed subscriptions (including feedburner statistics) and article sharing is private and inaccessible for analysis.

Each of the points above is an area for innovation, refinement and competition. Twitter and Facebook and the mobile ecosystem around them may be luring users and attention, but they are also creating alternatives and possibilities. I believe this area is ready for a new approach. 

Comments [0]

The Curious Time Teller

Our morning breakfast routine with the kids includes several inquiries, imperatives and reminders regarding time. Some of us are ready on time, yet get delayed lingering over the comics or savoring chai, and need to be prodded to look at the clock. Others watch the clock on the phone, the microwave or the car radio, restless to leave or arrive. Few things are as exciting as being early at the bus stop, or greeting the teachers during their morning coffee. 

Today, on the way to school, our little time-teller wondered: "How did the first person who made a clock set its time?

I did not know, so we started thinking this through. The easiest way, as usual, is to respond with a question. In this case: "How did we track time before clocks?"

Within a few minutes, we had covered the movement of the sun, sun-dials, water clocks, hourglasses and talked about Babylonians and their preference for the number 60. Leaving aside the enthusiasm for a sundial project, it took me many more minutes to realize that the question is deeper than the mechanics of how to measure time.

Let's say you have an accurate and highly reliable timepiece. What time do you set it to?

Well, you look at your phone, or hear the time on radio, or tune to the atomic clock in Colorado, or..., but wait, how did the first clock-maker set the time? 

It turns out that mechanical clocks have been made for a surprisingly longer time than you would expect. You could observe noon when the sun is directly overhead, set the clock to "12", and then ensure it can track 24 hours to the next noon. For cuckoo clocks and those counting only the hour, this would be reasonably sufficient. But the precision watch maker building minute and second tracking needs astronomical help. 

For reasons, mostly to do with earth's oblique axis of rotation and its elliptical orbit around the sun, the Earth's analemma is an asymmetric figure-of-eight, which results in variable duration of the solar day, and changes the "time of noon" from day to day as the year progresses, varying by as much as 30 minutes through the year. Ancient astronomers across diverse continents from America to India knew this, which led to the equation of time

The real solar time is that shown by the sun-dial. Mechanical clocks cannot easily be built to show that time. Neither does it make sense for them given that their strength is their reliable counting mechanisms. This discrepancy with true solar time became especially noticeable with improvements in the accuracy and precision of clock mechanisms.  

It is not surprising then, that Christiaan Huygens, the inventor of the pendulum clock was also the first to publish correct tables for the equation of time. For John Flamsteed, the first Astronomer Royal of Britain, this was part of the job, and he published even more accurate tables for Greenwich, the site of the Royal Observatory. 

As an aside, Both Huygens and Flamsteed were Fellows of the Royal Society of London for the Improvement of Natural Knowledge. These and other Fellows combined scientific theory, empirical observations and applied technology to vastly improve time measurement. 

As is true of technology to this day, timepieces became better and cheaper and affordable to the common citizen. Timekeeping set the cadence of the Industrial Revolution, changed social customs and transformed how people lived, worked, taught and learned. 

All that remained was standardization, which is having the same "time" all across a nation, regardless of local noon. That process was driven by the needs of safe navigation and transportation.

As the leading maritime nation, the British (and Royal Society) were at the forefront of determining ship's position by latitude as well as longitude. Fixing latitude is easy, and had been developed many centuries ago, but a fix for longitude required more than 200 years of concerted efforts and government incentives. The quickest and most reliable method became possible only with the availability of accurate marine chronometers.

Time equals longitude, and knowing the time at a fixed reference point in conjunction with the observed solar local time, ships could fix their longitude. The reference time tracked by ships' chronometers was the time at Greenwich, known as the Greenwich Mean Time. (And conveniently for calculations, the longitude through GMT was Meridian Zero). 

But landlubbers remained stubborn in their use of local time, and even on a small island like Britain, there were differences of 15 to 20 minutes in the local time between major cities. This is not a big deal, until you have steam-powered locomotives. Suddenly this small difference can lead to confusion in train schedules. This led to specifying train timetables and setting railway clocks in terms of Railway Time

The confusion went beyond missed trains. With increase in train speeds and frequency, engineers, conductors and station masters had to develop detailed schedule for train crossings and track switchings. Train operators and railroad employees using different local times would lead to disruptions and accidents. As a result, railroad companies were the first to standardize time across their network, developing technical means and operational methods to synchronize clocks. 

The synchronization was facilitated by the growth of communication networks. Telegraph links were usually laid beside railway lines, and soon they were used to electrically transmit time signals from Greenwich. Eventually, the Standard Time (or GMT) was mandated by national law in Britain. The story was repeated wherever railways went, so trains connected people in more ways than one. 

Wth air travel and now the Internet, it is routine to hop across dozens of Standard Time Zones within a "day" or coordinate with people across time zones. The zero-hour time zone is a successor of the GMT and referred to as Zulu or Coordinated Universal Time (UTC). The UTC is tracked with a precise atomic clock, and requires leap second adjustments to correlate with the slowing down of earth's rotation. Airplane pilots, military commanders, multi-national companies and worldwide travelers rely on it for safety and punctuality. And life is better for web programmers when all transactions are logged in UTC. 

Time is now broadcast by radio waves, embedded into GPS satellite signals and burned into computer firmware, so we don't have to think about how to set the time on some watch. Unless, of course, you are given to wondering or like periodic reminders of the time it took to set time. 

(Railway Clock - credit Alex Drennan)

 

Comments [0]

About

At mavenn, we are building a collective for monitoring, sharing and discovering information of professional and personal value.

On our blog here, you will find our thoughts as well as the latest updates. Along the way, we will also be sharing tips, tricks and trends and things we learn.

Twitter