The Data Ecology

by Josh Patterson ~ April 19th, 2009. Filed under: Brainstorming, Self Organization, The Data Ecology.

Newly Forming Ecosystem

Where there is an ecosystem, there are local experts. An outsider can muddle through an unfamiliar wilderness at some level, but to thrive or to survive a crisis, he’ll require local expertise. Gardeners regularly surprise academic experts by growing things they aren’t supposed to be able to grow because, as local experts, they tune into the neighborhood soil and climate.

— from the book Out of Control: The New Biology of Machines, Social Systems, & the Economic World by Kevin Kelly, Image courtesy of Boston.com’s “The Big Picture“; Mountainous countryside near Maelifellssandur, Myrdalsjökull Region, Iceland. Once the young lava fields of Iceland cool down, life begins anew little by little. Ice, wind and water flatten and carve out shapes to begin with, then, during the summer, bacteria, lichen and fungi prepare the soil for plants, in particular mosses which adapt to an environment which remains difficult. These plants colonise the most favourable sites and terrain little by little, forming a new ecosystem. [map] (© Yann Arthus-Bertrand)

Much like the physical world, our current web of data on the internet is growing and adapting to its environment. Everyday this web of linked data becomes a more intertwined ecosystem that grows along side other regions of linked data, linking the sub-ecosystems together. Each area of this ecosystem, just like in nature, has its own peculiarities and personality. There is a lot of movement in the linked data space, from the concept of “the cloud” to other technologies like Azure and RDF. Some themes I keep seeing are:

  • The inherent social nature of the web
  • Interconnectedness vs the “sandbox” mentality
  • Implications of interconnectedness
  • Wiring into open ecosystems and creating more value

The web is, and will continue to be, a very social place. The nature of intelligence is social, so the (open?) social graph will naturally be a conduit of related information. Some data will be explicitly linked via technologies like XRD (and other discovery techniques), other data will be implicitly linked via tech like what our friends are talking about or doing (XRD and LRDD are key in this space). I believe that friendfeed and “follow” are early “primordial soup”-type social discovery techniques. As this space evolves we’ll continue to see projects such as laconi.ca and magnolia2 pop up, projects that allow for far more decentralization of user information yet employ smart and simple federation techniques to allow for identity to emerge from the nebulous pockets of personal information scattered around the web.

Interconnectedness is another facet of an ecosystem, as the output of one system is the input of another system. Companies can no longer afford to dictate how the internet as a “platform” works as there is only value in the network effect; Companies have to come to the realization that value is in the connections, yet not so much in just the information itself. Far too many companies want to try and create stockholder value through creating a small closed ecosystem (a “sandbox”) and then trapping users inside that sandboxed ecosystem, holding them hostage with no other choices after lock-in. Today the internet is about value coming from choice and “sandboxes” limit this value through their constrictive nature. In the new linked data ecosystem, a sandbox or data island has little to no value as it does not benefit from relationships with other data, providing less context through fewer connections. We are quickly coming to a day where how we connect our data determines just how much value that data can have as it determines or limits how the data can interact with the greater data ecosystem.

In the current climate, sandboxes lose value with time due to the lack of connections. With open linked data, the ecosystem is always growing, as well as connections, so data accrues value naturally with time. I liken this to how money can earn interest in your bank account, but not in a mason jar buried in your back yard. An interesting example of connecting data together was a mashup between twitter and yahoo data:

That inspired Singh to create the TweetNews mashup, combining the real-time search Yahoo’s BOSS tools with the freshness of Twitter. As an added bonus each story listed in TweetNews’ results also shows the relevant tweets, which themselves often have additional links. A quick search this morning for “flight 1549″ yielded seven unique links from just the top result.

TweetNews is not only a fantastic resource, but might well be the best mashup we’ve ever seen. The remarkable part is that Singh was able to create it with less than one hundred lines of code — a testament to the power of Yahoo Boss and APIs like Twitter’s.

Mashups like this are the basis for a whole class of simple interconnected web building blocks that will evolve the data ecosystem towards new and interesting places. We need connections not only between sites, but between data as well, to create feedback (which we established earlier as a driver of growth in a system). Feedback along a system of connections in an ecosystem will create the building blocks for all sorts of new and interesting systems.

The interaction of an ecosystem’s parts at the local level create the ecosystem through emergent ripples. The local interactions we start with dictate which possible set of stable states we could arrive at. Open and free local interactions will more likely result in a more democratic and decentralized web data ecosystem. Closed and proprietary local parts will result in very muted, controlled, and color-less reactions. The freedom of choice at the local level is what drives any dynamic decentralized system, and users need to be able to choose where their data goes, and how it gets used. However, every ecosystem has to start somewhere, and the mechanics to boot up a successful ecosystem can be daunting.

In the book “Out of Control“, Kevin Kelly talks about recreating a lost biological ecosystem from scratch, and the difficulties involved in such a process:

The problem Wingate faced was the perennial paradox that all whole system makers confront: where do you start? Everything requires everything else to stay up, yet you can’t levitate the whole thing at once. Some things have to happen first. And in the correct order.

This is basically the issue of having a highly interdependent system that needs the parts to grow together in a cooperative but autonomous manner. Kelly goes on to make several other observations about ecosystems:

  • It was very easy to arrive at a stable ecosystem, if you didn’t care what system you arrived at.
  • The larger an Ecosphere is, the longer it takes to stabilize, and the harder it is to kill it. But once in gear, the collective give and take of a vivisystem takes root and persists.
  • Evolution not only evolves the functioning community, but it also finely tunes the assembly process of the gathering until the community practically falls together.

Which sum together to essentially mean that an ecosystem doesn’t just happen in an instant, it has to be “booted up”, so to speak, and even beyond that it has to organically develop in a cooperative manner. The advantage for any biological ecosystem is summed up by Kelly as:

Given a slim foothold, the remarkable latent power in interconnected green things can launch the law of increasing returns: “Them that has, gets more.”

Which brings us back to our system of linked web data; I hypothesize that the same effect is true for web linked data, as there is and will continue to be remarkable latent power in interconnected linked data (and services) which will continue to launch the law of increasing returns. I also assert that web linked data and dynamic discovery will be a driving force in the next wave of web technology. Linking of data and automated discovery (via techniques such as MetaData discovery / LRDD) will be the catalyst for co-evolution in future web iterations, but the evolution of these discovery and connection mechanisms won’t be without their growing pains.

Having a thriving, growing vivisystem of interconnected data is a high and mighty idea, but there are many considerations to take a look at under this scenario.

  • Technological hurdles
  • Not all data will be linked
  • Not every application needs to be open source
  • Privacy and Control balance
  • Threats to entrenched institutions, Opportunities for the grassroots players

Technology is generally a double edged sword; we come up with great ideas that can advance the art, but then have to go about finding design tradeoffs so that current technology can meet future needs. The first generation of the web had us in a single document mindset with some slightly dynamic interactions. The second (current, or “web 2.0″ as Oreilly calls it) generation has been about moving towards social networks, explosion of rich online media, with an explosion of personal publishing. All of this has provided us with an abundance of information about us scattered across the web.

I believe the next wave of the web will focus on dynamic discovery and federation of these resources to produce more sophisticated and intuitive social media. However, currently we struggle with finding and linking this data together in a easy and cohesive manner, as we have bits of data scattered across a very distributed data store (the internet). If we look at it from a computer science historical standpoint, I think the biggest issue with respect to linked data is basically web2.0’s version of shared memory and interprocess communication. I think this topic will have to be widely discussed as we approach issues such as:

How will we share and relate facebook data with myspace data, and pull that merged data store into a third party application securely and intuitively enough for the average user?

Another issue that I’ve seen brought up is the fact that not all data should be linked together. I think this is very important from the since that data control should be granular enough that we (the user and owner of our data) can make those distinctions at our own discretion. Data ownership and privacy will continue to be big topics in this arena.

Regardless of the hurdles, the companies that continue to wire into the data ecology will find their net returns magnified over time. The companies that continue to live in their sandbox will have to rely on considerable barriers to entry surrounding their market in order to continue operation. Automated discovery and wireup of services will be a key driver in the bootup of the web data ecology. More and more, the data ecology will tie together the desktop, web, traditional media, organizations, entities and closed systems into a computing space that is more open and interconnected than ever before.

7 Responses to The Data Ecology

  1. kotificekav

    kotificekav…

    Male Yeast Infection Cures

  2. JOE


    PillSpot.org. Canadian Health&Care.Best quality drugs.No prescription online pharmacy.Special Internet Prices. High quality drugs. Order drugs online

    Buy:Female Cialis.Advair.Zetia.Nymphomax.Zocor.Acomplia.Lipitor.Lasix.Cozaar.Female Pink Viagra.Lipothin.Ventolin.Prozac.Benicar.Amoxicillin.SleepWell.Aricept.Buspar.Seroquel.Wellbutrin SR….

  3. Dodge

    error http://nkatanau5d2.AUTOPARTSTHAI.INFO/tag/error+problem+Dodge/ : error…

    problem…

  4. racing

    helmet http://wbp.s21.202i.ca : zeronine…

    racing…

  5. EVAN

    ████████►BUY VIAGRA◀███████…

    ████████▲▲▲▲▲▲▲▲▲████████…

  6. ARTURO

    Abilify@official.site” rel=”nofollow”>.

    Buywithout prescription…

  7. ANTHONY

    rxlist@abilify.now” rel=”nofollow”>..

    Buydrugs without prescription…

Leave a Reply

You must be logged in to post a comment.