Categories
Academia Personal Publishing

On parallels with academic citation networks…

As ever when I’ve written something long and vaguely serious, I can’t think of anything to talk about for days afterwards. So to try and break me back into the writing habit, I’m going to talk a bit about the response that Discussion and Citation in the Blogosphere has received. As NSLog() has pointed out, it’s not the most revolutionary of posts, but I think sometimes it’s still important to state what we believe to be obvious – either to have it challenged or because other people don’t find it obvious. I think both types of reaction have taken place in this particular case.

(1) A few responses to comments

I’m going to start off by looking at a couple of the comments that I received about the piece. Jumping right in, these were (1) that I didn’t talk about the kind of indented hierarchical threaded-discussion boards (in which discussion can take a much more non-linear approach than my diagram suggested) and that (2) my diagram of micro-paradigm shifts was too neat and doesn’t mirror reality (Microdocs).

Firstly I’d like to say straight-away that they are – of course – both right. Real-life is always messier than abstractions, and I could never hope to have talked about all the kinds of online discussion boards that exist.

In the case of the indented-threading models – all I can say in my defence was that the piece I was trying to write wasn’t so much about the directionality or linearity of message-board discussions, but more about the filtering mechanisms implicit in the system. Another commentator) also pointed out that some message-board systems allow trackback on individual posts. Here I can only say that there’s a certain degree of bifurcation going on there – I can’t see a way in which those people within the social system of the board itself can help the filtering process for strangers, except by moving outside it and linking to it from outside (say from a weblog). And he also talks about weblog / message-board hybrids – which again I can only say that I wasn’t specifically familiar with. There are a lot of interesting models for online fora – and I hope people forgive me for concentrating for the most part on the one that the most people are familiar with… I think the most important thing that I want to say about this stuff is that I was definitely not undermining the importance of message-board technology in community-building. I’m a dyed-in-the-wool advocate of message-boards and have been playing with some new models in moderation and administration over at Barbelith Underground for several years now.

As regards my diagram being too regular and not reflecting reality (again cf. Microdoc’s diagram of this debate)- where they see difference – I see considerable similarity. Let’s call those posts that have one or less inward link “supporting” posts, and all those with more than one “structural” posts. If one does this, then even at this early stage it’s clear that only a couple of posts are driving the discussion forward. At the moment the debate has bifurcated (I specifically mention that as a possibility in the last post) – and no doubt one of those will be taken further by a subsequent structuring posts at some point. While the reality will always be messier than the abstracted diagram, I believe that (if we give the debate time enough to develop) the two diagrams will come to look more and more similar.

(2) On parallels with academic citation networks

Now I’m going to turn to another common response to the post. A few people have argued that (i) the existence of peer review mechanisms and (ii) an expertise-based barrier of entry makes academic filtering mechanisms very different from weblogging ones. I’ve seen this position articulated on a few sites – particularly 2lmc, commonplaces and a comment by Ross Mayfield on Many to Many – but I’m going to concentrate (yet again) on the response from Microdoc because it’s the most succinct and clear:

There is a substantial difference between writing an academic paper and having it published in comparison to blogging. In the academic world, I write a paper, have my peers review it, and then I submit it for publication where it may go through another review process, and eventually be published and it is from that paper that has two or three reviews that people will cite in their papers. That is, the academic paper is already “authorized” or “reviewed” and therefore has some weight already.

This is certainly true – there is a substantial barrier to entry in writing academic work. You have to be (to an extent at least) an expert in your field before your words will be seen by the rest of the community. And that means you also have to be an expert in your field before you can cite another article as well (although you don’t have to have the same level of expertise in the field of the article that you’ve cited).

But once you are inside that community of people, what then? Articles are not cited an equal number of times and nor are they given same value within the community – these mechanisms of citation and linkage appear to occur in almost exactly the same way as within weblogs. Individual scholars choose who to cite through a complex balancing act of who they wish to credit to, who directly inspires them, who they have to employ to back up their arguments and which articles have achieved such value and ubiquity that you can’t have a discussion about a given subject without citing them (this last one is more common among graduate students persuing a doctorate). Some of these citations consist of nothing more than a vote – a gesture that the article concerned is pertinent to a discussion. Often articles (or books) crystallise a discussion and are treated as a baseline from then on.

Essentially – the only difference that having barriers to entry into the community makes is that the criteria for judging whether a piece of writing is worth linking to may be different. The mechanisms, however, remain identical. Certain articles get cited, others do not. Discussion happens in a series of discontinuous leaps – sometimes collapsing back onto itself, sometimes bifurcating – with the community self-filtering the good stuff to where it’s most likely to be seen.

Categories
Net Culture Personal Publishing Social Software

Discussion and Citation in the Blogosphere…

A few days ago a stunningly interesting article was published on Microdoc News called Dynamics of a Blogosphere Story which aimed to look at exactly how a story or discussion moved through weblog space. I’ve been thinking along similar lines for a while now – at least partly as a way of articulating my problems with the iWire Scaling Clay Shirky piece. I’ve been trying to put down on paper why I think the iWire assertions are incorrect and to develop an alternative model of how discussion can occur usefully through the ‘blogosphere’. In fact more than that – I wanted to illustrate why I believe the system works to actually generate better discussion than a simple discussion board – by (on average) helping to hide the bad content and making it easier to find the good content. I most recently wrote something that gestured in this direction (How do we find information in the blogosphere?)

The Microdoc News piece is particularly illuminating because it’s dragged some actual examples into the fray. After examining 45 “blogosphere stories” they found four kinds of posts and a relatively predictable pattern of their usage, with an initial weighty post generating an explosion of smaller fragmentary reactions, commentaries and votes (cf Casting the microcontent vote). These posts are then aggregated or collected into another weighty post, which itself might have the potential to push forward the debate. Their four example posts are:

  1. Lengthy opinion and molding of a topic around between three to fifteen links with one of those links the instigator of the story;
  2. Vote post where the blogger agrees or disagrees with a post on another site;
  3. Reaction post where a blogger provide her/his personal reaction to a single post on another site;
  4. Summation post where the blogger provide a summary of various blogs and perspectives of where a blog story has got to by now.

I’ve been working in similar directions as this – in an attempt to resolve the questions, “Can you have good discussion across the blogosphere?”, “What is the nature of that discussion?” and “How does it differ from message-board conversation?”. And I think the answer lies – yet again – in going back to the beginning and looking at the way the web in general (and weblogs in particular) operate like an academic citation network.

The origins of the web are highly academic in origin. So it’s hardly a surprise that the combined use of hypertext and discreet blocks of content comes to mirror academic citation in research papers. Apart from a few wry-eyebrow-raising academics, I think most of us would agree that the idea that useful debate cannot happen in academic discourse is patently absurd. After all, the vast bulk of academic research in both the humanities and sciences is published as part of an ongoing conversation involving statements and citations.

The weblog sphere has taken on a great many of the characteristics of the distributed academic community’s citation networks – just at a much smaller, faster and more amateur level. Consensus can emerge (briefly or otherwise), reputations are made (deservedly or not), arguments occur regularly (usefully or otherwise). Nonetheless, discussions do occur, they do progress and they do reach conclusions. But it’s happening at a granularity of paragraphs rather than articles. It’s happening at a scale of hours rather than months.

The Microdoc article could easily have been written about citation networks in academic literature. And when we realise this, then lots of other things become clear too. The answers to my earlier questions are beginning to come into focus. And they remain basically simple answers too:

  • “Can you have good discussion across the blogosphere?”
    There are clear analogues for the way discussion over the blogosphere operates. One of those is academic / scientific discourse. This suggests (although it doesn’t prove) that not only can we have good discussion over the blogosphere, that it was almost optimised in such a way to make it inevitable.

  • “What is the nature of that discussion?”
    Perhaps we can answer that now by comparing the Microdoc article with studies of academic discourse like Kuhn’s Paradigm Shifts.

  • “How does it differ from message-board conversation?”
    If we know what the answer to the previous question is, then maybe we can answer this one by a simple direct comparison.

So here’s my suggestion of how we can usefully conceive of discussion occurring across the blogosphere (and I think it’s a model that’s practically explicit in the Microdoc article, so forgive me if it’s boring). We should think of it as a kind of micro-paradigm shift – a kind of hyperactive academia, where discussion moves forward in discontinuous chunks – with an initial weighty post articulating a position that is then commented upon, challenged and cited all over the place. But the debate doesn’t move forward until someone manages to articulate a position of sufficient weight and resonance to shift the emphasis of the discussion to their new position.

The weight of these debate-structuring posts can often be measured in terms of aggregated insight – in which case it’s a purely progressive model – an individual synthesizes all the interesting comments made by everyone else and pushes it slightly further, generating a new baseline from which the conversation can continue. On occasion, however, it would still be possible that an individual’s reputation would be weighty enough that everything they say defines the scope of the debate – that smaller dissenting voices would not be heard – and the debate would be carried behind a leader of some kind. And of course there are the times where a debate fragments or polarises, where more than one of these structuring posts occurs roughly simultaneously, or with radically different views – bifurcating any debate. Nonetheless, debate remains a series of discontinuous leaps, structured by impactful posting.

Here’s a diagram that I think illustrates how I think discussion happens between weblogs:

This ties in well with my previous article on finding information in the blogosphere. Because the smaller posts with negligible insight, voting or replicated insight are less likely to be linked to, then they’re also less likely to be read. And yet their value remains – they represent the arbiters (in a distributed fashion) of what should be being read. The posts that one is directed to most quickly are these structural posts – places where some kind of micro-paradigm shift has occurred.

I’m going to end now with a bit of a brief discussion about the differences between this kind of debate and the kinds of discussion that one finds on message-boards. I’m going to start off with a comparative diagram:

On the left, you can see a normal piece of discussion – as it would occur on a threaded message-board. In this example, the top post is the first, the second post cites the first, the third also cites the first while the fourth cites both the third and the second but not the first. In this debate there is no filtering mechanism of any kind. If the second post is entirely off-topic or contains spurious information, then it remains very clearly in the context of the thread. And if that thread is linked to from elsewhere, there can be no simple evaluation of what posts are considered more worthwhile than other1 – the thread is either good or it is not.

On the right, you can see a simplified diagram of the passage of a discussion through a citation network. If there are filtering mechanisms functioning through the community (in our case people choose who to link to based on whatever personal preference they wish to express) then the most important structural posts will self-locate towards the middle, generating a clear (almost linear) movement of discussion from first principles towards a conclusion of some kind. The conclusion itself may never be met – consensus may never be fully reached – but positions with regard to this evolving dominant narrative will be reached by everyone. Those posts which are merely “I agree” or “I disagree” will be filtered from the public consciousness, even as they have fulfilled a valuable function in directing people towards the next structural post in their debate.

So – what does this all mean? In essence I’m arguing that debate across weblogs self-organises in a pretty useful way. But I’m not going to pretend that it operates perfectly or that we can’t do anything to improve it. However, it seems to me that rather than bemoaning the things that make debate across weblogs different, we should be trying to grease the wheels of those mechanisms. It’s my personal belief (and one that I’ve expressed before) that things like trackback and Daypop work so well because they are specifically building upon – enhancing – the mechanisms that make webloggia operate effectively in the first place. If you’re looking for more specific suggestions, then I think that a balkanisation of blogdex would help different those mechanisms work more effectively within smaller communities with different and more distinct interests. After that, I have no idea. That’s where you people come in…

Footnote: (1) Obviously Slashdot has made gestural moves in this direction, but there are some interesting differences between the way the distributed community of webloggers evaluate one another and the way it is handled on Slashdot.

Categories
Random

Jesus! Won't you people stop for a moment?!

Jesus! What’s wrong with you people at the moment? I don’t have time to talk about all the cool things out there, let alone all the things that just tweak my interest. Damn you for making me linklog. Damn you to hell and back…

Blogs-Clogging the net?

Entertainment

Geek Stuff

Design Stuff

There are too many sources for this stuff than I can actually count – but almost all of them will be on my list of weblogs on the right. If you liked this stuff, I’d very much recommend you wander through them at later leisure…

Categories
Random

Crème de Webloggia…

The last week and a half or so has seen me in fairly odd headspace – I’ve been pursuing a number of work-related leads, talking to substantial numbers of intriguing people and helping out friends with projects large and small. I have a number of forms I have to fill in and a number of phone calls I should be making. I have – of course – a number of bills to pay.

All this being the case, I’ve not had a chance to post as much as I would normally like. I’ve got a piece on the boil that I want to try and get out over the weekend if possible, but in the meantime you’re going to have to make-do with scraps.

  • Interview with Brent Simmons
    Among other really interesting insights comes a surprising and plausible statement: “I probably wouldn’t hire anybody for anything unless they had a weblog.”
  • Bucket ‘O’ iPods
    “At that price,” an Apple source said, “we’re actually losing money on each iPod. “But we make it up in volume,” he claimed.
  • Scott Mills’ Gay Bar
    What total unmitigated wanker thought this was a non-insulting, non-degrading, non-offensive idea? No, really. I want to know…
  • Aaron Swartz’ Buffy Epiphany
    “On February 21, 2003 on watched my first episode of Buffy the Vampire Slayer. Since then, I have watched every aired episode of Buffy, spinoff Angel, and creator Joss Whedon?s other show, Firefly, from the beginning, in order…”
  • Making Water Go Uphill
    This year’s Chelsea Flower Show includes an Escher-esque fountain in which water travels uphill. My mother is at the show this year, with a floral arrangement from Wroxham Flower Club.
Categories
Random

Pair.com and MT comments…

After a large number of protests from users of this site, I’m going to have to open up my comments-related problems to the floor. But first things first, I want to talk really briefly about pair.com – a hosting organisation that I honestly can’t say enough nice things about. They’re reasonably priced, helpful and have been genuinely reliable over the last few years. I’ve still got the Barbelith Underground running on a pair server, and with remarkably few problems… But there has been one thing that they haven’t been particularly ideal for: hosting Movable Type based sites.

In fact, my problems with running Movable Type on Pair kept me using Blogger for about a year longer than I’d expected. Pair have this kind of time-out running on cgi-processes that effectively means that (unless you know to run everything through cgiwrap) any decent-sized MT operation (say saving a new post) may cause the system to throw a wobbly. And you can forget importing large number of posts from other systems. I had months of trouble with that.

I want to make it clear that once I did install the site through cgiwrap, I didn’t have any trouble – so in a sense my only gripe is that Pair are different enough from other hosts in the world to require you to go through a different installation process. I can’t exactly blame them for that. And other people’s experience of their hosting may vary from mine, of course…

Anyway – back to my problem. I’ve got MT running cheerfully on Pair’s servers now. It’s all very smooth – except in one particularly difficult area. Everyone who uses my site realises quite quickly that I’ve got a problem with comments. In fact, often when someone attempts to post a comment to the site, they get returned a Server 500 error. In fact normally the comments have been saved and it’s just the weblog page that isn’t rebuilt. But people don’t tend to realise this, so there’s routine multiple-posting. It’s profoundly annoying. Does anyone have any brilliant ideas about how I could fix this problem. Is it something that I can do to make it less likely to happen? Or do I need to go and attack Pair with pinking shears?

Categories
Technology

Highly unoriginal thoughts about mobile devices…

Notes from a conversation with Dan Hill pertaining (in particular) to address books on mobile phones. I make no claim to their originality or their novelty. Almost certainly they’re on page six of a really well known influential book that I almost certainly should have read by now…

Thought one: The mobile phone address book as a web of trust. This is really trivial, but it’s also really powerful – the telephone numbers in your mobile phone all identify actual people (however you decide to encode the metadata of their names). The telephone number is like the unique id number that you give a field in a database. So what does it mean if a pair of phones have each others numbers in their address book? Doesn’t it imply a relationship? Perhaps even a similarity? Maybe it even means that you’re more likely than average to like each other? So if you pinged every phone that’s got internet access (and the phone was happy for you to do this) you could pretty easily make a social network map of pretty much everyone in the country. This is not a new idea.

Thought two Self-assembly address books. So you’ve lost your phone and with it you’ve lost all of your numbers. So you ring up two or three of your friends and they amend their record to your new number and you add their numbers to your phone. Then you trigger the ‘fix my address book’ trigger and sit back and watch. Your phone pings your friends’ phones. Their phones ping their friends’ phones. Everyone who has your old number in it is informed of your new number, and they ping your phone and build in the reciprocal links. And those people who appear most interconnected between the groups of friends you’ve mentioned are also added to your phone. An instant sense of your social network. An instant way of grabbing your local space… This is probably not a new idea.

Thought three Distributed 192. 192 was (until very recently) the telephone number for directory enquiries in the UK. You ring it, tell them the name and address of the person you’re looking for and they give you a number. Brilliant. Except if you don’t have their address of course. And it costs money and stuff. And it doesn’t work with mobiles. So what if instead of doing that, you typed in a search term, “Coates” into your phone and got it to ping everyone in your address book, aggregate the results and display them to you. Wouldn’t that be easier? I don’t know whether this is a new idea or not. I would doubt it.

Thought four Collaborative work over mobile phones. So you’ve got a web-of-trust and you have a communications medium. So basically that’s friendster then with a rather more intensive old-skool version of instant messaging (let’s call it “speech”). I wonder if there are people out there working on social software for phones. Or maybe social software that doesn’t actually have much of a human interface at all, something that’s really collaboratively sense related. Like a cyber-pet with two buttons that you can press – one if you really like a place and one if you really hate it. And then that’s geocoded and shared through your web of trust (because you’re similar to people you know). When you go into a place that everyone dislikes, your cyberpet freaks out. And if you go to a place that everyone likes, it starts to purr pleasantly in your pocket… I bet someone has thought of that as well…

Categories
Random

The End of Days…

I’ve just seen the final episode of Buffy. Here is my initial response:

I don’t really know what to say, because I’ve kind of been excited
about the whole thing but also kind of dreading it, because it would
have to be good, really – it would have to be good *enough* to be
a fitting end to all that had gone before, and I think it was. I really
think it was…

I’m delighted Angel came in and got out of it so soon. I’m delighted
that they didn’t push the Angel / Spike conflict and I’m really glad
that finally, at the end of days, they let things lighten up – to return
to the simplest elements.

Buffy’s realisation is interesting and unexpected. You get so used to
these bits of lore that you forget that someone had to think of them
eventually – be they man or god – and that things can can be remade
differently…

There were a lot of people they could have killed for cheap effect, but they
didn’t. The deaths were gruesome but valiant. Anya has never
been more glorious. Almost unmourned though, which was a bit creepy…

Andrew survived. He should and he did. He’s there to show that there’s
nothing simple about redemption. It’s not just throwing yourself in a pit.
Buffy’s always been better than that. And the little girls all around the
world… Awesome touch. Really nice.

Spike. Spike. Spike. They needed more Buffy subtext and explanation of this
stuff, but he was still pretty awesome. And I don’t know as yet whether
that means the move to Angel is a con or not. Certainly he’s never been
more glorious. It makes me wonder about the vampire with a soul
thing from Angel. Is his story over?

Willow the White, Xander and Dawn in the corridors. Shopping. Malls. School Buses and Slayers. It might not have been the ending that we dreamed of, but only because some parts (small parts maybe) weren’t dreamable about. And if you were worried about the scale of Willo’s tiny evil temple last year – then you’re not going to be let-down this year. This is pretty damn huge…

Categories
Net Culture

Is the UK falling behind?

Everywhere I look at the moment there are people working in the same areas as me going to conferences and festivals. God I’m jealous. They’re going to BlogTalk in Austria or they’re going to Digital Genres in Chicago or they’re going to Reboot in Copenhagen. But apart from my desperate overwhelming desire to go to all of these events (particularly after the world-expanding experience of ETCon) there’s only one thing I’ve really noticed about all these events. Absolutely none of them are happening in the UK.

But it’s not only conferences that we’re lacking. With a few limited exceptions, I think that the UK is beginning to fall behind (or is not moving fast enough to catch up with) the US in talking and developing the kind of thing that is being discussed at these events. Weblogs are a trivial but obvious example. The States has developed a certain amount of respect for the possibilities of the form, to the extent that acclaimed journalists feel comfortable starting weblog-style sites. And these sites seem to be gaining widespread core appeal from the rest of the country – weblogging has gone mainstream in the US so quickly and effectively so that it’s almost commonplace for writers of an equivalent standard to Julie Burchill to start their own sites.

In the UK, the only major newspaper to talk about weblogs in any ongoing or serious fashion is The Guardian. In the States (and in the international news media – ie. International Herald Tribune TV) it seems much more widespread. In the States’ tech community (ETCon for example), weblogs are also fairly central to people’s research into how information technology and the internet are affecting people – how potentially they could empower them (or – on occasion – discussing whether they’re disempowering them). Both AOL and Microsoft are working on – or rumoured to be working on – weblogging tech.

There’s a lot less of this stuff in the UK, and I think it’s a terrible shame, since we should be in a much better position than the rest of Europe to be at the head of this trend (since weblog software and weblogs themselves are often English-language). There’s a hell of a lot of potential for business around this stuff as well – so why isn’t it happening here! In fact there’s a whole exciting new raft of people thinking about, talking around and working in these areas, and none of it appears to be happening here in the UK… I think maybe that’s beginning to get me down…

Categories
Random

The Cat in the Hat…

Tom Coates

This is essentially a vague attempt to make it look like I’m not the most boring person in the world who only ever writes or thinks about weblogs, social software and work.

Categories
Personal Publishing Social Software

How do we find information in the Blogosphere?

It has become almost a truism in critical examinations of the Blogosphere to talk about how – with the explosion in weblog numbers – it becomes difficult to find the best insights on any given subject. I first came into contact with the clear expression of this idea in an article called Scaling Clay Shirky but it’s recently been pretty much everywhere…

I believe that there are some legitimate concerns in these sentiments, but I think fundamentally they miss the point – it’s my opinion that replication of content online and a massive increase in the number people posting about a specific issue does not constitute a problem for the blogosphere, but instead one of its most significant advantages. In fact I’d go further and say that where there are problems, these can be resolved by simply speeding up the self-organising mechanisms that are implicit within the blogosphere, which is, I think what sites like Daypop, Blogdex, Popdex and Technorati are currently doing, albeit in a reasonably primitive way. But I’m getting ahead of myself. Today I’m just going to talk about How do we reach 100% information saturation on any given subject in the blogosphere without reading anywhere near 100% of the weblogs in it? Or to put it another way: With everyone posting lots, does the system help me find the good stuff?

  • Before I start though – here’s a simplified, and easier to assimilate / read pdf version of what I’m about to say: scaling_clay_shirky.pdf [75k]

Let’s start off by aggregating all the possible insights about a given subject from all the weblogs that specifically refer to it. This total aggregation will represent 100% of the information available on the subject in the blogosphere at a point in time.

If information was distributed evenly throughout webloggery and weblogs were read randomly then take-up of information would be linear and stable – in order to get 100% of the insights, you’d have to read 100% of the weblogs.

Linear gradient

[In this first graph I’ve plotted on the left the amount of information that you’ve managed to assimilate versus (on the right) the percentage of the weblogs that you’d have to read in order to get that amount of information – in the very specific special case that information is distributed evenly and randomly. The features of this “special case” will gradually be removed over the rest of the article. Another point I should perhaps clarify is that I’ve tried to conceive of the bottom axis as also including the order in which one reads the weblogs – that should become clearer through the article…]

However, we know it to be the case that information will not be distributed evenly throughout these weblogs. Many weblogs will contain limited information of any kind. Some will contain a lot. Many will contain replicated information that could easily be found on other sites.

Graph reaches 100% earlier

In this graph, ignore for the moment the dotted lines on the left. they represent nothing but the uncertainly fo the beginning of the curve. This diagram takes into account that weblogs have different levels of insight withint them, and that information is often replicated (either by active memetic spread or because the insights are simple and common). In the vast majority of cases then – even given that you’re still reading weblogs in a totally arbitrary order – it’s likely that you’ll get extremely close to the 100% saturation point a significant way before you’ve read 100% of the available weblogs.

In practice – again assuming that you were reading the weblogs in a random order, it would be impossible to gauge the particulars of the curve that led up to the near-as-dammit-to-100% information saturation point. A sample curve would probably be organised in a series of steps – with gradual accretion of insight being the normal, but with occasional significant massive leaps also occurring.

The line becomes a series of progressive steps

Now – all these models have been based upon the assumption that the order in which the weblogs are read will be random. In fact nothing could be further from the truth. Some weblogs are clearly more likely to be read – this is not necessarily purely based upon the value of their contributions, but it’s not completely distinct from such valuations either. It would probably be fair to say that on average well-linked-to sites are more likely (albeit perhaps only incrementally) to contain insight than sites which are not linked to at all. Secondly, if someone does produce content of value and insight on any specific subject, then it is more likely to be linked to – which in turn increases the likelihood that an individual will visit the site in question.

Both of these criteria suggest that (in our attempts to reach the 100% insight threshold) we will be more likely to be initially directed to high-insight sites than low-insight sites. This changes our graph substantially.

The graph starts strong and levels off close to 100%

It seems likely, in other words, that even if there’s a limited tendency for sites with more insight to be read first – then the information accretion would be remarkably steep initially and the level off dramatically close to the 100% saturation point.

Hypothetical conclusions: For any given body of information on weblogs, no matter the rate of replication of information or the number of people who post exactly the same comments, close to 100% of the available insight can be reviewed by reading a disproportionately small number of sites – sites that will – as a rule – be among the first that they stumble across through their normal browsing and research patterns.

Related Hypotheses perhaps worth exploring: (1) The larger the number of posts about a subject (and hence the more likely replication) the smaller the proportion of those sites that need to be read in order to have reviewed close to 100% of the available insight. (2) The size of the available insight will increase as the number of posts about a subject increases (although perhaps not in linear proportion).