Categories
Personal Publishing

The Horseless Carriage…

This is a slightly rewritten and polished up version of a talk I gave to a Six Apart event (cf. On being on the panel at Blogs in Action) at London’s Polish Club a few weeks ago. It’s kind of a personal history of and exposition around weblogs and webloggery. This version has had some of the more colourful language removed. Warning – it’s quite long and it’s a bit of a mystery to me how I managed to get through it all in ten minutes.

Today I’m going to talk about the horseless carriage. Well. Kind of.

In the middle ages, people walked to places. If they didn’t walk, then they rode horses. Some attached carts to their horses. Some time later, those carts got suspension. That’s when they started to get really popular because people’s arses didn’t hurt so much. In the late eighteenth century – joy – the steam-based self-propelled vehicle was devised. By the way, many of the most notable advances in steam vehicles occurred in England – with the Lunar Society in Birmingham. These devices were known as ‘horseless carriages’. You may be familiar with their descendents. You may have used one of them to get to this event today.

That then, was the history of the horseless carriage – and I’ll come back to that. But now I want to look at a parallel history – that of the weblog. Many histories of webloggia have been written to date. The best is by Rebecca Blood and it’s called Weblogs: A History and Perspective. Reading it, you can get a real sense about all the technologies and analogues and developments that led up to a person pointing at something for the first time and going, “that thing there is a weblog”…

To summarise – some people say the very first site on the web, Tim Berners-Lee’s homepage, was the first weblog. On it he linked to new sites, services and pages as they started to appear. That experience rings nostalgic bells for me. I can remember a community of webloggers that did the same for each new person who joined us in writing rubbish on the internet. Those were the days.

The word itself, though, was a much much later development – and came in two stages. First Jorn Barger coined the term ‘weblog’ to describe his site Robot Wisdom. Subsequently, Peter Merholz shortened the term to the more commonly used ‘blog’. So he’s the person you should be looking to blame for that one…

So we had a name, but not necessarily any sense of what these sites were for? There was enormous debate about their essential character. These first practitioners saw the ‘point’ of a weblog as linking. The core of the enterprise for these people was finding cool sites and referencing them. Although perhaps that’s overstating it a bit, since even then people had a sense of the personal as well. Dave Winer famously said of weblogs that each had a human guide that you got to know. He talked about camaraderie and politics. I think he was right. I’ll get back to that too.

Alongside the people who viewed weblogs as linklogs, another group of sites that viewed themselves as ‘online journals’ were emerging from the more arty side of the internet. I was involved in this arty community and remember seeing sites like Ember.org and flaunt.net explore what it was to self-represent and write online. Glassdog.net was one of the many hubs of this period of artistry and creativity. The whole space was excitingly expressive, but not particularly social. Alongside all this, the LiveJournallers made their moves. All in all these were very similarly structured takes on a completely different direction.

After September 11th, yet another group of people manifested on the internet and chose weblogs as their means of expression. These people talked in terms of punditry and they thought of themselves as following in the footsteps (to praise or bury) of Glenn Reynolds. These people wanted to express opinions, debate and comment on what was going on in the world around them. These ‘warbloggers’ at times have started various snowballing news stories – particularly in the States.

And we’re still not done – linkers, journal-keepers and pundits have most recently been joined by the commercial people. America’s Gawker Media publishes weblogs around a whole range of subjects from politics to sex. Group weblogs like BoingBoing have gone pro as well – and they are, let me tell you, making a decent amount of money out it. Yay Google Ads!

All of this means that when people talk about weblogs they tend to talk about it in terms of other things they knew about beforehand that seem to be good analogies. Which one of these is right?

  • A weblog is a thing that links to sites elsewhere on the web
  • A weblog is a kind of online journal
  • A weblog is a kind of diary
  • A weblog is about personal publishing
  • A weblog is amateur journalism

Well, I’ll leave you with that question for a minute and talk a bit about my experiences online. When I started my site in 1999 it was already a year or two since Jorn Barger had coined the term. I was one of the very first batch of people to start using Blogger in the first months after it opened its doors to the public. When I started there were probably only a few hundred webloggers in the world. I think I started the day after Anil Dash. Each new weblogger who came online was a cause for celebration. We used to link to everyone. We’d redesign every week. Happy times.

Over the years I’ve watched as things that now seem obvious got invented. I’ve seen the concept of the permalink appear – and change everything completely. I’ve watched Movable Type appear, get huge and take over the world – and I’ve watched dozens of people try and create less capable knock-offs. I’ve seen the idea of a single-page archve manifest itself and then take a billion years to get into Blogger. I’ve seen the appearance (and I fear death) of Trackback. And the slow process to get comments on everyone’s sites. All the simple structures we operate with now – all the things we take for granted as being obvious – each was invented. And I’m lucky enough to remember each one turning up.

Alongside all of this, my own use of my site has changed. I started off writing about things I found on the internet. Then I started talking about things I did with my friends and what I thought about work. I explored the freedoms of being anonymous in public. Then my friends found out. Then I had this terrible pseudo abortive relationship and started writing secret stuff in the code of the site and being all melodramatic in public. Then I got excited about some specific subjects. Then it felt like everyone was impinging on my life and that everyone knew I had a weblog. Then I felt like I had no privacy. Then I stopped writing so much about my personal life and concentrated on writing on the things I cared about. Then the things I started writing about started to coincide with the things I was working on. Then I started getting into conversations with people who shared my professional capacity, and it having an impact on my working life. And that’s where you find me today – writin a mixture of personal stuff and ludicrously complicated stuff about social software.

And with all that experience, here’s what I’m here to say. Here are my observations about weblogs and weblogging:

  • The amount of weblogs that get a lot of traffic each day is pretty tiny in comparison with the number of weblogs in the world. And when I say ‘a lot of traffic’ I mean people who get (say) a thousand page impressions a day.
  • The average weblogger – the mainstream weblogger (and there are millions of the little buggers) ‘publish’ to a tiny audience. Many of these sites are password protected or completely private.
  • Weblogs are fast and rapid to write and as such they lend themselves to a slightly informal voice. People will tend to write kind of like they speak.
  • Most sites are one-per-person – this is the most natural state of affairs.
  • People cannot naturally stay on one subject – if you start writing a weblog about something in particular then in the end you’ll almost inevitably fail. The only things you can write about honestly and consistently for a long period of time are the things you’re doing and the things that you care about at any given period of time. These things will be perpetually in a state of flux, even if patterns will emerge…
  • The only exceptions to those rules being that if you’re doing something for money or if there’s a group of you doing it. That stuff’ll keep you on track and no mistake.
  • Running and writing a weblog is a social activity – it’s communicative – and you respond to what other people write when you write. It’s no fun at all if you’re writing into a vaccuum.
  • The average weblogger will often start off thinking that they can be anonymous and write about what they want. But they can’t and they figure that out soon enough when they get fired or when their boyfriend leaves them or their friends start looking at them funny or refusing to shake their hands when they’re at work.
  • That webloggery tends to successfully transmit itself down social and interest networks. People who enjoy writing weblogs tend to do so because they got into weblogging through a friend. When they come online they know that they’re already never going to be alone. They’ll have someone to read what they write.

Fundamentally, I think I can bring all of this down to one core statement – that for the vast majority of weblogs in the world the core over-arching principle is that of personhood. The weblog becomes an extension of yourself. A suit you wear, if you will. It’s like you’re controlling a whole prosthetic version of yourself. The tone of voice will be personal. The individual weblog like the individual person will benefit from feeling like they’re a part of a community. A weblogger’s community will work best when it has connections to (and overlaps with) that weblogger’s real-life community. Most importantly, an individual weblog/weblogger will care about what they care about and nothing more.

This means that whatever you’re planning to use weblogs for, then you’ll fid them most naturally useful if you keep the individual at the heart of the enterprise. That means if you’re interested in knowledge management, or in community generation or in using them for publishing or whatever, then you should keep that idea of an individual voice at the centre of your thinking. Even Fleshbot has a sense of an individual behind it – an editorial tone that feels personable. Not that I’ve been. And I’m sure none of you have either. And nor should you.

And if you’re thinking about the future of weblogs then you should think in terms of how to support the individual in their conversations and engagements and social networks. Maybe that means developing Livejournal explicit-relationships functionality. Or maybe it means bits of Flickr. Or maybe it means better conversation-tracking.

But it’s all got to be about the individual. And my preferred way of expressing that is that a weblog is a representation of a person. I think that’s behind all of the definitions that people came up with before. I think that’s the core principle that stands behind the idea that a weblog is a journal, or that they’re collections of links that I’ve seen on the web or that they’re space for an individual to undertake a new form of journalism, etc. etc.

And my contention – to bring us right back to the beginning – is that all of those statements (“A weblog is a kind of diary”) are kind of like saying that “A car is a horseless carriage”. For years we had to descrive cars by reference only to things that had come before, but we don’t need to do that any more. Enough time has passed that we can describe a car without talking about its origins or its analogues – without talking about things that are kind of like it. Now we can start to conceive of a car without thinking of horses and suspension and traps and carriages.

And I think we can now start to do the same thing with weblogs – we can push past describing them as being like empty books that people write in and keep private, we can say that they’re not just records of sites that we’ve visited, we can stop trying to compare what we’re doing with the actions of professional journalists writing for newspapers. I think we’ve got past that now. I think we can call a blog a blog.

Categories
Personal Publishing Social Software

Trackback is dead. Are Comments dead too?

I think it’s time we faced the fact that Trackback is dead. We should state up front – the aspirations behind Trackback were admirable. We should reassert that we understand that there is a very real need to find mechanisms to knit together the world of webloggers and to allow conversations across multiple weblogs to operate effectively. We must recognise that Trackback was one of the first and most important attempts to work in that area. But Nevertheless, we have to face the fact – Trackback is dead.

It has been killed by spam and by spammers – by the sheer horror of ping after ping pushing mother/son incest and bestiality links. It has been killed by the exploitation of human beings quite prepared to desecrate the work of tens of thousands of people in order that they should scrabble together a few coins. It has been killed by the experience of an inbox overwhelmed by the automated rape of our creative endeavours.

In a way it should have been predictable from the beginning – we should probably all have spotted that functionality that allows individuals to place links on other people’s sites could be exploited by spammers. Some people did spot these problems, but even they had no sense of the scale. Their responses were – at best – muted. But now I think we have to accept that the evidence is in. The situation is clear and it is not good. We’re engaged in an arms race with the worst kind of people, an arms race that has raged across other communications media and which we show no sign of winning. For me, the negative experience of dealing with trackbacks has long-since overwhelmed the benefits it brings. For these reasons, I’m turning off all incoming Trackbacks on plasticbag.org from this moment on.

Of course the problem isn’t restricted to Trackbacks. The systems we’re using to manage comments on our sites are probably under even more strain from spammers. The only reason I’m prepared to put up with this in the short term is because the comments seem to be more useful to more people at the moment. But I’m clear in my mind – we’re rapidly approaching a crisis here as well, and it is likely to be one that ends in the abandonment of comment systems as well.

And how to solve this problem? I don’t think it’s a matter of iterative improvements. I don’t think this problem can be solved by engaging in the arms race. MTBlacklist has saved my life, but it’s a patch, not a fix. No, any solution for this problem will be conceptually distinct from our current approaches. It could be a centralised approach – letting professionals manage the data that links our communities together. It could be a radically decentralised one beyond what we’re working with at the moment. I don’t know for certain. But I think we should be looking back to the origins of the weblog and seeing how things operated then.

Originally there was no weblog spam and yet conversation and discussion still existed. If an individual posted something and another individual wanted to respond to it, they simply wrote a post on their own site linking to the original. This environment was entirely free of spam. It was completely clean. I can’t help thinking that maybe we need to start thinking in terms of approaches like that – where there is no automated functionality that could be robotically exploited. Or perhaps we should be looking in other directions – how can we abstract out the kind of social networks that lie behind Flickr to structures that we could overlay across the internet as a whole. A question I think we should be asking is how could we build services that let you decide precisely which groups of people should be able to see, link to, ‘trackback’ or comment on the work you do in a decentralised, disaggregated way?

But this is to get ahead of ourselves. Today, we are here to mourn the passing of a great friend and a solution designed for happier and less cynical times. Trackback, I come to praise and bury you. May you rest in peace…

Categories
Personal Publishing

On being on the panel at Blogs in Action…

Last night I was the opening act of a sexy little conference about how people might use weblogs that Alastair Shrimpton of the UK branch of Six Apart was hosting at the Polish club near Imperial College. I was a fairly late addition to the schedule, but I don’t think I roamed too far off the point. Suw Charman wrote some insanely intense and accurate notes about the whole thing over on Strange Attractor, including this near-perfect transcript of a part of my talk:

If you want to use blogs for what they’re most naturally useful for, if you’re trying to exploit what makes them brilliant, keep the individual at the heart of it. Knowledge management, or community building, or publishing Wonkette style, keep the individual at the core, be conversational. Even Fleshbot has an editorial tone, not that I’ve ever been, and neither have you, and nor should you.

It was a very interesting evening all things considered, although I sometimes get the impression that I look like I’m having too much fun or am misbehaving when I do these things. Hopefully I didn’t say anything too out on a limb. I’ll probably stick up my notes at some point.

I think the best speaker of the evening was John Dale who has been putting to gether Warwick Blogs for Warwick University (which look like a pretty stunning implementation of the weblog concept inside an academic context). I think the part of his talk that surprised me most was that of everyone I’ve ever seen trying to market and publicise weblogs they seem to have done it best. They had a whole series of pretty stunning stickers and posters and fridge magnets that they distributed all over the campus. I’ve never understood why weblogging companies don’t explicitly target these venues – surely if you get them when they’re young, imaginative and have a lot of free time then they’re likely to stick with you for years. Here’s a lifted image from the Warwick blogs site to give you a sense of the way they branded the thing:

Checking out his site, I see John was at ETech too. It’s a shame we didn’t meet each other in that context too. That could have been fun.

Categories
Personal Publishing

Three things I wrote ages ago on weblogs, publishing and community…

For a variety of reasons I’ve been digging up some old stuff on the publishing of weblogs that I’ve written on this site of for conferences or whatever, and I thought I’d reference them again here because I was surprised by how much I still agree with them (and how much they’re still relevant) even though they’re a couple of years old:

  • Some ways that mainstream web media can interact with the “revolution” in personal publishingÖ (Powerpoint – 3Mb-ish)
    “Rather than treating weblogs as an object to be studied or as a territory to be claimed, mainstream publishers should be looking to build tools that increase interaction between the two types of site – making both better in the processÖ

  • On super-distributed and
    super-localised online communities
    (Powerpoint – 5Mb-ish)
    “The weblog world is a super-distributed community where – much like newspaper columnists – there are ongoing and involved discussions and conversations happening not on one site – but distributed across many hundreds of thousands of sites, each one radically personalised – a representation of its creator in cyberspaceÖ”
  • Why Content Publishers shouldn’t host weblogs (February 2003)
    “There is no reason to assume that being in the position to encourage the take-up of weblogging will mean that you’ll keep the ones you want to keep using your service. In fact: 1. The longer someone has been weblogging, and the more invested they are in it, the more likely it is that they’re going to want to get a domain name of their own. 2. These same people are also likely to want to use extended functionality at some point and will probably try and move to a dedicated application or provider who can more adequately fulfil their weblogging needs. 3. A dedicated long-term weblogger may not wish to be associated with the brand of your service any more and may choose to leave.”
Categories
Personal Publishing

On hybridised RSS feeds as evidence of a need for weblog refactoring…

Right then – I feel a bit like I’ve got the wind behind me and it might not last so I’m going to plough right on into another subject before the demons of fear crawl up my leg. Dave Shea’s written a really insightful little piece (after Haughey) on the current trend for hybridised RSS feeds merging del.icio.us feeds and Flickr photostreams with normal weblog posts. Here’s some of the best stuff:

The problem I have is quite similar to what Matt describes: when new items show up in my newsreader from people I enjoy reading, Iím often mildly disappointed when itís simply a new camera phone image, or a couple of sparsely-described links to stuff Iíve already seen. Iíll go one further though, and say this about the practice: itís really damaging the signal-to-noise ratio of content I otherwise love.

And all I can say is that I couldn’t agree more. My RSS feed at the moment is a monstrous atrocity. It’s vile and clumsy and ugly and infuriating. But it’s as vile as it is because that’s where the software and systems that I want and need to use have led me. I want to fix it. I’d love to make it better, but to do so I’d have to sacrifice something somewhere along the way.

When I started weblogging it was all about the links, and the little asides and the one-liners and guff like that. I believe fundamentally that weblogs are communicative and social rather than being publishing ((Weblogs and) The Mass Amateurisation of (Nearly) Everything) and this seemed like a natural register for that kind of thing. Everything was nicely informal and easy. But the systems had their problems – Blogger style permalinks were an ugly and clumsy way to reference particular pieces of commentary, so people moved towards using Movable Type with is individual archives and built-in comments. Page-per-post sites, though, require a different form of writing – people change their interactions and the sites become less agile and less cross-conversational. Shorter posts get lost, posting becomes more of an effort and many things that one might like to talk about in passing get thrown away. There are benefits, it’s clear – you write longer, better, more considered things. But they’re not the same things that we used to be writing. You can see some of the transitions that occurred when I moved to using Movable Type in the visualisations that people did for me last year: Visualisations lead to Self-Knowledge.

I think this shift was really what caused the desire for people to start link-logs. Sites on the internet that we responded to emotionally but couldn’t find time to write Movable Type-length posts about were getting no commentary at all. People who were busy found that they simply weren’t writing anything for their sites at all – the length of posts that using MT seems to inspire (in my case anyway) started to be incompatible with post-work energy levels. The concept of a simple, throwaway linklog seemed to present a cure to this situation. People could post things and get their thoughts out into public quickly and easily. Or – just as importantly – they could keep track of the links that they wanted to keep track of – back to the weblog as personal link-organiser. It seemed like the perfect solution.

Certainly it didn’t seem to matter much whether or not the links were unique or whether everyone else in the world had posted them too. This was the time that saw the emergence of what I call microcontent voting. The more people linked to something, the more people saw it, obviously – but now it was becoming an exponenial relationship rather than a linear one. This was because of the newly significant presence of aggregators like Blogdex, Daypop, Popdex, Technorati. Now there was an effective feedback loop – if something got the attention of a certain number of people in the ecosystem, it would be brought to the attention of even more. A site that got only a few links could be at the top of the aggregators within a day, and experience thousands of visits immediately.

The links had – in part – ceased to be just something you did individually and instead became something that you did as part of a community that one way or another helped information bubble up. I think this found its best expression to date in Hotlinks, which I really think needs only to be abstracted into a more generic service to take over the world.

But linklogs had their own problems – how should they be integrated into a weblog site? Should they be individually permalinkable or should they be aggregated in daily clumps? Should they sit in central bars or be relegated to little-read sidebars? Different weblog authors found different solutions to these design problems – with kottke and Anil probably being the groundbreakers in this area. But it still felt a little clumsy. The weblogs were capturing everything again – covering a whole range of content from long-form essays through to the smallest particles of link data – but it wasn’t sitting together well – the weblog softwares didn’t seem (and still don’t) to have found a way to really consolidate this kind of combination management / conversation / publishing role. I personally strongly felt that my link-logs should be posted in daily digests as part of my main weblog. I’d done groups of links as posts for years, and I didn’t see why that should change now. But these things were far from simple to perform.

The introduction of moblogging caused another problem – the infrastructure for handling phone to weblog stuff effectively had never seemed to emerge in an elegant, simple and un-hacky way – clearly they’d need different templates for a start, and again decisions had to be made about how to integrate them with the design and layout of sites. Should the photos be in a sidebar? Should they be a different and separate weblog? Should they be posted each day, or immediately as the photos were taken and coming in?

The two sites that – for me – changed this picture enormously were del.icio.us and Flickr in that they both provided me with new tool-sets for managing stuff and they both also gave me mechanisms for posting to my weblog in ways that seemed to afford more benefits than they had costs. Firstly, del.icio.us gave me the ability to organise my links more effectively than my weblog had – and made the process of refinding my links much much easier – and some slightly provisional settings do exist for publishing a daily digest to a weblog. So I get the space to file my links in a way that makes sense to me, and get to expose this action back to people. Except of course it’s not a finished piece of functionality – I can’t give it alternative formats for the title, I can’t change how each link is formatted and I can’t stop it publishing everything to both my site and weblog with extra fields of information exposed that I don’t want people to see (like the del.icio.us tags that each post has). I can hide these tags on the web because I control the stylesheets. But it’s much harder (rightly) to do that kind of thing in an RSS reader. The consequence? The posts generated by del.icio.us in my RSS feed are ugly and feel clumsy – they’re functional, but they’re not how I would have them…

And the same is true of my Flickr photostream. The pictures that I take are aspects of my life, and I want them to be exposed to people in the same way as my overt posts are – but I have non flexibility. I can have them posted directly to my site, but then they don’t feel cleanly on my site in that they’re still hosted elsewhere. And I can’t aggregate them into clumps usefully – every photo appears as I take it, and I can’t make a daily archive of them to be posted into the body of my site. So feedburner becomes the best option for bringing these very separate things together – except it has design problems of its own. Titles of Flickr photos don’t seem to update and the integration of feeds – while beautifully elegant technically – does seem to create unbalanced or confusing feeds to experience… And if you’re asking why I want to keep everything together in one Feedburner feed at all, it’s because the functionality that feedburner affords me in tracking the number of people reading the damn feed is so incredibly useful to me. And I wouldn’t get that ease or accuracty of calculation by having multiple feeds…

Phew! So that’s the history of all weblog functionality in a nutshell, which wasn’t quite what I was expecting to write. But the point is that all through the history of weblogs, the technologies have opened up new doors and created new problems. Different functionalities make it possible to do one thing much more easily or effectively, but they come with a smaller cost elsewhere. We’re definitely moving in a positive direction, but each time we make a leap to a new level of functionality, things get more complicated and fractured and difficult for a while. Our feeds are ugly, and they don’t quite work right and neither do our sites. But this is because the technologies that we’re using to organise and collate our lives aren’t quite communicating perfectly and aren’t splicing themselves together in the way that we might like. And things are getting ever more complicated, and we need to do something about it.

And I’m beginning to think that the thing we have to do is start to reconsolidate and refactor the weblog concept itself. We need to take a step back for the first time in years and re-ask the question – what is it for? How do we find something hard and shiny in the middle of all these hybridised trends and make it the ideal shape to support all the other services that will grow upon and around it. In a whole range of issues – from the collation of our browsing to the handling of our photos, from the posting of our opinions to the way we’re relating to our social networks – the traditional weblog format is starting to buckle. So rather than concentrating on the specifics of clashing informational streams in our feeds and looking to fix them, I’m going to make the problem even larger and ask – are these clashes evidence of something more seriously broken? Does anyone really have any idea what we do next?

Categories
Personal Publishing

On Six Apart and Livejournal…

So Six Apart have bought Livejournal after all. Here’s a brief story about how I didn’t find out about it… I was chatting to Mena yesterday about something else and couldn’t resist querying her about the rumours. Very patiently, and I suspect with some kind of tiny smile just about visible on her concealed-by-IM face, Mena politely informed that she didn’t respond to rumours and speculation. To which the only real response was, “If you won’t respond, will Anil?”

I didn’t get a direct answer to that question, but I did promise to let Mena know if Anil spilled the beans. I figure she’d want to give him a present or a bunch of flowers or something. She’s so nice. But then I forgot to ask him anyway. And so it goes…

Anyway, the news was announced relatively late last night as far as I can tell, which means that those of us on this side of the Atlantic probably heard about it before most of the US webloggers. Mena’s done a comprehensive write-up on Mena’s Corner – as has Brad on his Livejournal. They’re both pretty excited about the acquisition – and I think with good reason. There are people who aren’t so sure about the whole thing – and I think their anxieties are pretty reasonable (Danah’s post on the subject in probably one of the best), but I really think there are more opportunities here than problems. Bringing Livejournal into the weblogging fold (or bringing weblogs closer into the Livejournal fold) could be tremendously interesting. I’d be interested to see whether there was any further development of Livejournal-style networking and social networking stuff. It would seem to have lots of possibilities (and even webloggers love this stuff – check out their adoration of Flikr if you don’t believe me). And if it was possible to bring some of that overt social stuff over into MT or Typepad – by cross-platform protocols or whatever – then I can’t help but think that would be a good thing. I’m also really interested in the role of Typekey in all of this stuff.

The real threat is to the successful integration of the whole thing is – I think – not that Livejournal’s community is put under threat, but that their understated aesthetic has a kind of punk-cool and authenticity that could be polished out of existence by the really terribly good designers at SixApart. Livejournal has the feel of a grass-roots, bottom-up, wildly successful community project, where MT, Blogger et al kind of don’t. The slightly ramshackle look-and-feel is a core part of that, I think.

Anyway, all in all a pretty interesting day for webloggery and Livejournalhood. I’m really interested in how Blogger fits into all this though. At the moment, it’s pretty clear that the fire and the dynamism is with SixApart. They’re the people innovating (at least publically). Which leaves me with two questions (1) are Blogger becoming also-rans or (2) what the hell are they planning?

PS. Hey, guess what! Latest rumour I heard was that SixApart were going to acquire the previously-public sector, non-weblog-producing national-broadcaster of the UK, the BBC. The move is expected to slightly increase their staffing levels from something like 70 people to something more like 35,000 while reinforcing their otherwise lacking “Eastenders” arm of the business! Ok, maybe I made that last bit up…

Categories
Personal Publishing

Visualisations lead to self-knowledge…

This is absolutely the last post on visualisations of plasticbag.org data unless someone sends in something else really cool – and this one is more of a clarification than anything else. Daniel Boyd created this really cool model of post time that had accurately predicted both the times I woke up and went to bed, but it had seemed to go a little pear-shaped as a visualisation at the time I switched to using MT. I’ll post the original graph below for those of you too lazy to look at it in its original context (slackers):

Anyway, the confusion emerged at the beginning of January 2003 when suddenly it looks like I was staying up all night and posting at completely random times of the day. I instinctively felt like this could not be right – and by instinctively I mean that I knew it wasn’t right because I’ve become mostly old and predictable in my early thirties. Enter Tom Carden again, who writes:

“The time of day chart is great, but I suspect that more careful analysis is needed here. For instance, I believe that the move to MT coincided with the enabling of comments. I suspect that’s why the time chart suddenly goes strange. Here’s one without the comments. It’s as you might expect – pretty much the same times, but less frequent.”

He’s also generated a visualisation that shows comments as green dots and trackbacks as blue. It’s harder to interpret these on images reduced so much, so if you’re gripped by this whole subject area you might want to check out his two larger-scaled graphs: Without Trackbacks and Comments and With Trackbacks and Comments. The smaller version looks like this:

By way of intepretation then, I think it’s clear that since moving to MT my posting has dropped enormously (there are other reasons for this of course – including my rather hardcore nine months working on Radio 3. Perhaps more interesting is that the same posting patterns during the day appear to have continued (first posts between eight and ten in the morning – last ones between 12 and 2am) but there is more deviation from this practice. Whether this just represents being at conferences in other time zones or something isn’t clear to me.

Looking at the comments and trackbacks, it seems clear that they’re much more consistent during the day and night – which probably reflects a fairly international audience (or a lot of insomniacs and filthy drunks). There does appear to be a bit of a concentration of comments at least between midday and six UK time – although that may just be a result of comments lagging after specific posts. That would also account for the fairly heavy striping in comments up and down the page – on occasion I seem to write something that a lot of people want to comment upon. The rest of the time – not so much… There are also some fairly clear stripes of total inactivity emerging – February/March-ish of this year seems a complete dead zone as does a good period around two-three months ago.

I think generally what this little project has taught me is that statistics and visualisations are are both really good fun and that they can expose to you patterns and behaviours and causalities that you may have suspected were there all along, but couldn’t be certain about. Along with projects like Audioscrobbler, I feel like I’m starting to get a strange and exotic new statistical understanding of my life. I can look at these diagrams and see myself in them. It makes me want graphs of the fibrous content of my excrement automatically generated by my loo – and a complete break down of the percentage of my floor which is covered with rubbish and books at any given moment in time. Somewhere in this stuff is self-knowledge!

Categories
Personal Publishing

Three more sets of visualisations…

Wow, so it’s nearly three in the morning, which is basically four in the morning since the clocks only changed a week ago, and I still appear to be up and awake and completely uninterested in sleep. I may as well take this opportunity to post a few more visualisations of the weblog data that I posted up a few days ago (Five Years of Weblog Data vs. The Visualisations). This batch pushes us in a few interesting new directions – some of which so interesting in fact that I have no concept whatsoever what they might signify or represent. In fact let’s start with those:

These two were undertaken by Dan Kaminski and were “run through the Phase Space Visualization process popularized by Michael Zalewski” using Phentropy. I will confess straightaway that I’ve spent a limited amount of time trying to work out what they mean, but remain as yet unsuccessful.

The second set of visualisations came in from Daniel Boyd and is a plot of when during each day I posted over the last five years:

The first three years here are astonishingly consistent and – as Daniel himself pointed out – the data would suggest that I go to bed between midnight and 2am and that I get up each morning around eight. This is stunningly accurate and highly representative of my normal behaviour. What’s less clear is what happens after the beginning of 2003 where the data starts to look remarkably less organised. I’m still not clear why this has happened – and in fact quite suspicious of how accurate that bit is. It seems to me more likely that some part of the data is corrupt around the beginning of 2004 than that my entire posting behaviour changed overnight. Except to say that yet again this phase shift seems to coincide with switching to using Movable Type.

Which brings me to the last visualisation. When I last posted these visualisations I pointed out that they seemed to suggest that the switch to Movable Type had resulted in longer posts and a drop in the rate of posting. In response Anil Dash mooted that Movable Type might be making my posts better thought out. Now of course, there’s a clear value judgement there, and one that is not necessarily correct. It’s quite conceivable that I’ve just got more wordy and long-winded and that MT has supported or even caused that shift. It’s not even clear that weblog posts should be better thought out – Matt has suggested to me in the past that another equally convincing model might be to think of MT as having broken the paradigm of incredibly fast and easy informal peer-to-peer publishing that Blogger created and was initially its USP.

So it’s just as well that we have Richard Soderberg out there. He decided to run a Flesch-Kinkaid reading analysis on each of my posts over the last five years and plot them on a graph. The solid darker line in the middle being the reading level in the US educational system that might be required to be able to read each post.

As you can see, the readability level of my site changes dramatically around the same time that I move to using MT. Over the first three years it seems to settle around eight on the Flesch-Kincaid scale. But after moving to MT, the average appears to be more around 10. More interesting still is that the extremes in both directions have started to diminish. There are few posts that plummet down to the infantile depths of the scale and few that stretch upwards towards unintelligibility. Richard’s also provided a useful CSV of some key metrics. When you investigate it, you’ll see that each post has its complexity described in as either childish, acceptable, ideal, difficult or unreadable. I wonder if the same metrics apply with writing for the web as for elsewhere (we’re normally told by usability people that sentences should be shorter and punchier and articles not as long when we write online) but if they do the site has definitely moved much more heavily towards the ‘ideal’ intelligibility rating and that does appear to coincide with a move to MT.

So there you go – three new sets of visualisations and a stunning revelation that they should put on the marketing bumph. That’s right, ladies and gentlemen, apparently MT actually makes you smarter!

Categories
Personal Publishing

Five years of plasticbag.org: The Visualisations

Five years of plasticbag.org – it has passed in a flash. It’s seen me move from temp jobs, through journalism school to more temp jobs, from multiple roles at Time Out, to working at emap, designing UpMyStreet Conversations (among others) and doing R&D work for the BBC. The last five years has seen webloggia change from a couple of hundred dorks mucking around on the internet to a few million dorks mucking around on the internet and being talked about a conferences. It’s seen the world go from millennial angst to millennial hope, only to see 9/11 happen and our countries declare war on Afghanistan and Iraq. In my personal life I’ve lived in three major homes, stayed on innumerable floors and been to America a fair few times. I’ve moved from writing about stuff on the web to stuff in my life and back to stuff on the web again and had a small but statistically significant number of particularly disasterous relationships. God knows if I can manage another five years like the last five (I don’t know if I’d be able to survive it to be honest), but if I do I think maybe I’ll be looking for a party to celebrate…

Anyway, a few days ago I put up a dump of every post ever published on plasticbag.org for people to rip apart as they pleased. I thought some people might decide to visualisations or to analyse word frequency or link frequency or whatever. To be honest, I’ve not had the most overwhelming response ever (but then again it’s not like I was giving away free chocolate bars or anything), but I’ve really enjoyed the stuff I have received. I think perhaps the concept would be more appealing and generally useful if (as New Media Hack suggested) more people opened up their archives in a similar style. Still, never mind. Here we go:

Our first batch of analysis comes from Cal Henderson who has basically used the data at his disposal to take the piss out of me. A few weeks ago I got a bit moody with Matt Jones after he complained that I was starting every post I was writing with the word “So…” (here’s the grump in question). So what has Cal done? He’s established the horrible truth of the situation – here’s a graph of how many posts I’ve started with the word “So” over time:

As you can see – a startling indictment and as Cal said to me on AIM, “evidence that you’re getting worse”. More evidence in that direction comes from Tom Carden who sent in three visualisations of increasing complexity. The first diagram is a simple model of posting frequency. The graph is separated into five separate blocks (at the bottom of the diagram) and each day is represented by a vertical line. The stronger the colour of the line, the more posts happened on that day:

As you can see from the visualisation, I really seem to have found my stride towards the end of my first year of weblogging (Nov 1999 to Oct 2000) – and throughout 2001 I’m posting very regularly. 2002 starts slightly more slowly, but then my post-frequency goes through the roof for a while before apparently starting a slow long drift off towards irregularity which flattens off around nine months ago at an almost total absence of posting. (You can see that image at its full resolution here).

Tom’s next step was to try and incorporate into the graph some sense of post length. Which resulted in this diagram (which I’ve distorted slightly to make it easier to explain):

So one clear consequence of me posting less often appears to be that I have – unfortunately – become a bit of a blowhard. Look at how much longer the posts are! (The larger version of this visualisation is here). And when you bring it all together, you get this stunning piece of work:

There’s a bigger version of this particularly complicated graph here. The red line indicates a moving average of post length (over 25 posts). That looks like it was fairly solid for the first three years and then suddenly started to get substantially longer towards the beginning of the fourth segment. This coincides with an apparent drop in post-frequency (each post is represented by a vertical grey line, where they overlap they get brighter – you can see this most closely at the bottom of the graph).

You may well ask what it was that caused my post-length to go up and my post frequency to drop so dramatically? Well it turns out, looking at my archives, that this happens at precisely the same time as I switched to using Movable Type instead of Blogger – which just goes to show how much the tool helps dictate the form of your writing online.

The purple line indicates the moving average post length (over a seven day period rather than over 25 posts). This has vacillated a lot over the last five years, but appears to be reaching new lows in the last six-nine months (as well as the occasional odd new high). This is probably a direct result of work pressures. However it doesn’t appear to have had an enormously negative effect – the green line indicates cumulative total of words written on plasticbag.org and – although maybe it’s starting to flatten a little – seems to be an almost totally linear rise over the last five years. The blue line indicates the cumulative total of posts on plasticbag.org however – and that really does appear to have changed quite dramatically. If all these trends continue in the way they seem to be going at the moment, you can look forward to one post a year around the length of a novel. You lucky bastards.

Anyway, that’s your lot – that’s all the visualisations I’ve had in so far. I’m hoping to get a few more from lollygaggers and slugabeds, but in the meantime thank you to Cal and Tom for spending their time so ill-advisedly, and thank you all for being part of my life for more or less of the last five full years. Now I must get back to doing something slightly more useful with my time. xx

Categories
Personal Publishing

Five years of weblog data to rip apart as you please…

This weblog – originally located at barbelith.com but which subsequently moved to its current location at plasticbag.org – will hit its fifth birthday on Monday. That’s five full years of random plasticbag.org posts – 4175 of them in fact, plus 1517 links in the linklog (before I moved over to using delicious to manage them in the last couple of weeks). In terms of the non-linklog posts alone that works out at over two posts a day, each and every day of each and every week, of each and every month, of each and every year since November 1999.

In terms of words written it’s difficult to be precise. I’ve exported all the posts from my site, worked out the number of words in a given header, multiplied that by the number of posts and removed that from the total number of words that BBEdit tells me are in the exported dump. Clearly this is not going to be a terribly accurate way of measuring word count (God knows what HTML does to these things) but if you believe what BBEdit tells you, I’ve written in excess of 1.1 million words over the last five years. To put that in perspective, English versions of the Bible have only around 750,000 words in them. I’ve written a bible and a full third of a sequel.

Which brings me to why I’m mentioning all this stuff. I’m not sure I believe those figures. Hell, I’m not sure I want to believe those figures. But one thing is clear to me – there’s a lot of data here in one form or another – and there must be any number of ways to visualise that data or explore it or rip it apart or whatever. So I thought maybe what I should do is just open it all up – stick up a big MT export of everything I’ve done to date – and see if anyone out there can think of any interesting visualisations or ways of processing or graphing it. Obviously, I have no expectations – it could easily be that no one finds this the slightest bit interesting. But if you do, let me know (normal e-mail address tom [at] plasticbag [dot] org) and of course if you want me to post what you’ve done, or link to it in any way, then I’d be more than delighted to do so. It would be great to be able to stick something up that I’ve got from you guys on the fifth anniversary itself…

So here’s the dump: Every full post made to plasticbag.org over the last five years including – in an intriguing self-reflexive twist – this one. Have fun with it!