Personal Publishing

Five years of weblog data to rip apart as you please…

This weblog – originally located at but which subsequently moved to its current location at – will hit its fifth birthday on Monday. That’s five full years of random posts – 4175 of them in fact, plus 1517 links in the linklog (before I moved over to using delicious to manage them in the last couple of weeks). In terms of the non-linklog posts alone that works out at over two posts a day, each and every day of each and every week, of each and every month, of each and every year since November 1999.

In terms of words written it’s difficult to be precise. I’ve exported all the posts from my site, worked out the number of words in a given header, multiplied that by the number of posts and removed that from the total number of words that BBEdit tells me are in the exported dump. Clearly this is not going to be a terribly accurate way of measuring word count (God knows what HTML does to these things) but if you believe what BBEdit tells you, I’ve written in excess of 1.1 million words over the last five years. To put that in perspective, English versions of the Bible have only around 750,000 words in them. I’ve written a bible and a full third of a sequel.

Which brings me to why I’m mentioning all this stuff. I’m not sure I believe those figures. Hell, I’m not sure I want to believe those figures. But one thing is clear to me – there’s a lot of data here in one form or another – and there must be any number of ways to visualise that data or explore it or rip it apart or whatever. So I thought maybe what I should do is just open it all up – stick up a big MT export of everything I’ve done to date – and see if anyone out there can think of any interesting visualisations or ways of processing or graphing it. Obviously, I have no expectations – it could easily be that no one finds this the slightest bit interesting. But if you do, let me know (normal e-mail address tom [at] plasticbag [dot] org) and of course if you want me to post what you’ve done, or link to it in any way, then I’d be more than delighted to do so. It would be great to be able to stick something up that I’ve got from you guys on the fifth anniversary itself…

So here’s the dump: Every full post made to over the last five years including – in an intriguing self-reflexive twist – this one. Have fun with it!

12 replies on “Five years of weblog data to rip apart as you please…”

So is that 4175 including this one, or excluding this one?
(The fact that one or two of your posts cover how to export a Blogger blog to MT makes it confusing trying to parse the txt file!)

i don’t want to believe you seriously compare your five years of blogging to the bible. or maybe it does make sense, only to ask oneself, and what do my one million words do as compared to the bible? i mean, so people will rip, tear, visualize, therefore adding some value and a bit more hype to your page, but what then? and is your million written better than the couple million said in that time, because you can play around with the 7.4mb file, and if yes, do we all sign up for the microsoft’s ‘i will log everything in my life’ project?? cool down, i suggest, our society is obsessed with numbers and dates and anniversaries, but John Fiske said about popcultural content that it is broken and incomplete and not much value or attention should be devoted to it and that’s maybe that. i’ll honestly have fun with something else – though i’m sure the linguist, the databaser, the someone else might truly have fun with your data and in that sense it might be useful.

Well of course I’m not comparing my lurid writings to what’s in the bible. What’s on my site – for example – has some vague resemblance to reality. Much of what I write is also – surprising as it might seem – true. There too it differs from the Bible. On occasion I have changed my opinions over time and indeed apologised for stupid stuff I have said in the past. That’s yet another way in which differs from the Bible. I will confess to my stuff not being as imaginative (or promising as much or affecting as many people) as the bible – but then again very few people have taken what I’ve written as gospel and fought, killed or been killed as a result of it. So – you know – maybe we’re even, eh?
Joking aside – the reason this site is called is because I wanted a name that reflected mass production, disposability and triviality (as well as modernity). I don’t have any pretensions that what I’m doing here is literature. I don’t have any pretensions that what I do here is important. If it’s interesting to a few people, then I’ll be happy. And it does – on occasion – seem to be interesting to a few people.
Having said that I don’t think my individual weblog is important, I do also believe there is value in the trivial and in the minutiae of people’s lives and opinions – I believe that an individual’s ability to express their voice online is a good and beautiful thing even if no one is listening. I think that possibly the explosion of personal writing that has happened around the web – the fact that millions of people are maintaining and writing about their lives is interesting and important and valuable. So no – I’m not seriously suggesting that my writing is as important as the bible, any more than I am suggesting that I am important as a person – but I would argue that people as a whole are interesting and important, just as I would argue that weblogs as a whole are interesting and important.
So am I here shouting about how great I am and trying to get people excited about messing around with my data? Well, there might be some value in the analysis of a weblogger’s posting habits over five years. I’d be interested in a comparative analysis with other webloggers – we might start to notice some trends. But no – of course I’m not expecting everyone out there to eagerly grasp the data I’ve stuck out there and spend their every waking hour playing with it. If the linguist, the databaser and the someone else have fun with it, then that’s enough for me…

Congratulations Tom (grovel grovel please link to me)
Did you now these stats about plastic bags?
We use an estimated 1 billion plastic bags in Scotland every year.
Plastic bags take up to 1000 years to break down
In Ireland a charge on plastic bags introduced in 2002 resulted in a 90% reduction in their use and a drop in litter.
Every day Scotland throws out enough waste to fill Murrayfield Stadium – and the amount is growing by 2% a year.
Plastic, including plastic bags are a major hazard to wildlife. According to the Marine Conservation Societyπs (MCS) Beachwatch 2003 Report, based on 135 km of UK coastline, plastic items accounted for over 50% of the litter found, including 5,831 plastic bags, the equivalent of 43 plastic bags for every kilometre of coastline surveyed.
In the recent survey coordinated by the Marine and Coastal Zone Research Institute in the Netherlands, scientists found that 96% of dead fulmars studied had plastic fragments in their stomachs, double the amount found in fulmars in the early 1980s.
More essential imformation on the state of the planet at:

I must be more picky. My newsletter/blog is about to turn 8 years old and I’ve only got 10,000 links and a half million words. But hey, that’s just the public edition. likes it though… they selected it as one of the top 10 bloggs on the internet. If it wins the votes for #1 I’ll get lots and lots of Starbucks coffee …or will I? There is not a Starbucks within several hundred miles of me.

Five years’ worth of Boing Boing posts in one file!
Hey! Today is Boing Boing’s fifth bloggaversary — that is, it’s been five years since Mark posted the very first post on the Boing Boing blog. In that time, we’ve posted a little more than 17,000 stories and entries here. To celebrate our first half-d…

Comments are closed.