There’s been an enormous amount of good stuff around about tags and folksonomies recently, which I’ve not really had enough time to interrogate fully. One particularly interesting experiment has been the Cloudalicious service. Cloudalicious was apparently inspired by the Grafolicious service which tracks changes in the rate of bookmarking for any given URL as well as creating browsable interfaces for getting to grips with tags. Cloudalicious takes this one stage further – showing how the actual tags that people use to describe a given URL change over time. This blurry mess of semantic data is known as a ‘Tag Cloud’.
But what do changes in a tag-cloud mean? Probably the most obvious underlying cause for a change in the words used to describe a site would be that the site itself has changed. You could probably use an analysis of the changing tag-cloud to get a handle on what’s happening to the site. That’s quite interesting.
After that – or alongside that – another underlying cause could be a change in the vocabulary around a subject. At a really grand level, if you can imagine a one hundred year tag-cloud around a gay novel, then it might start with lots of people using the tag invert, with this gradually giving way to homosexual, then gay and potentially after that, queer.
There’s a really nice illustration of this on a weblog called P.S. which has a post called Tagclouds and cultural change. In it, there are a lot of illustrations of the take-up of the tag ‘Ajax’. You could argue this one in a couple of ways – a new concept emerges and a weblog might change direction to deal with it. In that case it’s just about the content changing. But for the most part the examples that the article uses are about specific unchanging individual articles, not whole weblogs. The vocabulary around the posts is changing, not the posts themselves. In the following graph from that article, Ajax is the pale blue line that – over time – becomes the tag of choice for the article in question:
But there’s also a third potential cause for changes in a tag-cloud over time – that people might approach the very act of tagging differently – that their understanding of what they’re doing might develop. This is a change in the nature of tagging itself. And this is what I want to talk about really briefly.
Matt Webb and I did a fair amount of work around tagging with a project called Phonetags that I never get time to properly write up. As we were working on it, we came to realise that each of us had a radically different understanding of what a tag was. Matt’s concept was quite close to the way tagging is used in del.icio.us – with an individual the only person who could tag their stuff and with an understanding that the act of tagging was kind of an act of filing. My understanding was heavily influenced by Flickr‘s approach – which I think is radically different – you can tag other people’s photos for a start, and you’re clearly challenged to tag up a photo with any words that make sense to you. It’s less of a filing model than an annotative one.
When I came to use del.icio.us I approached tagging in the way that made sense to me from Flickr. So any and all links were covered with loads of keywords with no thought for how they ought to clump together. I just tried to describe what the link was about in some way. Joshua and I had a bit of an argument about the way I was using it, actually. The browsing interface didn’t really suit an approach that had an enormous number of orphaned tags. You can get a sense of how out of control it all got with this visualisation of my tags. At the end of the argument I said to Joshua that it was almost like he was treating tags as folders. And he replied, exasperated, that this was exactly what they were. It was just that now an object could exist comfortably in a number of folders so you didn’t have to enforce an arbitrary heirarchy on your filing…
So two radically different forms of tagging that really share very little in common with one another – which leads to the question, is there room for two different paradigms here (at least) or will there be some refactoring and adaptation that moves us towards one or other model?
To help answer this question, here’s a representation of the tag-clouds surrounding my weblog over time (you can see the original in context on Cloudalicious):
So this basically traces my weblog over the last year. Each coloured line represents a particular tag – its height on the graph indicating its ‘weight’ – how often it is used in relation to the other tags. Here’s where it gets interesting – there’s at least one really significant shift of emphasis that happens over the year, between the blue and the red lines. This really does look like an ongoing shift of emphasis in the community of people who have bookmarked my site. And here’s the really interesting bit – the two tags are almost exactly the same. The blue one is blogs and the red one is blog. But why such a dramatic shift between the two tags?
Now of course, this is only one weblog and it’s difficult to come to any significant conclusions based on one example like this. But we could use it to form a hypothesis for other more technical people to test elsewhere. So here is that hypothesis – that the shift from people using blogs to blog represents the increasing dominance of a Flickr-style paradigm of tagging. Imagine the process of annotating a weblog – if you tag it with ‘blogs’ it seems clear that you are adding it to a collection of some kind. ‘Blogs’ is clearly the name of a folder which houses links to weblogs rather than an attempt to describe the weblog itself. But tagging something with the term “blog” suggests quite the opposite – to tag a link ‘blog’ suggests that I’m attempting to describe the link not as belonging to a bin labelled ‘blogs’ but simply as a ‘blog’ in and of itself. It is my conjecture, therefore, that the folder metaphor is losing ground and the keyword one is currently assuming dominance.
To test if this theory is correct – to see if one model of tagging is becoming dominant over another – should be relatively simple. You could use tag-stemming to spot tags with common roots in popular URLs, and then look for significant changes in their proportionate usage over time. I’d be particularly interested in tags that described the format of the object on the page (article vs. articles, quiz vs. quizzes, searchengine vs. searchengines) rather than the subject (trees, nuclear fission, cats). If someone was to do this kind of research then I’d be delighted – because it’s those kinds of studies and observances in user behaviour that allow us to design better interfaces to support these innovations.