Social Software

Two cultures of fauxonomies collide…

There’s been an enormous amount of good stuff around about tags and folksonomies recently, which I’ve not really had enough time to interrogate fully. One particularly interesting experiment has been the Cloudalicious service. Cloudalicious was apparently inspired by the Grafolicious service which tracks changes in the rate of bookmarking for any given URL as well as creating browsable interfaces for getting to grips with tags. Cloudalicious takes this one stage further – showing how the actual tags that people use to describe a given URL change over time. This blurry mess of semantic data is known as a ‘Tag Cloud’.

But what do changes in a tag-cloud mean? Probably the most obvious underlying cause for a change in the words used to describe a site would be that the site itself has changed. You could probably use an analysis of the changing tag-cloud to get a handle on what’s happening to the site. That’s quite interesting.

After that – or alongside that – another underlying cause could be a change in the vocabulary around a subject. At a really grand level, if you can imagine a one hundred year tag-cloud around a gay novel, then it might start with lots of people using the tag invert, with this gradually giving way to homosexual, then gay and potentially after that, queer.

There’s a really nice illustration of this on a weblog called P.S. which has a post called Tagclouds and cultural change. In it, there are a lot of illustrations of the take-up of the tag ‘Ajax’. You could argue this one in a couple of ways – a new concept emerges and a weblog might change direction to deal with it. In that case it’s just about the content changing. But for the most part the examples that the article uses are about specific unchanging individual articles, not whole weblogs. The vocabulary around the posts is changing, not the posts themselves. In the following graph from that article, Ajax is the pale blue line that – over time – becomes the tag of choice for the article in question:

But there’s also a third potential cause for changes in a tag-cloud over time – that people might approach the very act of tagging differently – that their understanding of what they’re doing might develop. This is a change in the nature of tagging itself. And this is what I want to talk about really briefly.

Matt Webb and I did a fair amount of work around tagging with a project called Phonetags that I never get time to properly write up. As we were working on it, we came to realise that each of us had a radically different understanding of what a tag was. Matt’s concept was quite close to the way tagging is used in – with an individual the only person who could tag their stuff and with an understanding that the act of tagging was kind of an act of filing. My understanding was heavily influenced by Flickr‘s approach – which I think is radically different – you can tag other people’s photos for a start, and you’re clearly challenged to tag up a photo with any words that make sense to you. It’s less of a filing model than an annotative one.

When I came to use I approached tagging in the way that made sense to me from Flickr. So any and all links were covered with loads of keywords with no thought for how they ought to clump together. I just tried to describe what the link was about in some way. Joshua and I had a bit of an argument about the way I was using it, actually. The browsing interface didn’t really suit an approach that had an enormous number of orphaned tags. You can get a sense of how out of control it all got with this visualisation of my tags. At the end of the argument I said to Joshua that it was almost like he was treating tags as folders. And he replied, exasperated, that this was exactly what they were. It was just that now an object could exist comfortably in a number of folders so you didn’t have to enforce an arbitrary heirarchy on your filing…

So two radically different forms of tagging that really share very little in common with one another – which leads to the question, is there room for two different paradigms here (at least) or will there be some refactoring and adaptation that moves us towards one or other model?

To help answer this question, here’s a representation of the tag-clouds surrounding my weblog over time (you can see the original in context on Cloudalicious):

So this basically traces my weblog over the last year. Each coloured line represents a particular tag – its height on the graph indicating its ‘weight’ – how often it is used in relation to the other tags. Here’s where it gets interesting – there’s at least one really significant shift of emphasis that happens over the year, between the blue and the red lines. This really does look like an ongoing shift of emphasis in the community of people who have bookmarked my site. And here’s the really interesting bit – the two tags are almost exactly the same. The blue one is blogs and the red one is blog. But why such a dramatic shift between the two tags?

Now of course, this is only one weblog and it’s difficult to come to any significant conclusions based on one example like this. But we could use it to form a hypothesis for other more technical people to test elsewhere. So here is that hypothesis – that the shift from people using blogs to blog represents the increasing dominance of a Flickr-style paradigm of tagging. Imagine the process of annotating a weblog – if you tag it with ‘blogs’ it seems clear that you are adding it to a collection of some kind. ‘Blogs’ is clearly the name of a folder which houses links to weblogs rather than an attempt to describe the weblog itself. But tagging something with the term “blog” suggests quite the opposite – to tag a link ‘blog’ suggests that I’m attempting to describe the link not as belonging to a bin labelled ‘blogs’ but simply as a ‘blog’ in and of itself. It is my conjecture, therefore, that the folder metaphor is losing ground and the keyword one is currently assuming dominance.

To test if this theory is correct – to see if one model of tagging is becoming dominant over another – should be relatively simple. You could use tag-stemming to spot tags with common roots in popular URLs, and then look for significant changes in their proportionate usage over time. I’d be particularly interested in tags that described the format of the object on the page (article vs. articles, quiz vs. quizzes, searchengine vs. searchengines) rather than the subject (trees, nuclear fission, cats). If someone was to do this kind of research then I’d be delighted – because it’s those kinds of studies and observances in user behaviour that allow us to design better interfaces to support these innovations.

23 replies on “Two cultures of fauxonomies collide…”

I agree. I have been doing research into sharing mobile media for about 2 1/2 years and one thing we did early on was include tags. We actually extended the idea to have tag types such as location, date, proximity etc, but always kept the content of the tag completly freeform. This allowed for fine grained tagging like “back from australia” or “Australia” where both are free form but when you know the first is a date and the second is a location you can do some nifty things.
It was then interesting doing this work to watch and flickr explode and people finally started to ‘get’ it.
Right now I am working on a CMS (finishing actually) that will soon (hopefully) be made public and freely available, which works on the idea of tagging everything that goes into it. It then useses that information to make groupings and metalinks between the documents created in the CMS as well as other files added such as images, videos, audio, spreadsheets, zip files and calendars etc etc.
Once you have this type of data one simply has to create a ‘view’ for it. Be it an RSS feed of new pictures or a claendar to use in iCal, or a basic blog type interface, ‘podcasts’ or even PDF documents.
The current (working) use case goes somthing like. I have a document called “My imac” with some keywords that also describe the document or its contents. I then might upload media using my phone or iphoto with keywords like ‘iMac’. These are then grouped with the document and become as much a part of the document as items which are manually assigned.
Of course one can edit whats actually assosiated later, and things can be hidden and so forth. But it is already showing its worth when managing huge ammounts of media across a CMS AND!! local file systems, because once the data is in the file and its uploaded – which is a one step action in most cases ñ the media is where it should be.

Having just recently signed-up for a account I’ve been slowly and diligently moving all of my bookmarks over from Safari during the past few days.

For me, having a centralized and OS independent bookmark resource which I can slowly build over time until it is a (hopefully) vast repository of things that interest me is quite enticing.

I’ll get to the point . . . Moving all of these bookmarks over to I had to decide how to tag them. At first I sorted them as if they were in folders ie: “reviews”, “tutorials” etc. Then, slowly, I began sliding more toward an item-based approach ie: “review”, tutorial” etc. I think this fits the whole tag user-model a lot better as instead of having a variety of folders containing the same bookmarks each bookmark exists solely unto itself but with a related keyword. In my brain each of these keywords has a variety of clothes-line links to other keywords and bookmarks.

At the moment I’m trying to limit myself to 100 tags, whether or not this is possible, and/or defeats the whole purpose of tags, is uncertain. But, if I have upwards of 500 tags to eventually maintain I’ll have a mess of a time allocating them to each incoming bookmark.

I hope at least some of this makes sense . . .

Interesting. There is a meaning shift. Not sure I see the correlation of blog(s) shortening as a shift to annotating. Perhaps a Tag Cloud of Flickr might help us see it. Otherwise it may be people adapting with tired fingers.

I think it’s important to note that what I’m proposing here is a hypothesis rather than a conclusion. It seems to me, though, that it should be eminently testable, even if – at the moment – I don’t quite have the brain power to work out how.

Great way of distinguishing two different ways tags are being used. I’m more on the keyword side myself. I wanted to share how I’ve been using the tags ‘blog’ and ‘blogs’ on delicious, it’s a little different than the way you suggest. When I bookmark an article that is from a blog, or the blog’s homepage itself, I’ll use ‘blog’ since it helps to identify what type of source it is. When I tag something ‘meta’-blog, ie a page about blogs or blogging, then I use ‘blogs’ to distinguish. People look at what these keywords mean differently, like you’ve explained, and while I agree this is worth examining I’m not sure what conclusions can be drawn yet.

What Luke said. I think I do both too, but not as consistently.
I have almost as many tags as posts, and the majority of my tags are only used once. It does clutter the interface (I keep asking for a filter) but I don’t think it devalues for me.
Actually, I just looked at the new interface where “view tags as list” and “sort by freq” pretty much solve my problem. Tag cloud view could do with being more heavily weighted though, I think.
Some things I think of as being thrown into multiple folders/buckets (events for example) and others are given keywords so I can find them again (wimbledonization for example). The latter could be often be done by searching the extended/title fields instead, I suppose.
It’s interesting to note that my most popular tags are mainly folder-style groups, and the (ahem) long tail is mainly keywords.

When I started using, I assumed I’d be using folder-style tags. Instead, I find myself using keywords; few of my tags are used more than once. Some folders have crept in, though, like the inevitable “toread”.
The beauty of tagging is that folder tags and keyword tags can coexist in the same system. I think that’s a big improvement over having folders in one place (the GUI) and keywords in another (the search box).

I’m in the process of COMBINING a heiarchical category style structure with tags at my website, What I find useful is the concept of using tags to describe *meaning* or *topic* that a piece of information conveys or references. I save categories for *location* or *type of media* e.g. audio, video, blog, article, etcetera. Categories to me work best for stuff like physics. Tags are best for making “meaning.” And for things which change their meaning based on the individual who does the interpreting, or the context of the environment…

This is really interesting. I’ve never had a flickr account, but I think my use of has been moving towards what you are calling flickr-style tagging since I started tagging. And now that I think about it, I think this is a natural trend. Over time, we start to tag for finding more than for organizing. We originally approach tags as folders because that’s what we’ve used before, but the more we go back to find things we’ve tagged before, the more we realize that if we added more tags, things would be easier to find, and the organization that we learned from folders doesn’t matter with tags.

I like Ross’ idea, but since Flickr is a narrow folksonomy the individual data points (people with their various interpretations of meaning) are lost. It is the individual data points that allow for true tracking of meaning change and usage.
I am seeing a bunch of text services (not photo nor media organization sites) moving to narrow folksonomies, including EVDB and Feedster (proposed), which is creating more of a mess than an advantage. The problem relates to the missing point of interpretation, the individual. The model (broad folksonomy) works very well for text based systems as the tagging adds another level of metadata to help find the item. With EVDB, the tags do not allow for following one person’s interpretation of a term, which means the tags get muddied and unuseful extremely quickly.
Measuring Flickr’s use of a tag is problematic unless the photos are of the exact same object. Flickr’s narrow folksonomy ties the tags directly to the object so there is no way of interpreting meaning and popularity, nor change of tag usage over time for the same object. This is a very important distinction to keep in mind when researching tags/folksonomy and change over time.

To describe a thing or to describe the category of things that the thing belongs to… that is the question.
Personally, I’m a thing tagger because I think in terms of ‘Aboutness’. There’s a really interesting look at ‘Aboutness’ in context to image classification here which draws the distinction between the naming of topic and function. The kicker quote from this is “Words are made more specific by grouping them with other words”. You can see this with flickr to great effect. So intrnisically, if you’re inclined to describing a thing by naming a category it belongs to, you may not necessarily gravitate toward a faceted solution (whereby you name many different topics to which a thing may pertain). Or would you?. Peter Van Dijck would…
At first I thought Peter was missing the folksonomy point, which says “peace, love and tagging, baby.. don’t put me in a box, even if is multidimensional… man”. Raising the level of committment to tag isn’t a great idea IMO because it becomes a barrier to participation, which helps nobody. But I guess he’s just in the “Gimmie structure! I NEEED STRUCTURE!!” camp. I’d be interested if Mr Matt would like this solution? Is there a place for both? Or will it emerge like all other internet tagging – haphazardly, leaving it to someone like Google to find the killer algorithm that will make sense of it all? Answer me Tom. Tell me now!

For what it is worth, I tend to tag homepages with “folder” like tags (“articles” and “blog” are two commone entries for me) and specific items/articles/blog entries with “keyword” like tags.

Hi tom,
Very thoughtful article, which I mostly agree with. Another issue which is not often touched with tags is the mono-linguistic view on them. The Web community involved in all these projects are in majority English-speaking, for historical, geographical, etc. reasons.
Most of the UI developed for these services are in English. Developers being English speaking, that seems logical. So first English users will participate, then people who have a good understanding of English plus their native language.
The first tags are set by an English speaking community in an English environment. If you want to benefits from the power of the community in terms of being findable or to relate to others findings, you are more likely to tag in English.
But it’s where starts the problem. As a “bilingual” user (French/English)
Should I tag in French or in English?
Will it depends on the language of the content for What about if I tag a page in Japanese with English or French again?
When it’s come to photo, what language should I use?
And if I use French, how an english user, a japanese user, etc, will find my content?
What about people who don’t know at all what “cow” means, but know the meaning of “vache”, or “niu” (chinese) or “ushi” (japanese) or even better “?” (here the kanji for cow)?
There are a lot of issues which shows that flat tagging is not neutral at all on the way we classify and has strong cultural influences.

Interesting analysis. To make your hypothesis testable, one approach would be to formalise a little more.
“…to tag a link ‘blog’ suggests that I’m attempting to describe the link not as belonging to a bin labelled ‘blogs’ but simply as a ‘blog’ in and of itself…”
Hmm, lets say we have a set called B which contains all blogs. This sounds like the bin. But then again, wouldn’t it also capture the information contained in each individual thing being “a ‘blog’ in and of itself”.
I believe there is a distinction there, which has shown up in a slightly different when people have tried to formalise tags in the Resource Description Framework (RDF*). It’s seemed a better fit when a layer of indirection has been inserted into what’s being modelled – rather than saying the things themselves should go in bins, it’s the *concept* of the things that are being classified. There is spec for doing this in RDF, the Simple Knowledge Organisation System (SKOS). From the SKOS Core guide:
SKOS Core allows you to model a set of concepts as an RDF graph. Other RDF applications, such as FOAF, allow you to model things like people, organisations, places etc. as an RDF graph.
* it’s appropriate to try doing this with RDF for both theoretical and practical reasons – the RDF (+OWL) model can be viewed as a Description Logic, a formalism based on first-order logic that’s grown from knowledge representation techniques like Semantic Nets and frames. Get the model right and you can express the information contained in tags, preserve the semantics while you shift it around. But the cool bit is once you have the formalism sorted, tools are already available for working with RDF/OWL allowing you to make inferences based on the info you’ve captured.
An immediate application of this is to be able to maintain the info about who’s done the tagging and what tagging scheme has been used. So it might be useful to distinguish between your use of the Flickr tag ‘blog’ and my use of it, but we might use the tag ‘blog’ differently, the latter perhaps being mergeablee.

I’ve been using tags as something in between. I’m putting things into buckets and describing them, at the same time. When I tag something, I try to use a controlled vocabulary for terms that I often use, instead of various synonyms, but I might also put in tags for its unique aspects, the ones that I’ll want it for later.

Ross’s ‘tired fingers’ idea should be quite easy to correct for – just find out the average trend in changing tag lengths over time, and see if the change in blog(s)-type cases is greater than the overall trend.
Alternatively, you could probably find some examples where the annotation-type tag (blog) is longer than the folder-type (blogs). I can’t think of an example right now, but I’m sure they exist.

I believe there is a distinction there, which has shown up in a slightly different when people have tried to formalise tags in the Resource Description Framework (RDF*). It’s seemed a better fit when a layer of indirection has been inserted into what’s being modelled – rather than saying the things themselves should go in bins, it’s the *concept* of the things that are being classified. There is spec for doing this in RDF, the Simple Knowledge Organisation System (SKOS).

An immediate application of this is to be able to maintain the info about who’s done the tagging and what tagging scheme has been used. So it might be useful to distinguish between your use of the Flickr tag ‘blog’ and my use of it, but we might use the tag ‘blog’ differently, the latter perhaps being mergeablee.

There is of course a much simpler explanation to the shift between blogs and blog. Imagine you use even more words, initially: blog, blogs, blogging, bloggers, .. ? Then you notice that you use all of these more or less interchangeably. It soon becomes a real mess. Do you tag a new URL blog or blogs or ??? So eventually you rationalize and adopt the use of the word stem: blog. It wouldn’t take too many people adopting this strategy to result in the small reversal observed in the graph.
How do you test my version? Well for one thing, each user who shifted should also show a decrease in the use of blogging, blogger, etc.
So here is a simple explanation that does not require the postulation of significant “mind shifts” of some sort. The simplest explanation should always be tested first!!

When I come across a much older blog post like this and read the comments, often I like to reply to a specific comment that was previously posted. I noticed here that visitors cannot reply directly to a specific previous comment, but have to refer to it at the bottom of the comment list. That hinders the discussion. It would be helpful if you added the ability to reply directly to each individual comment.
That’s just my 2 cents. Thanks.

Comments are closed.