Methods for the social archiving of mailing lists…

Imagine you’re on a mailing list that archives URLs that people share in some form, and that this creates indirectly some kind of archive or directory. Imagine that this archive has generally been maintained by hand and in a formal taxonomic structure. Imagine that the weight of maintainance started to get the list owner down and they decided they could no longer justify the time they’d need to spend on it. How to distribute the work effectively? How to maintain the utility of the directory without killing the people upon it? What follows are some freeform, stream of consciousness-style notes written off the top of my head. Better out than in.

Your most obvious territory for thought might be the categorisation scheme and how to dismantle the formal structure in favour of something lighter and less complex to maintain. The most obvious direction change you could make here would be to move towards a folksonomic tagging approach. But a true folksonomy must emerge from the overlaying of many people’s efforts otherwise what you’ve got is a personal, informalor just plain badtaxonomy. So straightaway you may end up having proliferated the work rather than reducing it. It may be more distributed, but is it any more likely to get done? That’s a difficult question to answer.

Skipping away from the question of annotation for a moment, let’s look for a moment at how to get the first order objects (links) in the database in the first place. one approach would be to put every unique URL sent to the list directly into a database. Conceivably you could organise those URLs by tags imported from other locations – for example you could just go and get the folksonomic information for that URL from del.icio.us.

There are problems with this approach of course. For a start, you have then a repository of information about links that’s completely editorial free and doesn’t necessarily represent the context in which the URL had originally been shared. That is to say, you don’t have any of the original posters thoughts on the link, just the link and some tags. You could apend the whole e-mail to the URL, but then you you’re stuck with what happens if the list is private. Obviously then you’d be stuck.

An alternative: when an e-mail is sent to a list containing a URL, why not get the server to reply immediately to the original poster with a post containing a link to the place they could annotate or categorise their link to be added to the directory. That way the link originator could take responsibility for their particular piece of maintenance and the directory could grow through the individual actions of multiple individuals. Conceivably, links could be added to the database immediately they’re sent to the list, but not made ‘public’ until they’ve been annotated by either the link originator or the list owner. Because you’d be able to track the originators of the e-mails, you could then easily create a queue of URLs to subsequently annotate or approve.

There’s still a problem here, although it’s not a big one. If you take the folksonomic approach to categorisation then you’d have to rely on the individual’s personal taxonomy rather than on the wisdom of crowds bubbling up ‘correct’ categorisations. So then you have to ask yourself whether there were ways that you could usefully allow other people to enhance these URLs with more information after the originator or site owner has done the initial work. One option is to mine del.icio.us or another social bookmarking site as I proposed above. The other might be to allow other users in the mailing list to add their own annotations and tags to the link concerned. A server could usurp all e-mails containing links and add in additional link to a place where they could be annotated subsequently. The readers would automatically see the original link and then a link place where they could annotate the item. My big concern here is that individuals would be compelled by the software to move the conversations about links off-list and thus deform or split the conversation more than necessary.

One sideline… Of course you don’t necessarily need to get people to follow a URL to add in their information about a link – particularly if they’re the originators. Another approach might be to send the originator an e-mail (as above) with an identifiable string in the subject. Then simply replying to that e-mail with a message only containing a paragraph of text or a few Flickr style tags could add those tags and that annotation to the database. One anxiety there might be people incorporating accidentally great tracts of their previous e-mail into their annotations. Not ideal. Too fragile.

On the other hand, instant messaging in the Twitter model might provide some good options. Imagine if all users on a mailing list added their IM details to their profile, and added a bot to their IM friends charged with handling their mailing lists. When a message was sent to the list with a URL in it, the originator could be sent an IM request to describe the URL and everyone else on the mailing list could be sent the URL without comment. Once they’d observed the link, they could simply reply with their own comments or annotations which would then be saved to a database. Easy. If they didn’t want to keep getting URLs, a simple ‘off’ command could cease the flow…

The most obvious problem there would be if another URL came in as you were typing or if there were substantial communication delays, but I suspect these could be resolved one way or another.

Another option: individuals could choose to categorise URLs within the mailing list by hand using a third party service like del.icio.us which could then be aggregated by a local piece of software. They could either use their own personal accounts and mark things ‘for:{name of list}’ or they could use a shared account. This way you could bootstrap off other tools rather than build everything yourself. The most obvious problem: Is this work that people would want to do? If it is, would using del.icio.us (and conceivably then having to change accounts if you were already a user or having to mix in other people’s links with your personal linkstream) be a greater impediment than another approach? Tricky one.

A few other approaches leap to mind, but I think I’ll leave it there. If anyone has any other ideas, I’d really appreciate hearing them. A good way to think around the territory would be to think about which groups of people could do the various tasks associated with saving or annotation. In some models it’s likely to be the posting user who does everything, in others their peers take on spotty bits of work and all the annotation. In still others you can imagine a dedicated admin doing all the work, and in others still, people off list completely could be categorising and annotating what people on-list are doing. Finding the correct approach will rely on working out where the motives for contributing might be for each group of people and how to build something that meets that particular groups needs. Any thoughts?

Wouldn’t that method be something more like trackback comment?
In fact, would that be a decent in-between in your archive of comments? Trackbacks are (sometimes) nice because they place a bit of the contextual sentence around the link when they get posted back onto blogs.
This way you could have your del.icio.us style folksonomy and in the equivalent to the comment area of del.icio.us you could have the surrounding sentence to the URL. Thus you would get something like: “…so I think that Tom says it best at [URL] because blah, blah…” then your tags.
That doesn’t solve the taxonomy problem though. Here’s another off-the-cuff thought though. If the service/directory was really worth something to its users, what about a situation where you had to edit, say 10 URLs a week to maintain your membership. Is that a daft and obnoxious way of getting help or a kind of BitTorrent idea of contributing for the common good so you don’t leech?

4 replies on “Methods for the social archiving of mailing lists…”

If you’ve got people sending the same url more than once, you could take the emails containing a url and find words that are the same in each email and tie them to the url, filtering out noise words. Basically like tagging, except you’d have to wait for a word to appear more than once before you could count it and link it to that url

I think you’d have to do a lot of weird stuff to make that work – filtering out a hell of a lot of common words. I wonder if you could weight submitted words in some way in terms of variation of frequency from the norm. Probably you’d need a hell of a lot of people tagging to make an approach like that work at all.

A few years ago I had a side project that played with automated aggregation of news articles and the potential of assigning an automated taxonomy for a stream of aggregated articles. I found auto taxonomies quite hard (surprise!). I ended up creating a ‘dictionary’ that used keywords, link location and a few other tricks to get it working, but even then it was far from perfect.
I wouldn’t use this kind of method for an email list.
What I would do is ask the list users to add tags when they post a url using ’email shortcuts’, for example someone posting a link would do this: http://www.yahoo.com {title: Yahoo! tags: search, blah, etc descript: etc}.
You’d have all email sent on the list parsed by the server with the url and the shortcuts added. List users could then edit (if they wished) the link info through a web interface.

Comments are closed.