Right. Now. This is interesting. Google Base has launched and is both pretty weird and pretty interesting. The concept is fundamentally pretty simple – it’s almost like a completely open content management tool where you can post a recipe or a personal profile or a classified ad or whatever kind of thing you want. The item is then added to the internet as a standalone page – A recipe for Beef & Broccoli Over Shells for example. You can then contact the poster and navigate through similar items by tags (here called ‘labels’) or search through the complete database to find events, jobs, news, products, reference articles or whatever other type of data you want to define and submit.
From a personal perspective, I don’t quite get it – there’s no obvious reason I can think of for an individual to post a recipe to the service – but from a business perspective it’s really interesting. Basically it’s a complete circumvention of the problems with the Semantic web which abandons decentralisation and microformats completely. If your company has a database of things (whether that be products or pictures or weblog posts or news articles or whatever) that it represents on the web, then Google Base suggests that you should not wait to be spidered and nor should you expect them to do all the heavy-lifting to work out what your site is about. Instead, you just bulk upload all your data to Google directly and associate each entry with your corresponding page on the web. Google get an enormous amount of new useful data to organise and present to people, while the businesses or start-ups that use the service get new interfaces created for their content, and a greater findability and navigability for their data, products and services. And when Google creates an API for the service, suddenly every data source that uses them has an API as well. That’s pretty astonishing.
It’s not all positive for the businesses or start-ups, of course. It consolidates the idea of Google or a parallel search engine as the definitive place to find out information of any kind (rather than the local brand that you usually associate with events, restaurants or whatever). And that kind of corresponds to a larger question about whether Google is gradually and systematically eating the web. And I think there are larger problems too – the lack of any form of solid identifiers that will indicate whether you’re talking about the same film or book (in the review space at least) seems to me to be am issue. But generally it’s pretty interesting.
Which brings me to a fun challenge to my old employers. My old colleague Mr Biddulph (who has been freelancing for the BBC for a few months) and Mr Hammersley (of RSS, web services and utilikilt infamy) have been working on a representation of the BBC Archives Infax database for a few months. They’ve written about it in two pieces: The BBC’s programme catalogue (on Rails) and Hot BBC Archive Action. So why not make this content more explorable and searchable (and help define the way the web understands TV and radio programming) by bulk-submitting the entire massive database to Google Base? That would be an extraordinarily interesting move…
A couple of other interesting pieces:
10 replies on “In which Google Base launches…”
You know, you and Simon both posted and linked to each other as if the other had written first. Did you post from the same room or something?
Perhaps Yahoo! will eat itself 😉
“Basically it’s a complete circumvention of the problems with the Semantic web which abandons decentralisation and microformats completely.”
Gosh, that’s fundamentally wrong. Even Dave Winer, notorious Semantic Web poo-pooer has said about Google Base “For Semantic Web people –> validation!”.
So why not make this content more explorable and searchable (and help define the way the web understands TV and radio programming) by bulk-submitting the entire massive database to Google Base?
Or why not just keep it, y’know, public? As far as I know Google’s still a shareholder-owned, profit-making entity (unlike, say, the BBC). I know that Web 2.0 is all about the enclosure of the commons, the monetisation of freely contributed labour and the appropriation of collective creativity, but I don’t see why the BBC should help the process along quite this vigorously.
“there’s no obvious reason I can think of for an individual to post a recipe to the service”
If I post my receipe repertoire I (and my family, and my friends) can access it from anywhere in the world, and look at just the chicken dishes, just the meat dishes, just the vegan dishes etc. very easily. I look forward to the free and simple blog hosting service which could do this.
I think you’re right, and maybe this will finally be a platform for writing more intelligent agents.
To Phil: Okay – let me clarify. What I mean is NOT that the idea of structured data on the web is a bad idea, because it’s very definitely not. What I mean is that this is an attempt to completely circumvent the idea of people putting that meaning into markup. The idea of semantic information connecting on the internet is totally important and right, but the traditional semantic web interpretation was to do it in front-line HTML, before moving towards a more parallel, link-rel microformats style approach. The Google thing is about skipping past all of that nasty decentralised stuff that I believe in towards a huge centralised bin of aggregated databases. Which I think is interesting, and might work, and is probably really powerful, but isn’t the ideal implementation.
To Phil: Making data available for everyone to use is keeping it in the public sphere. Google would just be the first to use it. It’s not a zero sum game – Google can have the data and the BBC can have the data with both winning. In fact, the BBC benefits most from getting people to its programmes, and should be looking to any mechanism that helps that kind of thing happen. You could make an argument that the data should only be made available for non-profit-making organisations, but I’d argue that this will end up crippling the ability for the BBC to make it’s programming discoverable when everyone’s going on demand.
To Phil’s second comment though – completely! If you want to put content out in public in a useful way, then this is a way to do it, although I wonder how many members of the public will understand the possibilities of structured data in that way. There are clearly uses, I just think it would make more sense for the individual to do it in a space that they felt they owned (or in fact, did own).
Google Base suggests that you should not wait to be spidered […]you just bulk upload all your data to Google directly and associate each entry with your corresponding page on the web.”
Good idea until…
“Friendly Funtime Website http://www.funttimeexample.com“ that redirects to PORN PORN SPAM PORN or “Great Investments Co http://www.greatinvestexample.com“ redirects to http://www.istealmoney.com
Just look at e-mail spam for the thousands of tricks and lies people will go to when they send their own data.
Aha, I misread! I read “Basically it’s a complete circumvention of the problems with the Semantic web (which abandons decentralisation and microformats completely)” whereas I think you meant something more like “Basically it’s circumventing the problems with the Semantic web by abandoning decentralisation and microformats completely”.
This meant that I was very confused by your response until it suddenly clicked, and yes, I agree. 🙂
From how Google Base has been described, it sounds almost like a Flickr of all content, except maybe I’m missing something, but I can’t figure out how to use it. When I search for something, all my results are links to other websites. Nothing takes me to a page at Google Base like the shells recipe you linked to.