Traffick - The Business of Search Engines & Web Portals
Blog Categories (aka Tags) Archive of Traffick Articles Our Internet Marketing Consulting Services Contact the Traffickers Traffick RSS Feed

Monday, September 04, 2006

Taxonomy for Fun and (Google's) Profit? Community Image Tagging

Google Image Labeler is eliciting intelligent commentary around the virtual campfire, as one might expect.

It seems Google needs to improve the quality of its Image Search by tagging the images. What better way to go about it than luring an army of volunteer taggers? Hey, where have we heard this story before? Remember ODP?

Accurately describing elements of an image in few words isn't as complex as editing directory categories.

Today, sites like Flickr and Youtube thrive on tagging. First, contributors of uploaded images, and later, other members of the community, tag their material as well as they can. It's a rough and ready form of classification that's attracted much interest and much pro & con, parallel with general debate over whether Web 2.0 is really anything, let alone an advance over what came before.

Well, it is an advance, or Google wouldn't be doing this. Tags help users find images, there's no doubt about that.

And now begins the great experiment with different incentive systems and value systems. It looks as if properties like Flickr and Youtube have pretty accurate taggers, perhaps because those engaged in tagging genuinely get it and are genuinely trying to be helpful. At this juncture, by contrast, Google seems to be running into the odd problem with insincere and malicious taggers, at least if the "editorial comment" type tags I'm seeing on Google Video are any indication. But the random "double-verification" approach to tagging is ingenious compared to hierarchical command-and-control systems. Where editors and their "bosses" know one another and can rig up a corruption scheme, this system seems to pair editors up with people they don't know and cannot know. That isolates cheaters, Panopticon-like. I'm going to give it a try, just to check it out.

If accurate tagging requires the equivalent of professional editorial staff, but you're running it like a kind of community effort involving nebulous rewards, because professional staff could never get to everything... it seems likely that odd usage/contribution patterns will arise, as they have before. In ODP, there were "meta" editors and high-output editors who developed expertise and did much more work than most of the rest, but also ran the risk of developing blinders of sorts. *Why* did they do so much more than others?

When it comes to Wikipedia, the same phenomenon has occurred. The "spontaneous outpouring of community input" is driven by a cadre of prolific editors, followed by a long tail of occasional helpers. What does it all mean? I'm not sure, except that it speaks to the competitiveness of some people, even when trying to win at something that doesn't really benefit them, and benefits a "community" in a way that is yet unproven.

In this case the mega-taggers probably can't wreck anything -- especially with the random competitive tagging method tied to points -- so the end result is better search. If Google Video tags currently stink, they can perhaps assign "points" to those folks who want to go in and clean up all those tags too. Google, of course, profits, but there is a certain inherent fascination with watching something work better as taggers get involved. Then again, I'm not 100% sure it's worth anyone's time to accurately tag a Japanese teenager singing karaoke Barbie Girl.

We debated this subject here way back with the ODP case. To get truly professional editorial results consistently, in some cases you have to pay people; in other cases, you don't. With a poorly-thought-out incentive system (quality depends on commitment and skill level as well as incentives and sanctions for bad behavior), alternative (corrupt) compensation schemes can arise.

So, some thinking had to go into it. Google doesn't have a real "vertical" or "spontaneous face to face society" feel to it, but it does of course have the advantage of a lot of money and a willingness to experiment with various filtering and incentive systems. So - it looks like a sawoff. They can find a way to overcome the shortcoming of their bigness.

Either way, tagging is moving search forward. Probably the most intriguing nascent tagging experiment, for me, is Amazon's. Books are being tagged as we speak, first by authors, then by prolific reviewers... and... later, everyone else? Or not? Regardless, the result seems to be a parallel form of taxonomy that arises spontaneously out of community effort (assuming reasonable expertise in the community), as opposed to getting the Library of Congress category right, or some other method that might have existed in the past. From a tag, bringing up all known books about "beanstalks" *tagged as such* is only a click away. That's not the same as doing a raw keyword search for beanstalks. Tagging is shades of past information science efforts, obviously, but it's happening here and now in a specific kind of way, and it would be a mistake to dismiss its impact.

One more thought: vis-a-vis PageRank and anchor text... hasn't linking always been like tagging? It's a mistake to say that Google eschewed metadata because they didn't look at meta keyword tags. They were just looking at different tags, and still do. :) For a long, long time, a high proportion of website publishers voluntarily "tagged" their links with something a little more informative than "click here"... just because the web gurus said it was a good thing to do.

Edit: after playing the "game," I ran across this excellent post on O'Reilly Radar, which explains that Image Labeler is based on Prof. Louis von Ahn's "ESP Game". On Search Engine Watch, Danny Sullivan confirms this in a Postscript, having heard back directly from Prof. van Ahn. As an aid to tagging images, it's clear to me as a player that the type of "ESP" that is involved in playing the game optimally is not going to lead all by itself to the kind of thorough tagging we see on other sites. The best way to get the most points is to match your partner's labels as many times as possible in a timed session. And the only way to do that is to quickly type in the least complex words possible. Sure, Google might tuck away your unmatched, more complex words, but to get the most points, you and your partners will soon learn that you should aim for the least complicated word possible to describe some part of the photo: eg. ocean, sky, people, woman, man, office, desk, etc. Screen shots of something complicated, such as a spreadsheet, are most easily matched when partners type in the heading of a column or any prominent word in the screenshot. A complex (but known) type of logo will be best matched with your partner if you both type in "logo." And so on.

On a final, final note: I suppose "tagging" is slang for "graffiti." This kind of tagging is something like the opposite of graffiti, especially when the sober, straight taggers are assigned to clean up the "Google Video Graffiti."

Posted by Andrew Goodman




View Posts by Category

 

Speaking Engagement

See Andrew Goodman speak at ClickZ Live New York

Need Solid Advice?        

Google AdWords book


Andrew's book, Winning Results With Google AdWords, (McGraw-Hill, 2nd ed.), is still helping tens of thousands of advertisers cut through the noise and set a solid course for campaign ROI.

And for a glowing review of the pioneering 1st ed. of the book, check out this review, by none other than Google's Matt Cutts.


Posts from 2002 to 2010


07/2002
08/2002
09/2002
10/2002
11/2002
12/2002
01/2003
02/2003
03/2003
04/2003
05/2003
06/2003
07/2003
08/2003
09/2003
10/2003
11/2003
12/2003
01/2004
02/2004
03/2004
04/2004
05/2004
06/2004
07/2004
08/2004
09/2004
10/2004
11/2004
12/2004
01/2005
02/2005
03/2005
04/2005
05/2005
06/2005
07/2005
08/2005
09/2005
10/2005
11/2005
12/2005
01/2006
02/2006
03/2006
04/2006
05/2006
06/2006
07/2006
08/2006
09/2006
10/2006
11/2006
12/2006
01/2007
02/2007
03/2007
04/2007
05/2007
06/2007
07/2007
08/2007
09/2007
10/2007
11/2007
12/2007
01/2008
02/2008
03/2008
04/2008
05/2008
06/2008
07/2008
08/2008
09/2008
10/2008
11/2008
12/2008
01/2009
02/2009
03/2009
04/2009
05/2009
06/2009
07/2009
08/2009
09/2009
10/2009
11/2009
12/2009
01/2010
02/2010
03/2010
04/2010

Recent Posts


Thanks, Mitch, for a Great Geek Dinner

Hedger Has a New Gig

Patricia Best: It's Old News

What, No Comment? Danny Sullivan's Bombshell

At Least It Isn't Arbitrage...

Monday News & Grumbles

Web 2.0 Roadkill

Toronto Star Classifieds vs. Craigslist (and 905 v...

Traffick's "Innovators" Series Debuts with the AdS...

Silicon Valley, a.k.a. Valet-Land

 


Traffick - The Business of Search Engines & Web Portals

 


Home | Categories | Archive | About Us | Internet Marketing Consulting | Contact Us
© 1999 - 2013 Traffick.com. All Rights Reserved