Sunday, March 04, 2007
Tom Foreski in SiliconValleyWatcher lists all the ways that we the people are expected to help search engines do their jobs - so search technology isn't pulling its weight as compared with the human element.
Eye-opening at first, but rather than a blow-by-blow response, why not sum it up this way:
More detailed response:
- Should you do extra work to label your content or install sitemaps? Hmm, only if you want to be found.
- We're talking about publishers, not people. Therefore, purveyors of information (and/or products and offers) in a hugely competitive, open environment. Tom's article, for example, sports ads for conferences as well as Edelman, the world's largest PR firm. Seems like a tag or two might be a decent tradeoff for the exposure. You don't expect the engine to actually write the content for you, so what's a bit of extra metadata between friends? As for "people," it's the users that are getting a good deal out of the extra work you might do to label your content
In short, the claim that "people should just find me" is a bit like building an all-graphics site and hoping people will find you when they search for "guitar pick." Or sitting on your back porch strumming "Galveston" and praying you'll be invited onto American Idol.
- Tags or labels are indispensable when it comes to some kinds of content, such as videos or photos
- If something is useful or popular enough, depending on the community, third-party tagging can be helpful. What's the incentive to do this? Interesting question. What's my incentive to type this sentence? But yes I think there is a huge bunch of unlabeled stuff that probably will stay unlabeled because there is no incentive to label it. That doesn't mean search engines aren't going to try to "organize it and make it universally accessible."
- The article's general tone seems to suggest that the search engines are stingy about "sending their robots around." Far from it! Even relatively unpopular sites are spidered frequently nowadays.
- Search engines have advanced in many ways over the past few years. One of them is their sheer storage capacity. Index size is a huge challenge, which brings us to:
- The claim that corporate search engines are doing a better job of letting publishers take the lazy way out is a bit odd. It's a much smaller dataset, so stuff is much easier to find. But I'll grant that it is interesting that some of these technologies are quite good at recognizing industry-specific patterns, and autocategorizing content -- no user tagging required. But that's a whole internal debate in the info retrieval field. I'm sure some companies use human categorization!
- Search is a bit like matchmaking, and the meaning of what "search" is has expanded. Take the emerging field of local search. Now add the premise that "refine is the new search" (I don't think it really is on its own, but users definitely want to be able to "drill down" to get exactly what they want by telling the search engine). And hey, why not toss in the idea of geolocation & mapping. So I'm a user and I'm looking for a hardware store, let's say. Let's say I also want to find a hardware store that sells a certain brand of doorbell. I'd prefer it be within 20 minutes driving distance. And I want to find one that is "open 24 hrs." (just for argument's sake). None of that is ever going to be findable without a huge amount of research, unless of course the "publisher" (hardware store owners) is willing to upload their information in a structured format. By uploading that info, buyer and seller connect more easily. By not uploading it, you choose "not to be on the map." It's your choice.
- Things like Google Base are arguably research projects to help Google find out what are some common categorization schemas in a given industry - or a whole category, like brick and mortar retail. (If "open 24 hrs." is a common one, then maybe it'll come up more often in search and navigation databases as a yes/no item down the road, let's say.)
Labels: galveston, google base, local search, mapping, metadata, search engines, sitemaps
View Posts by Category