| |
|
Meaning-Based
Search Redefines Web Sleuthing
Andrew's
METAGUIDE - Searching for a Better Way #3 - Meaning-Based Search
Engines
This year, several companies have
arrived on the scene promising to improve on traditional keyword
searching through "meaning-based" search technology. Their premise
is a good one: keyword searching is "dumb" and can return many irrelevant
results.
When I search for the word "portal,"
I mean web portal; I don't want to see material about a science
fiction game called Portals When I'm seeking information about "The
Big Tuna," I don't want to go on a fishing expedition. I just want
to know whether Bill Parcells plans to coach again. (Note to self:
suggest this meaning for "Big Tuna" to the Oingo staff.)
I've recently talked with three companies who are working on this
problem. Two of them, Oingo and Simpli, are working on developing
proprietary lexicons which allow users to zero in on particular
meanings for a given keyword. A third, ejemoni, is developing more
ambitious technology that can scan the text of a document and analyze
the relationships among words to help in placing documents in specific
categories that describe what they are about overall.
All of these technologies have potentially widespread applications.
As one contributor to Traffick Forums argued, however, at this stage
they don't do a lot that a conscientious searcher couldn't do for
themselves with the simple use of Boolean operators such as AND
or NOT.
Let's take a peek at these three entrants into the meaning-based
search field. We're sure to hear more from them in the future.
Oingo
http://www.oingo.com
Oingo is more
typical of a Silicon Valley startup than the other two: it's brash,
young, fun, and likely to outwork you if it can't out-think you.
They've got a working product, and they've got it now. Their team
of linguists has built a large lexicon of common meanings for search
terms, and the company is now offering their technology "open source"
as a front end for any directory or site which wishes to use it.
The default directory being used to demo the service is the ever-present
Open Directory Project.
If you try Oingo, you'll see where they're headed. For the time
being, however, it's not about to replace my favorite search engines.
(Lately, I have been using Google and Ixquick, two that I find tend
to provide highly relevant results without a whole lot of effort
in devising search terms.)
Down the road, however, the Oingo team feels it's only a matter
of time before a major search company finds the technology useful.
This could well be true. Major search and portal companies today
are not shy about adding on a combination of external technologies
to ensure better results. Go2Net uses Direct Hit (the popularity
engine); MSN offers Looksmart directory results; various others
have chosen the Open Directory for categorized results. Meaning-based
search is going to find its way into the mix one way or another.
SimpliFind
http://www.simpli.com
SimpliFind
has the same basic idea as Oingo, but seems to have a little heavier
complement of scientific muscle on board from the likes of Brown
and Princeton Universities. Its lexicon, called WordNet, was developed
over a long period of time by cognitive and linguistic scientists
at Princeton.
A test of the product is satisfying. This technology is sure to
find its way into many databases, and might become a force on the
Internet.
Then again, holes in the database reinforce the fact that SimpliFind,
like Oingo, is going to have to rely on considerable customer-driven
customization and brute force to respond to very human twists and
turns in language, history, commerce, and popular culture. I searched
for "Watergate" and Simpli came back with "No Meaning Found." Now
there's some social amnesia for you! (Oingo has them beat on that
one, which underscores the fact that high-level cognitive science
alone won't be enough to make this technology practical.)
One question for the scientists. Will XML (eXtensible Markup Language)
have the potential to make their current approach irrelevant? Tomorrow's
Internet is going to be more than a question of determining the
different meanings for words in the English dictionary. XML may
allow meanings to become hard-wired to ever more particular contexts,
and thus make search technology ever more useful. Thus we'll be
able to search for documents, companies, publications, products,
people, spare parts, geographic locations, stock prices, and so
on, without seeing all the other junk with similar keywords. At
least that's what I read in The Economist magazine.
ejemoni
http://www.ejemoni.com
At this point,
we can only speculate about the power of ejemoni, another sophisticated
startup working on meaning-based search. Ejemoni is well-financed
by an influential angel investor, and has what some observers believe
may be a major scientific breakthrough on its hands. The core idea
appears to be the ability to find related documents by analyzing
the content of whole documents and placing them into an overall
category similar to the Library of Congress classification system.
A cool feature that may be made possible with ejemoni's technology
is the ability to highlight a whole paragraph or even several paragraphs
of text, and search for related documents based on all of the words
you highlight. The company stresses that the algorithm used by ejemoni
will not simply be looking at keyword density but will genuinely
analyze the meanings of documents based on word relationships. Obviously,
there is a lot of potential in a search technology that works better
as you feed it more words. It might even be able to approximate
what Ask Jeeves only pretends to do, which is to understand your
questions! And yes, Jeeves, ejemoni is already voice-recognition
ready, according to the company.
At this stage, it's too early to see ejemoni in action. We'll catch
up with this one again later.
Sites
featured in this article
Oingo
- http://www.oingo.com
SimpliFind - http://www.simpli.com
ejemoni - http://www.ejemoni.com
READ
THE REST OF THE SERIES:
Searching for a Better Way
1. Popularity
Engines
2. Meta Search Engines
3. Meaning-Based Search Engines
4. Natural-Language Search Engines
5. Expert Guide Sites
6. Pay-Per-Click Search Engines
PAST TRAFFICK
COLUMNS >>
|