"People who assume that Google has everything really miss relevant items," said Tamas Doszkocs, a computer scientist at the National Library of Medicine (NLM). He has been working for almost a decade on a metasearch engine called ToxSeek, which scours toxicology and environmental health databases at government agencies. The site, accessible during its beta-testing phase, is scheduled to launch later this year.
In addition to metasearch capabilities, ToxSeek also uses clustering, another new search technique. With clustering, algorithms sort search results into groups based on textual and linguistic similarities.
For example, a ToxSeek user could search for "cancer" and "smoking," and the system would return results categorized by a variety of subheads, including the information's source, topic and type.
Clustering lets users see results that would otherwise appear near the end of ranked lists, and they can survey the information landscape before digging in.
One of the earliest adopters of clustering in the government is the Homeland Security Digital Library. The library, maintained by the Homeland Security Department and Naval Postgraduate School, deployed a version of ToxSeek more than six months ago.
ToxSeek has been released as an official service of the National Library of Medicine.
NLM, DOE Synergy in NLM Prize for NLMplus – An Award-winning Demonstration of Semantic Search [presentation]
http://www.cendi.gov/minutes/pa_0112.html
ICE – Intelligent Clustering Engine: A clustering gadget for Google Desktop
Expert Systems with Applications, Volume 39, Issue 10, Pages 9524-9533, August 2012
Carlantonio, Lando M. di ; Osiek, Bruno A. ; Xexéo, Geraldo B. ; Costa, Rosa Maria E.M. da
http://dx.doi.org/10.1016/j.eswa.2012.02.101