OK, I know this is a bit much – three posts in a row that’s related to Powerset, and there will still be a Part 2 to the Powerset demo post. But an article on Powerset just came out from Technology Review, and there are a few interesting nuggets in it.
A key component of the search engine is a deep natural-language processing system that extracts the relationships between words; the system was developed from PARC’s Xerox Linguistic Environment (XLE) platform. The framework that this platform is based on, called Lexical Functional Grammar, enabled the team to write different grammar engines that help the search engine understand text. This includes a robust, broad-coverage grammar engine written by PARC.
The article also mentions a semantic search engine being developed by IBM.
IBM is also in the midst of developing a semantic search engine, code-named Avatar, which is targeted at enterprise and corporate customers; it’s currently in beta testing within IBM. Project manager Shivakumar Vaithyanathan says that the hardest problems to overcome with natural-language search are finding a way to extract higher-level semantics from large documents while at the same time preserving precision and speed.