Monday, June 29, 2009

Ascot and Ballmer

Recently I watched the above video where Steve Ballmer is explaining his view about the innovation in search.

Here is the transcript of the interesting part:

"The truth of the matter is: there s gonna be more innovation in search in the next 10 years than the last 10. It s still true today that the average search, at least in English, is 2.2 words cause we' ve all figured out that if you type more into a search engine, you get worse results. That s crazy. Search engine should be able to do better, if they understand more about what you mean, if they understand more about the semantics of the documents. The user interface in search hasn't changed in years. We're exploring some changes."

For the first time, I told myself that Ballmer was right.

He' s right on the 4 following points:

1- "Innovation in search"
2- "Paradox in search"
3- "Semantics in search"
4- "Interface in search"

1- is related to Singularity. According to the law of accelerating returns, there is paradigm shift every decade. In 1999, we got Google with PageRank. In 2009, we got Wolfram and Bing. But neither Wolfram nor Bing have induced any change. They still rely on the search bar paradigm (*). To me, the next paradigm will be either "Natural Language Processing" or a "Personal PageRank" (= a ranking algorithm that is not assessing the relevance of a given document against the majority of users but against a specific user profile.)

2- The more keywords you input, the more information you input, the more accurate the results should be. The main reason for that paradox, at least from my perspective, is that current search engines still use a Boolean way of handling the keywords. One consequence is that each keyword is regarded as a new condition. But in fact, users do not want necessarily to specify additional conditions.

3- Search engines should be able to understand the meaning of the user request instead of using keywords that can be ambiguous. That s why we get false positive matches if we are not cautious enough.

4- "Interface in search": Current search engines still use that narrow, flattened small window where users input ASCII keywords.

Nor 2-, nor 3-, neither 4- are solved by Bing.

Are they solved by Ascot Project? 3- is not, definitively. We do not deal with semantics at all. I think that the real solution will come with natural language understanding involved in natural language processing.
For 2-, we have a solution. For 4-, we do again :)

We are not pretending to have invented the next paradigm because we haven't. But we think we got something that represents an additional step towards the ability to process natural language processing. From now on, users will be able to input as many keywords as they want...

(*) The search bar paradigm is comprised of the AND/OR relations and the search bar itself.


  1. I don't think natural language query will be for soon. The main problem is not really the algorithms themselves, it's the resources (once again we end up talking about it). The problem actually is that a web resource specially html page are really difficult the semantically parse. dozen of dtd, none-valid pages, images containing texts and more specifically and language errors (syntax, misspelling, drunk people writing there blog etc...). Therefor i really think the concept of personal robots will be the next big move, for multiple reasons:
    - No need to rewrite the ranking algorithm
    - Easier to get personal data from the users, then easier to serve him targeted ads, then make more money
    - Will help to speed-up the development of the semantic-web.

  2. but the idea of robot is not new. they are already used to parse documents.
    Or you are talking about robots that have a certain artificial intelligence?

  3. exactly, robots that know what i want and what i am interested in to provide me what i really need and even without searching for it.

  4. How would these robots be able to do any better than the current algorithms, NLP analyzers if they rely on the same methods of analyzing the web?

    I think the post is very straightforward, recognizing honestly what the Ascot Project can do and cannot do.