New data services 2: Wolfram|alpha

In March this year, überscientist Stephen Wolfram, of Mathematica fame, revealed the world he was working on something new, something big, something different. The first time I heard of this was through semantic web prophet Nova Spivack, who is not known to get excited by less-than-revolutionary projects. That, plus the fact that the project was announced so short before its release, contributed to build anticipation to huge levels.

wolframalpha

Wolfram|alpha describes itself as a “computational knowledge engine” or, simply put, as an “answer engine”. Like google and search engines, it tries to provide information based on a query. But while search engines simply try to retrieve the keywords of the query in their indexed pages, the answer engine tries to understand the query as a question and forms an educated answer. In a sense, this is similar to the freebase project, which is to put all the knowledge of a world in a database where links could be established across items.

It attempts to detect the nature of each of the word of the query. Is that a city? a mathematic formula? foodstuff? an economic variable? Once it understands the terms of the query, it gives the user all the data it can to answer.

Here for instance:

wolframalpha-2

Using the same find access process present share diagram as before,

Wolfram|alpha’s got “find” covered. More about that below.

It lets you access the data. If data have been used to produce a chart, then there is a query that will retrieve those bare numbers in a table format.

Process is perhaps Wolfram|Alpha’s forte. It will internally reformulate and cook your query to produce all meaningful outputs in its capacity.

The presentation is excellent. It is very legible, consistent across the site, efficient and unpretentious. When charts are provided which is often, the charts are small but both relevant and informative, only the necessary data are plotted. This is unusual enough to be worth mentioning.

Wolfram|alpha doesn’t allow people to share its outputs per se, but since a given query will produce consistent results, users can simply exchange queries or communicate links to a successful query result.

Now back to finding data.

When a user submits a query, the engine does not query external sources of data in real time. Rather, it used its internal, freebase-like database. This, in turn, is updated by external sources when possible.

For each query, sources are available. Unfortunately, the data sources provided are for the general categories. For instance, for all the country-related informations, the listed sources are the same, and some are accurate and dependable (national or international statistical offices), some are less reliable or verifiable (such as the CIA world factbook or what’s cited as Wolfram|Alpha curated data, 2009.). And to me that’s the big flaw of this otherwise impressive system.

Granted, coverage is not perfect. That can only improve. Syntax is not always intuitive – to make some results appear in a particular way can be very elusive. But this, as well, will get gradually better over time. But to be able to verify the data presented, or not, is a huge difference – either it is possible or not. I’m really looking forward to this.

 

jerome

 

Leave a Reply

Your email address will not be published. Required fields are marked *