1. The Tokyo Institute of Technology announced the details of the “Tsubame 2.0,” the next-generation multi-petaflop supercomputer system for the university that will start operation in the fall of 2010
The computation capacity of the system is 2.39 PFLOPS (petaflops, double-precision value), which ranks second in the “Top500,” a ranking of supercomputers, as of June 2010.
“It will be the first petaflops computer in Japan,” said Satoshi Matsuoka, professor at the Global Scientific Information and Computing Center (GSIC) of the university. “And it will be the first world-class supercomputer system for our university.”
The actual construction of the system, which will be conducted by NEC Corp and Hewlett-Packard Co, has yet to be done.
The system has the “vector-scalar mixture architecture,” Matsuoka said. But the computation capacity of its graphics processing units (GPUs) accounts for 90% of the total computation capacity, making the system more like a vector computer.
For the last three years, I.B.M. scientists have been developing what they expect will be the world’s most advanced “question answering” machine, able to understand a question posed in everyday human elocution — “natural language,” as computer scientists call it — and respond with a precise, factual answer. In other words, it must do more than what search engines like Google and Bing do, which is merely point to a document where you might find the answer. It has to pluck out the correct answer itself.
The producers of “Jeopardy!” have agreed to pit Watson against some of the game’s best former players as early as this fall. To test Watson’s capabilities against actual humans, I.B.M.’s scientists began holding live matches last winter.
On one day of test matches against humans Watson won four of six games.
Jeopardy Champions hit the buzzer first about 50% of the time and get 85-95% of the answers correct.
IBMs true target is to create search engines better than Google and answer engines better than WolframAlpha.
Many of the statistical techniques Watson employs were already well known by computer scientists. One important thing that makes Watson so different is its enormous speed and memory. Taking advantage of I.B.M.’s supercomputing heft, Ferrucci’s team input millions of documents into Watson to build up its knowledge base — including, he says, “books, reference material, any sort of dictionary, thesauri, folksonomies, taxonomies, encyclopedias, any kind of reference material you can imagine getting your hands on or licensing. Novels, bibles, plays.”
No single algorithm can simulate the human ability to parse language and facts. Instead, Watson uses more than a hundred algorithms at the same time to analyze a question in different ways, generating hundreds of possible solutions. Another set of algorithms ranks these answers according to plausibility; for example, if dozens of algorithms working in different directions all arrive at the same answer, it’s more likely to be the right one. In essence, Watson thinks in probabilities.
At first, a Watson system could cost several million dollars, because it needs to run on at least one $1 million I.B.M. server. But Kelly predicts that within 10 years an artificial brain like Watson could run on a much cheaper server, affordable by any small firm, and a few years after that, on a laptop.
* Medical – answer questions about health problems
* Call centers for banks, retail, government agencies. Improve the automated voice messaging systems
3. Digging into Data projects are expected to run until March 31 of 2011, and are applying supercomputer-powered humanities research. The ultimate goal is to model humanity. Imagine predicting humanity’s social evolution on a massive scale the same way scientists model and predict climate patterns.
* a group of researchers, including some at McGill University, who plan to collect 23,000 hours of music across every style and region imaginable, and digitally analyze the data to find the underlying structures of global music.
* Another project proposes taking digitized classic Greco-Roman text and automatically “enriching” it by having a computer run through the text and automatically tag it with links, much the same way as text on a website contains hyperlinks to other pages. For example, the word ‘Athens’ would be automatically tagged with a classical map of the city, or an entry to the city’s Wikipedia page.
* a computer will look at almost 200,000 recently digitized records of trials from the Old Bailey, London’s central criminal court and generate assertions based on what it read – for example, hypotheses about the lives of lower-class London residents, and punishments for certain crimes.