Machine-learning techniques could revolutionize materials science

Researchers are using computer modeling and machine-learning techniques to generate libraries of candidate materials by the tens of thousands. Even data from failed experiments can provide useful input1. Many of these candidates are completely hypothetical, but engineers are already beginning to shortlist those that are worth synthesizing and testing for specific applications by searching through their predicted properties — for example, how well they will work as a conductor or an insulator, whether they will act as a magnet, and how much heat and pressure they can withstand.

The hope is that this approach will provide a huge leap in the speed and efficiency of materials discovery, says Gerbrand Ceder, a materials scientist at the University of California, Berkeley, and a pioneer in this field. “We probably know about 1% of the properties of existing materials,” he says, pointing to the example of lithium iron phosphate: a compound that was first synthesized2 in the 1930s, but was not recognized3 as a promising replacement material for current-generation lithium-ion batteries until 1996. “No one had bothered to measure its voltage before,” says Ceder.

At least three major materials databases already exist around the world, each encompassing tens or hundreds of thousands of compounds. Marzari’s Lausanne-based Materials Cloud project is scheduled to launch later this year. And the wider community is beginning to take notice. “We are now seeing a real convergence of what experimentalists want and what theorists can deliver,” says Neil Alford, a materials scientist who serves as vice-dean for research at Imperial College London, but who has no affiliation with any of the database projects.

However, the journey from computer predictions to real-world technologies is not an easy one. The existing databases are far from including all known materials, let alone all possible ones. The data-driven discovery works well for some materials, but not for others. And even after an interesting material is singled out on a computer, synthesizing it in a laboratory can still take years. “We often know better what we should be making than how to make it,” says Ceder.

In 2003, Ceder and his team first showed how a database of quantum-mechanics calculations could help to predict the most likely crystal structure of a metal alloy — a key step for anyone in the business of inventing new materials.

In the past, these calculations had been long and difficult, even for supercomputers. The machine had to go through an inordinate amount of trial and error to find the ‘ground state’: the crystal structure and electron configuration in which the energy was at a minimum and all the forces were in equilibrium. But in their 2003 paper, Ceder’s team described a shortcut. The researchers calculated the energies of common crystal structures for a small library of binary alloys — mixes of two different metals — and then designed a machine-learning algorithm that could extract patterns from the library and guess the most likely ground state for a new alloy. The algorithm worked well, slashing the computer time required for the calculations

The idea then gave birth to two separate projects. In 2006, Ceder started the Materials Genome Project at MIT, using improved versions of the algorithm to predict lithium-based materials for electric-car batteries. By 2010, the project had grown to include around 20,000 predicted compounds. “We started from existing materials and modified their crystal structure — changing one element here or another one there and calculating what happens,” says Kristin Persson, a former member of Ceder’s team who continued to collaborate on the project after she moved to the Lawrence Berkeley National Laboratory in California in 2008.

In June 2011, the White House announced the multimillion-dollar Materials Genome Initiative (MGI). The initiative has invested more than US$250 million into software tools, standardized methods to collect and report experimental data, centres for computational materials science at major universities and partnerships between universities and the business sector for research on specific applications.

Materials databases are works in progress, and a good share of time is spent adding more compounds and refining the calculations — which, they admit, are far from perfect. The codes tend to be quite good at predicting whether a crystal is stable or not, but less good at predicting how it absorbs light or conducts electricity — to the point of sometimes making a semiconductor look like a metal. Marzari notes that even for battery materials, an area in which computational materials science is having its best success stories, standard calculations still have an average error of half a volt, which makes a lot of difference in terms of performance. “The truth is, some errors come with the theory itself: we may never be able to correct them,” says Curtarolo.

sThere are still many hurdles to overcome before materials genomics can live up to its promises. One of the largest is that computer simulations still give few clues on how an interesting material can be made in a lab — let alone mass produced. “We come up with interesting ideas for new compounds all the time,” says Ceder. “Sometimes it takes two weeks to make it. Other times we still can’t make it after six months, and we don’t know whether we haven’t done the right thing, or it just can’t be made.”

Both Ceder and Curtarolo are trying to develop machine-learning algorithms to extract rules from known manufacturing processes to guide the synthesis of compounds.

Another limitation is that materials genomics has been hitherto applied almost exclusively to what engineers call functional materials — compounds that can perform a task such as absorbing light in a solar cell or letting electrical current pass in transistor. But the technique does not lend itself well to studying structural materials, such as steel, that are needed to build, for example, aircraft wings, bridges or engines. This is because mechanical properties such as a material’s springiness and hardness depend on how it is processed — something that quantum-mechanical codes by themselves can not describe.

SOURCE- Nature