Add idea


«    July 2019    »










DateDate: 5-07-2019, 05:55

Artificial intelligence, which did not contain any knowledge of chemistry, rediscovered the periodic table and prompted scientists to new promising materials. For this, he analyzed 3.3 million annotations of scientific papers.
The achievement is described in a scientific article published in Nature magazine by a group led by Anubhav Jain from Lawrence Berkeley National Laboratory, USA.
Today, there is a sad joke among scientists that it is easier to rediscover it than to find information about it. An estimated five years ago, the Internet was available 114 million (!) Of scientific publications in English. And every day a lot of new ones are added to this array.
Even in narrow areas of science, be it the study of the solar wind or thermoelectric materials, the number of outgoing articles is such that the researcher is not able to read them all, even if he will only do this every day from morning to evening.
The scientific community is trying to solve this problem by creating more sophisticated search engines, databases and automatic information analysis tools. But at the moment the task of being aware of everything that is happening in your field still requires overwork from a scientist.
The Jaina team contributed to solving this problem by creating an artificial intelligence system based on Word2vec technology.
This method is inherently purely linguistic. Each word is represented as a set of n numbers (coordinates). In other words, it becomes a point in n-dimensional space.
The computer calculates how often certain words are found nearby from each other. On this basis, he assigns them the values of the coordinates. It is assumed that words with close coordinates have a similar meaning.
Jaina and colleagues were interested in how this approach would cope with the analysis of the scientific literature on materials science. They were specifically interested in thermoelectric materials that convert temperature differences into electrical voltage (or vice versa).
The researchers fed the system 3.3 million annotations of scientific articles published in more than a thousand journals between 1922 and 2018. Artificial intelligence has identified in them about half a million different words. He presented each word as a set of two hundred coordinates.
The authors emphasize that the program was not incorporated any information on chemistry or physics. The system learned all its “knowledge” from annotations of scientific articles. All the more surprising were the results.
For example, the researchers found out what the coordinates in the 200-dimensional space got the name of each chemical element. Having projected this picture onto a plane, they got a kind of periodic table. The elements were grouped by nature: inert gases separately, alkali metals separately, diatomic non-metals separately, and so on.
Left: elements grouped by artificial intelligence. Right: the same groups in the periodic table.
Berkeley Lab illustration.
“Without knowing anything about materials science, [the program] studied such concepts as the periodic table [of Mendeleev] and the crystal structure of metals,” says Jain.
If computer intelligence has mastered materials science so well, can it identify effective thermoelectrics among numerous materials? The authors checked this by specifying a search for the names of substances, in their coordinates as close as possible to the word "thermoelectric".
The program has formed the top 10 materials. For each of them, the researchers calculated the power factor (power factor), which determines its effectiveness as a thermoelectric.
It turned out that for all selected substances this value was higher than the average for all known thermoelectric compounds. The materials from the top 3 had more than 95% of the known thermoelectrics.
But for a prospective thermoelectric, not only the power factor is important. It is necessary that the substance was inexpensive, safe and easy to manufacture. Does artificial intelligence take into account these features? And can he predict which substances should experts pay attention to in future studies? According to the authors, yes.
For example, scientists gave the system the same task: to find thermoelectrics, but they did it twice. For the first time they provided her with publications published before 2008, and for the second time - until 2018.
For the first time, the system selected the top 5 materials. Three of them were in the top and in publications in 2018, and the other two contained rare or toxic elements.
According to the calculations of researchers, the system guessed the material, to which experts will pay attention in the next ten years, four times more often than if she called them by chance.
"This study shows that if this algorithm were used earlier, some materials could have been discovered many years earlier," says Jain.
In other words, information about promising compounds was contained in scientific articles, but the community did not notice it in time.
Together with the results of such "after-orders", the authors publish