Nanoinformatics’ (the use of data science and machine learning to explore the complex structure/property relationships in nanoscale materials) is just starting to emerge.
Machine learning needs no input assumption and if a relationship (pattern) exists in the data it will naturally emerge, regardless of whether it was foreseen and targeted in the original series of experiments or simulations.
Humans are very good at visual pattern recognition, and researchers with an intimate familiarity with their material and data would be remiss not to draw on this ability as part of their research.
By generating 2D maps of nanoparticles encoded by one characterizing feature at a time, while reducing the dimensionality of the data, the researchers could easily compare 2D maps to pick out trends.
Algorithms that can reduce the dimensionality of this kind of data into 2D maps include t-distributed stochastic neighbor embedding (t-SNEs) and self-organisation maps (SOMs), also referred to as Kohonen networks.
Maps based on the SOM algorithm comprise a grid of units that act as “neurons”. Each neuron starts with a random value. In the machine learning stage, for each data point recorded, the algorithm searches the grid for the unit that best matches its value by taking differences.
The value of the neuron at this “best matching unit” and those close to it are then updated to “weight” it with respect to the matching data. The t-SNE is similar in some ways but weights its grid based on probabilities and so distances and directions on the map rendered are not as meaningful.
Barnard and Sun use both algorithms on two sets of data one for silver nanoparticles and one for platinum nanoparticles. They show how they can identify structure/function relationships using both algorithms.
Some of the structure/property relationships we identify were already known and some were more nuanced, and previously hidden because the curse of dimensionality prevented straightforward interpretation using conventional methods,” Barnard tells.