The ability to predict, design, and control systems stems from our ability to reduce their dimensionality to a few key variables that accurately capture much of the behavior of the system. In theoretical physics, this has led to some of the most precise predictions ever made and the ability to control physical systems also with high precision. However, in complex systems, such as biological and technological systems we do not have this degree of predictability or control because these systems are so high-dimensional it has been daunting for human scientists to identify the relevant reduced variable set to describe them accurately. Complex systems present two major challenges to a reduced description of their behavior: (1) they are high dimensional (i.e., much higher than physical systems, meaning we cannot apply the same old tools and need new ones) and (2) the mapping at micro-scale can include many-to-many mapping. This means the mappings are themselves probabilistic, which makes accurate prediction impossible. An example of (2) is genotype to phenotype maps, where there are many genotypes with a given phenotype and the phenotypic landscape is itself dynamic such that many phenotypes can correspond to the same genotype, depending on environment.
Current machine learning algorithms either aim to directly predict microstates, or to identify macrostates without the ability to design microstates. What is needed is a machine learning framework that predicts macrostates and retains information to sample microstates with the specified behavior allowing the framework to predict distributions of microstates rather than just one microstate.
Researchers at Arizona State University have developed a machine learning framework that can automatically identify macroscale behaviors in complex systems and can sample specific instances that exhibit the behavior. For example, the framework can automatically identify energy as a relevant variable for describing the motion of a pendulum when provided with trajectory data, and then provide specifications of how to set the pendulum to produce the behavior with a specific energy. In another more complex example, the framework can be applied to Turing patterns (i.e., stable patterns that persist in oscillating chemical systems). The framework can identify the parameters underlying the feature of Turing patterns of a given size, and then be used to design specific examples with that behavior that were not in the original training set. Applied to complex systems, this allows the first automated method for identifying predictive regularities in data via dimensionality reduction to just a few variables (or macrostates), that also then allows the capability to design a new system outside the original training data that exhibits the same macroscale behavior.
The core capabilities of the framework include (1) allowing automated discovery of macroscale descriptions of complex systems (reduction in dimensionality to a few key variables); (2) design of systems that have the specified macroscale description by sampling microstates consistent with the identified macrostates; and (3) predicting behaviors at the macroscale when accurate predictions for specific systems are not possible.
- Machine learning algorithm that can be used for:
- Weather prediction, financial market prediction, other time series predictions
- Nanotechnology design, medicine design (drug discovery), chemical design, other design problems that require automated parameter design and sampling based on identified parameters
Benefits and Advantages:
- General purpose algorithm for identifying predictive macroscale properties of complex systems
- Predicts future behavior based on identified macroscale behavior, and allows sampling of microstates that are consistent with observed macroscale data
- Only framework that can both identify macrostates and sample system parameters to allow design of new microstates with a specified macroscale behavior