Current location - Education and Training Encyclopedia - Graduation thesis - Paper Notes-Learn to simulate complex physics with graphic network network simulator
Paper Notes-Learn to simulate complex physics with graphic network network simulator
original document

Here, we provide a general framework for learning simulation and the realization of a single model, which can produce the most advanced performance in various challenging physical fields (including the interaction of fluids, rigid solids and deformable materials). Our framework (we call it "Graph Network Simulator" (GNS)) uses particles to represent the state of the physical system, and particles are represented as nodes in the graph, and dynamics is calculated by learning message passing. Our results show that our model can be extended from a single time step containing thousands of particle states during the prediction training to different initial conditions and thousands of time steps, and at least one order of magnitude of particles can be added during the test. Our model is robust to super-parameter selection across various evaluation indexes: the main determinant of long-term performance is the number of message passing steps, and the accumulation of errors is reduced by destroying training data with noise. Our GNS framework is by far the most accurate universal learning physics simulator, which is expected to solve all kinds of complex positive and negative problems.

Realistic simulators of complex physics are priceless to many science and engineering subjects, but the creation and use of traditional simulators may be very expensive. It may take several years of engineering work to build a simulator, and it is usually necessary to sacrifice universality in a narrow setting range to obtain accuracy. High-quality simulators need a lot of computing resources, and it is forbidden to expand them. Due to the lack of understanding of basic physics and parameters or the difficulty of approximation, even the best is often inaccurate. An attractive alternative to the traditional simulator is to use machine learning to train the simulator directly from observation data, but for large state space and complex dynamics, the standard end-to-end learning method is difficult to overcome.

Here, we propose a general framework for learning simulation from data-"Graphical Network Simulator" (GNS). Our framework imposes inductive bias, in which rich physical states are represented by graphs of interacting particles, while complex dynamics are approximated by message transmission between nodes.

We have implemented the GNS framework in the deep learning framework, and found that it can learn various physical systems that accurately simulate the interaction between fluids, rigid solids and deformable materials. Our model can also be extended to a larger system and a longer time range than the trained system. Although previous learning simulation methods are highly focused on specific tasks, we find that a single GNS model performs well in dozens of experiments and is usually robust to hyperparameter selection. Our analysis shows that the performance is determined by the following key factors: its ability to calculate long-range interaction, attribution bias with spatial invariance, and training program that can reduce the error accumulation on long-term simulation trajectory.

Universal learnable simulation

We assume the state of the world at time t, and the state trajectory can be obtained by applying physical dynamics in k time steps. A simulator, S: X → X, simulates dynamics by mapping the previous state to the future causal state. We represent the simulated "unfolded" trajectory as:, which is calculated by iteration. The simulator calculates dynamic information reflecting how the current state changes, and uses it to update the current state to the predicted future state. An example is the solver of numerical differential equations: these equations calculate dynamic information, that is, time derivatives, and integrators are updating mechanisms.

A learnable simulator uses a parameterized function approximator () to calculate dynamic information, and its parameters can be optimized for some training targets. Represents dynamic information, whose semantics are determined by the update mechanism. You can think of the update mechanism as an adopted function and use it to predict the next state, namely. Here, we assume a simple updating mechanism (an Euler integrator), where y stands for acceleration. However, it is also possible to use a more complex update program that is called many times, such as a higher-order integrator.

Message passing on simulation graph

The simulation method we can learn is to use the particle-based representation of the physical system, that is, each xi of n particles represents its state. Physical dynamics is approximately the interaction between particles, such as the exchange of energy and momentum between particles. The modeling method of particle interaction determines the quality and universality of the simulation method, for example, the effects that can be simulated and the types of materials, and in which case the effect of the method is good or bad. We are interested in learning these interactions. In principle, it should allow us to understand that the dynamics of any system can be expressed as particle dynamics. Therefore, it is very important that different values should make the interaction function between particles span a large range.

Particle-based simulation can be regarded as graphical message transmission. Nodes correspond to particles, and edges correspond to the pairing relationship between particles. On this basis, the interaction is calculated. We can understand methods such as SPH in this framework-messages passed between nodes may use density kernel to evaluate stress.

We use the correspondence between particle-based simulator and message transmission on the graph to define the generality based on GN. Our work is divided into three steps-encoder, processor and decoder.

Encoder definition. Encoder: X is embedded as a latent image based on the state representation of particles, where). Node embedding is a learning function of particle state. Add directed edges to create paths between particle nodes with potential interactions. Edge embedding is the corresponding particle ri; The learning function of J's paired properties, such as displacement between their positions, spring constant, etc. Graphic-level embedding can represent global properties such as gravity and magnetic field (although in our implementation, we just attach them as input node functions).

Processor definition. Processor: calculate the interaction between nodes through m learning message passing steps, and generate a series of updated potential graphs. It returns the final graph. Messaging allows information to spread and obey constraints: the number of messaging steps required may increase with the complexity of the interaction.

Decoder definition. Decoder: the node from the final latent image). Learning should make the representation reflect relevant dynamic information, such as acceleration, so as to be semantically meaningful to the updating process.

Input and output representations. The input state vector of each particle represents the previous velocity sequence at that position, which has the characteristics of capturing static material characteristics (such as water, sand, mass, stiffness and boundary particles). If applicable, the overall characteristics g of the system include external force and overall material characteristics. The prediction goal of supervised learning is the average acceleration of each particle. Note that in our data set, we only need vectors: using finite difference to calculate sum.

Encoder details. The encoder constructs a graphic structure G0 by assigning a node to each particle and adding edges between particles within the "connecting radius" r, which reflects the local interaction of particles and remains unchanged for all simulations at the same resolution. In order to generate the rollout, at each time step, the edges of the graph are recalculated by the nearest neighbor algorithm to reflect the current particle position.

The encoder realizes sum as a Multilayer Perceptron (MLP), which encodes the node features and edge features into a potential vector sum with the size of 128.

We tested two variants of the encoder to distinguish whether they use absolute position information or relative position information. For absolute variables, the input of is the above-mentioned xi with global characteristics. Input of, i.e. ri; J does not actually carry any information, so it is discarded, in which the fixed deviation vector is set to be trainable. A variant of that relative encode aims to apply a constant inductive bias to the absolute spatial position. Xi's pi information is forced to be ignored by shielding. Provide relative position displacement, the size of which is. Both variants connect the global attribute g to each xi and then pass it to ".

Processor details. Our processor uses the same MGN (where m is a super parameter) stack as the MLP structure of internal edge and node update functions, and the parameters that * * * enjoys or does not * * * enjoy. The GN we use has no global function or global update (similar to the interactive network), and there is a residual connection between the input and output potential nodes and edge attributes.

Decoder details. The learning function of our decoder is MLP. After the decoder, the Euler integrator is used to update the future position and speed, so it corresponds to acceleration and has 2D or 3D dimensions, depending on the physical domain. As mentioned above, the goal of supervised training is only these vectors.

Parameterization of neural network. All MLPs have two hidden layers (with ReLU activation), followed by an inactive output layer, and the size of each layer is 128. All MLPs (except the output decoder) are followed by LayerNorm (layer). We usually find that the stability of training has improved.

Software. We use TensorFlow 1, Sonnet 1 and "graph net" libraries to realize this model.

Training noise. Modeling complex and chaotic simulation systems requires a model to reduce the accumulation of errors during long-term deployment. Because we train our models on basic fact single-step data, we will never provide them with input data destroyed by this cumulative noise. This means that when we generate generalization by providing our own noise to the model and taking the previous prediction as input, the fact that its input is not within the training distribution range may lead to greater errors, thus rapidly accumulating further errors. We use a simple method to make the model more robust to noise input: in the training process, we use random walking noise to destroy the input position and speed of the model, so the training distribution is more similar to the distribution generated when it is introduced.

Normalization. We use the statistical data calculated online during training to normalize all the elements of input and target vectors to zero mean and unit variance. Preliminary experiments show that normalization can accelerate the training speed, but the fusion performance has not been significantly improved.

Loss function and optimization program. We randomly select particle state pairs from the training trajectory, calculate the target acceleration, and calculate L2 loss according to the predicted acceleration of each particle. We use the minimum lot size of 2 and use Adam optimizer to optimize the model parameters of this loss. We perform a gradient update step of 20M at most, and the exponential learning rate drops from to. Although the model can be trained in fewer steps, we avoid using too high learning rate to reduce the differences between data sets and make the comparison between settings more fair.

In the training process, we regularly evaluate our model by full-length demonstration on five unchanging verification trajectories, and record the relevant model parameters to obtain the best demonstration MSE. When we observed that the decline of MSE was negligible, we stopped training. On GPU/TPU hardware, MSE is usually within a few hours for smaller and simpler data sets. For larger and more complex data sets, it may take up to a week.

Model effect and animation:/their/bb7cfd1d-20a8-4f08-8a2b-a64dd04e 37b6