Insertion Based Sequence Generation with Learnable Order Dynamics
arXiv:2602.18695v1 Announce Type: new Abstract: In many domains generating variable length sequences through insertions provides greater flexibility over autoregressive models. However, the action space of insertion models is much larger than that of autoregressive models (ARMs) making the learning challenging. To address this, we incorporate trainable order dynamics into the target rates for discrete flow matching, and show that with suitable choices of parameterizations, joint training of the target order dynamics and the generator is tractable without the need for numerical simulation. As the generative insertion model, we use a variable length masked diffusion model, which generates by inserting and filling mask tokens. On graph traversal tasks for which a locally optimal insertion order is known, we explore the choices of parameterization empirically and demonstrate the trade-offs between flexibility, training stability and generation quality. On de novo small molecule generation
arXiv:2602.18695v1 Announce Type: new Abstract: In many domains generating variable length sequences through insertions provides greater flexibility over autoregressive models. However, the action space of insertion models is much larger than that of autoregressive models (ARMs) making the learning challenging. To address this, we incorporate trainable order dynamics into the target rates for discrete flow matching, and show that with suitable choices of parameterizations, joint training of the target order dynamics and the generator is tractable without the need for numerical simulation. As the generative insertion model, we use a variable length masked diffusion model, which generates by inserting and filling mask tokens. On graph traversal tasks for which a locally optimal insertion order is known, we explore the choices of parameterization empirically and demonstrate the trade-offs between flexibility, training stability and generation quality. On de novo small molecule generation, we find that the learned order dynamics leads to an increase in the number of valid molecules generated and improved quality, when compared to uniform order dynamics.
Executive Summary
This article proposes a novel approach to sequence generation using learnable order dynamics in insertion models. By incorporating trainable order dynamics into the target rates for discrete flow matching, the authors demonstrate the tractability of joint training without numerical simulation. The generative insertion model, a variable length masked diffusion model, generates sequences through inserting and filling mask tokens. The authors empirically explore parameterization choices on graph traversal tasks and de novo small molecule generation, revealing improved generation quality and stability. The study provides a promising solution for tackling the challenges of sequence generation in variable length sequences.
Key Points
- ▸ The article introduces a novel approach to sequence generation using learnable order dynamics in insertion models.
- ▸ Trainable order dynamics are incorporated into the target rates for discrete flow matching, enabling joint training without numerical simulation.
- ▸ The generative insertion model is a variable length masked diffusion model, generating sequences through inserting and filling mask tokens.
- ▸ The study explores parameterization choices on graph traversal tasks and de novo small molecule generation, revealing improved generation quality and stability.
Merits
Improved Generation Quality
The learnable order dynamics lead to improved generation quality, particularly in de novo small molecule generation.
Increased Flexibility
The variable length masked diffusion model provides greater flexibility in sequence generation compared to autoregressive models.
Reduced Numerical Simulation
Joint training of the target order dynamics and the generator is tractable without numerical simulation, reducing computational complexity.
Demerits
Increased Action Space
The action space of insertion models is larger than that of autoregressive models, making learning challenging.
Limited Exploration
The study only explores parameterization choices on graph traversal tasks and de novo small molecule generation, limiting the generalizability of the findings.
Dependence on Parameterization
The performance of the generative insertion model is highly dependent on the choice of parameterization, which may not generalize to other tasks.
Expert Commentary
The article presents a promising approach to sequence generation using learnable order dynamics in insertion models. The incorporation of trainable order dynamics into the target rates for discrete flow matching is a significant contribution, enabling joint training without numerical simulation. However, the study's limitations, such as dependence on parameterization and limited exploration, highlight the need for further research in this area. The generative insertion model's performance on graph traversal tasks and de novo small molecule generation is impressive, but its applicability to other tasks requires further investigation. The study's findings have significant implications for the development of more effective sequence generation models, which can be used in various policy-making applications.
Recommendations
- ✓ Future studies should explore the applicability of the generative insertion model to other tasks, such as text summarization and machine translation.
- ✓ The dependence on parameterization should be addressed through more extensive exploration of parameterization choices and their impact on model performance.