Academic

DISCO-TAB: A Hierarchical Reinforcement Learning Framework for Privacy-Preserving Synthesis of Complex Clinical Data

Arshia Ilaty, Hossein Shirazi, Amir Rahmani, Hajar Homayouni · April 3, 2026 · 1 min read · 2 views

#cs.LG #cs.AI

arXiv:2604.01481v1 Announce Type: new Abstract: The development of robust clinical decision support systems is frequently impeded by the scarcity of high-fidelity, privacy-preserving biomedical data. While Generative Large Language Models (LLMs) offer a promising avenue for synthetic data generation, they often struggle to capture the complex, non-linear dependencies and severe class imbalances inherent in Electronic Health Records (EHR), leading to statistically plausible but clinically invalid records. To bridge this gap, we introduce DISCO-TAB (DIScriminator-guided COntrol for TABular synthesis), a novel framework that orchestrates a fine-tuned LLM with a multi-objective discriminator system optimized via Reinforcement Learning. Unlike prior methods relying on scalar feedback, DISCO-TAB evaluates synthesis at four granularities, token, sentence, feature, and row, while integrating Automated Constraint Discovery and Inverse-Frequency Reward Shaping to autonomously preserve latent medical logic and resolve minority-class collapse. We rigorously validate our framework across diverse benchmarks, including high-dimensional, small-sample medical datasets (e.g., Heart Failure, Parkinson's). Our results demonstrate that hierarchical feedback yields state-of-the-art performance, achieving up to 38.2% improvement in downstream clinical classifier utility compared to GAN and Diffusion baselines, while ensuring exceptional statistical fidelity (JSD < 0.01) and robust resistance to membership inference attacks. This work establishes a new standard for generating trustworthy, utility-preserving synthetic tabular data for sensitive healthcare applications.

Executive Summary

DISCO-TAB introduces a hierarchical reinforcement learning framework designed to enhance the synthesis of complex clinical data by integrating a fine-tuned LLM with a multi-objective discriminator system. Unlike conventional methods that rely on scalar feedback, DISCO-TAB evaluates synthesis across four granularities—token, sentence, feature, and row—using Automated Constraint Discovery and Inverse-Frequency Reward Shaping to preserve latent medical logic and mitigate minority-class collapse. The framework demonstrates significant performance gains, achieving up to a 38.2% improvement in downstream clinical classifier utility relative to GAN and Diffusion baselines, while maintaining statistical fidelity (JSD < 0.01) and resistance to membership inference attacks. This represents a meaningful advancement in generating trustworthy, clinically valid synthetic data for healthcare applications.

Key Points

▸ Hierarchical reinforcement learning framework for clinical data synthesis
▸ Multi-granularity evaluation (token, sentence, feature, row)
▸ Integration of Automated Constraint Discovery and Inverse-Frequency Reward Shaping

Merits

Performance Enhancement

DISCO-TAB achieves up to 38.2% improvement in downstream clinical classifier utility compared to existing baselines, indicating substantial utility gains.

Statistical Fidelity

Maintains low JSD (< 0.01), ensuring clinical validity and statistical alignment with real-world data.

Demerits

Complexity and Implementation Barrier

The hierarchical structure and multiple evaluation layers may increase computational complexity and present challenges for deployment or replication in resource-constrained environments.

Expert Commentary

DISCO-TAB represents a significant methodological leap in the domain of privacy-preserving synthetic data generation. The framework’s innovative use of hierarchical reinforcement learning to address complex dependencies and class imbalances in EHR data is particularly noteworthy. By decomposing the synthesis process across multiple granularities and applying targeted constraint discovery mechanisms, the authors effectively mitigate common pitfalls in LLMs—namely, the tendency to produce statistically plausible yet clinically invalid records. The inclusion of Inverse-Frequency Reward Shaping to counter minority-class collapse is a sophisticated solution to a persistent problem in generative models. Furthermore, the rigorous validation across high-dimensional, small-sample datasets strengthens the credibility of the results. While the computational overhead of multi-granularity evaluation may pose a hurdle, the trade-off in accuracy and clinical reliability appears justified. This work sets a new benchmark for evaluating synthetic data quality in healthcare, and it is likely to influence future research in both clinical AI and data privacy.

Recommendations

✓ Researchers should consider adapting DISCO-TAB’s multi-granularity evaluation framework for other domains beyond EHR, such as finance or education, where sensitive data synthesis is critical.
✓ Healthcare institutions and policymakers should evaluate DISCO-TAB as a viable tool for generating compliant synthetic datasets for training AI models without compromising patient privacy.

Sources

Original: arXiv - cs.LG

arXiv - cs.LG

DISCO-TAB: A Hierarchical Reinforcement Learning Framework for Privacy-Preserving Synthesis of Complex Clinical Data

AI Commentary

Executive Summary

Key Points

Merits

Performance Enhancement

Statistical Fidelity

Demerits

Complexity and Implementation Barrier

Expert Commentary

Recommendations

Sources

Related Articles

AI-Driven Approaches to Enhancing Fairness and Identifying Algorithmic Bias in …

High resolution schemes for hyperbolic conservation laws

Robust Graph Representation Learning via Adaptive Spectral Contrast

Towards Intrinsically Calibrated Uncertainty Quantification in Industrial Data-Driven Models via …

JCG, PC

HSOLLC Co., Ltd.