Academic

Tracking Cancer Through Text: Longitudinal Extraction From Radiology Reports Using Open-Source Large Language Models

arXiv:2603.09638v1 Announce Type: new Abstract: Radiology reports capture crucial longitudinal information on tumor burden, treatment response, and disease progression, yet their unstructured narrative format complicates automated analysis. While large language models (LLMs) have advanced clinical text processing, most state-of-the-art systems remain proprietary, limiting their applicability in privacy-sensitive healthcare environments. We present a fully open-source, locally deployable pipeline for longitudinal information extraction from radiology reports, implemented using the \texttt{llm\_extractinator} framework. The system applies the \texttt{qwen2.5-72b} model to extract and link target, non-target, and new lesion data across time points in accordance with RECIST criteria. Evaluation on 50 Dutch CT Thorax/Abdomen report pairs yielded high extraction performance, with attribute-level accuracies of 93.7\% for target lesions, 94.9\% for non-target lesions, and 94.0\% for new lesio

L
Luc Builtjes, Alessa Hering
· · 1 min read · 10 views

arXiv:2603.09638v1 Announce Type: new Abstract: Radiology reports capture crucial longitudinal information on tumor burden, treatment response, and disease progression, yet their unstructured narrative format complicates automated analysis. While large language models (LLMs) have advanced clinical text processing, most state-of-the-art systems remain proprietary, limiting their applicability in privacy-sensitive healthcare environments. We present a fully open-source, locally deployable pipeline for longitudinal information extraction from radiology reports, implemented using the \texttt{llm\_extractinator} framework. The system applies the \texttt{qwen2.5-72b} model to extract and link target, non-target, and new lesion data across time points in accordance with RECIST criteria. Evaluation on 50 Dutch CT Thorax/Abdomen report pairs yielded high extraction performance, with attribute-level accuracies of 93.7\% for target lesions, 94.9\% for non-target lesions, and 94.0\% for new lesions. The approach demonstrates that open-source LLMs can achieve clinically meaningful performance in multi-timepoint oncology tasks while ensuring data privacy and reproducibility. These results highlight the potential of locally deployable LLMs for scalable extraction of structured longitudinal data from routine clinical text.

Executive Summary

This study presents an open-source, locally deployable pipeline for extracting longitudinal information from radiology reports using large language models (LLMs). The pipeline, implemented using the llm_extractinator framework, applies the qwen2.5-72b model to extract and link target, non-target, and new lesion data across time points in accordance with RECIST criteria. Evaluation on 50 Dutch CT Thorax/Abdomen report pairs yielded high extraction performance, with attribute-level accuracies of 93.7% for target lesions, 94.9% for non-target lesions, and 94.0% for new lesions. This approach achieves clinically meaningful performance in multi-timepoint oncology tasks while ensuring data privacy and reproducibility, demonstrating the potential of locally deployable LLMs for scalable extraction of structured longitudinal data from routine clinical text.

Key Points

  • The study presents an open-source pipeline for extracting longitudinal information from radiology reports using LLMs.
  • The pipeline applies the qwen2.5-72b model to extract and link target, non-target, and new lesion data across time points.
  • Evaluation on 50 Dutch CT Thorax/Abdomen report pairs yielded high extraction performance with attribute-level accuracies of 93.7%, 94.9%, and 94.0% respectively.

Merits

Strength

Achieves clinically meaningful performance in multi-timepoint oncology tasks while ensuring data privacy and reproducibility.

Advances clinical text processing

Utilizes large language models for text extraction and analysis, advancing the field of clinical text processing.

Open-source and locally deployable

Provides a freely available and deployable pipeline for extracting longitudinal information from radiology reports.

Demerits

Limitation

The study's evaluation was limited to 50 Dutch CT Thorax/Abdomen report pairs, and the generalizability of the results to other types of radiology reports and patient populations is uncertain.

Dependence on LLMs

The pipeline's performance relies on the accuracy of the underlying large language model, which may be subject to errors or biases.

Expert Commentary

This study makes a significant contribution to the field of clinical text analysis by demonstrating the potential of large language models in extracting structured longitudinal data from routine clinical text. The pipeline's performance and the study's findings have important implications for healthcare settings, particularly in terms of data privacy and reproducibility. However, as with any study, there are limitations and areas for future research, such as evaluating the pipeline's performance on larger and more diverse datasets. Nevertheless, this study is an important step towards harnessing the power of clinical text analysis for improving patient care and outcomes.

Recommendations

  • Future studies should evaluate the pipeline's performance on larger and more diverse datasets to determine its generalizability.
  • Researchers should continue to explore the potential of LLMs in clinical text analysis, particularly in areas such as data privacy and reproducibility.

Sources