BLUEPRINT Rebuilding a Legacy: Multimodal Retrieval for Complex Engineering Drawings and Documents
arXiv:2602.13345v1 Announce Type: new Abstract: Decades of engineering drawings and technical records remain locked in legacy archives with inconsistent or missing metadata, making retrieval difficult and often manual. We present Blueprint, a layout-aware multimodal retrieval system designed for large-scale engineering repositories. Blueprint detects canonical drawing regions, applies region-restricted VLM-based OCR, normalizes identifiers (e.g., DWG, part, facility), and fuses lexical and dense retrieval with a lightweight region-level reranker. Deployed on ~770k unlabeled files, it automatically produces structured metadata suitable for cross-facility search. We evaluate Blueprint on a 5k-file benchmark with 350 expert-curated queries using pooled, graded (0/1/2) relevance judgments. Blueprint delivers a 10.1% absolute gain in Success@3 and an 18.9% relative improvement in nDCG@3 over the strongest vision-language baseline}, consistently outperforming across vision, text, and mult
arXiv:2602.13345v1 Announce Type: new Abstract: Decades of engineering drawings and technical records remain locked in legacy archives with inconsistent or missing metadata, making retrieval difficult and often manual. We present Blueprint, a layout-aware multimodal retrieval system designed for large-scale engineering repositories. Blueprint detects canonical drawing regions, applies region-restricted VLM-based OCR, normalizes identifiers (e.g., DWG, part, facility), and fuses lexical and dense retrieval with a lightweight region-level reranker. Deployed on ~770k unlabeled files, it automatically produces structured metadata suitable for cross-facility search. We evaluate Blueprint on a 5k-file benchmark with 350 expert-curated queries using pooled, graded (0/1/2) relevance judgments. Blueprint delivers a 10.1% absolute gain in Success@3 and an 18.9% relative improvement in nDCG@3 over the strongest vision-language baseline}, consistently outperforming across vision, text, and multimodal intents. Oracle ablations reveal substantial headroom under perfect region detection and OCR. We release all queries, runs, annotations, and code to facilitate reproducible evaluation on legacy engineering archives.
Executive Summary
The article 'BLUEPRINT Rebuilding a Legacy: Multimodal Retrieval for Complex Engineering Drawings and Documents' introduces a novel multimodal retrieval system designed to unlock decades of engineering drawings and technical records stored in legacy archives. The system, named Blueprint, employs layout-aware techniques to detect canonical drawing regions, applies region-restricted VLM-based OCR, normalizes identifiers, and integrates lexical and dense retrieval with a lightweight region-level reranker. Deployed on a dataset of approximately 770,000 unlabeled files, Blueprint successfully generates structured metadata, enabling cross-facility search. The evaluation on a 5,000-file benchmark with 350 expert-curated queries demonstrates significant improvements in retrieval performance, with a 10.1% absolute gain in Success@3 and an 18.9% relative improvement in nDCG@3 over the strongest vision-language baseline. The article also highlights substantial headroom for improvement under perfect region detection and OCR, and releases all queries, runs, annotations, and code to facilitate further research.
Key Points
- ▸ Introduction of Blueprint, a multimodal retrieval system for engineering drawings and documents.
- ▸ Use of layout-aware techniques and region-restricted VLM-based OCR.
- ▸ Deployment on a large-scale dataset of approximately 770,000 unlabeled files.
- ▸ Significant improvements in retrieval performance metrics.
- ▸ Release of all queries, runs, annotations, and code for reproducible evaluation.
Merits
Innovative Approach
Blueprint represents a significant advancement in the field of multimodal retrieval, particularly for engineering documents. The integration of layout-aware techniques and region-restricted OCR addresses a critical gap in the retrieval of complex engineering drawings.
Large-Scale Deployment
The deployment of Blueprint on a dataset of approximately 770,000 unlabeled files demonstrates its scalability and practical applicability in real-world scenarios. This large-scale deployment is a testament to the system's robustness and efficiency.
Performance Improvements
The substantial improvements in retrieval performance metrics, including a 10.1% absolute gain in Success@3 and an 18.9% relative improvement in nDCG@3, highlight the effectiveness of Blueprint. These performance gains are particularly notable given the complexity of the engineering documents being retrieved.
Demerits
Dependency on Region Detection and OCR
The performance of Blueprint is heavily dependent on the accuracy of region detection and OCR. While the article highlights substantial headroom for improvement under perfect conditions, the current system's performance may be limited by inaccuracies in these components.
Limited Evaluation Scope
The evaluation of Blueprint is based on a 5,000-file benchmark with 350 expert-curated queries. While this evaluation provides valuable insights, it may not fully capture the diversity and complexity of all engineering documents. A more comprehensive evaluation across a broader range of documents could provide a more complete picture of the system's capabilities.
Potential Bias in Expert-Curated Queries
The use of expert-curated queries introduces the potential for bias in the evaluation results. The queries may reflect the expertise and preferences of the curators, which could limit the generalizability of the findings to other contexts or user groups.
Expert Commentary
The article presents a compelling case for the adoption of multimodal retrieval systems in the engineering field. Blueprint's innovative approach to integrating layout-aware techniques and region-restricted OCR addresses a longstanding challenge in the retrieval of complex engineering drawings. The substantial performance improvements demonstrated in the evaluation underscore the system's potential to revolutionize the way engineering documents are managed and retrieved. However, the dependency on region detection and OCR accuracy, as well as the limited evaluation scope, highlight areas for further research and development. The release of all queries, runs, annotations, and code is a commendable step towards reproducible evaluation and further advancements in the field. Overall, Blueprint represents a significant contribution to the intersection of AI and engineering, with broad implications for both practical applications and policy development.
Recommendations
- ✓ Further research should focus on improving the accuracy of region detection and OCR to enhance the performance of Blueprint and similar systems.
- ✓ Expanding the evaluation scope to include a broader range of engineering documents and user groups could provide a more comprehensive assessment of the system's capabilities and limitations.