Abstract
Image segmentation is attracting increasing attention in the field of medical image analysis. Since widespread utilization across various medical applications, ensuring and improving segmentation accuracy has become a crucial topic of research. With advances in deep learning, researchers have developed numerous methods that combine Transformers and convolutional neural networks (CNNs) to create high accurate models for medical image segmentation. However, efforts to further enhance accuracy by developing larger and more complex models or training with more extensive datasets, significantly increase computational resource consumption. To address this problem, we propose BiCLIP-nnFormer, a virtual multimodal instrument that leverages CLIP models to enhance the segmentation performance of nnFormer (a medical segmentation model). Since two CLIP models (PMC-CLIP and CoCa-CLIP) are pre-trained on large datasets, they do not require additional training, thus conserving computation resources. These models are used offline to extract image and text embeddings from medical images. These embeddings are then processed by the proposed 3D CLIP adapter, which adapts the CLIP knowledge for segmentation tasks by fine-tuning. Finally, the adapted embeddings are fused with feature maps extracted from the nnFormer encoder for generating predicted masks. This process enriches the representation capabilities of the feature maps by integrating global multimodal information, leading to more precise segmentation predictions. We demonstrate the superiority of BiCLIP-nnFormer and the effectiveness of using CLIP models to enhance nnFormer through experiments on two public datasets, namely the Synapse multi-organ segmentation dataset (Synapse) and the Automatic Cardiac Diagnosis Challenge dataset (ACDC), as well as a self-annotated lung multi-category segmentation dataset (LMCS).

This work is licensed under a Creative Commons Attribution 4.0 International License.
Copyright (c) 2025 Bo Wang, Yue Yan, Mengyuan Xu, Yuqun Yang, Xu Tang, Kechen Shu, Jingyang Ai, Zheng You
Downloads
Publication Facts
Reviewer profiles N/A
Author statements
- Academic society
- China Instrument and Control Society
- Publisher
- China Instrument and Control Society