AsyCMST: Asymmetric cross-modal spatio-temporal learning for multimodal ultrasound nodule recognition
Jul 1, 2026·
,,,,,,,,
HAN Hongcheng
Zhiqian Tian
Minghao Wang
Yutong Zhang
Dong Zhang
Qinbo Guo
Hui Guo
Jue Jiang
Shaoyi Du
Corresponding author
,Juan Wang
Equal contribution
·
0 min read
Abstract
Multimodal ultrasound combining B-mode ultrasound (BUS) and contrast-enhanced ultrasound (CEUS) has become a powerful tool for diagnosing superficial nodules in the thyroid and breast, leveraging the complementary strengths of BUS spatial structure and CEUS temporal hemodynamics. However, existing fusion methods typically treat both modalities symmetrically or focus solely on modality-specific features, overlooking the inherent asymmetric bidirectional guidance between BUS spatial context and CEUS perfusion dynamics. To address this limitation, we propose AsyCMST, an asymmetric cross-modal spatio-temporal network for multimodal ultrasound nodule diagnosis. First, we design a multi-task learning module to enhance modality-specific representations, where frame self-sorting distills canonical contrast perfusion patterns in CEUS, while nodule segmentation reinforces precise lesion localization in BUS. Second, we propose an asymmetric cross-modal spatio-temporal attention mechanism to enable clinically meaningful directional interaction: BUS spatial cues guide CEUS temporal modeling toward lesion-relevant regions, and CEUS hemodynamic evolution refines ambiguous structural patterns in BUS. This design effectively captures the asymmetric interdependency between structure and function. Experiments on thyroid and breast datasets demonstrate that AsyCMST significantly outperforms state-of-the-art video understanding and multimodal ultrasound fusion methods in accuracy, F1-score, AUC, and cross-dataset generalization. These results validate the effectiveness of knowledge-driven asymmetric fusion and highlight its potential to advance clinical adoption of multimodal ultrasound analysis.
Type
Publication
Medical Image Analysis, 112
License
CC-BY-4.0

Authors
HAN Hongcheng
(he/him)
PhD Candidate in Control Science and Engineering
Han Hongcheng (韩泓丞) received the degree of B.Eng. in School of Energy and Power Engineering, Xi’an Jiaotong University, Xi’an, China in 2020.
Since then, he is studying for Ph.D. degree in Institute of Artificial Intelligence and Robotics, Xi’an Jiaotong University.
His interests focus on intelligent transportation and medical image analysis, specializing in Multimodal data fusion and Image Synthesis.