JoDiffusion3D is a novel joint diffusion framework for simultaneous generation of 3D volumetric medical images and their segmentation masks. It employs a dual-branch latent diffusion model with cross-modal attention mechanisms, enforcing anatomical coherence through a 'cross-consistency loss'. The method achieves state-of-the-art performance in image quality and segmentation accuracy.
Key findings
JoDiffusion3D models the joint distribution of volumetric data and anatomical structures.
A dual-branch latent diffusion architecture enables bidirectional information flow.
Cross-consistency loss function ensures anatomical coherence during the diffusion process.
Achieves state-of-the-art performance in image quality and segmentation accuracy.
Limitations & open questions
The computational cost of 3D diffusion processes remains high.
Further work is needed to scale the model to larger datasets and more complex anatomical structures.