Jianming Liang Portfolio - Skysong Innovations

M11-093L: Development of a Highly Efficient and User-Friendly Software System for Carotid Intima-Media Thickness- Researchers at Arizona State University have developed a highly user-friendly system for semiautomatic CIMT image interpretation. Their contribution is the application of active contour models (snake models) with hard constraints, leading to an accurate, adaptive and user-friendly border detection algorithm. Please see Zhu – SPIE 2011 for additional information.

M11-103L: Automatic diagnosis of pulmonary embolism by machine learning-based detection of pulmonary trunk- Researchers at Arizona State University have developed a machine learning-based approach for automatically detecting the pulmonary trunk. By using a cascaded Adaptive Boosting machine learning algorithm with a large number of digital image object recognition features, this method automatically identifies the pulmonary trunk by sequentially scanning the CTPA images and classifying each encountered sub-image with the trained classifier.

M12-031L: Automated Detection of Major Thoracic Structures with a Novel Online Learning Method- Researchers at Arizona State University have developed a novel online learning method for automatically detecting anatomic structures in medical images, which continually updates a linear classifier. Given a set of training samples, it dynamically updates a pool containing M features and returns a subset of N best features along with their corresponding voting weights.

M12-112L: Self-Adaptive Asymmetric On-line Boosting for Detecting Anatomical structures- Researchers at Arizona State University have developed a novel self-adaptive, asymmetric on-line boosting (SAAOB) method for detecting anatomical structures in CT pulmonary angiography. This method utilizes a new asymmetric loss criterion with self adaptability according to the ratio of exposed positive and negative samples. Moreover, the method applies advanced formulates to situations and updates a sample’s importance weight based on those different situations i.e. true positive, false positive, true negative, false negative.

M12-113L: Shape-based analysis of right ventricular dysfunction associated with acute pulmonary embolism- Researchers at Arizona State University have developed a method of detecting early stage APE using measured biomechanical changes to the cardiac right ventricle. It was found that RV dysfunction due to APE exhibits several characteristic signs including (1) waving paradoxical motion of the RV inner boundary, (2) decrease in local curvature of the septum, (3) lower positive correlation between the movement of inner boundaries of the septal and free walls of the RV, (4) slower blood ejection by the RV, and (5) discontinuous movement observed particularly in the middle of the RV septal wall.

M13-026L: Computer-Aided Detection & Visualization of Pulmonary Embolism- Researchers at Arizona State University have developed novel approaches for automated computer-aided detection of emboli in CTPA. One technique automatically registers the vessel orientation in a display, providing compelling demonstration of arterial filling defects, if present, and allowing the radiologist to thoroughly inspect the vessel lumen from multiple perspectives and report any filling defects with high confidence. Another uses deep neural networks and vessel-aligned multi-planar representations to eliminate false positives. A third technique automatically and robustly detects and marks central emboli at CTPA using a rule-based approach for simplicity and low computational cost. Yet another technique creates and presents vessel-oriented images that provide consistent, compact and discriminative representation to enable a radiologist to distinguish PE from PE mimics. It also supports multi-view visualization to maximally reveal and fill defects. Please see Liang – MICCAI 2015 for additional information.

M13-122LC: Polyp Detection in Optical Colonoscopy- Researchers at Arizona State University in collaboration with Dr. Gurudu of the Mayo Clinic have developed two novel systems for computer-aided detection of polyps in optical colonoscopy images. The first system detects polyps by using boundary classifiers and a voting scheme to automatically identify the boundary or edge of polyps. This method was evaluated on 300 images containing 300 colorectal polyps with different shapes and scales and it detected 260 out of 300 polyps with 40 false detections. The second system uses a shape-based method and voting scheme to detect polyp boundaries in optical colonoscopy images. It is based on image appearance variation between polyps and their surrounding tissue. The second system was also evaluated on 300 images containing 300 colorectal polyps and detected 267 out of 300 polyps.

M13-234L: Diagnosing Pulmonary Embolism by Integrating Patient-level Diagnosis and Embolus-level Detection- Prof. Jianming Liang from Arizona State University has developed an innovative computer aided diagnosis system for PE detection. By using advanced algorithms and classifiers for patient-level diagnosis with embolus-level detection, non-PE patients can be excluded without overlooking PE patients. This positively impacts the system performance because the rate of true positive CTPA is only 5-10% and the treatment for PE is usually systemic, therefore false positives (FPs) impose extra burdens on the radiologist to evaluate and reject FPs in nearly all negative patients.

M14-115LC: Polyp Detection in Optical Colonoscopy- Researchers at Arizona State University in collaboration with Dr. Gurudu of the Mayo Clinic have developed two novel systems for computer-aided detection of polyps in optical colonoscopy images. The first system detects polyps by using boundary classifiers and a voting scheme to automatically identify the boundary or edge of polyps. This method was evaluated on 300 images containing 300 colorectal polyps with different shapes and scales and it detected 260 out of 300 polyps with 40 false detections. The second system uses a shape-based method and voting scheme to detect polyp boundaries in optical colonoscopy images. It is based on image appearance variation between polyps and their surrounding tissue. The second system was also evaluated on 300 images containing 300 colorectal polyps and detected 267 out of 300 polyps.

M14-124LC: Automatic Video Quality Assessment for Colonoscopy- Researchers at Arizona State University in collaboration with Dr. Gurudu of the Mayo Clinic have developed a system for automatic, objective quality assessment of colonoscopy videos. The overall quality of a colonoscopy is calculated as the average score of each frame in the video. This system can identify hasty/non-informative colon examination shots to help give an assessment of colonoscopy video quality. Compared to gray level co-occurrence matrix (GLCM) and discrete wavelet transform (DWT) methods, which have a 70% and 75% sensitivity, this system has a 93% sensitivity with a 10% false positive rate.

M15-018LC: Automatic Polyp Detection Using Global Geometric Constraints and Local Intensity Variation Patterns- Researchers at Arizona State University present a new method for detecting polyps in colonoscopy. Its novelty lies in integrating the global geometric constraints of polyps with the local patterns of intensity variation across polyp boundaries: the former drives the detector towards the objects with curvy boundaries, while the latter minimizes the misleading effects of polyp-like structures. Please see Tajbakhsh – MICCAI 2014 for additional information.

M15-121LC: Automated Polyp Detection Systems- Researchers at Arizona State University in collaboration with Dr. Gurudu of the Mayo Clinic have developed several novel systems for computer-aided detection of polyps in optical colonoscopy images. The systems use a variety of tools to enable better and more sensitive polyp detection including learning existing features, evaluating polyp edges/boundaries to automatically monitor video quality, voting and classification schemes, neural networks, etc. Experimental results based on the PI’s collection of videos shows remarkable performance improvements with each system over current methods, with significant sensitivity and dramatically fewer false positives.

M15-185LC: Chance-Constrained Optimization for Treatment of Prostate and Other Cancers in Intensity-Modulated Proton Therapy- Researchers at Arizona State University in collaboration with researchers at the Mayo Clinic have developed a novel treatment planning method for use in radiation therapy for cancer patients. They applied a probabilistic framework in the IMPT planning subject to range and patent set up uncertainties. The framework hedges against the influence of uncertainties and improves robustness of treatment plans. Results from this method were compared with the conventional PTV-based method and demonstrated enhanced effectiveness. The total deviation between the real and prescribed dose is minimized under the nominal scenario to provide a convenient framework for treatment planners.

M15-186L: Computer-Aided Detection & Visualization of Pulmonary Embolism- Researchers at Arizona State University have developed novel approaches for automated computer-aided detection of emboli in CTPA. One technique automatically registers the vessel orientation in a display, providing compelling demonstration of arterial filling defects, if present, and allowing the radiologist to thoroughly inspect the vessel lumen from multiple perspectives and report any filling defects with high confidence. Another uses deep neural networks and vessel-aligned multi-planar representations to eliminate false positives. A third technique automatically and robustly detects and marks central emboli at CTPA using a rule-based approach for simplicity and low computational cost. Yet another technique creates and presents vessel-oriented images that provide consistent, compact and discriminative representation to enable a radiologist to distinguish PE from PE mimics. It also supports multi-view visualization to maximally reveal and fill defects. Please see Liang – MICCAI 2015 for additional information.

M15-212L: Methods for Rapidly Interpreting Carotid Intima-Media Thickness Videos- Researchers at Arizona State University have developed a novel software system for rapidly interpreting and measuring CIMT. This system automatically selects EUFs, determines ROIs and performs the CIMT measurement in ultrasound videos, significantly cutting down on the time required for determining CIMT. The system automates the entire CIMT interpretation process.

M15-217L: Computer-Aided Detection & Visualization of Pulmonary Embolism- Researchers at Arizona State University have developed novel approaches for automated computer-aided detection of emboli in CTPA. One technique automatically registers the vessel orientation in a display, providing compelling demonstration of arterial filling defects, if present, and allowing the radiologist to thoroughly inspect the vessel lumen from multiple perspectives and report any filling defects with high confidence. Another uses deep neural networks and vessel-aligned multi-planar representations to eliminate false positives. A third technique automatically and robustly detects and marks central emboli at CTPA using a rule-based approach for simplicity and low computational cost. Yet another technique creates and presents vessel-oriented images that provide consistent, compact and discriminative representation to enable a radiologist to distinguish PE from PE mimics. It also supports multi-view visualization to maximally reveal and fill defects. Please see Liang – MICCAI 2015 for additional information.

M15-230LC: Automated Polyp Detection Systems- Researchers at Arizona State University in collaboration with Dr. Gurudu of the Mayo Clinic have developed several novel systems for computer-aided detection of polyps in optical colonoscopy images. The systems use a variety of tools to enable better and more sensitive polyp detection including learning existing features, evaluating polyp edges/boundaries to automatically monitor video quality, voting and classification schemes, neural networks, etc. Experimental results based on the PI’s collection of videos shows remarkable performance improvements with each system over current methods, with significant sensitivity and dramatically fewer false positives.

M16-036L: Matlab Software (c) for Ultrasound Carotid Intima-Media Thickness Image Interpretation- Researchers at Arizona State University present methods, systems, and media for determining carotid intima media thickness are provided. A method for determining carotid intima-media thickness of a carotid artery is provided, the method comprising: receiving a frame from a plurality of images, wherein each of the plurality of images includes a portion of the carotid artery; receiving a user selection of a location with the frame; setting a region of interest, based on the received user selection; detecting a first border and a second border within the region of interest; applying one or more active contour models to the first border and the second border to generate a smoothed first border and a smoothed second border; and calculating the intima-media thickness based at least in part on the smoothed first border and the second smoothed border. Please see Zhu – SPIE 2011 for additional information.

M16-037L: Java Software (c) for Ultrasound Carotid Intima-Media Thickness Image Interpretation- Researchers at Arizona State University present an assessment of Carotid Intima-Media Thickness (CIMT) by B-mode ultrasound is a technically mature and reproducible technology [1, 2]. Given the high morbidity, mortality and the large societal burden associated with CV diseases, as a safe yet inexpensive tool, CIMT is increasingly utilized for cardiovascular (CV) risk stratification [1]. However, CIMT requires a precise measure of the thickness of the intima and media layers of the carotid artery that can be tedious, time consuming, and demand specialized expertise and experience. To this end, we have developed a highly user-friendly system for semiautomatic CIMT image interpretation. Please see Zhu – SPIE 2011 for additional information.

M17-129L: Fine Tuning of Convolutional Neural Networks for Biomedical Image Analysis- Researchers at Arizona State University have developed methods to reduce annotation costs utilizing active learning and transfer learning in CNN for medical image analyses. These methods work with a pre-trained CNN to find worthy samples from the unannotated for annotation. This method has been evaluated in four different biomedical imaging applications, pulmonary embolism detection, polyp detection, colonoscopy frame classification and carotid intima-media thickness measurements, and have shown that the cost of annotation can be cut by at least half.

M18-196L: Convolutional Neural Networks for Medical Image Segmentation- Researchers at Arizona State University have developed a new architecture to bridge the information gaps that are observed in skip connections in U-Net and other CNN architectures used in medical image segmentation. This architecture forms new paths with different depths, some of which focus on localization and coarse segmentation, while others focus on fine-detailed segmentation. This novel architecture was tested on multiple different segmentation tasks with results demonstrating significantly increased performance over the original U-Nets and their variants. Please see Zhou et al – DLMIA Workshop – 2018 for additional information.

M18-197L: Computer-Aided Detection & Visualization of Pulmonary Embolism- Researchers at Arizona State University have developed novel approaches for automated computer-aided detection of emboli in CTPA. One technique automatically registers the vessel orientation in a display, providing compelling demonstration of arterial filling defects, if present, and allowing the radiologist to thoroughly inspect the vessel lumen from multiple perspectives and report any filling defects with high confidence. Another uses deep neural networks and vessel-aligned multi-planar representations to eliminate false positives. A third technique automatically and robustly detects and marks central emboli at CTPA using a rule-based approach for simplicity and low computational cost. Yet another technique creates and presents vessel-oriented images that provide consistent, compact and discriminative representation to enable a radiologist to distinguish PE from PE mimics. It also supports multi-view visualization to maximally reveal and fill defects. Please see Liang – MICCAI 2015 for additional information.

M19-117L: Fixed-Point Generative Adversarial Networks- Researchers at Arizona State University have proposed a new GAN, called Fixed-Point GAN, which introduces fixed-point translation and proposes a new method for disease detection and localization. This new Gan is trained by (1) supervising same-domain translation through a conditional identity loss, and (2) regularizing cross-domain translation through revised adversarial, domain classification, and cycle consistency loss. Qualitative and quantitative evaluations demonstrate that the proposed method outperforms the state of the art in multi-domain image-to-image translation and that it surpasses predominant weakly-supervised localization methods in both disease detection and localization. Please see Rahman Siddiquee et al – ICCV – 2019, GitHub – 2019 for additional information.

M19-189LC: UNet++: A Novel Architecture for Medical Imaging Segmentation- Researchers at Arizona State University have developed a new neural architecture, UNet++, for semantic and instance segmentation. UNet++ alleviates the unknown network depth with an efficient ensemble of U-Nets of varying depths, redesigns skip connections to aggregate features of varying semantic scales at the decoder sub-networks and devises a pruning scheme to accelerate the inference speed of UNet++. This architecture has been extensively evaluated using six different medical image segmentation datasets covering multiple imaging modalities and is shown to outperform the baseline models in semantic segmentation and enhance segmentation quality of varying-size objects of only certain sizes. Please see Zhou et al – ArXiv.org – 2019, Zhou et al – Poster, Zhou et al – Github for additional information

M19-194L: Fine Tuning of Convolutional Neural Networks for Biomedical Image Analysis- Researchers at Arizona State University have developed methods to reduce annotation costs by integrating active learning and transfer learning into a single framework to use CNN for medical image analyses. These methods work with a pre-trained CNN to find worthy samples for annotation and gradually enhances the CNN via continuous fine-tuning. These methods have been evaluated in several different biomedical imaging applications, pulmonary embolism detection, polyp detection, colonoscopy frame classification and carotid intima-media thickness measurements, and have shown that the cost of annotation can be cut by at least half compared with random selection. Please see Zhou et al – CVPR 2017, Zhou et al – CVPR Poster 2017, Tajbakhsh et al – IEEE Trans Med Imaging 2016 for additional information

M19-252LC: Models Genesis: Autodidactic Models for 3D Medical Image Analysis- Researchers at Arizona State University have developed a set of pre-trained models which may serve as a primary source of transfer learning for 3D medical imaging applications. These models were created ex nihilo (without manual labeling), were self-taught (learned by self-supervision), and were made generic so that they may serve as source models for generating application-specific target models. With the ability to learn from scratch on unlabeled images, these models yield a common visual representation that is generalizable and transferable across diseases, organs and imaging modalities. These models preserve the rich 3D anatomical information often found in medical images. These novel models consistently top any 2D/2.5D approaches and out-perform learning from scratch in all 5 target 3D applications, making them ideal source models for transfer learning in 3D medical imaging applications. Please see Zhou et al – arXiv – 2019, ModelsGenesis – Github for additional information.

M20-069L: Transferable Visual Words: Chest X-ray Image Analysis- Researchers at Arizona State University have developed a novel method, called Transferable Visual Words (TransVW) for chest X-ray image analysis. This method integrates CNNs and BoVW to amplify their strengths and overcome some of their limitations. TransVW uses the transfer learning capability of CNNs with the unsupervised nature of BoVW in extracting visual words, and results in a new self-supervised method. When TransVW was evaluated on an NIH hospital-scale chest X-ray dataset, it outperformed all of the state-of-the-art approaches, including fine-tuning pre-trained ImageNet models, which is a significant accomplishment.

M20-126L: Transferable Visual Words: Chest X-ray Image Analysis- Researchers at Arizona State University have developed a novel method, called Transferable Visual Words (TransVW) for chest X-ray image analysis. This method integrates CNNs and BoVW to amplify their strengths and overcome some of their limitations. TransVW uses the transfer learning capability of CNNs with the unsupervised nature of BoVW in extracting visual words, and results in a new self-supervised method. When TransVW was evaluated on an NIH hospital-scale chest X-ray dataset, it outperformed all of the state-of-the-art approaches, including fine-tuning pre-trained ImageNet models, which is a significant accomplishment.

M20-127L: Semantic Genesis- Researchers at Arizona State University have developed a self-supervised learning framework that enables the capture of semantics-enriched representation from unlabeled medical image data, resulting in a set of powerful pre-trained models, called Semantic Genesis. These models learn visual representation by self-discovery, self-classification, and self-restoration of the anatomy underneath medical images. Semantic Genesis yields a generic and transferable visual representation that can improve the performance of various medical tasks across diseases, organs, and imaging modalities. Another unique property of Semantic Genesis is that it can readily serve as an add-on to dramatically boost the performance of existing self-supervised learning approaches. Please see Haghighi et al – MICCAI – 2020, SemanticGenesis – Github for additional information.

M20-225L: Models Genesis: Autodidactic Models for 3D Medical Image Analysis- Researchers at Arizona State University have developed a set of pre-trained models which may serve as a primary source of transfer learning for 3D medical imaging applications. These models were created ex nihilo (without manual labeling), were self-taught (learned by self-supervision), and were made generic so that they may serve as source models for generating application-specific target models. With the ability to learn from scratch on unlabeled images, these models yield a common visual representation that is generalizable and transferable across diseases, organs and imaging modalities. These models preserve the rich 3D anatomical information often found in medical images. These novel models consistently top any 2D/2.5D approaches and out-perform learning from scratch in all 5 target 3D applications, making them ideal source models for transfer learning in 3D medical imaging applications. Please see Zhou et al – arXiv – 2019, ModelsGenesis – Github for additional information.

M20-239L: Semantic Genesis- Researchers at Arizona State University have developed a self-supervised learning framework that enables the capture of semantics-enriched representation from unlabeled medical image data, resulting in a set of powerful pre-trained models, called Semantic Genesis. These models learn visual representation by self-discovery, self-classification, and self-restoration of the anatomy underneath medical images. Semantic Genesis yields a generic and transferable visual representation that can improve the performance of various medical tasks across diseases, organs, and imaging modalities. Another unique property of Semantic Genesis is that it can readily serve as an add-on to dramatically boost the performance of existing self-supervised learning approaches. Please see Haghighi et al – MICCAI – 2020, SemanticGenesis – Github for additional information.

M20-240L: Self-supervised Learning: From Parts to Whole- Researchers at Arizona State University developed a novel algorithm to learn contrastive representation in 3D medical imaging. This framework for self-supervised contrastive learning via reconstruction is called Parts2Whole, because it exploits the universal and intrinsic part-whole relationship to learn contrastive representation without using contrastive loss. This self-supervised learning framework brings greater efficiency and computational capability for processing 3D medical images than has previously been achievable. Please see Feng et al – MICCAI – 2020 for additional information.

M21-048L: Self-supervised Learning: From Parts to Whole- Researchers at Arizona State University developed a novel algorithm to learn contrastive representation in 3D medical imaging. This framework for self-supervised contrastive learning via reconstruction is called Parts2Whole, because it exploits the universal and intrinsic part-whole relationship to learn contrastive representation without using contrastive loss. This self-supervised learning framework brings greater efficiency and computational capability for processing 3D medical images than has previously been achievable. Please see Feng et al – MICCAI – 2020 for additional information.

M21-064L: Fixed-Point Image-to-Image Translation- Researchers at Arizona State university have created a computer model that can be used for disease detection and localization called Fixed-Point GAN (Generative adversarial networks). This technology has potential to “virtually heal” a patient with unknown health issues by revealing diseased regions through the analysis of their medical images, which is possible through GAN’s ability to remove an object from an image while preserving the image content. Ultimately, this can be used as a tool to aid medical practitioners detect illnesses more accurately/effectively. Please see Rahman Siddiquee et al – ICCV – 2019, GitHub – 2019 for additional information.

M21-169L: Medical Image Segmentation with Interactive Refinement- Researchers at Arizona State University have developed a novel interactive training strategy for medical image segmentation. This strategy interactively refines the segmentation map through several iterations of training for continuous improvement and prediction. A convolutional neural network is trained with user simulated inputs to edit the segmentation and improve segmentation accuracy. When tested on different datasets, this strategy showed superior performance in comparison to other strategies. Please see Goyal – Thesis – 2021 for additional information.

M21-170L: Transferable Visual Words (TransVW)- Researchers at Arizona State University have developed an annotation-efficient deep learning framework, minimizing the human annotation efforts for developing high-performance CAD systems. This framework, called TransVW, is established on the self-supervised learning paradigm, gleaning medical knowledge from images without human annotations. TransVW is prominent for its capacity to develop a collection of base models that can be used as a starting point for training application-specific models, resulting in rapid progress and improved performance for various medical tasks. Moreover, TransVW boasts a unique add-on capability, boosting the performance of existing self-supervised learning approaches. Please see Highighi et al – IEEE TMI – 2021, TransVW – Github for additional information.

M21-199L: Active, Continual Fine Tuning of Convolutional Neural Networks for Reducing Annotation Efforts- Researchers at Arizona State University have created an innovative method to drastically reduce annotated computational datasets for medical imaging analysis, which is useful in convolutional neural networks (CNN). The method functions by using their novel mathematical algorithm that has the advantages of autonomously improving learning by active continual fine-tuning (ACFT). It is reported better due to the ability of computer-aided diagnosis (CAD), active selection that consistently reduces utilization of varying CNN-type architectures, and enabled predictive modeling. This type of tech has potential applications in not only pre-training for CNN-related software, but using ACFT for later incremental enhancements to reduce sample time and overall costs as stated.

M21-229L: Towards Annotation-Efficient Deep Learning in Computer-Aided Diagnosis- Researchers at ASU have created efficient and effective deep learning algorithms for medical applications which do not have large, annotated datasets for completing computer-aided diagnosis (CAD). Unlike existing deep learning models which require large, high-quality annotated datasets otherwise doing poorly and lacking generalizability on new datasets. This model architecture attempts to deliver deep models that approximate/outperform existing deep learning models while annotating a fraction of the dataset compared to traditional models. Ultimately, this work in medical imaging interpretation can have an impact on disease detection, classification, and segmentation. Please see Zhou – Dissertation 2021 for additional information.

M21-287L: Pre-trained Models for nnUNet- Researchers at ASU have enhanced the function of an algorithm for medical image segmentation ‘nnU-Net’ (no-new-net), an improvised version of ‘U-Net architecture’ and making it stable by integrating it with transfer learning. Image segmentation requires dividing an image into meaningful segments and such segments play a crucial role in forming decision in medical imaging by the doctors. However, nnU-Net suffers from hassles of being unstable due to the use of learning from scratch strategy and formulating numerous architecture due to dependency on a specific dataset. By employing transfer learning for nnU-Net, the inventors are able to use unlabeled data from the source task by exploring supervised learning method called Models Genesis. Further, an advanced segmentation architecture like UNet++ improved the segmentation task by eliminating the need of numerous specialized architectures. Please see Bajpai – Dissertation 2021 for additional information.

M21-298L: Annotation-Efficient Deep Learning for Medical Imaging- Researchers at ASU worked to overcome issues in deep learning neural networks (NN) by improving the efficiency with which NN train. Deep learning neural networks (NN) are powerful tools for analyzing complex and large quantities of data. However, for these tools to be effective, many manually labeled datasets are typically required. In the field of medical imaging, the examples are complex and expensive to label, and thus medical imaging datasets are often insufficient to properly train a NN. Please see Tajbakhsh – IEEE 2021 for additional information.

M22-036L Transfer learning from supervised ImageNet models has been frequently used in medical image analysis. Yet, no large-scale evaluation has been conducted to benchmark the efficacy of newly-developed pre-training techniques for medical image analysis, leaving several important questions unanswered. This technology presents a practical approach to bridge the domain gap between natural and medical images by continually (pre-)training supervised ImageNet models on medical images. It yields new insights: (1) pre-trained models on fine-grained data yield distinctive local representations that are more suitable for medical segmentation tasks, (2) self-supervised ImageNet models learn holistic features more effectively than supervised ImageNet models, and (3) continual pre-training can bridge the domain gap between natural and medical images. Please see Hosseinzadeh Taher – DART 2021 for additional information.

M22-048L: An approach to diagnosing pulmonary embolism via CT pulmonary angiography with conventional classification using self-supervised AI with conventional neural networks. In comparing convention classification (CC) with multiple instance learning (MIL), this approach consistently shows: (1) transfer learning consistently boosts performance despite differences between natural images and CT scans, (2) transfer learning with SSL surpasses its supervised counterparts; (3) CNNs outperform vision transformers, which otherwise show satisfactory performance; and (4) CC is, surprisingly, superior to MIL. Compared with the state of the art, our optimal approach provides an AUC gain of 0.2% and 1.05% for image-level and exam-level, respectively. Please see Islam – PMC 2022 for additional information.

M22-158L: DiRA: Discriminative, Restorative, and Adversarial Learning for Self-supervised Medical Image Analysis- Researchers at ASU have created DiRA, a novel framework that unites discriminative, restorative, and adversarial learning in a unified manner to collaboratively glean complementary visual information from unlabeled medical images for fine-grained semantic representation learning. Current technologies are unable to benefit from the collaborative effect of all these three self-supervised learning schemes. DiRA has demonstrated superior performance in learning generalizable and transferable representations across organs, diseases and modalities in medical imaging.

M22-164L: CAiD: A Self-supervised Learning Framework for Empowering Instance Discrimination in Medical Imaging- Researchers at Arizona State University have created a simple yet effective self-supervised framework, called Context-Aware instance Discrimination (CAiD). CAiD which improves instance discrimination learning by providing finer and more discriminative information encoded from diverse local context of unlabeled medical images. Please see Taher MIDL 2022 for additional information.

M22-209L: Self-supervised Visual Representation Learning by Recovering Order and Appearance on Vision Transformer- Medical imaging follows protocols for specific therapeutic purposes, resulting in consistent and recurring anatomical features across all scans, which can serve as robust, organically occurring supervisory signals, leading in more powerful models. The Vision Transformer (ViT) architecture has been successfully applied to a variety of natural image computer vision (CV) tasks, indicating that it holds great potential in medical image analysis. Researchers at ASU propose an approach that outperforms the ImageNet-21K Supervised pretraining and outperforms all transformer based self-supervised pretraining methods.

M22-210L: Benchmarking Vision Transformers for Chest X-rays Classification- Benchmarking Vision Transformers for Chest X-rays Classification. Vision Transformers (ViTs) produce outstanding results with regards to scalability and transferability for images. ASU researchers looked at the transferability of pretrained ViTs and their use in medical images, particularly Chest X-rays. Results indicate that, regardless of the parameter initialization approach, ViTs can work much better at classification tasks.

M23-028L: Stepwise incremental pretraining method for use in machine learning to better and more effectively perform recognition and diagnosis of medical imaging. The three self-supervised learning (SSL) components are discriminative, restorative, and adversarial learning and they are leveraged by their use in combination with redesigned versions of the 5 most prominent SSL methods under the united model. The method of pretraining overcomes the complexity and traditionally difficult unification of multiple learning components. The five methods that have been redesigned are Rotation, jigsaw, Rubik’s Cube, Deep clustering, and TransVW which when used with stepwise incremental pretraining at each level of the three learning components (i.e. (((D)+R)+A)) produces stable results and enhances the encoders for all three learning components. The United model with these methods is trained component by component in a stepwise manner yielding three learned transferable components: discriminative encoders, restorative decoders, and adversarial encoders. Please see Guo – MICCAI 2022 for additional information.

M23-029L: A novel vision transformer-based self-supervised learning framework, POPAR (patch order prediction and appearance recovery), for chest X-ray images. POPAR leverages the benefits of vision transformers and unique properties of medical imaging to simultaneously learn patch-wise high-level contextual features by correcting shuffled patch orders and fine-grained features by recovering patch appearance. POPAR outperforms self-supervised models with vision transformer backbone, achieves significantly better performance over all three state of the art contrastive learning models and outperforms fully-supervised pretrained models across architectures. Please see Pang – MICCAI 2022 for additional information.

M23-030L: Visual transformers have recently gained popularity in the computer vision community as they began to outrank convolutional neural networks (CNNs) in one representative visual benchmark after another. However, the competition between visual transformers and CNNs in medical imaging is rarely studied, leaving many important questions unanswered. This technology is directed to a method for benchmarking and boosting transformers for medical imaging analysis. A practical approach is also presented for bridging the domain gap between photographic and medical imaging by utilizing unlabeled large-scale in-domain data. Please see Ma – MICCAI 2022 for additional information.

M23-207L: Deep learning nowadays offers expert-level and sometimes even super-expert-level performance, but achieving such performance demands massive annotated data for training. Numerous datasets are available in medical imaging but individually small and heterogeneous in expert annotations. We envision that a powerful and robust deep model can be trained by aggregating numerous small datasets. To realize this vision, we have developed Ark, a framework that accrues and reuses knowledge from heterogeneous expert annotations in various datasets. To demonstrate its capability, we have trained two Ark models on 335,484 and 704,363 chest X-rays, respectively, by merging several datasets including ChestX-ray14, CheXpert, MIMIC-II, and VinDr-CXR, evaluated them on a wide range of imaging tasks covering both classification and segmentation via fine-tuning, linear-probing, and gender-bias analysis, and demonstrated our Ark’s superior and robust performance over the state-of-the-art fully/self-supervised baselines and Google’s proprietary CXR Foundation Model trained on 821,544 chest X-rays. This performance is attributed to our simple yet powerful observation that aggregating numerous datasets diversifies patient populations and accrues knowledge from manifold experts, yielding unprecedented performance yet saving annotation costs. Ark is not limited to chest X-rays and can be trained with images beyond chest X-rays; thereby, Ark is expected to exert an important impact on medical image interpretation in general, thanks to the potential that accruing and reusing knowledge from heterogeneous annotations by diverse experts associated with various (small) datasets can surpass the performance of proprietary models trained on unusually large data. More importantly, Ark can be adapted to local environments via fine-tuning with local image data, further boosting its performance. We are looking for industrial partners to commercialize Ark to realize its clinical value for assisting radiologists in interpreting chest X-rays in particular and medical images in general. Please see Ma – MICCAI 2023 for additional information.

M23-208L: A self-supervised learning (SSL) framework to analyze human anatomy, particularly chest anatomy, to analyze its hierarchy. This strategy is is hierarchical, autodidactic and coarse, resulting in a versatile pretrained model with is dense and semantics-meaningful. The structure and segmentation of different organs within the chest can be analyzed and divided into smaller segments to encode their relationships with each other. Then the false negative samples are pruned so the system can learn from the positive samples to analyze the structure as a whole. This model allows for a machine learning framework to analyze human anatomy by compartmentalization. Please see Hosseinzadeh Taher- DART 2023 for additional information.

M23-225L: A novel system for implementing improved generalizability, transferability and robustness through modality unification, function integration and annotation aggregation for medical image analyses. This system utilizes a deep learning method to analyze medical images from x-rays, colonoscopies and CT scans and annotate, classify and segment these images for model training. Once trained, the model performs in a way to limit bias related to gender and other variables.

M23-260L: System for implementing improved self-supervised learning techniques through relating-based learning for medical image analysis. This system implements local consistency of embeddings as well as hierarchical consistency in embedding. It generalizes the implementation of the trained AI model to produce a generalized pretrained AI model.

M23-274L: Swin-Unet+ – Researchers at Arizona State University have developed a generic unified multi-task model, to aid in the detection, classification, and segmentation of intestinal polyps. The model is established by the combination of the Swin-Transformer model and the widely accepted medical imaging UNet architecture. The integration of two components allows for the capacity to handle multiple tasks, like that of high-level feature extraction and accurate segmentation and localization. Likewise, the joint learning of polyp-specific features across tasks, resulting in overall improved performance. This can serve a significant impact into the early detection and accurate diagnosis of intestinal polyps for the prevention of colorectal cancer.

M23-285L: Researchers at Arizona State University have developed a Unified collaborative learning framework for performance gains and annotation cost reduction of 3D medical imaging. This framework employs the collaboration of three self-supervised learning (SSL) elements: discriminative, restorative and adversarial learning, enabling collaborative learning to yield a discriminative encoder, a restorative decoder, and an adversary encoder. Building upon six prominent self-supervised methods, including Rotation, Jigsaw, Rubik’s Cube, Deep Clustering, TransVW, and MoCo, the developed united framework integrates enhancements for each method, formulating them together into a cohesive framework. To address model complexity, a stepwise incremental pretraining approach was developed to unify the pretraining process. Performance gains were established in five target tasks, encompassing both classification and segmentation, across diseases, organs, datasets, and modalities. Please see Guo et al – Springer – 2022 for additional information.

M23-301L: PEAC – Researchers at Arizona State University have developed a novel self-supervised learning (SSL) approach for visual representations in medical image analysis. The approach, titled PEAC (patch embedding of anatomical consistency), builds upon the principles of previous SSL approaches, contrasting through its utilizing of medical imaging protocols to establish a high consistency in image anatomy features. PEAC has demonstrated a higher performance compared to existing state-of-the-art self-supervised methods. This is due to its ability to capture the anatomical structure consistency across differing views of the same patients and across different patient demographics of gender, weight, and health status. Please see Zhou et al – 2023, PEAC – GitHub for additional information.

M23-311L: DiRA – Researchers at Arizona State University have developed a novel Unified framework for the benefit of deep semantic representation learning. The basis of this framework, called DiRA, integrates the self-supervised learning (SSL) elements discriminative, restorative, and adversarial learning for the extraction of visual information from medical images for fine-grained semantic representation learning. The encouraged collaboration of SSL elements establishes a more generalized representation across organs, disease, and modalities in medical imaging. Additionally, DiRA outperforms fully supervised ImageNet models, reducing annotation costs and enabling accurate lesion localization with minimal annotation. Please see Haghighi et al – Medical Image Analysis – 2024, DiRA – GitHub for additional information.

M23-118L: Learning Foundation Models from Anatomy in Medical Imaging – Researchers at Arizona State University have developed novel training strategy to understand anatomy via hierarchical self-supervised contrastive learning. This SSL pretrained model (Adam) exploits the hierarchical nature of human anatomy and progressively learns anatomy in a coarse-to-fine manner using hierarchical contrastive learning. Adam generalizes to myriad tasks and also preserves intrinsic properties of anatomical structures (locality and compositionality), which is crucial to understanding anatomy. Please see Hosseinzadeh Taher – MICCAI Workshop for additional information.

M24-078L: Computer-Aided Diagnosis of Pulmonary Embolism – Researchers at Arizona State University have developed deep learning methods that can be used to diagnose Pulmonary Embolisms. Using convolutional neural networks (CNNs), they have been able to classify and segment different regions that are imaged. Their architecture, the Embedding-basedViT (E-ViT), takes a sequence of slice-level embeddings rather than image patches as the input. This framework works for a variety of sizes. Second, class tokens are used to address the multi-label classification task. Third, E-ViT discards the position embedding for each CT slice because neighboring slices in a CT scan appear similar, retaining the position embedding that can lead to degradation. E-ViT integrates class embedding with exam level embedding to exploit the feature representations extracted from individual sequences with varying numbers of slices. Please see Islam – Mach Learn Med Imaging for additional information.

M24-139L: Towards Hierarchical Embeddings with Localizability, Composability, and Decomposability Learned from Anatomy – Researchers at Arizona State University have developed a self-supervised learning strategy explicitly incorporating part-whole hierarchies into its learning objectives that may be used in medical image analysis. By using three key branches (localizability, composability, and decomposability) this learning strategy can overcome the limitation of lacking explicitly coding of part-whole relations. It allows the model to learn a semantically structured embedding space by discriminating between different anatomical structures, empowering the model to learn part-whole relations by constructing each anatomical structure through the integration of its constituent parts, and decomposing each anatomical structure into its constituent parts. The unification of these three branches together enables the system to preserve harmony in embeddings of semantically similar anatomical structures. Please see Hossein Taher – IEEE/CVF for additional information.

M24-140L: Anatomically Consistent Embeddings in Composition and Decomposition – Researchers at Arizona State University have developed an approach to overcome the limitation of self-supervised learning methods appreciating hierarchical structure attributes inherent to medical images. This approach seeks to capture hierarchical features consistent across varying scales, from subtle disease textures to structural anatomical patterns. It leverages some intrinsic properties of the medical image, and bridges the semantic gap across scales from high-level pathologies to low-level tissue anomalies, ensuring a seamless integration of fine-grained details with global anatomical structures. This shows significant promise for advancing explainable AI applications in medical image analysis.

M24-141L: Learning Anatomical Consistency, Sub-volume Spatial Relationships and Fine-grained Appearance for CT – Researchers at Arizona State University have developed a self-supervised learning approach that allows for the exploration of medical images within high-level context, spatial relationships in anatomy, and fine-grained features in anatomical structures. Called ASA, it is designed to learn anatomical consistency, sub-volume spatial relationships, and fine-grained appearance for 3D computed tomography images. This method acquires anatomical knowledge of intra-volume spatial relationships by sub-volume order prediction and volume-wise fine-grained features by volume appearance recovery. It also employs the student-teacher learning paradigm to automatically capture high-level features by optimizing the agreement between features from two spatially related crops. The framework encompasses two learning perspectives: acquiring high-level global features by optimizing the agreement between two spatially related views and acquiring sub-volume relationships and fine-grained features through sub-volume order prediction and volume appearance recovery. Please see Pang – MICCAI for additional information.

M24-179L: Large-Scale Benchmarking and Boosting Transfer Learning for Medical Image Analysis – Researchers at Arizona State University have developed a novel approach focusing on benchmarking numerous conventional and modern convolutional neural network (ConvNet) and vision transformer architectures across various medical tasks; (ii) investigating the impact of fine-tuning data size on the performance of ConvNets compared with vision transformers in medical imaging; (iii) examining the impact of pretraining data granularity on transfer learning performance; (iv) evaluating transferability of a wide range of self-supervised methods with diverse training objectives to a variety of medical tasks across different modalities; and (v) delving into the efficacy of domain-adaptive pretraining on both photographic and medical datasets to develop high-performance models for medical tasks. This technology demonstrates higher transferability than vision transformers when fine-tuning for medical tasks and has proven to be more annotation efficient than vision transformers when fine-tuned for medical tasks.

M25-029L: Towards Open Foundation Models in Medical Imaging. Researchers at Arizona State University have developed an approach to learning an open foundation model for medical images using a self-supervised learning (SSL) framework and a supervised learning framework. This can be used with or without heterogenous expert labels.

M24-212L: Disentangling Anatomical Visual Information from Diseases for Learning Entangled Representation- Researchers at Arizona State University have developed a framework that leverages a student-teacher network design to separately learn disease-related and anatomical features from approximately one million chest X-rays across multiple datasets. This novel approach mimics human perception, allowing the model to generate superior and transferable representations that improve disease identification and anatomy understanding in medical imaging tasks. Please see Haghighi et al – IEEE 2025 for additional information.

M24-213L: ASA: Learning Anatomical Consistency, Sub-volume Spatial Relationships and Fine-grained Appearance for CT Images- Researchers at Arizona State University have developed a self-supervised learning approach to enhance segmentation performance on 3D computed tomography images by focusing on learning anatomical consistency, sub-volume order, and fine-grained appearance features. Utilizing symmetry and inherent spatial patterns in medical images, ASA employs sub-volume order prediction, volume appearance recovery, and global and local feature alignment via a student-teacher network to efficiently train models without the need for large, annotated datasets. Please see Pang et al – Springer Nature-2024 for additional information.

M24-296L & M25-166L: ACE: Anatomically Consistent Embeddings via Composition and Decomposition- Researchers at Arizona State University have developed a novel self-supervised learning approach called ACE which introduces a dual consistency approach targeting both global macro-structures and local fine-grained details within medical images through contrastive learning and matrix matching. Utilizing grid-wise image cropping for precise patch matching, ACE excels in capturing the compositional and decompositional features of anatomical structures. Validated on chest X-ray datasets, it shows superior robustness, transferability, and clinical potential across classification, segmentation, and key-point detection tasks. Please see Zhou et al – IEEE-2025 for additional information.

M25-126L: Autodidactic Dense Anatomical Models – Researchers at Arizona State University have developed a self-supervised learning framework, Adam–v2 which leverages a three-branch architecture: localizability, composability, and decomposability, to model the hierarchical and compositional nature of anatomical structures in medical images. This improves performance in various medical imaging tasks such as segmentation and disease classification while demonstrating strong generalizability and emergent understanding of anatomical layouts. Please see Ma et al.-Nature-2025 for additional information.

M25-141L: Foundation X: Integrating Classification, Localization, and Segmentation through Lock-Release Pretraining Strategy for Chest X-ray Analysis – Researchers at Arizona State University have developed an end-to-end framework, named Foundation X, which utilizes diverse expert-level annotations from various public datasets in order to train a foundation model which is capable of multiple tasks including classification, localization and segmentation. Foundation X achieves significant performance gains through extensive annotation utilization helping it excel in cross-dataset and cross-task learning to enhance organ localization and segmentation. Please see Islam et al – WACV – 2025 for additional information.

M25-222L: Ark+ : Accruing and Reusing Knowledge for Superior and Robust Foundation Models- Researchers at Arizona State University have developed a framework, Ark+, designed to build high-performing foundation models for medical imaging by leveraging knowledge accrued from multiple public datasets with heterogeneous expert annotations. Ark+ supports distributed training across proprietary data sources, adapts to various model architectures and imaging modalities, and ensures robustness against biases to deliver reliable diagnostic performance. Please see Ma et al.-Springer Nature-2023 for additional information.

M25-253L: Benchmarking and Boosting of 3D Segmentation Models – Researchers at Arizona State University have developed a comprehensive benchmark that identifies the top performers for 3D single-task segmentation with limited data. It also employs pretraining strategies to develop a multi-task model capable of joint learning from multiple heterogeneous datasets. This model was pretrained on 16 public datasets with 3,000 CT scans annotated for 25 organs and 6 tumors and was able to outperform both Swin UNETR and CLIP-driven Universal.

M26-007L: Test Suite for Chest Radiography- Researchers at Arizona State University have developed a comprehensive test suite that presents an extensive evaluation framework for foundation models applied to chest radiography. This test suite focuses on five critical tasks: novelty detection, segmentation with limited data, anatomical structure matching, pattern retrieval in health and disease, and anatomy understanding. It benchmarks eight state-of-the-art large-scale medical models, leveraging diverse datasets such as COVIDxCXR-3 and ChestX-ray14, to assess their performance, adaptability, and clinical potential in medical imaging.

M26-008L: LAnD: Local Anatomical and Disease Feature Learning Model – Researchers at Arizona State University have developed a framework that employs an alternating knowledge distillation approach between anatomical and disease expert models, enabling effective learning without the need for large datasets or complex pretext tasks. It features a shared student network that synthesizes insights from both experts, leading to improved disease discrimination, anatomical comprehension, and robustness against challenges such as gender-imbalanced data. This framework achieves superior performance without requiring human annotations or complex pretext tasks, making it a versatile and robust solution for comprehensive medical image analysis.

M26-080L: Cross-Modal Knowledge Distillation for Chest Radiographic Diagnosis – Researchers at Arizona State University have developed a cross-modal knowledge distillation, CREED for chest radiographic diagnoses. Through three learning objectives: (1) embedding reconstruction to preserve fine-grained language information, (2) diagnostic classification to bridge the modality gap, and (3) KL-divergence minimization to enforce alignment between vision and language embeddings, CREED is able to achieve expert-level performance for chest radiographic diagnosis. Please see Ma et al – Conference Proceeding – 2025

M26-082L: Foundation Model Evaluation Suite for Chest Radiography – Researchers at Arizona State University have developed a comprehensive test suite which offers a systematic framework to assess the performance of foundation models on six critical chest radiography tasks, including novelty detection, recognizing abnormality via anomaly detection, organ and lesion segmentation, few-shot segmentation, anatomical structure matching, retrieval of anatomical patterns, and anatomical understanding without excessive training. By evaluating multiple recent models across diverse datasets and settings, it provides insights into their clinical strengths and limitations, fostering robust generalization in healthcare applications.

Inventor(s)

Technology categories

Licensing Contacts