Seminars

The group will host a monthly seminar series (usually at noon on the first Tuesday of the month), where we invite both local and external speakers to present their work on deep learning-related projects. Most meetings will be hybrid: Arnold Building M3-A805 and Zoom.

Upcoming Schedule

  • Tuesday, May 6, 2025, 12:10-1pm
    Hybrid: Arnold Building M3-A805 and Zoom
    Speaker: William Hsu, UCLA
    Title: To come
  • Tuesday, June 3, 2025, 12:10-1pm
    Hybrid: Arnold Building M3-A805 and Zoom
    Speaker: Jean Feng, UCSF
    Title: To come
  • Tuesday, June 17, 2025, 12:10-1pm
    Hybrid: Arnold Building M3-A805 and Zoom
    Speaker: Yoga Balagurunathan, Moffitt Cancer Center
    Title: To come
  • Tuesday, September 2, 2025, 12:10-1pm
    Hybrid: Arnold Building M3-A805 and Zoom
    Speaker: Matheus Viana, Allen Institute
    Title: To come
  • Tuesday, October 7, 2025, 12:10-1pm
    Hybrid: Arnold Building M3-A805 and Zoom
    Speaker: Guanghua Xiao, UT Southwestern
    Title: To come
  • Tuesday, November 4, 2025, 12:10-1pm
    Hybrid: Arnold Building M3-A805 and Zoom
    Speaker: To come
    Title: To come
  • Tuesday, December 2, 2025, 12:10-1pm
    Hybrid: Arnold Building M3-A805 and Zoom
    Speaker: Anton Arkhipov, Allen Institute
    Title: To come

Past Speakers

  • April 1, 2025
    Speaker: Kyle Lafata, Duke
    Title: Computational Tumor Phenotyping and Multiscale Mathematical Modeling to Study Radiation Resistance and Immune Dysregulation
    Abstract:  Cancer heterogeneity spans multiple length-scales of biological organization, including tissue, cellular, and molecular levels. Characterization of these domains is essential to overcoming therapeutic resistance and guiding personalized treatment strategies. This talk will focus on computational tumor phenotyping strategies and multiscale mathematical modeling of treatment resistance and immune dysregulation. I will demonstrate how imaging, digital pathology, and spatial transcriptomics enable a multiscale representation of tumor appearance and behavior. By integrating physics-informed, mathematical tumor models (theory) with image-based, data-driven solutions (observables), I will demonstrate that these techniques can capture both clinically relevant and biologically-sound phenomena. Overarching illustrating examples will include radiation-induced changes in tumor dynamics, single-cell evaluation of the tumor immune microenvironment and immune response, molecular insight into tumor heterogeneity, and biologically-guided adaptive treatment strategies. 
  • February 4, 2025
    Speaker: Hoifung Poon, Microsoft Research
    Title: Multimodal Generative AI for Precision Health
     Abstract: The dream of precision health is to develop a data-driven, continuous learning system where new health information is instantly incorporated to optimize care delivery and accelerate biomedical discovery. The confluence of technological advances and social policies has led to rapid digitization of multimodal, longitudinal patient journeys, such as electronic medical records (EMRs), imaging, and multiomics. Our overarching research agenda lies in advancing multimodal generative AI for precision health, where we harness real-world data to pretrain powerful multimodal patient embedding, which can serve as digital twins for patients. This enables us to synthesize multimodal, longitudinal information for millions of cancer patients, and apply the population-scale real-world evidence to advancing precision oncology in deep partnerships with real-world stakeholders such as large health systems and life sciences companies.

  • January 7, 2025
    Speaker: Gary Zhao, FHCC Translational Science and Therapeutics
    Title: Detection of Mutant Blood Cells by Trans-species Morphological Learning.
    Abstract: Detection of sparse mutant cells in blood samples has broad implications in precision medicine and cancer prevention. The intrinsic technical limitations of DNA sequencing-based mutation detection methods have been limiting the sensitivity, cost effectiveness, and turnaround time of such tasks. We broke away from the “ball and chain” of DNA sequencing and developed a trans-species single-cell morphology learning system that allows detection of mutant blood cells with improved sensitivity, lower cost, and faster turnaround, making population-wide screening and dense-time scale analysis possible. 
    Gary Zhao is a translational medical research scientist, with specialties in machine learning and data-driven experimental approaches. He is an investigator (Early Diagnosis and Early Intervention in Hematology) with the Hadland Group at FHCC and Acting Helmholtz HiDA Machine Learning Scientist, DKF.
  • November 5, 2024
    Speaker: Adam Visokay, University of Washington
    Title: 
    Inference on Predicted Data: Examples from Verbal Autopsies and the BMI
    As AI and ML tools become more accessible, and scientists face new obstacles to data collection (e.g. rising costs, declining survey response rates), researchers increasingly use relatively cheap predictions from pre-trained algorithms in place of more expensive "ground truth" data. Standard tools for inference can misrepresent the association between independent variables and the outcome of interest when the true, unobserved outcome is replaced by a predicted value. In this talk, I present an overview detailing how to perform valid inference when working with predicted data. I will share two examples of this method in practice - one in the context of global public health working with Verbal Autopsy data, and the other in the context of medicine, working with BMI data.
    Bonus talk! 
    Speaker: Ben McGough, FHCC Scientific Computing
    Title: Hutch Scientific Computing HPC Cluster Roadmap 
  • October 1, 2024,
    Speaker: Pang Wei Koh, Univeristy of Washington
    Title: Reliable data use: Synthesis, retrieval, and interaction
    How can we better use our data to build more reliable and responsible AI models? I will first discuss when it might be useful to train on synthetic image data derived, in turn, from a generative model trained on the available real data. Next, I will describe how scaling up the datastore for retrieval-based language models can significantly improve performance, indicating that the amount of data used at inference time—and not just at training time—should be considered as a new dimension of scaling language models. Finally, I will discuss how the static nature of most of our training data leads to language model failures in interactive settings.
  • June 18, 2024
    Speaker: James Zou, Stanford University
    Title: How Generative AI Can Transform Biomedical Research (recording not available)
    Abstract: This talk explored how we can develop and use generative AI to help researchers and clinicians. I will first discuss how generative AI can act as research co-advisors. Then I will present how we developed visual-language AI to help clinicians aggregate and interpret noisy data. Finally, I will explore the role of language as the foundational data modality for biomedicine. 
  • May 7, 2024
    Speaker: Zheng Wei, FHCC Comp Bio
    Title: Deciphering Gene Regulatory Logic Through Deep Learning Interpretation
    Abstract Discovering DNA/RNA regulatory sequence motifs and their relative positions is fundamental to understanding the mechanisms of gene expression regulation across development, tumors, and various diseases. Although deep convolutional neural networks (CNNs) have achieved great success in predicting different cis-regulatory elements, the discovery of motifs and their combinatorial patterns from these CNN models has remained difficult. We show that the main difficulty is due to the problem of multifaceted neurons that respond to multiple types of sequence patterns. To overcome this problem, we propose the NeuronMotif algorithm to interpret such neurons. NeuronMotif can output the sequence motifs, and the syntax rules governing their combinations are depicted by position weight matrices organized in tree structures, which are supported by multiple sources including existing knowledge databases, ATAC-seq data, and the literature. Currently, we are developing CNN-transformer hybrid models and corresponding interpretation algorithms to decode the regulatory logic involving long distance (> 1 Mb) in a networked manner. Our long-term goal is to use the regulatory logic to understand how non-coding mutations drive cancer risk and to develop accurate risk prediction scores for precision oncology. 
  • April 16, 2024
    Speaker:  Dominik Otto, FHCC Setty Lab
    Title: Quantifying Cell-State Densities using Mellon and Inferring Differentiation Dynamics.
    Abstract: Cell-state density characterizes the distribution of cells along phenotypic landscapes and is crucial for unraveling the mechanisms that drive cellular differentiation, regeneration, and disease. We present Mellon, a novel computational algorithm for estimation of cell-state densities from high-dimensional representations of single-cell data. We demonstrate Mellon's efficacy by dissecting the density landscape of various differentiating systems, revealing a consistent pattern of high-density regions corresponding to major cell types intertwined with low-density, rare transitory states. Mellon offers the flexibility to perform temporal interpolation of time-series data, providing a detailed view of cell-state dynamics during the inherently continuous developmental processes. I elaborate on how to use these cell-state density estimates for comprehensive training of cell-differentiation models based on deep neural networks and gaussian process. I will showcase an existing publication [1] and further advancements we are working on with Mellon.
    [1] Sha, Yutong, Yuchi Qiu, Peijie Zhou, and Qing Nie. “Reconstructing Growth and Dynamic Trajectories from Single-Cell Transcriptomics Data.” Nature Machine Intelligence 6, no. 1 (January 2024): 25–39. https://doi.org/10.1038/s42256-023-00763-w.
  • March 5, 2024
    Speaker: Rachel Thomas, fast.ai
    Title: Medical AI Needs You
    Abstract: It is not only possible, but also crucial, for you to get involved with medical AI.  Advances in AI are having a transformative impact on many fields, including medicine.  However, AI also creates and amplifies a number of ethical risks.  Having researchers and practitioners from a variety of backgrounds helps to mitigate these risks and to take advantage of opportunities. Through my work co-founding fast.ai, I helped to create the longest running deep learning course in the world and reached a diverse group of students.  Fast.ai alumni from a range of unconventional backgrounds have been able to make a positive impact.  This talk will cover details of the particular risks and challenges impacting medical AI, as well as the positives of when people from unlikely backgrounds get involved. 
    Speaker's Erratum Lauren Oakden-Rayner is not a fast.ai alum. Two radiologists who are fast.ai alums are Alexandre Cadrin-Chênevert and Judy Gichoya.  
  • February 6, 2024 
    Speaker: Harsha Nori, and Rich Caruana, Microsoft Research
    Title: Large Language Models (LLMs), Healthcare, and Interpretable Machine Learning
    Abstract: Recent language models like GPT-4 and MedPaLM have shown remarkable capabilities in various aspects of medicine, including clinical reasoning, diagnostics, and even augmenting clinician-patient interactions. These models have achieved expert level performance on competency exams like the USMLE and a battery of specialty board exams. We'll begin with a discussion on how we've been assessing these models at Microsoft, including a dive into how to elicit maximal performance from these models and a reflection on translating benchmark performance to the real world. We'll then share work we're doing to bring the power of LLMs into more traditional machine learning for healthcare tasks, like predicting 30-day readmission on structured, tabular datasets. We'll show how LLMs and interpretable machine learning models commonly used in healthcare can work surprisingly well together, especially on tasks that LLMs alone are not naturally suited for. Finally, we'll discuss promising trends on the horizon for the future of ML for healthcare.
  • January 9, 2024
    Speaker: Daniel Jones, Newell Lab, Vaccine and Infectious Disease Division
    Title: Cell Simulation as Cell Segmentation (recording not available)
    Abstract: Single-cell spatial transcriptomics promises a highly detailed view of a cell's transcriptional state and microenvironment, yet inaccurate cell segmentation can render this data murky by misattributing large number of transcripts to nearby cells or conjuring nonexistent cells. We adopt methods from ab initio cell simulation to rapidly infer morphologically plausible cell boundaries that preserve cell type heterogeneity.
  • November 7, 2023
    Speaker: Michael Haffner, Human Biology Division and Clinical Research Division, Fred Hutchinson Cancer Center
    Title: Machine Learning-Based Morphologic Characterization of Genitourinary Malignancies
    Abstract: In recent years, the field of medicine has witnessed a remarkable transformation with the advent of cutting-edge computer-based image and pattern analyses. These approaches have proven exceptionally adept at efficiently detecting and classifying objects, including the identification of cancer from large and complex medical images. This is particularly invaluable, as such tasks can be exceedingly time-consuming for healthcare professionals. Moreover, the remarkable power of these methods lies in their capacity to delineate patterns that were previously not recognized by physicians and researchers. In this seminar, I will share our efforts in harnessing machine learning-based approaches to define morphologic features of advanced metastatic prostate cancer and bladder cancer from standard pathology images, with a focus on developing biomarkers that can guide clinical decision making. 
  • October 3, 2023
    Speaker: Lucas Liu, Biostatistics Program, Publich Health Sciences Division, Fred Hutchinson Cancer Research Center
    Title: Deep Learning for Data Representation in Temporal Electronic Health Records
    Abstract: Analyzing and mining Electronic Health Records (EHR) data is essential for improving patient care and reducing healthcare costs. Sequential deep learning (DL) methods have become increasingly popular for analyzing temporal EHR data in various healthcare applications. Despite their promising performance, DL methods face significant challenges in being adopted in real-world healthcare settings. These challenges include handling irregular temporal scales and asynchronous multi-variable EHR inputs, ensuring model interpretability, and addressing fairness and performance disparities. In this talk, we will introduce DL algorithms for learning data representation in temporal EHR data and propose solutions to overcome these challenges.
  • September 5, 2023
    Speaker: John Kang, Department of Radiation Oncology, University of Washington
    Title: Unlocking Information Extraction in Oncology using NLP
    Abstract: Being able to detect and predict patient outcomes is an aspirational goal for AI in oncology. In this talk, we discuss how deeper and improved representation methods can unlock better methods for detection and prediction. We begin with current applications of AI in clinical trials using structured data, and then discuss how improved representation using NLP can improve performance and decrease labeling burden.
  •  June 15, 2023
    Speaker: Young Hwan Chang, Computational Biology Program, OHSU
    TitleRepresentation Learning and its Application on Multiplex Tissue Imaging Data
    Abstract: Multiplexed Tissue Imaging (MTI) techniques have revolutionized tissue sample analysis by enabling simultaneous measurement of numerous biomarkers. However, challenges such as technical artifacts, tissue loss, long acquisition times, and limitations in current MTI analyses hinder its full potential. In this talk, we will address these challenges and propose solutions to enhance the capabilities and accessibility of MTI, with a focus on representation learning. Our emphasis lies in the need for comprehensive representations of multiplexed single-cell images, encompassing morphology, cell shape, and texture beyond mean intensity features. We will also explore techniques like image-to-image translation and image-to-omics integration to obtain transferable multimodal representations, facilitating a holistic interpretation of cellular data. Through the utilization of representation learning on MTI, we uncover diagnostically significant features in standard histopathology images, advancing our understanding of tumor biology and improving cancer diagnosis and treatment. Furthermore, we will discuss strategies to overcome cost and time challenges associated with MTI, making it more accessible in cancer research and clinical settings. These advancements propel the field forward, unlocking the potential of MTI in cancer diagnosis and treatment, driving scientific discoveries, and ultimately improving patient outcomes.
  • May 2, 2023
    Speaker: William Stafford Noble, Department of Genome Science, University of Washington
    Title: Deep Learning for Mass Spectrometry Proteomics 
    Abstract: In this talk, I will describe several recent and ongoing projects that apply deep neural networks to the analysis of protein tandem mass spectrometry data. We first show how a Siamese architecture can be trained in a supervised fashion to embed individual mass spectra into a 32-dimensional space, yielding a compact representation that enables large-scale, highly accurate clustering of the spectra and significantly enhances our ability to assign observed spectra to their corresponding peptide sequences. The second project uses a model with a transformer architecture to perform de novo peptide sequencing, by translating directly from a mass spectrum (a sequence of peaks) to a peptide (a sequence of amino acids). The resulting model, trained from 30 million spectra, outperforms existing methods and enhances our ability to interpret various types of mass spectrometry data.