Workshops on Automated Science

Automated science uses AI to plan, execute and interpret experiments without human intervention. Automated science platforms combine three technologies: laboratory robots that perform physical experiments, a machine learning model that predicts results and an AI agent that plans future experiments. The fast-growing field of automated science includes self-driving labs, robot scientists, and AI systems that enhance human experimenters.

Paul Jensen (University of Michigan) and Mark Hendricks (Whitman College) teach a workshop that introduces automated science and shows how automation, modeling, and AI fit together into a closed-loop, autonomous system. The workshop also emphasizes the interfaces between technologies, e.g., how to modify machine learning models for use with a planning agent or the liquid handling challenges that arise from AI-planned experiments. Case studies in biology and chemistry highlight state-of-the-art automated science systems.

Module 1: Introduction to Automated Science. (4 hours) The first module introduces the "plan -> experiment -> model" strategy and the Bayesian optimization framework for identifying optimal experiments. Participtants will develop a conceptual understanding of the explore vs. exploit tradeoff and the constratints on optimal learning. This module is designed to get automated engineers, scientists, and modelers on the same page and assumes no prior knowledge of AI.

Module 2: Hands-on Automated Science. (4 hours) Participants will apply the techniques from Module 1 to build an AI system that learns to aim a tabletop catapult. Teams will design a learning strategy, collect data, and train a simple machine learning model using a Python software library. Ideally, each team of 2-4 participants would have one member with basic Python programming skills.

Module 3: Experiment Design and Automated Quality Control. (4 hours) The third module explains how classical design of experiments (DOE) can enhance modern automated science systems. Topics include factor screening, pooling designs, hybrid designs, and resposne surface methodology (RSM). The module also introduces techniques from statistical quality control that can be deployed autonomously to monitor AI-driven labs.

Module 4: Modeling for Automated Science. (4 hours) The final module dives deeper into the machine learning methods used to implement automated science systems. Topics include Gaussian Process Regression, Bayesian machine learning, optimization-based policies, batching/parallel experiments, hybrid experiments, and interpretation. This module is designed for participants with a computational background and assumes a basic understanding of machine learning concepts.

Audience and Background: Attendees will learn from instructors who have assembled automated science pipelines in biology and chemistry. The course requires only a basic understanding of statistics and laboratory automation (except Module 4, which is designed for participants with a computational background). Practitioners with expertise in one technology (e.g., machine learning or lab automation) are encouraged to attend and learn how to combine their knowledge with other parts of an automated science platform.

Format: Workshops last 4-16 hours (depending on the number of modules), split over 1-2 days. Workshops can be held at a single company/institution, or multiple groups can work together to host an open workshop. Workshops can also be taught at conferences or sponsored by government/industry partnerships. Please contact Paul Jensen (pjens@umich.edu) if you are interested in hosting a workshop.

Materials from Previous Short Courses

The following resources are from the Introduction to Automated Science short course presented at SLAS 2023 in San Diego, CA, USA. The same topics are covered by Module 1 in our Automated Science workshops. Permission is granted to use any of these materials for educational purposes (with attribution). Questions, comments, suggestions, and corrections are appreciated.

  1. Introduction [slides]
    • Lab automation vs. automated science
    • The automated science cycle
    • Definitions
    • The case studies presented in the short course are:
      • Martin KN, Rubsamen MS, Kaplan NP, Hendricks MP.
        Method for Interfacing a Plate Reader Spectrometer Directly with an OT-2 Liquid Handling Robot
        chemRxiv, 2022. [link]
      • Dama AC, Kim KS, Leyva DM, Lunkes AP, Schmid NS, Jijakli K, Jensen PA.
        BacterAI maps microbial metabolism without prior knowledge.
        Nature Microbiology. 2023; 8:1018-1025. [link]
  2. Planning -- Exploiting [slides]
    • Initial designs
    • Searching for next runs
    • Sequential optimization
  3. Models [slides]
    • Mechanistic vs. black-box
    • Fixed parameter vs. Bayesian
    • Types of models
    • Bagging
    • Calibration vs. accuracy
  4. Planning -- Exploring [slides]
    • Sequential model improvement
    • Explore vs. exploit tradeoffs
    • Expected improvement
    • Optimality of sequential designs
  5. Putting it all together [slides]
    • Batched experiments
    • Automation challenges
    • The automated science checklist
Slides from a previous course at SLAS 2023:
  1. Introduction [slides]
  2. Planning -- Exploiting [slides]
  3. Models [slides]
  4. Planning -- Exploring [slides]
  5. Putting it all together [slides]

Suggested Reading

Bayesian optimization (modeling & policies)

Bayesian modeling

Other Resources

Paul taught a semester-long course on experiment design at the University of Illinois. The course website has lecture slides, links to textbooks, and assignments. Part 1 of the course discusses classical Design of Experiments which can be used for factor screening and initial designs. Part 2 covers sequential optimization with Bayesian surroages. Part 3 introduces reinforcement learning for applications where experiments are inexpensive and fast.