Workshops on Automated Science
Automated science uses AI to plan, execute and interpret experiments without human intervention. Automated science platforms combine three technologies: laboratory robots that perform physical experiments, a machine learning model that predicts results and an AI agent that plans future experiments. The fast-growing field of automated science includes self-driving labs, robot scientists, and AI systems that enhance human experimenters.
Paul Jensen (University of Michigan) and Mark Hendricks (Whitman College) teach a workshop that introduces automated science and shows how automation, modeling, and AI fit together into a closed-loop, autonomous system. The workshop also emphasizes the interfaces between technologies, e.g., how to modify machine learning models for use with a planning agent or the liquid handling challenges that arise from AI-planned experiments. Case studies in biology and chemistry highlight state-of-the-art automated science systems.
Module 1: Introduction to Automated Science. (4 hours)
The first module introduces the "plan
Module 2: Hands-on Automated Science. (4 hours) Participants will apply the techniques from Module 1 to build an AI system that learns to aim a tabletop catapult. Teams will design a learning strategy, collect data, and train a simple machine learning model using a Python software library. Ideally, each team of 2-4 participants would have one member with basic Python programming skills.
Module 3: Experiment Design and Automated Quality Control. (4 hours) The third module explains how classical design of experiments (DOE) can enhance modern automated science systems. Topics include factor screening, pooling designs, hybrid designs, and resposne surface methodology (RSM). The module also introduces techniques from statistical quality control that can be deployed autonomously to monitor AI-driven labs.
Module 4: Modeling for Automated Science. (4 hours) The final module dives deeper into the machine learning methods used to implement automated science systems. Topics include Gaussian Process Regression, Bayesian machine learning, optimization-based policies, batching/parallel experiments, hybrid experiments, and interpretation. This module is designed for participants with a computational background and assumes a basic understanding of machine learning concepts.
Audience and Background: Attendees will learn from instructors who have assembled automated science pipelines in biology and chemistry. The course requires only a basic understanding of statistics and laboratory automation (except Module 4, which is designed for participants with a computational background). Practitioners with expertise in one technology (e.g., machine learning or lab automation) are encouraged to attend and learn how to combine their knowledge with other parts of an automated science platform.
Format: Workshops last 4-16 hours (depending on the number of modules), split over 1-2 days. Workshops can be held at a single company/institution, or multiple groups can work together to host an open workshop. Workshops can also be taught at conferences or sponsored by government/industry partnerships. Please contact Paul Jensen (pjens@umich.edu) if you are interested in hosting a workshop.
Materials from Previous Short Courses
The following resources are from the Introduction to Automated Science short course presented at SLAS 2023 in San Diego, CA, USA. The same topics are covered by Module 1 in our Automated Science workshops. Permission is granted to use any of these materials for educational purposes (with attribution). Questions, comments, suggestions, and corrections are appreciated.
- Introduction [slides]
- Lab automation vs. automated science
- The automated science cycle
- Definitions
- The case studies presented in the short course are:
- Martin KN, Rubsamen MS, Kaplan NP, Hendricks MP.
Method for Interfacing a Plate Reader Spectrometer Directly with an OT-2 Liquid Handling Robot
chemRxiv, 2022. [link] - Dama AC, Kim KS, Leyva DM, Lunkes AP, Schmid NS, Jijakli K, Jensen PA.
BacterAI maps microbial metabolism without prior knowledge.
Nature Microbiology. 2023; 8:1018-1025. [link]
- Martin KN, Rubsamen MS, Kaplan NP, Hendricks MP.
- Planning -- Exploiting [slides]
- Initial designs
- Searching for next runs
- Sequential optimization
- Models [slides]
- Mechanistic vs. black-box
- Fixed parameter vs. Bayesian
- Types of models
- Bagging
- Calibration vs. accuracy
- Planning -- Exploring [slides]
- Sequential model improvement
- Explore vs. exploit tradeoffs
- Expected improvement
- Optimality of sequential designs
- Putting it all together [slides]
- Batched experiments
- Automation challenges
- The automated science checklist
- Introduction [slides]
- Planning -- Exploiting [slides]
- Models [slides]
- Planning -- Exploring [slides]
- Putting it all together [slides]
Suggested Reading
Bayesian optimization (modeling & policies)
- Surrogates by Robert Grammacy. A statistically-oriented tutorial of surrogate modeling, including space-filling designs and optimization. The practical, conversational style includes R code for every figure. Grammacy connects Bayesian optimization with DOE/RSM techniques to show applications in process improvement. [Website]
- Bayesian Optimization by Roman Garnett. A comprehensive treatment of Gaussian Processes and Bayesian policies. More theoretical than other books, great for those wanting a mathematically-rigorous reference. [Website]
- Bayesian Optimization in Action by Quan Nguyen. A guided tutorial of Gaussian Process modeling and Bayesian optimization policies. A good starting point for implementing these approaches in Python (GPyTorch/BoTorch). [Publisher]
- The Design and Analysis of Computer Experiments by Satner, Williams, & Notz. This is a classic refernce on Bayesian optimization, primarily for surrogate modeling of computer simulations. Readers interested in wet-lab experiments may need to translate some of the techniques. [Publisher]
- Experiments by Wu & Hamada. Chapter 14 is a brief introduction to Bayesian optimization for those coming from a DOE/RSM background. [Publisher]
Bayesian modeling
- Hands-on Bayesian Neural Networks -- A Tutorial for Deep Learning Users by Jospin, et al. A comprehensive review of Bayesian neural networks, including an appendix with Python code. [arXiv]
- Variational Methods for Machine Learning with Applications to Deep Networks by Cinelli, et al. Shows how Variational Inference (VI) can be used to convert traditional deep neural networks into Bayesian models. [Publisher]
- Probabilistic Machine Learning: Advanced Topics by Murphy. A series of reference books on machine learning with a probabilistic view. Topics in Book 2 include Bayesian neural networks (chapter 17), Gaussian processes (chapter 18), and policies (chapter 34). Python code included for each example. [GitHub]
Other Resources
Paul taught a semester-long course on experiment design at the University of Illinois. The course website has lecture slides, links to textbooks, and assignments. Part 1 of the course discusses classical Design of Experiments which can be used for factor screening and initial designs. Part 2 covers sequential optimization with Bayesian surroages. Part 3 introduces reinforcement learning for applications where experiments are inexpensive and fast.