Investigations in Bioscience and Biotechnology: Big Data in Biology

The data science craze is sweeping the world. In the past decade, the amount of available biological data has exploded due to improved technologies and access to companies like 23andme. How can effective use of this data lead to a better understanding of human disease as well as more effective diagnoses and treatments? In this course, we will learn how computers can learn, using different kinds of biological data (genomics, imaging, single-cell measurements) to answer critical medical and research questions: How do we automatically identify a tumor from an X-ray image? How can we find predictors of disease from routine blood samples across many patients? Can we find robust patterns in a tumor sample to tell us more about how cancer works?

Students will practice applying different facets of machine learning including bias, bootstrapping, nonlinearity, model interpretation, and more to different biological datasets. We will also learn basic R programming and practice problem solving from a quantitative and practical skillset. Students will work in teams to complete a final project of their design, which will culminate in a presentation to their class during an informal poster presentation.

Session Two
Accepting Waitlist Applications
at the time of application
on the first day of session

Completion of one year of high school biology.