Back to All Events

Robust Learning from Big and Messy Data

The Department of Ecology and Evolution has coordinated a seminar with guest speaker Daniel Pimentel-Alarcón, Ph.D., Assistant Professor, Department of Computer Science, Georgia State University.

The seminar will be held Monday, May 20, 2019, at 10:30 a.m. in the Gordon Center for Integrative Science (GCIS) Room W301, 929 East 57th Street, Chicago, IL.

The seminar title is “Robust Learning from Big and Messy Data."

Abstract: Big data is only getting bigger. For example, the upcoming Square Kilometre Array alone will daily generate twice the amount of data sent around the Internet per day, and 100 times more than the CERN Large Hadron Collider, which already generates so much data that scientists must discard the overwhelming majority of it, hoping they didn’t throw away anything useful. Big data is also getting messier: incomplete, sparse, noisy, biased, and with outliers. Exploitation of these big and messy data increasingly depends on our ability to identify patterns that summarize these datasets.

In this talk I will present our recent theoretical findings to learn linear and non-linear patterns from big and messy data. I will also discuss the main ideas behind our practical algorithms that are guaranteed to succeed even in cases where traditional methods are guaranteed to fail. Finally, I will discuss applications of our findings in areas as diverse as astronomy, computer vision, metagenomics, and more.