BEGIN:VCALENDAR
VERSION:2.0
PRODID:-//Department of Statistics - ECPv4.9.9//NONSGML v1.0//EN
CALSCALE:GREGORIAN
METHOD:PUBLISH
X-WR-CALNAME:Department of Statistics
X-ORIGINAL-URL:https://stat.wisc.edu
X-WR-CALDESC:Events for Department of Statistics
BEGIN:VTIMEZONE
TZID:America/Chicago
BEGIN:DAYLIGHT
TZOFFSETFROM:-0600
TZOFFSETTO:-0500
TZNAME:CDT
DTSTART:20190310T080000
END:DAYLIGHT
BEGIN:STANDARD
TZOFFSETFROM:-0500
TZOFFSETTO:-0600
TZNAME:CST
DTSTART:20191103T070000
END:STANDARD
END:VTIMEZONE
BEGIN:VEVENT
DTSTART;TZID=America/Chicago:20191211T160000
DTEND;TZID=America/Chicago:20191211T170000
DTSTAMP:20191213T090322
CREATED:20190807T144314Z
LAST-MODIFIED:20191203T142121Z
UID:1813-1576080000-1576083600@stat.wisc.edu
SUMMARY:Statistics Seminar
DESCRIPTION:Title: Statistical Inference for Large-Scale Data with Incomplete Labels \nPresenter: Hyebin Song \nAbstract: In various real-world problems\, we are presented with data with partially observed or contaminated labels. One example is datasets from deep mutational scanning (DMS) experiments in proteomics\, which typically do not contain non-functional sequences. This talk addresses statistical inference procedures for analyzing noisy\, high-dimensional binary data. In the first part of the talk\, I will discuss variable selection in the context of positive-unlabeled data when the number of features p is large. I present the PUlasso algorithm for variable selection and classification with positive and unlabeled responses\, which is scalable to large-scale data and equipped with the minimax optimal mean-squared error guarantee. In the second part of the talk\, I will discuss statistical inference procedures with noisy labels data. With the key observation that the noisy labels problem belongs to a special sub-class of generalized linear models\, I will present convex and non-convex approaches for inference with statistical guarantees. Finally\, I will present an application of our methodology to inferring sequence-function relationships and designing highly stabilized enzymes from large-scale DMS data. \n
URL:https://stat.wisc.edu/event/statistics-seminar-2/
LOCATION:140 Bardeen
CATEGORIES:Seminar
END:VEVENT
END:VCALENDAR