Statistics Seminar – Canziani

Title: Unsupervised Deep Learning: Autoencoders and Generative Adversarial Nets Presenter: Alfredo Canziani Abstract: The brain has about 10^14 synapses and we only live for about 10^9 seconds. Even just considering binary synapses, a learning algorithm …

Statistics Seminar

Title: Statistical analysis and spectral methods for signal-plus-noise matrix models Presenter: Joshua Cape Abstract:Estimating eigenvectors and principal subspaces is of fundamental importance for numerous problems in statistics, data science, and network analysis, including covariance matrix …

Proper Extraction and Representation of Low Rank Modules in Gene Expression Data Studies

Abstract: Reliable biological interpretation from gene expression
data collected from cancer tissue samples are often challenged by
two aspects: (1) the multiple signals coming from diverse cell
components within each tissue; and (2) the heterogenous patient
sub-groups. For (1), the decomposition of convoluted signals
requires careful extraction and representations of the low-rank
modules of high-dimensional data matrices. We developed a semisupervised
approach in synergy with a constrained non-negative
matrix decomposition approach to identify the diverse signal
intensities contributed by the cell components in the tissue. For (2),
considering the marked heterogeneity among samples each
measured with a high-dimensional feature sets, we developed a biclustering
based subspace clustering (SSC) algorithm, where
different from traditional SSC algorithms, the samples are clustered
for many times, and each time, the clustering is done on a subset of
attributes weighted differently for each cluster. Our analysis
identified novel cell type specific functions and cell-cell interactions
in different cancer types. Experimental validations by using CRISPR
knockout has demonstrated key genes expressed by cancer and
other cells that contribute to the immune evasion and drug
resistance in colorectal cancer.