The College of New Jersey Logo

Apply     Visit     Give     |     Alumni     Parents     Offices     TCNJ Today     Three Bar Menu

Department Colloquium – Chamont Wang, Jana Gevertz, and Sudhir Nayak from Biology will be Presenting Their Joint Work

The Future of Indirect Evidence: Finding Important Traits in Non-experimental Data

SPEAKERS:
Chamont Wang, Mathematics & Statistics Department
Jana Gevertz, Mathematics & Statististics Department
Sudhir Nayak, Biology Department

February 29, 3:15 – 4:15 pm
Science Complex P101
The College of New Jersey

Abstract

Over the past decade, statisticians and machine-learning researchers have developed thousands of new tools for the reduction of high-dimensional data in order to identify the most important factors that determine a particular trait. These tools have applications in a plethora of settings, including the analysis of data in the field of business, education, forensics, and biology (microarray, proteomics, brain imaging), to name just a few. In the present work, we focus on data collected from microarray experiments, where the t-test, its modifications, and other statistical models are often used to help identify genes related to a disease or an ailment.

Specifically, we investigated the limitations and potential misuses of the current techniques. We found that models that produce 100% accuracy measures often select different sets of genes and that certain widely used models would render 100% prediction accuracy with totally irrelevant genes. An alternative methodology, TreeNet (a.k.a. stochastic gradient boosting), will be shown to be a superior model for gene selection. We will showcase this technology and discuss the implications of our findings in other areas of statistical applications.

*Joint work with Chaur-Chin Chen, National Tsing-Hua University, Taiwan
Leonardo Auslender, TD Bank, New Jersey, USA

Top