Fall 2007

October 3, 2007
Title: Talking Stats with Fireman: Highlights from StatCom’s first year
Speaker: David Lockhart and Julian Wolfson
Abstract: Statistics in the Community (StatCom) is a student-run initiative at the University of Washington which provides free statistical consulting to non-profit community and governmental organizations. Started at the UW in 2005 based on a concept developed at Purdue University, StatCom pulls on the expertise of students in Statistics, Biostatistics, Genome Sciences, and beyond. In this presentation, we will talk about how StatCom works, as well as describing some of the projects StatCom is currently involved in. Students interested in learning more about StatCom (we’re always looking for more members!) either before or after the seminar can visit http://www.stat.washington.edu/statcom.

October 10, 2007
Title

Methods to Estimate the Distribution of the Failure Time Under Misclassification for Current Status Data

Speaker: Giancarlo Sal y Rosas
Abstract: A common study design in epidemiology and clinical
research is the follow-up study in which a fixed number of participants are
followed for a period of time in order to observe some event such as death,
disease, development of a tumor, etc. In some cases, (e.g. tumor
development, asymptomatic disease) the exact time of the event may not be
observable. Instead, the participant is tested once at some pre-determined
time and the outcome is observed to have occurred or not occurred. Such data
are referred to as current status data or type I interval censored data.

Groeneboom and Wellner (1992) proposed two methods to estimate the
cumulative distribution function of the failure times in the absence of
covariate effects. The first is based on the EM-Algorithm, which arises
naturally because we can consider current status data as an example of a
missing data problem. The second method is based on isotonic regression.
Huang (1996) extended the idea to the Cox Proportional Hazard model with
current status data.

We extend these methods to the situation where the outcome is based on an
imperfect test (sensitivity and specificity less than one), so that we
expect some false positives and false negative outcomes in our data. In
particular, we discuss the estimation of the NPMLE using the EM algorithm
and isotonic techniques in the case of no covariate effect. We also discuss
the case of covariate effect (Proportional Hazard Model).


April 11, 2007
Title: An Overview of ROC curves in the context of Survival Model Predictive Accuracy
Speaker: Paramita Saha
Abstract: ROC curves are a popular method for displaying sensitivity and specificity of a continuous marker, X, for a binary disease variable, D. However, many disease outcomes are time-dependent, D(t), rather than D, and ROC curves that vary as a function of time may be more appropriate. In this talk, I will give an overview of how ROC curve methodology is extended in the context of time-dependent disease status. I will talk about different ways to extend the ideas, introduce various estimation methods and advantages/disadvantages.

April 18, 2007
Title: Maternal Neutralizing Antibodies in Mother-to-Child Transmission of HIV
Speaker: Katie Davis
Abstract: This is a practice talk for my Biology Exam. I will be describing the assays used to assess neutralization as well as discussing the relevance of neutralizing antibodies to current research in the field of mother-to-child HIV transmission and to vaccine development.

April 25, 2007
Title: Knowing When to Quit: An Exploration of Monte Carlo Error
Speaker: Elizabeth Koehler
Abstract: Have you ever run a simulation or performed a bootstrap analysis? If so, how did you decide how many replicates to use? Past literature has suggested that anywhere from 25 (Efron 1987) to 13,573 (Serlin 2000) replications are sufficient. This seminar aims to introduce and explore Monte Carlo Error (a.k.a. the phenomenon that happens when you run a simulation get the results, lose your jump drive with your report and have to run them again, find the jump drive and realize the results are different, and have a moment of panic). More specifically, Monte Carlo Error is the variance associated with an estimate achieved via simulation. The way to reduce this variance is to increase the number of replications you use. Since it’s not always practical to run infinite replications or even 13,573, I will introduce some methods for estimating Monte Carlo Error with the aim of guiding the decision about whether you have used enough replicates. This way you can know when to quit.


May 2, 2007
Title: Developing an understanding and motivation for Dirichlet Process Priors in a nonparametric Bayesian framework
Speaker: Mark Giganti
Abstract: Heagerty and Kurland (2001) showed that estimates will be biased if the true distribution of individual random effects is misspecified in a conditional model for repeated measures binomial data. To better model the individual random effects, I propose a Bayesian framework that uses a prior distribution which avoids strong distributional assumptions. One possible prior distribution is a Dirichlet Process prior. The goal of this talk is to develop the motivation behind this choice for a prior distribution and to enhance the understanding of its parameters.

May 9, 2007
Title: Joint Linkage and Segregation Analysis of Quantitative Traits Allowing for Multiallelic QTLs Using Reversible Jump MCMC
Speaker: Elizabeth Koehler
Abstract: Joint linkage and segregation analysis quantifies the association between inherited alleles (or variants) at a trait locus and a given trait, while at the same time estimating the genetic parameters at the locus. This method has been extended to allow for multiple quantitative trait loci (QTLs) using reversible jump MCMC in the program Loki. Ordinarily, researchers assume that each QTL has 2 alleles. However, nearly all trait loci have multiple alleles, each with different mean effect. I will discuss some possible ways to incorporate a multiallelic model into the analysis implemented by Loki as well as the challenges associated with incorporating multiple alleles.

May 16, 2007
Title: Comparing methods of ROC curve analysis
Speaker: Liz Thomas
Abstract: Receiver Operating Characteristic (ROC) curves are a common tool in medical decision-making and the evaluation of medical tests. Parametric, semi-parameteric, and nonparametric methods are all in common use for estimating and comparing ROC curves, but it is vital to understand their assumptions and limitations. We compare the parametric binormal, bi-lognormal, semiparametric proportional hazards and non-parametric methods for evaluating both the strong and weak null hypotheses when comparing ROC curves for two diagnostic tests with various plausible underlying distributions and censoring mechanisms. In the case of semi-parametric methods with misspecified models, we illustrate when the tests are still valid for the strong null hypothesis, given appropriate estimation of variance.

May 30, 2007
Title: Application of ODP to time-course study
Speaker: Sangsoon Woo
Abstract: It has been increasingly important to accurately extract biologically relevant signal from thousands of related measurements. The Optimal Discovery Procedure (ODP) has been introduced and theoretically shown to optimally perform multiple significance tests. My talk will include an introduction to the ODP method and will present simulation results of applying the ODP statistic to time-course microarray data.

Leave a Reply