| April 5, 2006 | |
| Title: | Semiparametric approaches for joint modeling of longitudinal and survival data with time varying coefficients |
| Speaker: | Xiao Song, Biostatistics faculty |
| Abstract: | We study joint modeling of survival and longitudinal data. There are two regression models of interest. The primary model is for survival outcomes, which are assumed to follow a time varying coefficient proportional hazards model. The second model is for longitudinal data, which are assumed to follow a random effects model. Based on the trajectory of a subject’s longitudinal data, some covariates in the survival model are functions of the unobserved random effects. Estimated random effects are generally different from the unobserved random effects and hence this leads to covariate measurement error. To deal with covariate measurement error, we propose a local corrected score estimator and a local conditional score estimator. Both approaches are semiparametric methods in the sense that there is no distributional assumption needed for the underlying true covariates. The estimators are shown to be consistent and asymptotically normal. Finite sample properties are assessed via simulation. The approaches are demonstrated by an application to data from an HIV clinical trial. |
|
|
|
| April 12, 2006 | |
| Title: | Developing Odds Ratio Estimation under a Multistage Design |
| Speaker: | Judy Zhong, Biostatistics student |
| Abstract: | Genome-wide association is a promising approach to identify common genetic variants that predispose to human disease. Because of the high cost of genotyping hundreds of thousands of markers on thousands of subjects, genome-wide association studies often follow a staged design in which a proportion of the available samples are genotyped on a large number of markers in stage 1, and a selected proportion of these markers are later followed up by genotyping them on the remaining samples in stage 2. Prentice et al. (2006) proposed a testing from such a multistage design can take place with good power by considering log-odds ratio test statistics in an inverse variance weighted fashion from each design stage. Here we propose to consider whether earlier stage data can be used for the point estimation and constructing more precise confidence intervals for odds ratios that characterizes the strength of relationship between SNPs and disease. However, odds ratio estimates from earlier stages will tend to be overestimated, since SNPs meeting criteria for moving to a subsequent stage will, on average, be above their respective means. In this talk, I will present the impact of this feature on the bias of the combined odds-ratio estimator and propose a correction procedure that acknowledges the selection. Then I will show that the correction procedure also allows us to build a bootstrap confidence interval for the corrected combined estimator from the data across stages. Last the behaviors of the corrected odds-ratio estimator and its confidence interval will be examined via simulation studies. |
|
|
|
| April 19, 2006 | |
| Title: | A Latent Variable Model for Comparing the Sensitivity and Specificity of Two Assays with a Partially Observed and Time-Varying Gold Standard |
| Speaker: | Elizabeth Brown, Biostatistics faculty |
| Abstract: | In a study of mother to child HIV transmission, two different assays were used to determine the HIV-infection status of infants born to HIV-infected mothers over time. Although, there is no gold standard test result to determine the HIV infection status for the majority of infants, there is interest in comparing the sensitivity and specificity of these two assays. Further complicating the analysis, an infant’s HIV infection status may not be constant over time. In this talk, I will present a latent variable modeling approach to compare the sensitivity and specificity of these two tests. We model the assay results over time conditional on the often unobserved disease status. We propose a mixture model for the time to HIV infection that is biologically plausible for mother to child transmission of HIV. We apply these methods to data from HPTN 024, a trial to assess the impact of antibiotics on mother to child transmission of HIV and to simulated data. |
|
|
|
| April 26, 2006 | |
| Title: | Extracting Bowhead Whale, Balaena mysticetus, Migration Patterns Using Visual and Acoustic Location Data Gathered off Pt. Barrow, AK |
| Speaker: | Nate Mercaldo, Biostatistics faculty |
| Abstract: | Due to the influx of anthropogenic activities spanning nearly the entire globe, the monitoring of the environmental implications of said activities is warranted. Focusing our attention on oceanographic exploitations, a few examples include harvesting, oil drilling/explorations, ocean floor or coastline mappings and military training exercises or weapons testing. Research studying the effects of these activities have primarily focused on marine mammals, such as whales or dolphins, and have demonstrated their deleterious effect on individuals / groups and thus their overall population size. Depending on the status of the species (candidates/proposed, threatened, endangered) the robustness of the population’s overall health is questionable. If a species becomes threatened or endangered, then certain subgroups (cows and calves) become more valuable in terms of population sustainability. Therefore, for a population to coincide or prosper the proper identification of the cow-calf pairs is needed and when encountered having the man-made endeavors halted or curbed. The focus of today’s seminar is the Bering-Chukchi-Beaufort Seas stock of the bowhead whale, a population that suffered tremendous losses between the mid 19th and early 20th centuries due to commercial whaling. Unlike its North Atlantic counterpart, the North Atlantic Right whale, which is the most endangered whale in the world, the stock of bowheads has shown signs of recovery. However, this up sweep may be threatened as this geographic area becomes more attractive for petroleum enterprises as well as the increase of the allotted quotas for aboriginal substance whaling.
For the past 25 years, ice-based census efforts, using both visual and acoustic methods have been conducted in order to enumerate this stock during its spring migration off the northern coast of Alaska. The primary focus of this seminar will be a review of the census methods and statistical methodology for linking visual and acoustic data points. Since this is a work in progress, the ultimate goal of this work will not be discussed, but it includes the distinguishing of cow-calf pairs from non-cow-calf pairs via acoustical measures. |
|
|
|
| May 3, 2006 | |
| Title: | Got Class? An introduction to Taxometric Methods and Latent Class Analysis |
| Speaker: | Elizabeth Koehler, Biostatistics student |
| Abstract: | Social science frequently aims to measure traits that may be better explained by something that is unobservable. For example, how can one accurately measure things like stress, intelligence and social ability? More and more social scientists are relying on latent models to help explain the unobservable. This talk aims to introduce the audience to just two of the methods available: taxometric methods and latent class analysis. We will be closely examining an example problem that looks into the possibility of there being more than one subtype of autism. |
|
|
|
| May 10, 2006 | |
| Title: | Functional ANOVA Normalization of Two-Channel Microarrays |
| Speaker: | Alan Dabney, Biostatistics student |
| Abstract: | We present a new, general method for normalizing two-channel microarray data, partially drawing on ideas from two widely used approaches. Whereas the ANOVA approach carefully distinguishes different sources of signal and bias through explicit terms in its model, the MA-plot based approach takes into account the fact that sources of bias may be intensity-dependent. However, both approaches suffer from serious drawbacks, as we have shown in previous work. The fixed (non-intensity-dependent) coefficients in the ANOVA approach tend to under- or over-fit the data, and the MA-plot based approach assumes that all intensity-dependent trends are due to unwanted bias, each leading to inaccurate normalization in fairly common scenarios. Our proposed approach, called eCADS, captures the strengths of these previous approaches, while avoiding their weaknesses. We replace the fixed coefficients in the ANOVA model with functions of underlying RNA amount, thereby incorporating intensity-dependent relationships like those evident in MA-plots. The normalization method fits this “functional ANOVA” model and subtracts off terms representing bias to retain the biological signal of interest. By requiring a simple balance in experimental design, we show that our method preserves differential expression relationships in expectation. A consequence of this work is the statistical justification of a more efficient dye-swap design that requires only one array per sample pair. We demonstrate our new method on an experiment measuring expression in developing mice. |
|
|
|
| May 17, 2006 | |
| Title: | Comparison of Haplotype-based and Tree-based SNP Imputation in Association Studies |
| Speaker: | James Dai, Biostatistics student |
| Abstract: | Missing single nucleotide polymorphisms (SNPs) are quite common in genetic association studies. Subjects with missing SNPs are often discarded in analyses, which may seriously undermine the inference of SNP-disease association. In this article, we compare two haplotype-based imputation approaches and one regression tree-based imputation approach for association studies. The goal is to assess the imputation accuracy, and to evaluate the impact of imputation on parameter estimation. Haplotype-based approaches build on haplotype reconstruction by the expectation-maximization (EM) algorithm or a weighted EM (WEM) algorithm, depending on whether case-control status is taken into account. The tree-based approach uses a Gibbs sampler to iteratively sample from a full conditional distribution, which is obtained from the classification and regression tree (CART) algorithm. We employ a standard multiple imputation procedure to account for the uncertainty of imputation. We apply the methods to simulated data as well as a case-control study on developmental dyslexia. Our results suggest that imputation generally improves over the standard practice of ignoring missing data in terms of bias and efficiency. The haplotype-based approaches slightly outperform the tree-based approach when there are a small number of SNPs in linkage disequilibrium (LD), but the latter has a computational advantage. Finally, we demonstrate that utilizing the disease status in imputation helps to reduce the bias in the subsequent parameter estimation. |
|
|
|
| May 24, 2006 | |
| Title: | Bias correction in non-differentiable estimating equations for optimal dynamic regimes |
| Speaker: | Erica Moodie, Biostatistics student |
| Abstract: | A dynamic regime is a function that takes treatment and covariate history as inputs and returns the treatment to be given. Robins (1986) introduced g-estimation, and recently showed that this can be used to make inference about the optimal regime (Robins, 2002; 2004). This method is always consistent, but can be asymptotically biased under a given structural nested mean model for certain longitudinal distribution functions of the treatments and covariates, the so-called exceptional laws. In fact, the null hypothesis constitutes an exceptional law under structural nested mean models which allow for interaction of current treatment with past treatments or covariates. In this talk, we explain the problem of exceptional laws, and provide a new approach to g-estimation that shares all of the asymptotic properties of ordinary g-estimates at non-exceptional laws while providing substantial reduction in the bias at exceptional laws. |
|
|
|
| June 7, 2006 | |
| Title: | Multiple Imputation Methods for Treatment Noncompliance and Missing Data |
| Speaker: | Leslie Taylor, Biostatistics Doctoral Candidate |
| Abstract: | Well-designed randomized clinical trials are a powerful tool for investigating causal treatment effects, but in trials involving human subjects there are oftentimes problems of noncompliance which standard analyses, such as the intention-to-treat or as-treated analysis, either ignore or account for in such a way that the estimand can no longer be considered a causal effect. An alternative to these analysis is the complier average causal effect (CACE) which estimates the average causal treatment effect among a subpopulation that would comply under any treatment assigned. In this talk we derive multiple imputation estimators for the CACE using data augmentation algorithms in the setting of a randomized clinical trial with crossover treatment noncompliance and missing data. Using simulated data we investigate the finite sample properties of these estimators as well as of competing procedures in a simple setting. Finally we illustrate our methods using a real randomized encouragement design study on the effectiveness of the influenza vaccine. |