Subscribe: Biometrika - Advance Access
Preview: Biometrika - Advance Access

Biometrika Advance Access

Published: Fri, 03 Nov 2017 00:00:00 GMT

Last Build Date: Sat, 04 Nov 2017 03:47:46 GMT


Two-sample tests of high-dimensional means for compositional data


Compositional data are ubiquitous in many scientific endeavours. Motivated by microbiome and metagenomic research, we consider a two-sample testing problem for high-dimensional compositional data and formulate a testable hypothesis of compositional equivalence for the means of two latent log basis vectors. We propose a test through the centred log-ratio transformation of the compositions. The asymptotic null distribution of the test statistic is derived and its power against sparse alternatives is investigated. A modified test for paired samples is also considered. Simulations show that the proposed tests can be significantly more powerful than tests that are applied to the raw and log-transformed compositions. The usefulness of our tests is illustrated by applications to gut microbiome composition in obesity and Crohn’s disease.

A randomization-based perspective on analysis of variance: a test statistic robust to treatment effect heterogeneity


Fisher randomization tests for Neyman’s null hypothesis of no average treatment effect are considered in a finite-population setting associated with completely randomized experiments involving more than two treatments. The consequences of using the $F$ statistic to conduct such a test are examined, and we argue that under treatment effect heterogeneity, use of the $F$ statistic in the Fisher randomization test can severely inflate the Type I error under Neyman’s null hypothesis. We propose to use an alternative test statistic, derive its asymptotic distributions under Fisher’s and Neyman’s null hypotheses, and demonstrate its advantages through simulations.