This paper considers the problem of feature selection for composite hypothesis testing: The goal is to select, from m candidate features, r relevant ones for distinguishing the null hypothesis from the composite alternative hypothesis; the training data are given as L sequences of observations, of which each is an n-sample sequence coming from one distribution in the alternative hypothesis. What is the fundamental limit for successful feature selection? Are there any algorithms that achieve this limit? We investigate this problem in a small-sample high-dimensional setting, with n = o(m), and obtain a tight pair of achievability and converse results: (i) There exists a function f (L, n, r, m) such that if f (L, n, r, m) ↓ 0, then no asymptotically consistent feature selection algorithm exists; (ii) We propose a feature selection algorithm that is asymptotically consistent whenever f (L, n, r, m) ↑ ∞.