We present a transformation based, multigrained data modeling technique in the context of text independent speaker recognition, aimed at mitigating di culties caused by sparse training and test data. Both identication and veri cation are addressed, where we view the entire population as divided into the target population and its complement, which we refer to as the background population. First, we present our development of maximum likelihood transformation based recognition with diagonally constrained Gaussian mixture models and show its robustness to data scarcity with results on identi cation. Then for each target and background speaker, a multi-grained model is constructed using the transformation based extension as a building block. The training data is labeled with an HMM based phone labeler. We then make use of a graduated phone class structure to train the speaker model at various levels of detail. This structure is a tree with the root node containing all the phones. Subsequent levels partition the phones into increasingly ner grained linguistic classes. This method a ords the use of ne detail where possible, i.e. as re ected in the amount of training data distributed to each tree node. We demonstrate the e ectiveness of the modeling with veri cation experiments in matched and mismatched conditions. Keywords| Speaker recognition, maximum likelihood linear transform, multi-grained modeling.