Training data has a significant influence on the behaviour of an artificial intelligence algorithm developed using machine learning techniques. Consequently, any argument that the trained algorithm is, in some way, fit for purpose ought to include consideration of data as an entity in its own right. We describe some simple techniques that can provide domain experts and algorithm developers with insights into training data and which can be implemented without specialist computer hardware. Specifically, we consider sampling density, test case generation and monitoring for distribution shift. The techniques are illustrated using example data sets from the University of California, Irvine, Machine Learning repository.