In this paper, we describe an approach to computational modelling and autonomous learning of the perception of sensory inputs by individuals. A hierarchical process of summarization of raw data with heterogeneous nature is proposed. At the lower layer of the hierarchy, the raw data autonomously forms semantically meaningful concepts.Instead of clustering based on the visual or audio similarity, the concepts are being formed at the second layer of the hierarchy based on the observed physiological variables (PVs) like heartbeat, skin conductance, etc. and mapped to the emotional state of the individual. Wearable sensors were used in the experiments. Methodologically, we used the recently introduced Empirical Data Analytics (EDA) computational framework and the data partitioning method within EDA to form the data clouds (cluster-like formations with no pre-defined shape) autonomously and used AnYa type of IF-THEN fuzzy rule-based models to describe the mapping of the observable PVs to the emotional states. Multi-modal typicality distributions, which have properties like pdf and indicate the empirical likelihood distributions of the data, can be further derived without making prior restrictive assumptions traditionally required in statistical approaches. The quality of the classifier is evaluated by a confusion matrix and classification rate/precision. The experimental results are very encouraging: for this extremely complicated problem, we got 74.58% correct classification results without any pre-training.