This paper introduces entropy quad-trees, which are structures derived from quad-trees by allowing nodes to split only when those correspond to sufficiently complex sub-domains of a data domain. Complexity is evaluated using an information-theoretic measure based on the analysis of the entropy associated to sets of objects designated by nodes. An alternative measure related to the concept of box-counting dimension is also explored. Experimental results demonstrate the efficiency of entropy quad-trees to mine complex regions. As an application, we used our proposed technique in the initial stage of a crater detection algorithm using digital images taken from Mars surface. Additional experimental results are provided that demonstrate the crater detection performance and analyze the effectiveness of entropy quad-trees for high-complexity regions detection in the pixel space with significant presence of noise. This work is focused on 2-dimensional image domains, but can be generalized to higher dimensional data.
Keywords: Entropy, Box-Counting Dimension, Quad-trees, Circular Hough Transform
Entropy Quad-Trees for High Complexity Regions DetectionThe concept of complexity relates to the presence of variation. In science there are many approaches that characterize complexity. A variety of scientific fields have dealt with complex mechanisms, simulations, systems, behavior and data complexity as those have always been a part of our environment. In this work, we focus on the topic of data complexity which is studied in information theory. While randomness is not considered complexity in certain areas such as those related to the study of complex systems, information theory tends to assign high values of complexity to random noise. Many fields benefit from the identification of content or noise related complex areas. In data hiding, adaptive steganography takes advantage of high concentration of self-information on high complexity areas originated from both content and noise to embed data. The authors of [1] describe the benefits of selective embedding related to the reduction of perceptual degradation for transform domain steganographic techniques. Bio diversity is another area where complexity can be used for identification and localization of different species. In this case, the complexity originated from content is more important than the one originated from noise.Our goal in this paper is to introduce a variant of quad-trees for mining high complexity sub-domains of a data domain. A quad-tree is a tree structure defined on a finite set of nodes that either contains no nodes or is comprised of a root node and 4 quad-subtrees. In a full quad-tree, each node is either a leaf or has degree exactly 4. Our variant of quad-trees requires that each node that has descendants corresponds to a region that has a sufficient level of diversity as assessed by the value of an information-theoretical measure. We also present an alternative measure that has its roots in fractal geometry where the so called box-counting dimens...