Data federations provide seamless access to multiple heterogeneous and autonomous data sources pertaining to a large organization. As each source database defines its own access control policies for a set of local identities, enforcing such policies across the federation becomes a challenge. In this article, we first consider the problem of translating existing access control policies defined over source databases in a manner that allows the original semantics to be observed while becoming applicable across the entire data federation. We show that such a translation is always possible, and provide an algorithm for automating the translation. We show that verifying whether a translated policy obeys the semantics of the original access control policy defined over a source database is intractable, even under restrictive scenarios. We then describe a practical algorithmic framework for translating relational access control policies into their XML equivalent, expressed in the eXtensible Access Control Markup Language. Finally, we examine the difficulty of minimizing translated policies, and contribute a minimization algorithm applicable to nonrecursive translated policies.
Abstract. The eXtensible Markup Language (XML) provides a powerful and flexible means of encoding and exchanging data. As it turns out, its main advantage as an encoding format (namely, its requirement that all open and close markup tags are present and properly balanced) yield also one of its main disadvantages: verbosity. XML-conscious compression techniques seek to overcome this drawback. Many of these techniques first separate XML structure from the document content, and then compress each independently. Further compression gains can be realized by identifying and compressing together document content that is highly similar, thereby amortizing the storage costs of auxiliary information required by the chosen compression algorithm. Additionally, the proper choice of compression algorithm is an important factor not only for the achievable compression gain, but also for access performance. Hence, choosing a compression configuration that optimizes compression gain requires one to determine (1) a partitioning strategy for document content, and (2) the best available compression algorithm to apply to each set within this partition. In this paper, we show that finding an optimal compression configuration with respect to compression gain is an NP-hard optimization problem. This problem remains intractable even if one considers a single compression algorithm for all content. We also describe an approximation algorithm for selecting a partitioning strategy for document content based on the branch-and-bound paradigm.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.