Open Access Research articledata. Further, dynamic data analysis of such huge data requires Hadoop like map-reduce stacks [3]. For example, Netflix, an online video viewing subscription based platform relies on customer's viewing preferences to recommend better titles which are more relevant [4]. Face book tracks user interactions with news feed items, friends and pages, photos, text posts etc. to provide better features on its social networking platform.Data in modern web services and applications thus plays really important role. This creates necessity of Technologically Protective Measures (TPM) to protect the company data related assets [5]. One technique popular in this field is of Watermarking. The core idea is to embed a watermark into tuples of database. Watermark is usually based on some secret string chosen by owner and hashing functions. However, static watermarking works on snapshots of databases and often uses techniques that are slow and can't be scaled for big data easily. Big data is dynamic in the sense that the rate of change in it is too high, and the size is too huge, which makes static watermarking on snapshots nearly impossible. This poses a challenge to traditional watermarking techniques in order to implement TPM on Big Data.The dynamically increasing NoSQL databases with irregularities in schema require parallelizable and atomic watermarking techniques that deal with following problems:(a) The technique shouldn't rely on schema specific details as NoSQL databases tend to change very often and with document store systems like MongoDB, CouchDB, RethinkDB etc, schema is flexible and different documents can have different structure.(b) Technique shouldn't work on a snapshot of database but should work dynamically as and when data comes. This would mean the implementation should have minimum computational overhead to be worth.(c) Technique should exploit new features of NoSQL databases like support for new data types and sharded/ distributed databases that can't support inter tuple dependency for watermark embedding.Considering these issues in mind, a technique is proposed to protect integrity of NoSQL databases. In this work, the emphasis is laid on the challenges faced while designing the technique and
AbstractWith the advent of gathering real time information, the need of Not Only SQL (NoSQL) databases has skyrocketed. No prior work is done on watermarked protection of such Schema-less databases. Traditional watermarking techniques cannot be applied as NoSQL databases are dynamically increasing and often have irregularities in schema. In this paper, a new perspective of embedding watermark into such databases is proposed. Watermark that acts as a signature is securely prepared for each tuple individually. The proposal deals with the issues by leveraging the flexible schema features provided by NoSQL database. A new attribute is dynamically injected into the tuple before inserting or updating it in database. Experimental results and analyses demonstrate that with even the slightest modifica...