Abstract:Very large data sets -telephone call records, network logs, high-resolution satellite images, or web document repositories -are not easily analyzed using traditional database techniques. They may be simply too large, grow too fast, or may not fit well in a database schema. They tend to span multiple disks and machines. On the other hand, these large data sets often have a flat and regular structure that permits distributed filtering and aggregation. We present a system and language for such analyses*. A filter… Show more
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.