Research on API migration and language conversion can be informed by empirical data about API usage. For instance, such data may help with designing and defending mapping rules for API migration in terms of relevance and applicability. We describe an approach to large-scale API-usage analysis of opensource Java projects, which we also instantiate for the SourceForge open-source repository in a certain way. Our approach covers checkout, building, tagging with metadata, fact extraction, analysis, and synthesis with a large degree of automation. Fact extraction relies on resolved (type-checked) ASTs. We describe a few examples of API-usage analysis; they are motivated by API migration. These examples are concerned with analysing API footprint (such as the numbers of distinct APIs used in a project), API coverage (such as the percentage of methods of an API used in a corpus), and framework-like vs. class-library-like usage.
The dCache project provides open source storage software deployed internationally to satisfy ever more demanding scientific storage requirements. Its multifaceted approach provides an integrated way of supporting different use cases with the same storage, from high throughput data ingest, through wide access and easy integration with existing systems. In supporting new communities, such as photon science and microbiology, dCache is evolving to provide new features and access to new technologies. In this paper, we describe some of these recent features that facilitate the use of storage to maximise the gain from stored data, including quality-of-service management, support for distributed and federated systems, and improvements with support for parallel NFS (pNFS).
The dCache project provides open-source storage software deployed internationally to satisfy ever more demanding scientific storage requirements. Its multifaceted approach provides an integrated way of supporting different use-cases with the same storage, from high throughput data ingest, through wide access and easy integration with existing systems. In this paper, we describe some of the recent features that facilitate the use of storage to maximise the gain from stored data, including quality-of-service management, heterogeneous systems-both through integrated tertiary storage support and geographical locality-the parallel NFS (pNFS) extension, and innovative delegated authorisation schemes.
For over a decade, dCache.ORG has provided robust software, called dCache, that is used at more than 80 universities and research institutes around the world, allowing these sites to provide reliable storage services for the WLCG experiments and many other scientific communities. The flexible architecture of dCache allows running it in a wide variety of configurations and platforms - from all-in-one Raspberry-Pi up to hundreds of nodes in multi-petabyte infrastructures. The life cycle of scientific data is well defined - collected, processed, archived and finally deleted, when it’s not needed anymore. Moreover, during all those stages the data is never modified: either the original data is used, or new derived data is produced. With this knowledge, dCache was designed to handle immutable files as efficiently as possible. Data replication, HSM connectivity and data-server independent operations are only possible due to the immutable nature of stored data. Nowadays many commercial vendors provide such write-once-read-many or WORM storage systems, as they become more and more demanded with grown demand of audio, photo and video content in the web. On the other hand by providing standard NFSv4.1 interface dCache is often used as a general-purpose file-system, especially by new communities, like photon scientists or microbiologists. Although many users are aware of data immutability, some applications and use cases still require in-place updates of stored files. To satisfy new requirements some fundamental changes have to be applied to dCache’s core design. However, new developments must not compromise any aspect of existing functionality. In this presentation we will show new developments in dCache to turn it into a regular file system. We will discuss the challenges to build a distributed storage system, ‘life’ with POSIX compliance, handling of multiple replicas and backward compatibility by providing WORM and noWORM capabilities within the same storage system.
The dCache project provides open-source software deployed internationally to satisfy ever more demanding storage requirements of various scientific communities. Its multifaceted approach provides an integrated way of supporting different use-cases with the same storage, from high throughput data ingest, through wide access and easy integration with existing systems, including event driven workflow management. With this presentation, we will show some of the recent developments that optimize data management and access to maximise the gain from stored data.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
334 Leonard St
Brooklyn, NY 11211
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.