Yih-Farn Robin Chen scite author profile

Abstract-Modern distributed storage systems often use erasure codes to protect against disk and node failures to increase reliability, while trying to meet the latency requirements of the applications and clients. Storage systems may have caches at the proxy or client ends in order to reduce the latency. In this paper, we consider a novel caching framework with erasure code called functional caching. Functional Caching involves using erasure-coded chunks in the cache such that the code formed by the chunks in storage nodes and cache combined are maximaldistance-separable (MDS) erasure codes. Based on the arrival rates of different files, placement of file chunks on the servers, and service time distribution of storage servers, an optimal functional caching placement and the access probabilities of the file request from different disks are considered. The proposed algorithm gives significant latency improvement in both simulations and a prototyped solution in an open-source, cloud storage deployment.

show abstract

Zebroid: using IPTV data to support STB-assisted VoD content delivery

Chen

Jana

Stern

et al. 2010

Multimedia Systems

View full text Add to dashboard Cite

IPTV, unlike Internet TV, delivers digital TV and multimedia services over IP-based networks with the required level of quality of service (QoS) and quality of experience (QoE). Linear programming channels in IPTV are delivered through multicast, which is highly scalable with the number of subscribers. Video-on-demand (VoD) content, on the other hand, is typically delivered using unicast, which places a heavy load on the VoD servers and all the network components leading to the end-user set-top boxes (STBs) as the demand increases. With the rapid growth of IPTV subscribers and the shift in video viewing habits, the need to efficiently disseminate large volumes of VoD content has prompted IPTV service providers to consider the use of STBs to assist in video content delivery. This paper describes our current research work on Zebroid, a potential VoD solution for fiber-to-the-node (FTTN) networks, which uses IPTV data on a recurring basis to determine how to select, stripe, and preposition popular content in selected STBs during idle hours. A STB requesting VoD content during the peak hours can then receive necessary stripes from participating STBs in the neighborhood. Recent VoD request access patterns, STB availability data, and capacity data on network components are taken into consideration in determining the parameters used in the striping algorithm of Zebroid. We show both by simulation and emulation on a realistic IPTV testbed that the VoD server load can be reduced by more than 70% during peak hours by allocating only 8 GB of storage on each STB. The savings achieved through Zebroid would also allow IPTV service providers to add more linear programming channels without expensive infrastructure upgrades.

show abstract

Distributed data storage systems with opportunistic repair

Aggarwal

Tian

Vaishampayan

et al. 2014

View full text Add to dashboard Cite

The reliability of erasure-coded distributed storage systems, as measured by the mean time to data loss (MTTDL), depends on the repair bandwidth of the code. Repair-efficient codes provide reliability values several orders of magnitude better than conventional erasure codes. Current state of the art codes fix the number of helper nodes (nodes participating in repair) a priori. In practice, however, it is desirable to allow the number of helper nodes to be adaptively determined by the network traffic conditions.In this work, we propose an opportunistic repair framework to address this issue. It is shown that there exists a threshold on the storage overhead, below which such an opportunistic approach does not lose any efficiency from the optimal storage-repairbandwidth tradeoff; i.e. it is possible to construct a code simultaneously optimal for different numbers of helper nodes. We further examine the benefits of such opportunistic codes, and derive the MTTDL improvement for two repair models: one with limited total repair bandwidth and the other with limited individual-node repair bandwidth. In both settings, we show orders of magnitude improvement in MTTDL. Finally, the proposed framework is examined in a network setting where a significant improvement in MTTDL is observed.

show abstract

Differentiated Latency in Data Center Networks with Erasure Coded Files Through Traffic Engineering

Xiang

Aggarwal

Chen

et al. 2019

IEEE Trans. Cloud Comput.

View full text Add to dashboard Cite

Abstract-This paper proposes an algorithm to minimize weighted service latency for different classes of tenants (or service classes) in a data center network where erasure-coded files are stored on distributed disks/racks and access requests are scattered across the network. Due to limited bandwidth available at both top-of-the-rack and aggregation switches and tenants in different service classes need differentiated services, network bandwidth must be apportioned among different intraand inter-rack data flows for different service classes in line with their traffic statistics. We formulate this problem as weighted queuing and employ a class of probabilistic request scheduling policies to derive a closed-form upper-bound of service latency for erasure-coded storage with arbitrary file access patterns and service time distributions. The result enables us to propose a joint weighted latency (over different service classes) optimization over three entangled "control knobs": the bandwidth allocation at top-of-the-rack and aggregation switches for different service classes, dynamic scheduling of file requests, and the placement of encoded file chunks (i.e., data locality). The joint optimization is shown to be a mixed-integer problem. We develop an iterative algorithm which decouples and solves the joint optimization as 3 sub-problems, which are either convex or solvable via bipartite matching in polynomial time. The proposed algorithm is prototyped in an open-source, distributed file system, Tahoe, and evaluated on a cloud testbed with 16 separate physical hosts in an Openstack cluster using 48-port Cisco Catalyst switches. Experiments validate our theoretical latency analysis and show significant latency reduction for diverse file access patterns. The results provide valuable insights on designing low-latency data center networks with erasure coded storage.

show abstract

The Growing Pains of Cloud Storage

Chen

2015

IEEE Internet Comput.

View full text Add to dashboard Cite

Growth and Challenges of Cloud Storage Cloud storage is growing at a phenomenal rate and is fueled by multiple forces: Mobile Devices, Social Networks, and Big Data. Content is created any time and any where on billions of smartphones and tablets; high resolution photos and videos are frequently uploaded to the cloud automatically as soon as they are captured. A Gartner Report predicts that consumer digital storage would grow to 4.11 zettabytes in 2016, with 36% of it in the cloud[4]. All the social interactions and transactions on the Internet are frequently captured for targeted advertising. Besides Social Networks and E-commerce, Big Data analytics are growing in many other sectors as well, including government, health care, media, and education. An IDC forecast suggests that the storage of Big Data is growing at a compound annual growth rate (CAGR) of 53% from 2011-2016 [1]. The growth of cloud storage has made it an expensive cost component in many cloud services and the cloud infrastructure today. While raw storage is cheap, the performance, availability, and data durability requirements of cloud storage frequently dictate sophisticated, multi-tier, geo-distributed storage solutions. Amazon S3 offers 11 nine's of data durability (99.999999999%), but some other cloud storage services demand even more stringent requirements due to the sheer number of objects being stored in the cloud these days (1.3 billion Facebook users, uploading 350 million photos each day) and the importance of the data (who can afford to lose a video of your baby's first steps?). Data is frequently replicated or mirrored in multiple data centers to avoid catastrophic data loss, but copying data across data centers is very expensive. The networking cost is frequently proportional to the distance and bandwidth requirements between the datacenter sites. Traditional storage systems use dedicated storage hardware and networking to guarantee that the storage QoS requirements such as throughput, latency, and IOPS (total number of input/output operations per second) are preserved. Unfortunately, these dedicated resources are frequently underutilized. Cloud computing promises efficient resource utilization by allowing multiple tenants to share the underlying networking, computing, and storage infrastructure, but it is difficult to provide end-to-end storage QoS guarantees to individual tenants without mechanisms to avoid interference. Typically, in a cloud

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.