Pinchao presented ELVES lab work “Towards Adaptive Replication for Hot/Cold Blocks in HDFS using MemCached” in the conference IEEE ICDIS 2019 at South Padre Island.
The presentation can be found here.
The paper’s abstract is as below:
With the advancement of ever-growing online services, distributed Big Data storage i.e. Hadoop, Dryad gained much more attention than ever and the fundamental requirements like fault tolerance and data availability become the concern for these platforms. Data replication policies in Big Data applications are shifting towards dynamic approaches based on the popularity of files. Formulation of dynamic replication factor paved the way of solving the issues generated by existing data contention in hotspots and ensuring timely data availability. But from the empirical observations, it can be deduced that popularity of files is temporal rather than perpetual in nature and, after a certain period, content’s popularity ceases most of the time which introduces the I/O bottleneck of updating replication in the disk. To handle such temporal skewed popularity of contents, we propose a dynamic data replication toolset using the power of in-memory processing by integrating MemCached server into Hadoop for getting improved performance. We compare the proposed algorithm with the traditional infrastructure and vanilla memory algorithm, as the evidence from the experimental results, the proposed design performs better i.e throughput and execution period.