Persistent key-value (KV) stores are mainly designed based on the Log-Structured Merge-tree (LSM-tree), which suffer from large read and write amplifications, especially when KV stores grow in size. Existing design optimizations for LSM-tree-based KV …
In clouds and data centers, GPU servers which consist of multiple GPUs are widely deployed. Current state-of-the-art GPU scheduling algorithm are \"static\" in assigning applications to different GPUs. These algorithms usually ignore the dynamics of …
Docker containers are widely deployed to provide lightweight virtualization, and they have many desirable features such as ease of deployment and near bare-metal performance. However, both the performance and cache efficiency of containers are still …
In cloud computing, how to use limited hardware resources to meet the increasing demands has become a major issue. KSM (Kernel Same-page Merging) is a content-based page sharing mechanism used in Linux that merges equal memory pages, thereby …
Due to individual unreliable commodity components, failures are common in large-scale distributed storage systems. Erasure codes are widely deployed in practical storage systems to provide fault tolerance with low storage overhead. However, the …
Random walk is widely applied to sample large-scale graphs due to its simplicity of implementation and solid theoretical foundations of bias analysis. However, its computational efficiency is heavily limited by the slow convergence rate (a.k.a. long …
The increasing capacity of SSDs requires a large amount of built-in DRAM to hold the mapping information of logical-to-physical address translation. Due to the limited size of DRAM, existing FTL schemes selectively keep some active mapping entries in …
Solid-state drives (SSDs) are susceptible to the limited number of program/erase (P/E) cycles and uncorrectable flash errors, and hence achieving high reliability of SSD storage systems is a critical issue. RAID provides a viable option for enhancing …
In the “network-as-a-service” paradigm, network operators have a strong need to know the metrics of critical paths running services to their users/tenants. However, it is usually prohibitive to directly measure the metrics of all such paths due to …