技術探索

具資料高度保護能力的分散式區塊儲存系統

中文摘要

近年來隨著網路普及和使用者終端設備數量的增加,每天都有大量的資料不斷地產生。快速成長的巨量資料需要一個具備儲存空間擴充能力、資料高度保護能力、和高儲存效率的儲存系統。這幾年有很多分散式儲存系統因此而誕生,其分散式架構使得儲存空間的擴充變的容易,因此可解決每天不斷增加的資料的儲存問題。然而,這些系統大多只提供基本的資料存儲,使用者需自行將這些系統與提供備份功能的次要儲存系統(Secondary Storage System)整合才可以讓資料有完整的保護,但這些整合過程通常耗時且繁雜。因此我們設計了一個新的儲存系統DISCO(Distributed Integrated Storage with Comprehensive Data Protection),此系統提供透過虛擬硬碟存取的區塊式資料儲存,並提供了完整資料保護功能以防範各種可能造成資料毀損的情況,包括:資料N份複本技術(N-way Replication)可避免單一儲存節點損壞造成資料無法存取、虛擬硬碟快照功能可解決人為失誤造成的資料毀損狀況、異地備份功能可解決天然災害導致資料全毀的問題。另外,此系統更透過了自動精簡配置(Thin Provisioning)、資料去複本(De-duplication)、資料和元資料(Metadata)零複製的虛擬硬碟複製技術的實作提高了資料儲存效率。DISCO系統現今已完成上述所有功能開發與測試,且與OpenStack Mitaka版本相容的Cinder驅動程式也已獲得官方認證通過,因此更可作為OpenStack系統的儲存方案。

Abstract

With the popularity of the internet and the increase of intelligent user devices, huge data set is generated every day. The fast-growing big data demands a storage system with high scalability of storage space, high data protection capability, and high storage efficiency. Therefore, in these years tremendous distributed storage systems have been developed to achieve high storage space scalability. However, most of them only provide basic data access. Users need to manually integrate these systems with other secondary storage systems with backup functionalities to get higher data protection; however, the integration is usually time-consuming and complicated. In this paper, we have developed a novel storage system called DISCO (Distributed Integrated Storage with Comprehensive Data Protection) to provide high data protection. DISCO not only provides block-level storage service through virtual disk access, but also provide full data protection to handle most of failure cases. For example, N-way replication technology is adopted to prevent data loss from single storage server failure, the snapshot feature can protect data from unexpected user errors, and the remote backup feature will be helpful for disaster recovery. In addition, DISCO also achieves high storage efficiency by the implementation of thin provisioning, data de-duplication, and virtual disk clone with zero data copy and zero metadata copy. DISCO has released the first stable version, and its Cinder driver for OpenStack Mitaka has been verified by the organization such that DISCO can be taken as the storage solution in OpenStack deployments.

關鍵詞 (Key Words)

區塊式儲存系統(Block-level Storage System)
分散式儲存系統(Distributed Storage System)
儲存虛擬化(Storage Virtualization)

相關檔案: 具資料高度保護能力的分散式區塊儲存系統(全文)