Adaptive Data Migration Scheme with Facilitator Database

Recent “data explosion” induces the demand for high flexibility of storage extension and data migration. The data amount of LHD plasma diagnostics has grown 4.6 times bigger than that of three years before. Frequent migration or replication between plenty of distributed storage becomes mandatory, and thus increases the human operational costs. To reduce them computationally, a new adaptive migration scheme has been developed on LHD’s multi-tier distributed storage. So-called the HSM (Hierarchical Storage Management) software usually adopts a low-level cache mechanism or simple watermarks for triggering the data stage-in and out between two storage devices. However, the new scheme can deal with a number of distributed storage by the facilitator database that manages the whole data locations with their access histories and retrieval priorities. Not only the inter-tier migration but also the intra-tier replication and moving are even manageable so that it can be a big help in extending or replacing storage equipment. The access history of each data object is also utilized to optimize the volume size of fast and costly RAID, in addition to a normal cache effect for frequently retrieved data. The new scheme has been verified its effectiveness so that LHD multi-tier distributed storage and other next-generation experiments can obtain such the flexible expandability. Recent “data explosion” demands higher potential of distributed storage and mass data migration among them. Acquired data amount for each LHD plasma discharge has grown 4.6 times bigger than that of three years before (Fig. 1). In such circumstances, data migration or replication between a large number of distributed storage devices becomes much frequent, which increases the human operational and maintenance costs

Free download research paper