No CrossRef data available.
Article contents
Using Hadoop Distributed and Deduplicated File System (HD2FS) in Astronomy
Published online by Cambridge University Press: 23 December 2021
Abstract
During the last years, the amount of data has skyrocketed. As a consequence, the data has become more expensive to store than to generate. The storage needs for astronomical data are also following this trend. Storage systems in Astronomy contain redundant copies of data such as identical files or within sub-file regions. We propose the use of the Hadoop Distributed and Deduplicated File System (HD2FS) in Astronomy. HD2FS is a deduplication storage system that was created to improve data storage capacity and efficiency in distributed file systems without compromising Input/Output performance. HD2FS can be developed by modifying existing storage system environments such as the Hadoop Distributed File System. By taking advantage of deduplication technology, we can better manage the underlying redundancy of data in astronomy and reduce the space needed to store these files in the file systems, thus allowing for more capacity per volume.
- Type
- Poster Paper
- Information
- Proceedings of the International Astronomical Union , Volume 15 , Symposium S367: Education and Heritage in the Era of Big Data in Astronomy , December 2019 , pp. 464 - 466
- Copyright
- © The Author(s), 2021. Published by Cambridge University Press on behalf of International Astronomical Union