Author: Hariharan Devarajan (Illinois Institute of Technology)
Advisor: Xian-He Sun (Illinois Institute of Technology)
Abstract: Traditional scientific discovery was driven through the compute power of a computer system. Hence, I/O was treated as sparse tasks to perform occasional checkpoints for the application. This approach led to a growing gap between the compute power and storage capabilities. In the era of data explosion, however, where data analysis is essential for scientific discoveries, the slow storage system has led to the research conundrum known as I/O bottleneck. Additionally, the explosion of data has led to proliferation of application as well as storage technologies. This has created a complex matching problem between diverse application requirements and storage technology features. In this proposal, we introduce Jal, a dynamic, re-configurable and heterogeneous-aware storage system. Jal utilizes a layered approach that includes application model, data model and storage model. The application model uses a source-code based profiler which identifies the cause of the I/O behavior of applications. The data model translates various applications' I/O requirements to an underlying storage configuration to extract maximum performance from each application. Finally, the storage model builds a heterogeneous-aware storage system which can be dynamically re-configured to different storage configurations. Our evaluations have shown that these models can accelerate I/O for the application while transparently and efficiently using the diverse storage systems.
Thesis Canvas: pdf