All File Systems Are Not Created Equal: On the Complexity of Crafting Crash-Consistent Applications
One-line Summary
This paper presents a comprehensive study of file system persistence properties and modern application crash inconsistency vulnerabilities. Two tools, BOB and ALICE, are presented to analyze FS-level and application-level vulnerabilities.
Paper Structure Outline
Introduction
Persistence Properties
An Example
Study and Results
Atomicity
Ordering
Summary
The Application-Level Intelligent Crash Explorer (ALICE)
Usage
Crash States and APMs
Logical Operations
Abstract Persistence Models
Constructing crash states.
Finding Application Requirements
Static Vulnerabilities
Implementation
Limitations
Application Vulnerabilities
Workloads and Checkers
Overview
Databases and Key-Value Stores
Version Control Systems
Virtualization and Distributed Systems
Vulnerabilities Found
Common Patterns
Atomicity across System Calls
Atomicity within System Calls
Ordering between System Calls
Durability
Summary
Impact on Current File Systems
Evaluating New File-System Designs
Discussion
Related Work
Conclusion
Background & Motivation
To provide crash consistency for update-in-place file systems, journaling is performed. A high-level overview of journaling:
Intuition
Before updating the file system, write a note describing the update
Make sure note is safely on disk
Once the note is safe, update the file system
If interrupted, read the note and redo updates
Protocol
Write the data (no pointers to it) - Optional
Write the note: Journal Metadata
Make sure the note is durably written: Journal Commit
Update the in-place metadata: Checkpointing
Replay the note: Recovery
The motivation for this work is that applications may not be aware of consistency guarantees for different file systems or even the same file system with different configurations. The authors categorize file system persistency properties and study the differences among widely deployed file systems (ext3, ext4, btrfs). It then studies application-level crash inconsistency vulnerabilities.
Design and Implementation
BOB (Block Order Breaker)
BOB analyzes syscall persistence properties on a file system. Here's how BOB works:
Runs user-level workloads stressing the property
Records block-level trace of the workload
Reconstructs disk-states possible on a power-loss
All states possible if disk-cache does not re-order
A few states where disk-cache re-orders
Run FS recovery, verify property on each disk-state (atomicity & ordering)
ALICE (Application-Level Intelligent Crash Explorer)
ALICE analyzes application update protocols and finds crash vulnerabilities (across all file systems). ALICE runs an application and collects its syscall trace (which represents an execution of the application's update protocol). The traces are then converted into a sequence of logical operations, which is then used to generates possible disk states according to characteristics of these syscalls and produces all possible intermediate disk states. If any of these disk states violate any application invariant, then this is considered a crash vulnerability.
Evaluation
File systems
The conclusion for the file system study is that applications should not rely on persistence properties. Also, testing applications on a specific file system is not enough.
Applications
Applications from different domains are studied (relational & non-relational databases, version control, distributed services, virtualization). Many of the vulnerabilities will bring trouble over some modern file system configurations (e.g., content-atomic appends over no-delayed-allocation file systems).
New Vocabulary
APM: Abstract Persistence Model
Links
Last updated