At the this month'sNon-Volatile Memory Workshop[1], Stanford researcher Ana Klimovic, presented the results of her work with Heiner Litz, in an extended abstract titled ReFlex: Remote Flash ≈ Local Flash[2] and slide deck[3] which offers a novel solution to the SSD conundrum.
As early as 2010, it was obvious that Solid State Drives (SSD), using flash memory, would easily equal the IOPS performance of $100,000+ storage arrays at a fraction of the price. Furthermore the SSDs would not require a storage area network (SAN), as each server could have its own internal PCIe SSD with latency that no SAN array could match.
In those early days, enterprises were thrilled to get SSD performance, even though a 400GB SSD cost several thousand dollars. But as enterprises and cloud vendors adopted low-cost, shared nothing, scale-out infrastructures - typified by the Google File System and Hadoop - the unusable stranded performance and cost of server SSDs has become a major issue.
What is ReFlex?
ReFlex is a software-only system
. . . for remote Flash access that provides nearly identical performance to accessing local Flash. ReFlex uses a dataplane kernel to closely integrate networking and storage processing to achieve low latency and high throughput at low resource requirements. Specifically, ReFlex can serve up to 850K IOPS per core over TCP/IP networking, while adding 21μs over direct access to local Flash.
The performance of ReFlex is due to several key factors:
- Hardware virtualization capabilities of NICs and SSDs to operate directly on hardware I/O queues without copying.
- The dataplane kernel dramatically reduces I/O overhead compared to library-based I/O calls.
- A novel quality-of-service (QoS) scheduler enforces equitable sharing of remote devices by multiple tenants