Author: Jamie Busic
File System based arrays have always held a special place in my heart. I can still remember getting one of the first generation NetAPP Toasters and thinking how easy and elegant the platform was. While at the time, it couldn’t perform as well as my massive block based storage arrays, it had them beat hands down in the ease of use department. Fast forward only a couple of years, and NetAPP was giving SUN and EMC a run for their money by exploiting initial performance gains from WAFL, the NetAPP file system that allowed the hard disk to drop data anywhere the head was laying at the time so long as there wasn’t data in that sector already. This approach also had the very real benefit of allowing for virtually unlimited snapshots that took little incremental space. File systems are truly great.
So if WAFL and NetApp were so great in their approach, why are they not the Storage Area Network (SAN) Array of choice today for low latency applications? The answer is in Flash and how you work with it. Traditional storage vendors like NetApp, EMC, and HP have approached Flash by using it is a tier of storage where hot blocks/sections of data get moved back and forth between the flash tier and the spinning disk tier. Other vendors utilize flash by putting a storage cache into the server directly. Both of these approaches have a significant challenges.
The first problem with the storage tiering approach is the data a lot of the times is not in the correct location when you need it. The flash tier has to heat up. In a very random workload like the ones that Virtual Machines create, the storage array does not have enough hits into the same data long enough to heat up the flash tier effectively.
The second issue points to the approach of server side flash cache like solutions from VMWare (vFlash) or FusionIO. Server side cache is good at accelerating sequential read workloads and depending on the effectiveness of the caching algorithm, random read operations. Server side flash solutions however are not great at caching writes to the underlying storage, especially in a distributed computing environment like VMWare or HyperV. Why? Write synchronization and write acknowledgement. When an application writes to disk, many times the application will wait for an acknowledgment from the underlying storage subsystem to ensure that the data being written is really there before it moves on. This is one of the ways the application helps protect against data corruption. When an application utilizes server side flash, the application may get a false sense of security in that the flash may give the application a write acknowledgement when data hasn’t been flushed from the local flash cache to the underlying storage subsystem. While on the surface this may not pose a problem, if distributed computing is introduced there may be a catastrophic failure. In the same scenario where the server has crashed and there is data in flight to the underlying disk but the application thinks it has been written, a VM may be moved to a new machine, complete with RAM memory contents. The new VM resumes on different hardware and now the data that was in flash on the original machine is lost forever. There are several ways around this, the first and easiest is not to do write caching in server side flash. The second and exponentially more complicated method is to perform write synchronization like that done on high performance computer clusters. When this is done, expect to pay big money for low latency network infrastructure like Infiniband, not to mention systems that can perform write synchronization typically are not designed to run applications that you would want like Exchange, SQL, etc. This is why you see vendors like VMWare not introducing write caching in their initial server side flash products.
This brings us to the biggest issue in my opinion, flash is not a hard disk, it is memory. Memory systems are not managed like hard disks. They are infinitely fast but need to be organized differently. Also, SSD’s are not always the best place for writes but they excel at reads. So when a legacy vendor pushes flash into a tier it typically doesn’t speed everything up because they are treating the layer like another, but faster, hard disk. This is where new flash and hybrid flash storage array vendors really shine. For sake of brevity I will concentrate on Nimble Storage’s approach as I feel it gives the biggest bang for the buck and is probably the best expression of how to manage flash.
Nimble utilizes an architecture called CASL™ which stands for Cache Accelerated Sequential Layout. CASL does a few things that make performance fast but I will stick to the highlights.
There are a variety of other features and benefits to this platform’s approach to flash management. Bottom line is you need to know how your vendor implements various technologies to ensure your environment runs the best it can and achieves it’s ultimate goals.
Jamie Busic is a technology Entrepreneur that has founded several successful companies including instantWorkplace, Bluemile Wireless, and Bluemile and has held roles at major institutions such as Dell, L Brands, and Chase. Jamie focuses on Cloud Computing, High Performance Flash Based Storage, Campus & Datacenter Networking, and Security.