Becoming Your Own Integrator: Knowing What You’re Getting Into with Software-Defined Storage
Software-defined storage (SDS) is receiving a tremendous amount of vendor and end-user attention these days, promising to deliver vendor independence and dramatically lower storage costs. Conventional wisdom is that SDS is good news for end-users, but bad news for legacy storage vendors used to having tight account control and predictable, recurring sales. But is this really the case? For end-users, SDS sounds too good to be true. Let’s explore some of the hidden costs of SDS and what can be done to minimize such costs.
SDS is an evolving concept where storage operating system software is independent of the underlying storage hardware. The software can be proprietary to a specific vendor or a freely-available open source storage solution, such as ZFS, CEPH or OpenStack Cinder and Swift. The underlying hardware is typically an Intel x86-based offering, as such hardware can be acquired from dozens of vendors – keeping costs low. The key software features, for which storage teams are used to paying large sums of money, are usually included in the SDS software or are readily available from a variety of vendors and open source sites. Such features include policy management, thin provisioning, snapshots, replication, deduplication, compression, performance monitoring and backup facilities.
Software-defined storage may be implemented over a traditional Storage Area Network (SAN), or as part of a scale-out Network-Attached Storage (NAS) solution, or as the basis of an Object-based storage solution. Scalability, in terms or capacity and performance, is claimed to be practically unlimited.
Proprietary storage solutions are bifurcated along vendor lines in today’s enterprise data center because the control plane software from each of the various vendors contain tools that usually don’t play nicely together. Efforts through Storage Resource Management solutions meet with only modest success due to the proprietary interfaces that were never made openly available. SDS benefits from open source collaboration, which promises more cohesive management and lower operating costs.
Some people claim that without the constraints of a physical system, a storage resource can be used more efficiently and its administration can be simplified through automated, policy-based management. Potentially, a single software interface can be used to manage a shared storage pool that runs on commodity hardware.
A major differentiator of “hardware defined storage” and SDS is that SDS adherents claim big cost savings using commodity hardware. It’s no secret that supplier gross margins on propriety storage hardware and software are pretty healthy. If the same data availability, performance and management functionality that the legacy storage vendors have spent decades refining can be obtained while using low-cost commodity hardware with SDS software, the costs savings to IT organizations would be immense. Of course, that’s the challenge.
The total cost of storage ownership isn’t just about commodity vs. proprietary hardware. It includes the value of all the software and the integration and testing needed to ensure performance guarantees that is required. In the SDS model, you can no longer rely on the legacy vendors to ensure that all components work together and that everything will perform as desired once the SDS solution is deployed into production. These critical tasks now fall on the customer, who must act as his or her own general contractor, or integrator. Storage vendors do a substantial amount of testing before they release new products. They carry out:
- Limits Finding – determining the workload conditions that drive performance below minimally acceptable thresholds and the documenting of storage behavior at the point of failure.
- Functional Testing– the validation, under a simulated load, of various features and functions of the storage system (e.g., backup, compression, etc.).
- Error Injection – the evaluation, under simulated load, of specific failure scenarios (e.g., fail-over when a drive or controller fails) to understand how failures affect performance.
- Soak Testing – the observation of the storage system under load sustained over significant time such as a typical business cycle (e.g. 1 day, 1 week, 2 weeks).
- Compatibility Testing – determining that the interaction of storage hardware, software and networking is compatible with major other subsystems (e.g. virtualizers, switches, and database systems).
- Regression Testing – a huge effort is made to ensure that new releases don’t break things that used to work on prior releases; it can be the single largest QA testing job and requires either massive manual efforts or extremely automated, well scripted test beds.
Then there are the intangibles, such as resource deployment and allocation. When–not if–a serious problem pops up, storage administrators are used to calling on their local vendor support team to help them triage, find and fix the problem. When the end-user IT staff plays the part of the integrator and support engineer, that role falls to the storage manager. Instead of calling your storage vendor to unsnarl a sticky issue, you’re potentially calling multiple vendors, big and small, who may or may not have any depth of knowledge of your scenario and doing online postings in a user forum to solicit help from an open source community–with no guarantee of a timely response.
By shifting away from vendor and hardware specific storage software, IT organizations are also taking on –- sometimes unknowingly — tasks and services traditionally offered by storage vendors. For IT, this means that storage performance and validation tools that test integration and performance under load, before deployment, become the key to their successful adoption of SDS and unlocking its benefits.
There are two types of solutions that can be used for storage load testing, open source tools and commercial products. Open source tools include ioMeter, vdbench, and fio, and are freely available. To properly use these tools requires implementing and maintaining multiple servers and many virtual machines to create testing environments with sufficient load generating capabilities. With these open source tools, setting up and maintaining meaningful tests and report generation across a number of servers requires significant learning, setup and ongoing scripting to achieve useful and relevant results. Some commercial products combine load testing, test management, and performance analysis into a single system that requires no scripting and minimal setup. Such products include purpose-built high-volume load generation hardware that replaces multiple servers. This approach, while having a higher purchase cost, can significantly reduce workload modeling and test administration time and improve the accuracy of simulated workloads.
Today’s storage environments are more fluid and fast becoming the critical component of every major data center deployment. Making the shift to SDS can bring huge benefits. But like every new technology, SDS brings its own gamut of issues and challenges to your enterprise – technological, financial, and organizational. Fully recognizing these challenges and selecting best practices that will prepare the organization for your SDS deployment can be your key to successful adoption.