The New Data Center: Serious Simplicity Masks Crazy Complexity
Times are changing. The days of the backroom engineer sitting alone in a locked room denying knowledge to the rest of the organization are over. Today, CIOs and other business and technology leaders seek technology solutions that don’t require Ph.D. level knowledge to implement and administer. For many, simplicity has become a key factor in decisions regarding new technology procurements. The technology vendor community is listening and vendors are tripping over themselves in an effort to bring solutions to customers that are as close to “set it and forget it” as possible.
This trend is manifesting itself through the appearance of all kinds of new products intended to shake up the staid, complex data center paradigm and nowhere is this more evident than in the area of storage, but it has pulled server into its orbit as well. At Storage Field Day 6, which took place in San Jose on November 4 -6, 2014, I had the opportunity to visit with a number of storage vendors and the theme of simplicity was more than apparent. Further, as you expand just a bit beyond storage, there
The Cost Factor
Time is money. That is a fact. Over the years, IT systems have become increasingly complex and inflexible as IT reacted to new demands placed upon it by both the business and by the rapid pace of change in the technology market in general. IT has created massive environment to support rapidly evolving virtualization efforts, but these environments have come at a tremendous cost. In many places, there are swarms of technical people running around to keep these systems up and running at all times. These same people continuously tweak their systems to meet application demands and to build out new environments as new business needs arise.
In smaller environments, very few people have to hold a broad range of knowledge to handle the various aspects of the data center. As new needs come up, these same people have to continuously add new skills to their already burgeoning information satchels.
In both large environments and in small ones, all of this effort is a direct cost to the business. These technical resources are being expended on activities that are required to keep the lights on and also deploy new services. However, CIOs and business leaders today are far more interested in directing technical resources to helping solve complex business problems while also reducing the amount of time that it takes to deploy new services.
In all aspects of the data center – servers and storage for the purposes of this article – lifecycle, there are emerging – and even established – solutions that are helping businesses and IT solve these critical challenges.
Procurement and Implementation
Buying servers has become pretty easy, especially with the quick rise of virtualization. Today, virtual hosts are pretty generic commodity systems that just need compute and RAM to host virtual machines. Sure, RAM and processor count/cores are configurable, but these are pretty cookie-cutter. Further, because they’ve become a commodity, server pricing isn’t too bad.
Storage, on the other hand, hasn’t been easy at all; it’s been a beast. However, there are a whole lot of companies out there that are making it easier to procure and implement storage. Many of these companies have streamlined the procurement process by providing fewer “optional” items – particularly when it comes to software – and making it dead simple to deploy a new storage system.
All-inclusive Software Licensing
At Storage Field Day, multiple vendors – Pure Storage and Tegile, for example – touted their “all inclusive” pricing. You may recall a few years ago that buying storage meant taking a very a la carte approach as you and your vendor ran through myriad optional add-ons that might improve your storage experience. For example, did you need deduplication or thin provisioning? Cha-ching! Add thousands of dollars in licensing costs.
Today, storage startups and hyperconverged infrastructure vendors are including everything except the kitchen sink in their product sales. For example, when you buy a Tegile array, the only optional items are Fibre Channel adapters, additional Ethernet adapters, and various support services. The product sale itself includes all of the hardware and software that anyone would need or want, including the aforementioned deduplication and thin provisioning, but also replication and many, many other services. Pure Storage’s all-inclusive pricing includes their powerful deduplication engine, which is a core part of their platform and their value proposition. What used to be considered enterprise class features are being found in even moderately priced arrays today.
Even better, as new features are added to the product, they are almost always added in software. Existing customers get these software updates as a part of their agreement with the vendor. If there are new features, the customer gets them. It’s that simple. Of course, before that, the vendor has engaged their engineering teams to create these new features, but the customer simply sees these enhancements as a part of their routine software update.
No Professional Services
The “we don’t require professional services” line was uttered by the good folks at Pure Storage during Storage Field Day 6, but even if it wasn’t said by others, the spirit is definitely there as evidenced by the simplicity that is being baked into the products from many companies. The fact is that many organizations simply don’t want to or can’t afford to hire people dedicated to specific resources in the data center. For example, hiring people dedicated to just storage may not be feasible for many. And, even for those that already have dedicated people, as CIOs and business leaders look for more ways that technology can transform operations to better enable top line revenue generation opportunities, there may be interested in retasking existing storage personnel lines or reducing headcount.
With many of today’s storage technologies, deploying new storage has become really easy. Rack, cable, power on, do an initial configuration and the system in then ready to start serving data. It doesn’t require a consultant to spend 5 days on site like it used to with a lot of more legacy systems. Obviously, the deployment experience is going to vary based on organizational size and complexity, but the point here is that many of today’s storage options make the overall experience far easier than it was in the past.
Here’s A Example
Let’s take a look at a specific example now. StorMagic is addressing a really critical problem. They have massively distributed companies – think retails stores and such – that are trying to improve manageability of their applications, but they can’t just move everything to a central site and be done with it. The need to rely on the WAN affects the user experience and, as a result, many apps need to remain at the edge. Again, think about retail stores that need local systems for their point of sale terminals; they need to be able to function even if the WAN is down.
In StorMagic’s experience, the typical remote site:
- Has 2 TB average data capacity;
- Requires 7 or 8 key applications;
- Has No dedicated computer room; systems may be in a closet, under a table, or on a wall, etc.;
- There is no local IT staff at all; everything needs to be set and forget.
These customers want simple. They don’t want any bells and whistles; just plug in and go. They want cost effective high availability at remote sites – that’s the problem that StorMagic is trying to solve. And they need to be able to do this in a way that makes economic sense. As a result, StorMagic has developed a product that provides a highly available environment using just two servers (competitors require three servers) running vSphere/ESXi with easy to consume pricing:
- $2,000 for two nodes with 2 TB
- $8,000 for two nodes with 16 TB
- $10,000 for two nodes with unlimited capacity
The product features bare metal recovery. If a system fails, the customer can just ship an ESX server to the location and StorMagic does the rest. A person at the site just plugs the server into the network and the StorMagic software handles the hard work or rebuilding the highly available environment. The user has to do nothing else.
Ongoing Administration
Procurement and deployment are generally one-time events in the lifecycle of the storage environment, but making those processes easier does have benefit. However, it’s in simplifying ongoing administration where organizations can begin to realize major efficiencies with modern storage systems. Administration is generally a series of repeatable tasks. In many storage systems, it’s still necessary – and, depending on needs, might be desirable – to create RAID groups for data protection, LUNs for presenting volumes to servers (or shares for NFS folks), tiers for managing performance, and to do all of the heavy lifting in the storage environment. If doing this work is necessary of desirable, then simpler storage may not be for you of course.
However, for many, storage is a means to an end. It’s critical, but people are tired of having to do these things. Today, CIOs and business leaders want technology systems that don’t require the kind of heavy lifting that older systems may require. They want systems that:
- Meet capacity needs;
- Meet performance needs;
- Meet data protection/recovery needs;
- Are simple to manage.
That’s it. For many, this is the entirety of the storage buying checklist. Obviously, under the hood, there may be a lot of features that enable these checkboxes and those aspects are important to understand as they can have a dramatic impact on cost. For example, on the capacity front, it may seem to be an easy metric to figure out but it becomes a bit muddied when considering potential data reduction benefits that could be had from many vendors that provide inline deduplication capabilities.
Data availability and protection is another area of concern. For years, storage administrators have carefully crafted RAID sets to make sure this need was met, but in some cases, RAID structures have become insufficient. And again, people are tired of having to deal with it. So, they want a storage system that takes the worry out of this stuff and makes the decisions for them.
The same goes for storage performance, which has become a hot topic in recent years. Companies have had to take great pains to carefully manage workloads and storage systems to make sure that the business doesn’t suffer from storage performance-induced issues. Today’s arrays can often take the worry out of this as well. By leveraging either all flash systems or hybrid systems that use flash storage as an accelerator, customers can overcome performance issues without needing to hire a Ph.D. level storage engineer to provide assistance. Behind the scenes, though, there are powerful algorithms and fast hardware making all of this work and handing off to the administrators just enough administrative capability to meet business needs while masking the really hard stuff that’s happening under the hood. Making things easy for the user is generally really hard work for the vendor.
Simplifying the Support Experience
Most of today’s enterprise arrays – particularly those that are shipping from relatively young companies such as Nimble Storage and Tegile – have built-in comprehensive phone home capabilities. What I mean by that is this: on a regular basis, these arrays send to the corporate mother ship – the storage array vendor – a plethora of statistics around storage environment usage, performance, and health characteristics. In many cases, such systems are sending hundreds or thousands of data points to centralized vendor databases on an ongoing basis. The title of this post calls this “crowd-sourced” but in reality, it’s not quite that, but it does share the spirit of improvement by cooperation that makes many crowd-sourced services work.
At first glance, these kinds of systems might appear to be ways to just streamline the support experience, but their use has far more benefits and there are complex systems that back each of these services. By the way, HP does this kind of stuff for their servers as well, and it’s awesome.
Customer Support and Proactive Issue Correction
Yes – these systems are absolutely used to make the customer support experience far better than the traditional support methodology that usually goes something like this eight step process:
- Customer (or worse, an end user) notices a problem
- Customer determines that the storage array has a problem or is running slowly
- Customer calls vendor and provides them with case details
- Customer hangs up and waits for a call back
- Vendor calls back and requests log files to be sent and tells customer they will call back after analysis
- Customer sends log files to vendor
- Vendor analyzes log files
- Vendor calls back to help customer correct whatever issue was being experienced
What if, instead, the support process looked like this for the majority of issues?
- Vendor fixes problem or vendor notifies customer that storage capacity is running low and needs to be upgraded.
With automated support systems that report stats back to the vendor on a regular basis, this one step process is actually achievable. Of course, there will remain instances in which the first support paradigm still needs to be followed, but these can become the exception rather than the norm. When a vendor sees a problem even before the customer and can easily fix the problem, the customer has the ability to focus their efforts very differently… and in a good way.
This support improvement goes beyond just the vendor, though. It can be a critical enabler when it comes to improving the relationship between a VAR and a customer, as long as that VAR is able to access the statistics.
Issue Avoidance
Support is great, but it’s a reactive service. Support happens when some event takes place that requires the support call. What if whatever issue that might result in a support call can actually be avoided? When an array vendor has the ability to match your array’s performance, health, and application characteristics against thousands of other customers, there are analytics opportunities that make issue avoidance a reality. As customers grow, they can be proactively notified about things like unacceptable cache miss ratios, which would imply that flash or cache storage needs to be increased. These systems can also begin to understand individual application performance characteristics in order to improve the experience for other customers that might be likely to encounter similar issues.
Product Development
Just as important, with billions of actionable data points at their disposal, vendors can use their massive customer-provided data statistics as a part of their ongoing product development plans. For example, for a hybrid vendor, are customer environments beginning to demonstrate that the time is right to introduce an all-flash system? While that’s a very general example, it does demonstrate how data can be used for product development.
Or, for example, is the vendor seeing particular challenges for organizations running 2,500 Exchange mailboxes? They can take product development steps to address such challenges and further improve the ability to address the market they serve.
In other words, using the data that these vendors glean from their customers, they can further improve the product to further improve the overall experience and increase the value proposition. Behind the scenes, though, making these services available takes armies of people massive cloud-based databases, and awesome analytics engines. What the user sees, though, is a pretty interface with immediately actionable intelligence.
Adding More
At some point, you’ll probably have to add more capacity to your storage environment. Whether your needs are described in terms of GB or in terms of IOPS (performance). This scaling need has not always been easy. As you may guess, though, this has become far easier with newer storage systems that are available on the market. There were major decisions to make around what – space or speed – was needed and different ways to achieve the expansion goals.
Fast forward to now. Nimble Storage now has a system that can scale both up and out depending on your needs and adding this capacity is a matter of plugging it in. Tegile scales up with any of their products as well, and it’s just a matter of cabling the system and adding it to the storage pool. And, in the hyperconvergence space – which includes storage – companies like Gridstore have taken scaling to whole new levels by providing a choice in how to scale while still making it easy to do so.
Summary
I didn’t start out writing a book about this topic, but it ended up that way. Simplicity was a huge theme at Storage Field Day 6 and this theme starts with the buying process and continues through the product lifecycle. This need for simplicity is one that is – or needs to be – pervasive through the entire IT paradigm, but starting with complex items such as servers and storage is a good start. Under the hood, there is all kinds of complex software that does a lot of hard work. To the user, though, they have simple interfaces and tools that can help them meet business goals.