Almost every data center globally will undertake some form of digital transformation as there is significant industry discussion about the importance of data as a stepping stone to business disruption through new and different revenue opportunities. In many cases, collecting and analyzing large amounts of data will be the foundation to these exciting opportunities. While analytics, and the use of data science, will be core drivers to these new frontiers, they will also be a source of agility, whereby the opportunities can be identified, tested, and rejected or accepted quickly.
Can data science help ferret out the best opportunities? The answer is yes, but even more importantly, data science can help enterprises fail fast.
In quick review, the fail fast philosophy supports incremental development and extensive customer testing to determine whether an opportunity or idea has value. The key to this philosophy is to cut losses when testing reveals something isn’t working, or to try something else. One of the key changes enabling enterprises to fail fast when building enterprise infrastructures focuses on hiring data scientists versus those who simply build, and represents a seismic shift from how organizations have staffed their teams in the past. As infrastructure teams typically build long-term enterprise infrastructures, they are often developed with rearview mirror focus, and once that infrastructure is in place, the cycle before it is changed again, can be three- to five-years.
Why focus on data scientists? The size of a data center, and how quickly one can access the data within it, doesn’t matter if users don’t understand what that data means, its potential impact, or value. Unlike those who build infrastructures, data scientists can map the statistical significance of key problems, and visualize and translate data quickly. The resulting data analysis may suggest radical changes to the business that may drive very different infrastructure investments, so to understand data and fail fast, data science and data scientists are critical.
Let’s consider an example based on a large internet retailer that wants to see if a new idea can achieve a three percent savings or gain (i.e. three percent rule). The retailer uses big data analytics to identify those areas of cost savings, as well as new revenue streams. Once the idea is identified as an opportunity, it is implemented into a pilot program for three-months that could require infrastructure acquisitions, cloud usage, prototypes and other investments.
Data scientists use data science programs to track improvements and changes to the idea over the three-month period. If the changes they see allow them to project a three percent savings or growth in revenue upon full implementation, then the retailer will move to development and rollout. If not, the pilot will be shut down, and the retailer will move on to testing the next new opportunity. This fail fast model allows the retailer to quickly find the real winners among their many opportunities, and quickly determine whether they will achieve that three percent savings or gain in order to move forward with implementation.
There are some basic rules and assumptions that data scientists follow as part of their mode of operation:
The value of data is very high and has hidden value within it that can be mined for competitive advantage
Using the internet retailer example, the business features millions of customers and trillions of transactions. They believe that the trends and patterns from this data are meaningful and can be used to drive their business, and as such, the company developed a big data infrastructure (as early implementers of Hadoop clusters) and a wide range of batch analytics. If asked what their core competency is, the retailer would probably reply with data science versus online retailing.
More data is better — all data is the best
From our example, the internet retailer believes in the value of their data and keeps a record of every transaction that has ever occurred on this platform. This accounts for trillions of transactions over the last ten-years. They understand that machine learning is more effective when applied to larger datasets — they have deep understanding of the trends in use on their platform — they have complete history on all of their customers and what those customers prefer – all of which provides a unique competitive advantage for the retailer.
Ideas can be tried, tracked and measured quickly, resulting in a low cost of failure
Also from our example, the retailer has learned that they can apply big data analytics to their internal IT decisions as they are always looking for opportunities to reduce their costs or increase their capabilities. They apply the “three-percent rule” in IT to optimize all types of infrastructure choices – they have a mindset of agile infrastructures — and their investments are all data driven and focused on continuous improvement. A data scientist can shorten learning cycles to days enabling the business to be improved based on current trends instead of it taking years to shift the IT spend and associated resources.
For data science, the investment must be made in the infrastructure first
If a business is based on mining data, then investments in data are critical. This includes investments in the infrastructure and object storage, and more importantly in resources such as data scientists. In our example, the retailer has already deployed one of the largest Hadoop clusters in the world to support their data mining activities, and they know this investment will help them drive revenue growth and new market opportunities that will far outweigh their investment in data.
For companies born in the digital age, the use of big data and associated analytics provides a very natural mindset as they collect a lot of data and use it for business intelligence and decision-making. For other enterprises, the shift is more difficult as they may not have built-in ways to properly or accurately capture data across their businesses. Or, if they are capable of capturing the data, they often lack the data scientists who can translate that data and extract value into new opportunities. Using all of that data analysis for a project that might be shut down in a few months seems crazy, but failing fast means getting to success quicker — and the combination of data science and data scientists are the key to this success.