Enterprise networks are struggling with growing complexity.  The number of devices involved has increased, and users’ devices and IoT devices have changed the attack surface, complicating network operations.  The typical enterprise is quite diverse, with many different vendors of routers, switches, firewalls, and other devices, each with their own configurations and their own troubleshooting procedures.  These networks are constantly changing, with hundreds or thousands of changes per month.  And managing this process is typically largely manual. With a rapidly changing, complex environment, human factors often come into play – leading to the risk of outages and unnoticed vulnerabilities.

This is the part of the story where automation — software-defined networking and other orchestration technologies — is supposed to save the day.  These technologies are indeed helping to transform enterprises to become more software-driven, with the ability to define a high-level intent and realize it across many physical devices.

But today’s network automation technologies have focused on provisioning resources, and in some ways have actually made the network more opaque and more risky.  Software-defined data center solutions typically use an overlay, which means there are now two networks to manage — virtual and physical — often with poor visibility into how they correlate.  Some services can move to public clouds, but cloud networks (such as virtual firewalls) need configuration as well and may need to be coordinated with on-premises gear.  Even if network provisioning has been automated, human factors can still cause misconfiguration, systems can still (fail to) integrate in unexpected ways, and occasional bugs in vendor software are inevitable.  Most concerning, network automation will dutifully and efficiently amplify any mistakes that do occur.

It’s perhaps no surprise, then, that in a recent survey from Dimensional Research, 59% of networking professionals said that increasing complexity had led to more frequent outages (and only 7% said their network’s complexity had not grown). All kinds of networks from small enterprises to the largest cloud providers have experienced outages, costing many millions of dollars.  Network router issues at multiple major airlines recently including Southwest and United Airlines have grounded thousands of flights, leaving customers stranded at airports around the country.  And human error in the network can also cause vulnerability to data breaches, posing a high risk to any business carrying user data.  The same survey showed that ensuring a secure network was the top overall concern of network professionals.

The key to the problem might be that we’ve been missing half the automation picture.

When we build a complex system like an enterprise network, if we hope to have any assurance that is secure and resilient, we need to automate the whole “control loop” – not only control, but also understanding.  In fact, recent progress in automated control of provisioning the network – and the complexity and risk of error-magnification that this entails – makes automated visibility and verifiability all the more urgent.  The force must be in balance, if you will.

Automated understanding and analytics will come in a variety of forms.  Monitoring events and traffic flow is a start, but monitoring can only find problems after they’ve already begun.  Predictive technology could pinpoint security vulnerabilities and resilience problems before they even cause outages or vulnerabilities, giving assurance to network operators that their changes are correct.  Interactive querying of network behavior could avoid manual device-specific tasks and help resolve incidents rapidly.  And more advanced automation technology may eventually even determine how to automatically repair the network.

Advanced analytics and verification, then, may be key to the next generation of network automation, taking us a leap ahead towards secure, resilient software-driven networks.