Maximizing VPN Remote Access for Business Continuity with

 

 

Remote and mobile workforces have been commonplace for years, but relatively few companies planned for the need to
suddenly support a wholesale switch to an employee base working primarily from home offices. However, the COVID-19 outbreak has led governments and organizations worldwide to institute social distancing and self-quarantine measures to ensure the safety and well-being of constituents and employees.

This has resulted in a rapid and wholesale transition to a remote workforce, distant learning, and virtual events for many organizations and significant populations of their employees worldwide.

It is therefore critically important to ensure the performance and availability of remoteaccess infrastructure such as Virtual Private Network (VPN) gateways. Poorly performing network infrastructures can result in lost productivity of remote/home users and decreased revenue and profits.

Issue

The current situation with VPN usage illustrates the importance of managing such infrastructures as a business continuity decision and why monitoring this part of the network is so compelling. VPNs in many cases were not designed to support scenarios such as an entire workforce working remotely, but rather were built to support the needs of well-defined employee populations, such as sales and pre-sales engineers. Two weeks ago, prior to the progress of COVID-19, VPN volumes were completely different.

The NETSCOUT nGeniusONE® Service Dashboard example provided in Figure 1 shows the immediate rise and peak utilization of today’s fully remote workforce compared with normal traffic.

Figure 1: This nGeniusONE, Service Dashboard view shows a clear spike in VPN utilization occurring in a three-day span, corresponding to the COVID-19 progression timeline.

Impact

For enterprise, commercial, government, and service providers alike, the influx of remote workers will inevitably lead to points of congestion, as users of certain systems (e.g., Contact Center and Customer Relationship Management solutions) must employ VPN for access. If the VPN resources are overwhelmed, then remote worker productivity will dramatically decline.

Similarly, Cisco WebEx, Microsoft, and Zoom video conferencing application usage rates are on the rise, with even first-time users now turning to these meeting apps to replace collaborative, faceto-face sessions formerly occurring in the workplace. These video conferencing apps fill a critical void in the workplace, but they now consume VPN bandwidth, as well. Likewise, employees are now likely to call into conference calls using VPN-connected laptops, which also consume bandwidth in a manner not envisioned in network designs preceding COVID-19.

Companies will therefore need end-to-end visibility across their network and real-time performance monitoring to analyze the impact of increased competition for these resources.

With the nGeniusONE platform, traffic monitoring metrics can be viewed by a range of keys, such as locations, community of users, servers, users, and applications, providing both holistic and granular data that allows teams to accurately diagnose issues and better allocate bandwidth or build specific services to alleviate the issue.

Triage / Troubleshooting

The nGeniusONE platform provides IT teams with real-time monitoring and trending analysis required for preventive service assurance and effective troubleshooting.
In the Service Dashboard provided in Figure 2, we see a relationship between the rise in usage and sessions aligning to the timeouts and delayed response times. New session timeouts rose by 18 percent – as the volume of VPN usage increases, so does the problem.

Figure 2: nGeniusONE Service Dashboard analytics providing singlepane views into error code distribution  as well as degraded and slo response times.

By combining these performance metrics into a single Service Dashboard customized for VPN monitoring, nGeniusONE provides a range of diagnostic views that allows IT teams (e.g., NetOps, SecOps) to recommend changes and institute best practices to alleviate performance issues.

With nGeniusONE, IT can pinpoint mission-critical applications and identify those having performance issues over VPN by looking into timeouts, slow response times, and retransmissions. We can then see the application affected by these issues and look for the root source of the problem. Using nGeniusONE to analyze session timeout instances, IT can see boundaries of response time, which can add value by showing that front-end servers rather than bandwidth may be the problem, for example. Similarly, nGeniusONE contextual analysis can show that retransmissions could also be linked to out-of-order packets or oversaturation of network links.

Figure 3: nGeniusONE Traffic Monitor view showing Top 10 applications running on the aggregated VPN. This view helps IT visualize how employees are consuming VPN resources, differentiating business application use from non essential internet steaming services that consume valuable bandwidth.

By contextually drilling down from the VPN Service Dashboard view, IT users can then access a corresponding nGeniusONE Traffic Monitor providing information regarding the Top 10 applications running on VPN.

For IT teams looking to monitor VPN user experience by location or communities, the Service Dashboard provides a single-pane, real-time view into overall performance.

Remediation / Restoration

Using nGeniusONE traffic metrics allows organizations to make betterinformed decisions on adding VPN capacity or configuring technology such as split-tunnel-VPN, which directs all internet traffic through local home networks.

It also helps companies hone and articulate remote access policies. For example, something as simple as constant communication about what applications require access via VPN and which do not can have a positive effect.

Summary

Unplanned events such as the COVID-19 pandemic show the importance of capacity planning from a business continuity standpoint.
While most companies do encourage and accommodate remote work for some of their employees, most did not factor the capacity
constraints that would result from such a sudden urgent increase in remote work. Those organizations that can understand the impact quickly with validated metrics will reduce the time to understand and improve the time to resolution.

Leave a Reply

Your email address will not be published. Required fields are marked *