The use of multiple technologies and an exponential rise in network elements creates new challenges in gaining end-to-end visibility of IMS and LTE network performance and availability.
By Matt Goldberg, Systems Engineer, SevOne, Inc.
LTE (Long Term Evolution) has been confirmed as the fastest developing mobile system technology ever, with at least 64 LTE networks expected to be in commercial service by end of 2012. All existing 3G technologies can harmonize to LTE, which provides a natural migration path for both CDMA and GSM/HSPA wireless carriers.
For the foreseeable future however, multi-technology hybrid networks will co-exist, meaning wireless carriers will have to interwork a multitude of technologies, protocols and network resources. Most significantly, the number of network elements and interfaces will increase by an order of magnitude, while many core components will be software rather than hardware based. Legacy network performance management solutions have not evolved in-step, which impacts both provisioning times for new devices, and the ability of wireless carriers to measure and track performance of the infrastructure end-to-end.
Management and monitoring limitations
LTE standardization is complete, and is overseen by 3GPP (Third Generation Partnership Project), with LTE Release 8 providing the basis for initial deployments. Carriers, network equipment vendors and handset developers have also committed to 3GPP specifications for IMS (IP Multimedia Subsystem) under the GSM Association’s VoLTE (Voice over LTE) initiative. Originally conceived as part of the core network evolution from circuit-switching to packet-switching, IMS provides an access-independent platform for delivery of real-time interactive IP-based voice, video and multimedia services, and employs SIP (Session Initiation Protocol) as a key enabler.
With LTE/IMS, wireless carriers face a unique set of challenges from a performance management perspective:
• How to scale their infrastructure and implement new services while maintaining required performance and availability levels
• Ensuring quick ‘time-to-value’ with a low-touch deployment and management model
Traditional distributed performance management solutions rely on a centralized PMDB (performance management database) surrounded by collection, aggregation, and reporting components. This means the PMDB tends to become a major bottleneck as a network grows and evolves, slowing the generation of reports and dashboards to a crawl. This is compounded in cellular network topologies, which tend to be highly distributed given the regional variations in subscriber numbers, traffic volumes, and cell-site loading.
These limitations become even more pronounced because LTE/IMS architecture collapses what were previously two separate networks (i.e. voice and data) into a single all-IP system. Although this is a simplification of the network, it also introduces many new network elements, and increases the number of interfaces by an order of magnitude. There are however, five core components [see also Figure 1]:
• LTE RAN – based on eNode Bs, which are LTE BTSs (base transceiver systems)
• Aggregation and backhaul – whereby routers are deployed at the base of cell-site towers, and at key points of aggregation across the RAN, in order to support the large increases in data traffic volumes
• Ethernet Packet Core (EPC) – provides LTE’s packet domain, and is a flat, all-IP system comprising three key nodes:
o MME (Mobility Management Entity) – the control-node for the LTE access network
o SGW (Serving Gateway) – routes and forwards user data packets
o PGW (PDN Gateway) – provides connectivity from the user equipment to external packet data networks
• Transport/backbone – typically based on IP MPLS, and a mixture of fibre-optic cable and microwave transmission systems
• IMS core – effectively an authentication system that includes nodes such as the PCRF (Policy and Charging Rules Function), HSS/3GPP AAA (Home Subscriber Server), and CSCF (Call Session Control Function)
Today, vendor-specific Element Management Systems (EMSs) allow wireless carriers to monitor network elements within a specific section of the infrastructure (i.e. core/distribution, access, or edge). There are also solutions enabling synthetic IP SLA tests across these sections of the network. However, these systems are not able to address performance and service delivery end-to-end across the multiple technologies introduced by IMS/LTE migration.
Software-based solutions and open source ones in particular, are severely limited in their ability to deploy, maintain, and scale. When performance monitoring agents are deployed at a specific device, the device effectively becomes a server, which must be re-configured, patched or upgraded each time there is a change to the operating system or network infrastructure. Similarly, network management protocols such as SNMP are designed to extract performance metrics from networking equipment, yet many LTE/IMS components do not support SNMP. Some equipment vendors and wireless carriers are also moving away from SNMP due to concerns over deployment complexity and security.
Figure 1: The core components of an IMS/LTE infrastructure
Source: SevOne, Inc.
Unifying the elements
An IMS/LTE architecture calls for a next-generation network performance management solution able to provide a real-time, end-to-end view of the infrastructure, independent of equipment type, so that wireless carriers can see not only how their network is performing, but how individual services on the network are performing. Given the limited scalability of legacy performance management tools, an appliance-based solution is recommended. A full-featured integrated appliance eliminates the need for additional software, hardware, or external databases, and can be used in standalone or peered configurations in order to quickly provide reports on any indicator, device, or application to be monitored.
Integrating with the vendor specific EMS for each IMS/LTE component is a necessary and efficient method for the bulk collection of performance statistics. This approach avoids the overhead of directly polling or collecting performance data from network elements, and it also reduces the overhead of moves and changes as the number of network elements grow.
Standard network performance protocols can also be employed to collect, report and graph performance statistics directly from routers, switches, servers, and firewalls, as well as every application, operating system or process that is generating network traffic. In addition, NetFlow is a push technology that can be employed to gain crucial insight into bandwidth and performance. With NetFlow, the router sends call record information about each flow that crosses it. Each record contains header and volume information about each flow, enabling users to look beyond simple network utilization statistics, and identify who is talking to whom, and who is using the most network resources.
Next-generation performance management
Most tier-one wireless carriers have anywhere between 20,000 and 50,000 cell sites, each one paired with a backhaul router, aggregated to larger back haul routers numbering in the hundreds, and connected to a core/distribution network comprised of a further 25,000 devices. Scalability and the length of time it takes to run reports are therefore key issues in respect of performance management and monitoring. Without a central view, it becomes difficult to identify the source of a spike or drop. Indeed, with LTE/IMS networks, a minor configuration change can have a major impact in terms of traffic flows, dropped connections, and failed authentications. Even something as innocuous as a minor alteration to a load balancer can result in a complete DNS outage.
High level visibility into how the different components/devices and applications are performing within the LTE/IMS infrastructure must therefore be available, in near real time, throughout the enterprise. A non-hierarchical, peer-to-peer configuration of performance management appliances [see Figure 3] creates a PMDB providing a central reporting view of data across the multiple silos of core/distribution, access and edge networks, and at regional and/or national level. Collected data can be available for alerts and reports in seconds, rather than many minutes, and stored locally in the event of an outage to provide ‘back-fill’ performance metrics in monthly reports.
Figure 2: Scalable Peer-to-Peer Solution
And since there could be upwards of 500 users needing to run reports at any one time, a web-based console and intuitive GUI should be available to deliver this ‘single pane’ view. It also allows reports to be customized and created quickly and easily. Security can be assured via login and authentication mechanisms such as RADIUS, TACAS or LDAP (for Active Directory), and managed via user-defined policies meeting the needs of the carrier’s individual business units.
As wireless networks grow, an appliance-based and distributed peer-to-peer platform is able to scale in linear fashion. It means carriers have the ability to provision a new device, such as a router or switch, without adding an entirely new traffic route, SNMP community stream, or access list. Access to a device from the network performance management system can be achieved within an hour or two of the device coming online, compared with a couple of weeks in the case of legacy platforms.
Most importantly, such a model provides the real-time visibility required to proactively monitor and troubleshoot any issue before it adversely affects service delivery.
About SevOne
SevOne, Inc. is the new leader in network performance management. Working with leading mobile carriers to implement the collection of bulk performance statistics from the element management systems of their IMS and LTE equipment providers, and supporting more than 15 performance metric collection methodologies, SevOne monitors all of the components in the data path to build an end-to-end view of the IMS/LTE infrastructure, independent of equipment type. This allows carriers to see not only how their network is performing, but how individual services on the network are performing, for proactive monitoring and troubleshooting of any issue before it becomes service impacting.
SevOne provides a fast, easy, and scalable network appliance that combines granular performance data from NetFlow, SNMP, VoIP, IP SLA, NBAR and WMI into a single view. This enables IT staff to lower their total cost of ownership, assure service levels and increase productivity. For more information go to www.sevone.com