Generative AI for Network Operations Centers

Plan, build, and operate telecommunications networks securely with generative AI.

Workloads

Generative AI

Industries

Telecommunications

Business Goal

Risk Mitigation

Products

NVIDIA AI Enterprise
NVIDIA NIM Microservices
NVIDIA NeMo

Generative AI Accelerates Network Configuration, Deployment, and Operation

Telecom companies spent an estimated nearly $295 billion in capital expenditures (CapEx) and over $1 trillion in operating expenditures (OpEx) in 2024, including spending on manually based processes for network planning and maintenance. In a telecom network, configuration and optimization involve managing a vast number of interdependent parameters that directly affect network performance, user experience, and spectrum efficiency for millions of customers and end users. These settings need constant tuning from telecom network engineers based on time of day, user behavior, mobility, interference, and service types. 

Generative AI, powering large telco models (LTMs) and AI agents enable the next generation of AI in network operations, supporting telecom companies to optimize their OpEx, use their CapEx efficiently, and unveil new opportunities for monetization. NVIDIA developed an agentic AI solution to bring autonomy into this dynamic environment by observing real-time network KPIs, making data-driven decisions, and automatically adjusting parameters. 

Unlike traditional rule-based systems, an AI agent can perceive, reason through complex trade-offs, learn from feedback loops, and adapt to new conditions with human-in-the-loop feedback added as needed. It can also orchestrate changes across multiple layers and multiple vendors, enabling coordinated actions like load balancing, inter-cell interference coordination, or power saving in lightly loaded areas. This level of autonomous control not only improves efficiency and quality of service (QoS), but also reduces operational complexity and time-to-resolution for issues in dense, high-demand environments. 

 

Boosting Network Performance and Efficiency With Accelerated Computing

Global telecommunications companies are exploring how to cost-effectively deliver new AI applications to the edge over 5G and upcoming 6G networks. With NVIDIA accelerated computing and AI, telcos, cloud service providers (CSPs), and enterprises can build high-performance cloud-native networks—both fixed and wireless—with improved energy efficiency and security. 

The NVIDIA AI Foundry for Generative AI

The NVIDIA AI foundry—which includes NVIDIA AI Foundation models, the NVIDIA NeMo™ framework and tools, and NVIDIA DGX™ Cloud—gives enterprises an end-to-end solution for developing custom generative AI. 

Amdocs, a leading software and services provider, plans to build custom large language models (LLMs) for the $1.7 trillion global telecommunications industry using the NVIDIA AI foundry service on Microsoft Azure. In network operations, Amdocs and NVIDIA are exploring ways to generate solutions that address configuration, coverage, and performance issues as they arise, including:  

  • Building a generative AI assistant to answer questions around network planning
  • Providing insights and prioritization for network outages and performance degradations
  • Optimizing operations by using generative AI to monitor, predict, and resolve network issues, manage resources in real time​, monitor network diagnostics, analyze service and user impact, prioritize impact-based recommendations, and execute orchestration activation

 

ServiceNow is integrating generative AI capabilities into their Now Platform and enriching all workflows with Now Assist, their generative AI assistant. ServiceNow leverages NeMo and NVIDIA Triton™ Inference Server (both part of NVIDIA AI Enterprise), NVIDIA AI Foundation models, and DGX systems to build, customize, and deploy generative AI models for telecom customers. These include use cases in network operations:

  • Automated service assurance: Distill and act on volumes of complex technical data generated from network incidents​ and summarized by generative AI.
  • Streamlined service delivery​: Dynamically create order tasks with generative AI to reduce human errors, ensure accurate service delivery, and improve customer satisfaction and loyalty.
  • Optimized network design: Manage diverse network services, local configurations, and rulings to improve network design.

 

NeMo provides an end-to-end solution—including enterprise-grade support, security, and stability—across the LLM pipeline, from data processing to training to inference of generative AI models. It allows telcos to quickly train, customize, and deploy LLMs at scale, reducing time to solution while increasing return on investment.

The NVIDIA AI foundry includes NVIDIA AI Foundation models, the NeMo framework and tools, and NVIDIA DGX™ Cloud , giving enterprises an end-to-end solution for creating custom generative AI models.

Once generative AI models are built, fine-tuned, and trained, NeMo enables seamless deployment through optimized inference on virtually any data center or cloud. NeMo Retriever, a collection of generative AI microservices, provides world-class information retrieval with the lowest latency, highest throughput, and maximum data privacy, enabling organizations to generate insights in real time. NeMo Retriever enhances generative AI applications with enterprise-grade retrieval-augmented generation (RAG), which can be connected to business data wherever it resides.

NVIDIA DGX Cloud is an AI-training-as-a-service platform, offering a serverless experience for enterprise developers that’s optimized for generative AI. Enterprises can experience performance-optimized, enterprise-grade NVIDIA AI Foundation models directly from a browser and customize them using proprietary data with NeMo on DGX Cloud.

NVIDIA AI Enterprise for Accelerated Data Science and Logistics Optimization

The NVIDIA AI Enterprise software suite enables quicker time to results for AI and machine learning initiatives, while improving cost-effectiveness. Using analytics and machine learning, telecom operators can maximize the number of completed jobs per field technician​, dispatch the right personnel for each job, dynamically optimize routing based on real-time weather conditions​, scale to thousands of locations​, and save billions of dollars in maintenance.

AT&T is transforming their operations and enhancing sustainability by using NVIDIA-powered AI for processing data, optimizing fleet routing, and building digital avatars for employee support and training. AT&T first adopted the NVIDIA RAPIDS™ Accelerator for Apache Spark to capitalize on energy-efficient GPUs across their AI and data science pipelines. Of the data and AI pipelines targeted with Spark RAPIDS, AT&T saves about half of their cloud computing spend and sees faster performance, while reducing their carbon footprint.

AT&T, which operates one of the largest field dispatch teams, is currently testing NVIDIA® cuOpt™ software to to handle more complex technician routing and optimization challenges. In early trials, cuOpt delivered solutions in 10 seconds, while the same computation on x86 CPUs took 1,000 seconds. The results yielded a 90 percent reduction in cloud costs and allowed technicians to complete more service calls each day.

Quantiphi, an innovative AI-first digital engineering company, is working with leading telcos to build custom LLMs to support field technicians​. Through LLM-powered virtual assistants acting as copilots, Quantiphi is helping field technicians resolve network-related issues and manage service tickets raised by end customers.

“Ask AT&T was originally built on OpenAI’s ChatGPT functionality. But Ask AT&T is also interoperable with other LLMs, including Meta’s LLaMA 2 and the open-source Falcon transformers. We’re working closely with NVIDIA to build and customize LLMs. Different LLMs are suited for different applications and have different cost structures, and we’re building that flexibility and efficiency in from the ground floor.”

Andy Markus, Chief Data Officer, AT&T

Getting Started With Generative AI for Network Operations

NVIDIA AI Blueprints enable scalable automation by providing a workflow that developers can use to create their own AI agents. With these, developers can build and deploy custom AI agents that can reason, plan, and take action to quickly analyze large amounts of data, summarize, and distill real-time insights. 

The NVIDIA AI Blueprint for telecom network configuration delivers validated building blocks for network operations across multiple domains. This AI Blueprint enables developers, network engineers, telecom companies, and vendors to automate configuration of radio access network (RAN) parameters using an agentic LLM-driven framework. 

Autonomous networks provide an opportunity to better manage OpEx. The AI Blueprint for telco network configuration facilitates this by providing a modular AI architecture and automation workflows needed for consistent, scalable deployments. Powered by generative AI, this AI Blueprint enables network engineers to add adaptive intelligence by predicting issues, optimizing performance, and automating decisions.

BubbleRAN and Telenor Adopt NVIDIA AI Blueprint for Telco Network Configuration

The AI Blueprint for telco network configuration is powered by BubbleRAN’s software on a cloud-native infrastructure which can be utilized for building autonomous networks at scale along with their multi-agent RAN intelligence platform. 

Telenor Group, which serves over 200 million customers globally, plans to deploy the AI Blueprint for telco network configuration to address configuration challenges and enhance QoS during network installation.

Implementation Details

This agentic LLM-driven framework utilizes Llama 3.1-70B-Instruct as the foundational AI model, due to its robust performance in natural language understanding, reasoning, and tool calling. 

Customers have the flexibility to deploy this Blueprint via: 

  • NVIDIA's hosted NIM™ microservices API endpoints at build.nvidia.com
  • On-premises NIM microservices to meet privacy and latency requirements

End users interact through a Streamlit-based user interface (UI) to submit their queries or initiate network operations. These queries are processed by a LangGraph agentic framework, which orchestrates the specialized LLM agents. 

The LLM agents are equipped with specialized tools that allow them to generate and execute SQL queries on both real-time and historical KPI data, calculate weighted average gains of the collected data, apply configuration changes, and handle the BubbleRAN environment. 

We leverage prompt-tuning to inject contextual knowledge about the BubbleRAN network architecture, including the setup details and the interdependencies between various KPIs and the logic for balancing trade-offs to optimize weighted average gains. 

The LangGraph-powered agentic framework orchestrates three specialized agents, each with distinct responsibilities that work together to close the loop of monitoring, configuration, and validation. Once the user initializes the network with selected parameters, they can choose between a monitoring session with a monitoring agent or directly query the configuration agent to understand parameter impacts and network status.

Below is a breakdown of each agent and their functionality: 

 1.Monitoring Agent
This agent continuously tracks the average weighted gain of preselected parameters in user-defined time intervals (default: 10 seconds) on a real-time BubbleRAN KPI database. When it detects performance degradation due to reduction in weighted average gain of a specific parameter, it raises the issue to the user for authorization of the next step.

2. Configuration Agent
The configuration agent can be activated by the monitoring agent’s hand-off or direct user queries about parameter optimization or network health. It analyzes historical data, then reasons through the analyzed trends and domain-specific knowledge of parameter interdependencies and trade-offs. Based on its analysis, it suggests improved parameter values to the user and waits for user confirmation.

3. Validation Agent
Once parameter adjustments are confirmed, the validation agent restarts the network with the new parameter configuration. It evaluates the updated parameters over a user-configurable validation period and calculates the resulting average weighted gain. If the real-time average weighted gain deteriorates further, it automatically rolls back to the previous stable configuration. Otherwise, it confirms success and updates the UI with the new settings. 

In summary, our framework enables continuous, intelligent network optimization through an agentic loop, where specialized LLM agents work together to monitor, analyze, and validate parameter changes in real time. Equipped with tools to analyze real-time and historical KPI data, and with domain-specific knowledge of network parameters and trade-offs, these agents provide data-backed recommendations and explainable reasoning. This closed-loop design ensures that network performance remains autonomous yet user-controllable, empowering users to maintain optimal performance while retaining control on every decision point.

For more technical details, explore the Blueprint card.

NVIDIA NIM

NVIDIA NIM, part of NVIDIA AI Enterprise, is an easy-to-use runtime designed to accelerate the deployment of generative AI across your enterprise. This versatile microservice supports open community models and NVIDIA AI Foundation models from the NVIDIA API catalog, as well as custom AI models. NIM builds on NVIDIA Triton™ Inference Server, a powerful and scalable open source platform for deploying AI models, and is optimized for large language model (LLM) inference on NVIDIA GPUs with NVIDIA TensorRT-LLM. NIM is engineered to facilitate seamless AI inferencing with the highest throughput and lowest latency, while preserving the accuracy of predictions. You can now deploy AI applications anywhere with confidence, whether on-premises or in the cloud.

NVIDIA NeMo Retriever

NeMo Retriever is a collection of CUDA-X microservices enabling semantic search of enterprise data to deliver highly accurate responses using retrieval augmentation. Developers can use these GPU-accelerated microservices for specific tasks including ingesting, encoding, and storage large volumes of data, interacting with existing relational databases, and searching for relevant pieces of information to answer business questions.

Generative AI can analyze large volumes of data from equipment sensors to predict potential failures or issues. This helps technicians anticipate problems before they occur, allowing for timely maintenance and minimizing downtime.

Generative AI-driven analytics provide technicians with actionable insights and recommendations based on real-time data. This allows them to make informed decisions regarding repairs, upgrades, and network optimization.

Generative AI can automate repetitive and routine tasks, such as generating work orders, scheduling appointments, and creating documentation. This allows technicians to focus more on complex issues and customer service.

Optimize Network Operations With Generative AI

By leveraging NVIDIA AI, telecommunications companies can reduce network downtime, increase field technician productivity, and deliver better quality of service to customers. Get started by reaching out to our team of experts or exploring additional resources.

Resources

Generative AI in Practice: Examples of Successful Enterprise Deployments

Learn how telcos built mission-critical LLMs, powered by NVIDIA DGX systems and the NeMo framework, to simplify their business, increase customer satisfaction, and achieve the fastest and highest return.

Part 1: A Beginner's Guide to Large Language Models

Get an introduction to LLMs and how enterprises can benefit from them.

Part 2: How LLMs Are Unlocking New Opportunities for Enterprises

Explore how traditional natural language processing tasks are performed by LLMs, including content generation, summarization, translation, classification, and chatbot support.

Architecture diagram of NVIDIA AI Blueprint for telco network configuration.