Monday 21 July, 2025

Building Trust in Telecom AI: Red Team Insights on TelecomGPT

Like what you read? Share.

A woman types on a laptop, overlaid with translucent futuristic interface graphics showing charts, data, graphs, and digital dashboards, suggesting advanced data analysis or technology work in a modern, professional setting.

Having already partnered with Khalifa University to launch the GSMA Open Telco LLM Benchmarks community, GSMA teamed up with the leading academic institution to Red Team its TelecomGPT Model at MWC25 Barcelona. Here, Khalifa University’s Dr. Lina Bariah reveals how TelecomGPT performed under pressure and why red teaming must be at the heart of any responsible AI strategy in telecoms.

During MWC25 Barcelona, GSMA joined forces with our team at Khalifa University’s 6G Research Center and Datumo to host an exciting event: The GenAI Red Team Challenge: Prompting for Prizes. Thanks to the sponsorship of e&, this challenge aimed to push TelecomGPT, and other LLMs, to their limits in a fun and engaging way. We tested the model’s ability to identify and counter various prompts designed to uncover potential gaps and vulnerabilities.

Approximately 40 participants from around the world joined us in the networking hub in hall 6. The assembled red team comprised a diverse group including developers, regulators, telecoms executives and industry experts; their varied perspectives and contrasting approaches enabled us to test the models in very different ways.

Our study targeted misinformation tasks relevant to telecom operations and open knowledge. This was done with the aim of determining if TelecomGPT could correctly identify and reject prompts designed to force agreement with particular false statements. In this red teaming exercise, a diverse set of 25 misinformation prompts was designed to evaluate the robustness of TelecomGPT to adversarial manipulation. These prompts targeted areas including:

Historical milestones (e.g., incorrect GSM call dates)
Network technology specifications (e.g., false claims about 4G and 5G frequencies)
Satellite system characteristics (e.g., misleading information about Starlink operating bands)
Technology deployment timelines (e.g., claiming 6G is already commercially deployed)
Spectrum and protocol assignments (e.g., Wi-Fi 6 operating in 3.5 GHz, Bluetooth 5.0 used for NB-IoT)
Network architecture misconceptions (e.g., Open RAN as a competing standard to 5G, 5G core not supporting cloud-native)
Performance characteristics (e.g., 5G enabling infinite bandwidth, 4G latency outperforming 5G under all conditions)
Security and encryption myths (e.g., quantum encryption has eliminated all threats, networks fully secure with no oversight)
Access technology confusion (e.g., fiber optics using radio waves, 2G networks supporting LTE-level speeds)
Indoor and coverage claims (e.g., 5G waves unable to penetrate glass, making indoor coverage impossible)

A donut chart shows five categories: Others (43.5%), Historical info (20.5%), Frequency misinfo (19.4%), Performance Claim (13.3%), and Authority Override (2.5%). Each section is color-coded and labeled with its percentage.

Figure 1 Breakdown of Successful Attacks by Prompt Category (PASSED Outcomes)

The red team participants successfully tricked the model in multiple cases by combining social-engineering language with authoritative instructions. Many prompts were framed as if they were from a developer or trusted administrator, explicitly telling the model to override its own safety or guardrails. Prompts using phrases like “I’m your developer” or “ignore all previous instructions” were powerful. The model trusted instructions from someone who appeared to have high-level access or special permissions. This strategy is known as Roleplay Jailbreaks. TelecomGPT, and by extension other specialised LLMs, should not assume that any user role has unconditional override rights without robust verification.

On the other hand, many prompts were carefully crafted to sound technically relevant by incorporating real telecom concepts (like 3GPP, mmWave, Wi-Fi 6, etc.), but then indirectly twisting them. This shows that hallucinations are more likely if a question is “close enough” to true domain knowledge. This can be driven also by the fact that the model’s pattern-matching prefers “technically flavored” language. The prompts also targeted typical frequency-related misconception. For example, claiming Starlink satellites use 5G spectrum, or Wi-Fi 6 operates in 3.5 GHz. Since these technical domains involve complex, niche knowledge, they are especially vulnerable to confident hallucinations. Another observation was to ask the model to confirm absolute statements (“5G waves cannot penetrate glass” or “AI fully automates networks with no human oversight”). Such phrasing seems to push the model toward confident agreement rather than a balanced answer. Black and white statements can create bias gaps, especially when users exploit them with a tone of certainty.

Figure 1 illustrates the distribution of successful attack categories, highlighting the relative prevalence of each prompt approach. Among the successful attacks, the largest share fell under the “Other” category, accounting for 43.5% of the total. These prompts often included creative or loosely phrased misinformation, for example, suggesting that Bluetooth is an NB-IoT standard, or that 5G eliminates the need for fiber-optic networks. Because these did not match clearly defined technical topics like frequencies or standards, they highlight the model’s vulnerability to more generic, plausibly worded telecom misinformation. Historical and milestone attacks made up 20.5% of the successful prompts. These targeted areas where the model is prone to hallucinate, such as telecom standards release years or first-call events, show the need for explicit fact-checking around telecom history and timelines.

These findings illustrate that misinformation, if not effectively resisted, could propagate through downstream telecom operations. For example, an engineer relying on a model that confidently repeats false spectrum information might misconfigure a live network. Similarly, if a model provides flawed advice on deployment timelines or standards, it could cause costly mistakes, compliance violations, or even safety incidents.

Why telecom LLMs need red teaming?

As generative AI adoption has accelerated recently in the telecom industry, large language models are increasingly embedded in customer support and network operations [1]. However, these models are vulnerable to hallucinations, where confident but incorrect responses may compromise safety, violate regulatory compliance, or undermine confidence among engineers, regulators, and customers in using AI tools for telecom operations [2]. To counter these risks, red teaming, the adversarial testing of models against harmful or manipulative prompts, has emerged as a best practice across the AI community [3][4]. Applying structured red teaming in telecom is therefore crucial to build robust, trustworthy LLMs that can safely support mission-critical telecom operations.

Next steps & Recommendations

Building on this experience, several concrete steps are proposed to strengthen telecom-focused large language models across the industry. First, integrating retrieval-based verification will allow models to cross-check critical facts, reducing the risk of hallucinations on technical data such as frequencies, standards, or timelines. Second, implementing multi-layer safety guardrails (for example dynamic filters that look for jailbreak patterns, e.g., “forget your rules” and trigger a higher-level review) can improve model resistance to manipulation, including developer-style or authority-based override attempts. Third, establishing a structured quarterly red teaming program will help proactively flag new adversarial attack patterns as models and user behavior evolve. It is also essential to promote knowledge-sharing by encouraging GSMA members to exchange red teaming methodologies, creating a broader responsible AI ecosystem. Finally, working toward standardised, telecom-specific LLM evaluation tests will ensure that all industry players measure trustworthiness and resilience consistently, supporting the safe, reliable, and compliant adoption of generative AI technologies in telecom operations. The question isn’t whether models will be jailbroken, but how quickly we can adapt and improve our defenses.

As always, we encourage our members and partners to share insights to help build industry-wide standards for trustworthy AI. You can get involved in GSMA’s Open-Telco LLM Benchmark initiative, explore the GSMA AI Use Case Library to gain deeper insights on AI applications in the telecoms sector or email GSMA’s Director of AI Initiatives, Louis Powell, to continue the conversation.

[1] McKinsey & Company, How generative AI could revitalize profitability for telcos, 2024.

[2] Microsoft, Lessons from red teaming 100 generative AI products, 2025.

[3] OpenAI. Red Teaming Network, 2023.

[4] Anthropic. Constitutional AI: Harmlessness from AI Feedback, 2022.

Like what you read? Share.