Britain Is Running 21st-Century Voice AI Over a 1972 Phone Line

Cloudax explores why the UK’s PSTN switch-off may not solve underlying voice quality challenges, and what that could mean for the future performance of voice AI applications.

The PSTN switch-off is not the end of Britain’s narrowband problem. It is barely the beginning. The UK is rebuilding its telephony backbone around a 1972 codec – and every voice AI system in the country will pay the cost.

Three Numbers From a Network Designed for Rotary Phones

Britain is building world-class voice AI and routing all of it through a frequency band engineered for a copper exchange in the year The Godfather was released.

The cost is paid by every business and every citizen who will speak to a machine on the phone in the next decade.

1972

YEAR THE UK’S DEFAULT VOICE CODEC WAS SPECIFIED

G.711 — written for copper telephone exchanges and Bakelite handsets — is still the lowest common denominator on UK telephony in 2026, even after the All-IP migration.

3.4 kHz

AUDIO CUTOFF NARROWBAND TELEPHONY ENFORCES

Most consonant energy that modern speech models are trained to hear sits between 1.5 and 8 kHz. A PSTN-grade pipe deletes most of it before any AI ever sees the signal.

10-20%

WORD ERROR RATE INCREASED ON NARROWBAND

Industry benchmarking shows G.711 telephony audio drives a 10–20% rise in word error rate compared with wideband audio of the same speech.

A Network Designed for Bakelite Handsets

The codec is called G.711. It samples the human voice at 8,000 times per second and discards everything above roughly 3.4 kHz.

That standard was written when the average British home had a single rotary phone wired into a wooden box on the wall.

It is still, in 2026, the de facto codec of the Public Switched Telephone Network and the lowest common denominator of UK telephony.

By Openreach’s own data, it is the format your call will most likely be transcoded into the moment it touches a carrier interconnect — even if your softphone, your headset and your handset are all capable of something dramatically better.

Britain is now doing something genuinely strange. We are building world-class voice artificial intelligence — large language models that reason in real time, transcribers trained on hundreds of thousands of hours of high-fidelity speech, neural text-to-speech that breathes — and then routing all of it through a frequency band engineered for the bandwidth of a copper telephone exchange.

The PSTN Switch-Off Solves the Wrong Problem

The headline reform is simple. The copper plant is genuinely failing — Ofcom logged a 45% rise in significant PSTN resilience incidents in 2024. The trouble is what the migration is actually delivering.

After multiple delays, the industry-led deadline is now the end of January 2027. Openreach has confirmed the date is “locked in.”

Stop-sell on legacy lines has been in force across most exchanges since 2023, and price escalators of 20%, then 40%, are pushing remaining customers off the network through 2026.

The replacement for PSTN is not a wholesale upgrade in voice quality. It is the same call, in the same narrowband 8 kHz format, transported over IP instead of copper.

Most All-IP voice services in the UK still default to G.711 — the same 1972 codec — wrapped in a SIP packet. The wires change. The audio does not.

Wideband alternatives exist. G.722 doubles the sampling rate to 16 kHz. EVS goes up to 20 kHz and covers the full range of human hearing.

Opus — used in WhatsApp, Teams, Google Meet — adapts dynamically up to 48 kHz. None of them is the default on UK SIP trunks.

The instant a call traverses a PSTN gateway, an interconnect or a number-translation service, it is transcoded back down to G.711 and the upgrade evaporates.

The Codecs the Rest of the World Moved To

None of these are theoretical. Every one is standardised, deployed in production, and supported by every modern endpoint. The UK simply does not negotiate them as the default on the network that replaces PSTN.

CODEC	YEAR	SAMPLE	BANDWIDTH	IN PRODUCTION
G.711	1972	8 kHz	≤ 3.4 kHz	UK PSTN default. Still the de facto codec on All-IP voice.
G.722	1988	16 kHz	≤ 7 kHz	Wideband. Available on every modern SIP endpoint — rarely the default.
AMR-WB	2001	16 kHz	≤ 7 kHz	Mobile HD Voice. Standard on 3G+ networks since the early 2010s.
EVS	2014	up to 48 kHz	≤ 20 kHz	Superwideband for VoLTE / VoNR. Native on 5G voice.
Opus	2012	up to 48 kHz	Full hearing range	Default on WhatsApp, Microsoft Teams, Google Meet, and most web voice.

What 8 kHz Actually Costs You

The technical case against narrowband telephony is not aesthetic. It is mathematical — and it is brutal on the machine listeners now sitting on the other end of every line.

Human speech contains useful acoustic information well above the 3.4 kHz cutoff that G.711 enforces. Fundamental frequencies — the pitch that lets you recognise your mother on the phone — sit between 85 and 250 Hz. Vowels carry the energy from 350 Hz to 2 kHz. So far, so PSTN-compatible.

The consonants — the sibilants, fricatives and unvoiced stops — live between 1.5 kHz and 4 kHz, with significant energy extending to 8 kHz and beyond.

These are the sounds that distinguish “fifteen” from “sixteen,” “Smith” from “Smyth,” “F” from “S” from “Th.” They contain about 5% of the power of the voice and contribute roughly 60% of its intelligibility.

Modern speech recognition models — Whisper, Deepgram Nova, Google’s Chirp, the in-house models running inside every major voice AI platform — are trained almost entirely on 16 kHz audio. Whisper requires 16 kHz input by design.

Feed it 8 kHz telephony and you are upsampling lossy data and presenting the model with a frequency distribution it has never seen during training.

Engineers building voice agents on Twilio routinely report 50% accuracy on otherwise-clear speech the moment the call hits an 8 kHz codec, when the same speech transcribes flawlessly through a browser microphone. This is not a problem a smarter prompt or a bigger model will solve. It is a physics problem.

Easy to Miss in a Slide Deck. Ruinous in Production

Narrowband audio degrades voice AI on at least six axes — individually annoying, jointly catastrophic. The economics of these failures is not theoretical; it is paid every minute, on every misrouted call, in every regulated industry trying to deploy a voice agent on the existing network.

Hallucinations – Lossy codecs introduce metallic, bubbly noise that AI models can interpret as either background or actual speech. A speaker says nothing; the model invents a word. Every transcription error then feeds the LLM, which has to reason about a corrupted input.
Turn-taking collapse – Voice agents detect when a caller has stopped talking using high-frequency acoustic cues. Strip those out and VAD becomes unreliable — the agent interrupts, waits too long, or talks over the caller.
Speaker ID degrades – Wideband audio is roughly twice as effective as narrowband for speaker recognition systems, which matters for fraud detection, voice biometrics and consent verification.
Transcoding loss – An AI agent generates Opus at 48 kHz, transcoded to G.711 at the SIP gateway, transcoded again to AMR-WB or EVS at the mobile handover, and arrives as a multiply-degraded approximation.
Multilingual & accented discrimination – Narrowband filtering compounds existing model weaknesses on non-native English, regional dialects and tonal languages. The high-frequency cues that disambiguate similar phonemes are exactly what gets discarded first.
The economic tail – Inaccurate transcription drives misrouted IVR calls, 3–5 minute handle-time inflation, ~30% callback rates in voicemail-to-text, and ~60% false-positive rates in sentiment pipelines. When a contact centre quotes a 12% AI deflection rate where their competitor quotes 35%, the answer is very often not the model — it is the codec.

Why the UK Position Is Uniquely Bad

Britain is not alone in running narrowband PSTN. We are, however, in a worse position than most of our peers, for reasons that are entirely a function of policy choices.

The UK migration has been delayed twice, from December 2025 to January 2027, primarily because of unresolved questions about vulnerable users on telecare devices.

Those concerns are legitimate. The 1.8 million Britons relying on personal alarms over copper lines deserve to be migrated safely.

But the delay has been used by the industry as an excuse to sequence the cheapest possible upgrade rather than the right one.

The new digital service is a like-for-like replacement of voice quality, not an upgrade — and the regulatory framework treats audio quality as essentially a consumer-preference matter.

UK vs. Peer Countries

Country	Position	Status
Estonia	PSTN switched off in 2017.	AHEAD OF UK
Netherlands	Switched off shortly after Estonia.	AHEAD OF UK
France	AMR-WB on mobile networks since 2010.	AHEAD OF UK
Germany	Measurably ahead of the UK on ALL-IP migration	AHEAD OF UK
United States	Wideband mobile largely complete; EVS rolling 5G	AHEAD OF UK
United Kingdom	Switch-off delayed to Jan 2027. Replacement is like-for-like on audio	BEHIND

The Strategic Error

A telephony backbone built around a 1972 codec spec, in the same year the UK names AI a national priority.

Voice AI is the single most consequential application of language models for the British economy. Contact centres employ around 1.3 million people in the UK.

Public services handle hundreds of millions of inbound calls per year. The opportunity is enormous — and it is bounded above by the fidelity of the audio.

Three Things Need to Happen, and the Window Is Closing

The 2027 switch-off is being managed as a network engineering project. It should be managed as a national infrastructure upgrade with explicit voice-quality minimums — because what gets specified now will be the baseline for the next thirty years.

Mandate wideband as the All-IP default – Ofcom should require every IP-based voice service marketed as a PSTN replacement to negotiate G.722 or better as its preferred codec, with G.711 permitted only as a legacy interconnect fallback. Every modern SIP endpoint already supports it. The barrier is commercial inertia, not engineering difficulty.
Set a roadmap to superwideband – EVS is already standardised, deployed in mobile networks worldwide, and represents the natural endpoint for voice quality on IP. The UK should set a target — 2030 is achievable — for EVS to be the default codec on all new business voice services.
Treat telephony audio as critical infrastructure – When an NHS triage line, a 101 service or a benefits hotline is bottlenecked by a 1972 codec preventing modern voice AI from understanding callers, that is the same kind of infrastructure failure as a fire-alarm line going down — just slower-moving and more pervasive.

A Ceiling on the Whole UK Voice AI Sector

The audio that arrives at any speech model is set at the network edge, not at the application layer. Every voice AI product running on UK telephony shares the same ceiling — and lifting it is a national infrastructure question, not a vendor one.

Whichever voice AI sits on the other end of the line — Cloudax, a hyperscaler’s voice agent, another specialist platform, or a contact-centre incumbent’s AI module — every one of them inherits the same audio. The carrier decides what survives the codec; the application layer inherits whatever is left.

That makes this a uniquely tractable problem. A wideband-default UK network would lift the floor of every voice AI deployment in the country at the same time, with no coordination required at the application layer.

The same speech-to-text model that runs at 88% accuracy on G.711 routinely runs at 95%+ on G.722 — for free, the moment the network stops downgrading the call.

Voice quality is a network-level property. It is determined by what carriers offer, what regulators require, and what defaults are written into the standards Britain adopts during the PSTN migration.

The decisions being made in 2026 will set that ceiling for the next generation — for every voice AI product built in the UK, every public service that buys one, and every citizen who speaks to one.

Out of the 20th Century of Copper. Into the 20th Century of IP.

The PSTN switch-off should have been a chance to move Britain into the 21st century of voice. Right now, it is a chance to move us out of the 20th century of copper and into the 20th century of IP, with the audio quality unchanged. That is not enough. It has never been enough.

As voice AI becomes the primary interface between citizens, public services and the businesses they rely on, the cost of pretending otherwise will be measured in misunderstood callers, failed automation, lost productivity, and a country that built world-class AI and then crippled it at the wire.

Author: Cloudax
Reviewed by: Jo Robinson

Published On: 16th Jun 2026 - Last modified: 17th Jun 2026
Read more about - Guest Blogs, Cloudax