
Whispeak, a Lille publisher specializing in the security of voice interactions, presents two technologies for combating voice fraud using artificial intelligence, a synthetic voice detection engine and an authentication solution using voice biometrics, both of which render a verdict in less than a second. The publisher highlights them as VivaTech approaches, as voice establishes itself as an attack surface for financial services.
Reproducing a voice only requires a short audio sample. A few seconds are enough to generate an imitation realistic enough to fool a human interlocutor or an automated authentication system. Call centers, voicemail and telephone authentication paths thus become entry points for identity theft, circumvention of access controls and disinformation campaigns, continuing the rise of deepfakes as a weapon of social engineering.
Whispeak quantifies the extent of the phenomenon based on its own observations. According to the publisher, fraud exploiting synthetic voices in European financial services increased by 475% in 2025. The company, founded in Lille, highlights a ranking of its models as first in the world by the Hugging Face platform in 2025 and a distinction at the DGA Cyber Challenge 2024. These elements come from the publisher’s communication and have not been independently verified. They nevertheless situate the positioning of a French player in a segment where detection must work in real conditions.
A detection engine for generated voices
The first brick analyzes an audio signal to distinguish a human voice from a synthetic voice. The processing is based on a deep learning model that produces a representation of the signal, optimized to separate the two categories. The publisher indicates that the model has 98.9 million parameters and that it remains capable of recognizing voices produced by generators that appeared after its training phase, a critical point in a market where synthesis models are constantly evolving.
The result is returned in less than a second, along with a confidence score. The model is designed to process any type of audio, including the telephone channel, a degraded format that constitutes the real-world terrain of production calls. This constraint of the telephone distinguishes operational use from a laboratory demonstration, because the reduced quality of the signal complicates the detection of artifacts of a synthetic voice. An engine calibrated on studio audio actually sees its performance drop when it processes the compressed flow of an incoming call, a situation which nevertheless corresponds to almost all of the fraud observed in customer relations centers.
Voiceprint authentication
The second technology relates to authentication. When first used, the user speaks for a few seconds and the model generates a unique voiceprint. With each subsequent interaction, the voice is compared in real time with the reference fingerprint and the system renders a verdict, verified or not, in less than a second. The device also recognizes fraud attempts, whether it involves a synthetic voice or a replayed recording rather than a person speaking live.
For a direction of security, the articulation of the two bricks draws a two-stage vocal defense, authentication verifying the identity of the declared speaker, detection signaling the synthetic nature of a signal. The scope of such a device, however, remains limited by the dynamics specific to the field. Detection trained on known generators faces a continuous flow of new models, and the advantage is measured over time rather than at a given moment. The confidence score, which is not a binary certainty, also requires calibrating a decision threshold adapted to the level of risk of each course.
An assumed sovereign positioning
The deployment mode constitutes the differentiating argument for sensitive environments. Beyond the programming interface and hosting, Whispeak offers on-site installation and deployment completely isolated from an external network. This option meets the requirements of sectors where voice data cannot leave the controlled perimeter, notably banking, insurance and defense, and where dependence on a third-party service hosted outside the control of the organization raises reservations.
The publisher’s profile extends this argument. A French player, Whispeak does not expose its clients to extraterritorial access regimes which weigh on service providers subject to American law, a dimension that compliance managers now include in the evaluation of a security brick. For financial institutions subject to the DORA regulation, which requires control of operational resilience and the service provider chain, the option of deployment controlled from end to end by a European supplier adds a favorable criterion. Voice biometrics also remains sensitive personal data within the meaning of the GDPR, the collection and conservation of which require a legal basis and specific guarantees, which the organization remains required to define, regardless of the supplier chosen.
Whispeak will present its solutions at the VivaTech show on June 17 and 19. The issue he highlights goes beyond the performance of a model. Restoring trust in voice, while it still serves as proof of identity in many journeys, requires tools capable of being maintained in production and governance that defines the place of detection in the chain of control. Technology provides a signal, the decision to grant or deny access remains a responsibility of the organization.





