Cover Image for Google presents an artificial intelligence to communicate with dolphins.
Tue Apr 15 2025

Google presents an artificial intelligence to communicate with dolphins.

DolphinGemma, developed by Google, is an open-source model created to interpret the sounds made by dolphins.

Scientists at Google DeepMind have created an innovative artificial intelligence (AI) model that has the potential to advance research on dolphin communication and, in the future, facilitate tools for interactive communication with this species of cetaceans. The scientific community has been working for years to understand how dolphins relate to one another, as they are considered one of the most intelligent animals in the world. Previous research has shown that these marine mammals communicate through clicks, whistles, gestures, and vibrations in their skin, a method not very different from that of humans.

A particular case involves a bottlenose dolphin that moved to a remote area away from its group in the Baltic Sea. Scientists in Denmark suggested that its whistles might be aimed at calling its pod, although this hypothesis is still debatable. The Google team posed a crucial question: what would happen if researchers could not only listen to dolphin sounds but also interpret them with such precision that they could generate responses that are understandable and natural?

With this idea in mind, DolphinGemma was born, an open-source AI model designed to analyze dolphin vocalizations and eventually produce sounds that facilitate meaningful interspecies interaction. This model was developed in close collaboration with researchers from the Georgia Institute of Technology and uses recordings collected by the Wild Dolphin Project (WDP), which has documented the natural communication and social interactions of Atlantic spotted dolphins (Stenella frontalis) in Bahamian waters since 1985.

The Google team explains that underwater recordings have allowed them to directly link sounds with specific behaviors, something that could not have been achieved through surface observation alone. This non-invasive approach has generated a unique database, which includes decades of video and audio recordings, associated with the identity of the dolphins, their life histories, and observed behaviors.

DolphinGemma was trained using the vast acoustic database of the WDP. Its operation is based on optimized versions of Gemma, the family of AI models that support part of the technology behind Google's Gemini algorithms. With a structure of approximately 400 million parameters, the model can operate on mobile devices, such as Google Pixel smartphones.

According to the company, DolphinGemma functions as a system that processes natural dolphin sounds to identify patterns, structure information, and predict future vocalizations, similar to how large language models anticipate the next word in a sentence. Starting from the next fieldwork season, DolphinGemma will be employed to detect common sound patterns and coherent sequences in dolphin communication.

Scientists believe this model can accelerate the understanding of structures and meanings in dolphin communication, which traditionally required extensive human time and effort. In the long run, it is expected that the identified patterns, along with synthetic sounds created to represent common objects in dolphins’ lives, will serve as a basis for developing a shared vocabulary that facilitates bidirectional interaction with humans.

In parallel, a system called CHAT (Cognitive Hearing Augmentation for Cetaceans) is being developed, also supported by the Georgia Institute of Technology. This device is designed not only to understand dolphin language but also to create a more accessible communication system. The goal is to associate synthetic whistles with objects of interest to dolphins, such as sargassum or seagrass, encouraging them to imitate these sounds to request objects.

For smooth communication between dolphins and humans, CHAT must perform complex real-time tasks, such as listening to dolphin vocalizations and responding quickly by identifying what has been requested. DolphinGemma optimizes CHAT's function by anticipating and identifying imitations before they conclude, thus facilitating more natural interactions.

Much of this work is carried out using a Google Pixel 6, which enables high-fidelity acoustic analysis. This device simplifies maintenance and reduces energy consumption, which is crucial for operating in challenging marine environments. The next generation of the project is expected to launch in the summer, using the new Google Pixel 9, which will feature improved speakers and microphones, as well as greater processing capabilities.

Although DolphinGemma has been trained with vocalizations from Atlantic spotted dolphins, developers foresee that the model could be adapted to study other species, such as bottlenose dolphins or spinner dolphins, by adjusting the parameters to the acoustic characteristics of each type of dolphin. Despite the long road ahead to fully understand dolphin communication, the collaboration between the WDP, the Georgia Institute of Technology, and Google is opening up new and exciting opportunities to bridge the communicative gap between humans and dolphins.