introduction
In 2024, a new AI-based technique was developed to help people who struggle with
speech, such as those recovering from tongue cancer or neurological conditions. The
primary aim of this research is to create Personalized Synthetic Voices, which means
generating a voice that closely matches the person’s natural voice using AI technology.
This system helps people who have lost their original voices, offering them a way to
regain their speech and identity.
2. Problem
Many individuals who have undergone surgeries (e.g., glossectomy, where part of the
tongue is removed) or experienced neurological events like strokes, find it difficult to
speak clearly. Traditional text-to-speech systems generate robotic, mechanical voices
that do not carry the person’s emotions or identity. This lack of personal connection
can leave patients feeling isolated and disconnected.
3. Methodology
Researchers have developed an advanced AI model that employs two specialized
techniques to improve the quality of synthetic speech:
• Regularization Loss: This method prevents the AI model from learning incorrect
speech patterns during training. Essentially, it helps the AI understand which
speech elements are correct and which are not.
• Consistency Loss: This technique encourages the AI to generate speech that is
clear and articulate, even if the original audio data is distorted or unclear.
The model was tested using audio data from patients who had undergone tongue
cancer surgery. The system was able to learn the tonal qualities of the patients’ original
voices, correct articulation errors, and produce a synthetic voice that sounded natural
and familiar.
4. Results
When the results of the test were analyzed, several positive findings emerged:
• The synthetic voice generated by the AI closely matched the tone and
characteristics of the patient’s original voice.
In 2024, a new AI-based technique was developed to help people who struggle with
speech, such as those recovering from tongue cancer or neurological conditions. The
primary aim of this research is to create Personalized Synthetic Voices, which means
generating a voice that closely matches the person’s natural voice using AI technology.
This system helps people who have lost their original voices, offering them a way to
regain their speech and identity.
2. Problem
Many individuals who have undergone surgeries (e.g., glossectomy, where part of the
tongue is removed) or experienced neurological events like strokes, find it difficult to
speak clearly. Traditional text-to-speech systems generate robotic, mechanical voices
that do not carry the person’s emotions or identity. This lack of personal connection
can leave patients feeling isolated and disconnected.
3. Methodology
Researchers have developed an advanced AI model that employs two specialized
techniques to improve the quality of synthetic speech:
• Regularization Loss: This method prevents the AI model from learning incorrect
speech patterns during training. Essentially, it helps the AI understand which
speech elements are correct and which are not.
• Consistency Loss: This technique encourages the AI to generate speech that is
clear and articulate, even if the original audio data is distorted or unclear.
The model was tested using audio data from patients who had undergone tongue
cancer surgery. The system was able to learn the tonal qualities of the patients’ original
voices, correct articulation errors, and produce a synthetic voice that sounded natural
and familiar.
4. Results
When the results of the test were analyzed, several positive findings emerged:
• The synthetic voice generated by the AI closely matched the tone and
characteristics of the patient’s original voice.