Tencent logo
Research Scientist - Speech & Audio Understanding (Speech Generation)
full-timeUnited States$122k - $229k

Summary

Location

United States

Salary

$122k - $229k

Type

full-time

Explore Jobs

About this role

Business Unit

What the Role Entails

Job Responsibilities:
1. Track the latest research in speech generation algorithms, explore next-generation paradigms for speech/audio generation, and push the boundaries of speech generation capabilities.  
2. Investigate cutting-edge multimodal voice foundation model technologies to enhance voice interaction experiences by integrating text, speech, and vision.  
3. Lead the technical R&D of voice foundation models, driving model performance improvements and innovative applications.  

Who We Look For

Job Requirements:
1. Master’s or Ph.D. in Computer Science, Artificial Intelligence, Electronic Engineering, Signal Processing, or related fields.  
2. Research or development experience in one or more areas: voice foundation models, speech synthesis, speech recognition, audio generation, voice conversion, or speech codec.  
3. Familiarity with mainstream voice-enabled large models (e.g., GPT4o, GLM-4-Voice, Qwen2.5-Omni, Voila). Prior project experience is preferred.  
4. Proficient in deep learning frameworks (e.g., PyTorch). Experience with large-scale model training frameworks (Megatron/Deepspeed) is a plus.  
5. Solid understanding of large model architectures and principles. Experience in large-scale pretraining or post-training is preferred.  

Location State(s)

US-Washington-Bellevue

The expected base pay range for this position in the location(s) listed above is $122,500.00 to $229,700.00 per year. Actual pay may vary depending on job-related knowledge, skills, and experience. Employees hired for this position may be eligible for a sign on payment, relocation package, and restricted stock units, which will be evaluated on a case-by-case basis. Subject to the terms and conditions of the plans in effect, hired applicants are also eligible for medical, dental, vision, life and disability benefits, and participation in the Company’s 401(k) plan. The Employee is also eligible for up to 15 to 25 days of vacation per year (depending on the employee’s tenure), up to 13 days of holidays throughout the calendar year, and up to 10 days of paid sick leave per year. Your benefits may be adjusted to reflect your location, employment status, duration of employment with the company, and position level. Benefits may also be pro-rated for those who start working during the calendar year.

Equal Employment Opportunity at Tencent

As an equal opportunity employer, we firmly believe that diverse voices fuel our innovation and allow us to better serve our users and the community. We foster an environment where every employee of Tencent feels supported and inspired to achieve individual and common goals.

Other facts

Tech stack
Speech Generation,Multimodal Voice Foundation Models,Voice Interaction,Deep Learning,Speech Synthesis,Speech Recognition,Audio Generation,Voice Conversion,Speech Codec,Large Models,PyTorch,Model Training,Large-Scale Pretraining,Signal Processing,Artificial Intelligence,Computer Science

About Tencent

Tencent is a world-leading internet and technology company that develops innovative products and services to improve the quality of life of people around the world.

Founded in 1998 with its headquarters in Shenzhen, China, Tencent's guiding principle is to use technology for good. Our communication and social services connect more than one billion people around the world, helping them to keep in touch with friends and family, access transportation, pay for daily necessities, and even be entertained.

Tencent also publishes some of the world's most popular video games and other high-quality digital content, enriching interactive entertainment experiences for people around the globe.

Tencent also offers a range of services such as cloud computing, advertising, FinTech, and other enterprise services to support our clients' digital transformation and business growth.

Tencent has been listed on the Stock Exchange of Hong Kong since 2004.

Team size: 10,001+ employees
LinkedIn: Visit
Industry: Software Development
Founding Year: 1998

What you'll do

  • The role involves tracking the latest research in speech generation algorithms and leading the technical R&D of voice foundation models. The candidate will also investigate multimodal voice technologies to enhance voice interaction experiences.

Ready to join Tencent?

Take the next step in your career journey

Frequently Asked Questions

What does Tencent pay for a Research Scientist - Speech & Audio Understanding (Speech Generation)?

Tencent offers a competitive compensation package for the Research Scientist - Speech & Audio Understanding (Speech Generation) role. The salary range is USD 123k - 230k per year. Apply through Clera to learn more about the full compensation details.

What does a Research Scientist - Speech & Audio Understanding (Speech Generation) do at Tencent?

As a Research Scientist - Speech & Audio Understanding (Speech Generation) at Tencent, you will: the role involves tracking the latest research in speech generation algorithms and leading the technical R&D of voice foundation models. The candidate will also investigate multimodal voice technologies to enhance voice interaction experiences..

Why join Tencent as a Research Scientist - Speech & Audio Understanding (Speech Generation)?

Tencent is a leading Software Development company. The Research Scientist - Speech & Audio Understanding (Speech Generation) role offers competitive compensation.

Is the Research Scientist - Speech & Audio Understanding (Speech Generation) position at Tencent remote?

The Research Scientist - Speech & Audio Understanding (Speech Generation) position at Tencent is based in United States, United States. Contact the company through Clera for specific work arrangement details.

How do I apply for the Research Scientist - Speech & Audio Understanding (Speech Generation) position at Tencent?

You can apply for the Research Scientist - Speech & Audio Understanding (Speech Generation) position at Tencent directly through Clera. Click the "Apply Now" button above to start your application. Clera's AI-powered platform will help match your profile with this opportunity and guide you through the application process. You can also learn more about Tencent on their website.