Shaip
  • What We Do
        • What we do best

          AI Data Services

          Data Collection Create global audio, images, text & video.

          Data Annotation & LabelingAccurately annotate to make AI/ML think faster

          Data LicensingOff-the-Shelf Curated Data. Smarter Models

          Speciality

          Healthcare AI Transform complex data into actionable insight.

          Conversational AI Localize speech models with multi-lingual datasets.

          Computer Vision Best-in-class visual training data

          Generative AIFuel your Gen AI with our premium training data.

          • RAG
          • Fine-Tuning
          • Red Teaming
          • Multimodal AI
          • RLHF
          • AI Prompt Generation
  • Off-the-shelf Data
        • Off-the-shelf Data Catalog & Licensing

          Medical DatasetsGold standard, de-identified data

          Physician Dictation Datasets

          Transcribed Medical Records

          Electronic Health Records (EHR)

          CT Scan Images Datasets

          X-Ray Images Datasets

          View All

          Computer Vision DatasetsImage & Video data for ML

          Bank Statement Dataset

          Damaged Car Image Dataset

          Facial Recognition Datasets

          Landmark Image Dataset

          Pay Slips Dataset

          View All

          Speech/Audio DatasetsTranscribed & annotated data in 65+ languages.

          New York English

          Chinese Traditional

          Spanish (Mexico)

          Canadian French

          Arabic

          TTS

          Wake Word

          Call-Center

          Scripted Monologue

          General Conversation

          Podcast

          Spontaneous Dialogue

          Spontaneous IVR

          Singing Audio

          View All

  • Solutions
        • Solutions

          Industry

          Healthcare Transform complex data into actionable insight.

          Technology Powering Technology with Precision Data

          eCommerce Improve Conversion, Order Value, & Revenue

          View All

          Use Cases

          Biometric Data High-Quality Biometric Datasets

          Facial Recognition Auto-detect faces via facial landmarks

          Image Annotation Services Supercharge AI with Image Annotation

           

          Indic Language Data Pre-labeled Indian language speech datasets

          Content Moderation Services Boost AI trust & brand reputation

          Medical Data Annotation Extract entities from unstructured data

          View All

  • Platform
    • Data Platform
    • Generative AI Platform
  • Company
    • About
    • Leadership
    • Blogs
    • Events & Webinars
    • Careers
    • Press Room
    • Security & Compliance
    • Resources
      • Case Study
      • Buyer’s Guide
      • Infographics
      • In The Media
      • Sample Datasets
  • What We Do
    • AI Data Services
      • Data Collection
      • Data Annotation & Labeling
    • Speciality
      • Healthcare AI
      • Conversational AI
      • Computer Vision
      • Generative AI
      • Large Language Models Service
  • Off-the-shelf Data
    • Medical Data Catalog
    • Speech Data Catalog
    • Computer Vision Data Catalog
  • Solutions
    • Industry
      • Healthcare
      • Technology
      • eCommerce
    • Use Cases
      • Biometric Data
      • Facial Recognition
      • Image Annotation Services
      • Indic Language Data
      • Content Moderation Services
      • Medical Data Annotation
      • View All
  • Platform
    • Data Platform
    • Generative AI Platform
  • Resources
    • Case Study
    • Buyer’s Guide
    • Infographics
    • Sample Datasets
    • In The Media
    • Blogs
  • Company
    • About Us
    • Leadership
    • Careers
  • Contact
  • Collaborate with Us
Contact Us
Freelancer/Vendor

Home » Speech Datasets » TTS Dataset

High-Quality TTS Datasets

Enhance your ASR, NLP, and speech synthesis projects with diverse multilingual TTS datasets

Speech Datasets
Arabic Dataset Speech Data

General Conversation, TTS

No. Hours: 2,239

Arabic Dataset

View More

Burmese Dataset Speech Data

General Conversation, TTS

No. Hours: 1,000

Burmese Dataset

View More

Canadian French Dataset Speech Data

TTS

No. Hours: 1,222

Canadian French Dataset

View More

Chinese Simplified Dataset Speech Data

TTS

No. Hours: 2,762

Chinese Simplified Dataset

View More

Chinese Traditional Dataset Speech Data

TTS

No. Hours: 1,028

Chinese Traditional Dataset

View More

Chittagonian Dataset Speech Data

General Conversation, TTS

No. Hours: 900

Chittagonian Dataset

View More

Danish Dataset Speech Data

General Conversation, Podcast, TTS

No. Hours: 3,615

Danish Dataset

View More

Dari Dataset Speech Data

General Conversation, TTS

No. Hours: 700

Dari Dataset

View More

Dogri Dataset Speech Data

General Conversation, TTS

No. Hours: 250

Dogri Dataset

View More

Dutch Dataset Speech Data

TTS

No. Hours: 1,205

Dutch Dataset

View More

Gojri Dataset Speech Data

General Conversation, TTS

No. Hours: 250

Gojri Dataset

View More

Hindi Dataset Speech Data

General Conversation, Podcast, TTS

No. Hours: 3,126

Hindi Dataset

View More

Japanese Dataset Speech Data

TTS

No. Hours: 2,335

Japanese Dataset

View More

Kashmiri Dataset Speech Data

General Conversation, TTS

No. Hours: 1,000

Kashmiri Dataset

View More

Korean Dataset Speech Data

Call-Center, Podcast, TTS

No. Hours: 2,266

Korean Dataset

View More

Nagamese Dataset Speech Data

General Conversation, TTS

No. Hours: 850

Nagamese Dataset

View More

Polish Dataset Speech Data

Podcast, TTS

No. Hours: 1,751

Polish Dataset

View More

Russian Dataset Speech Data

TTS

No. Hours: 2,398

Russian Dataset

View More

Sinhalese Dataset Speech Data

General Conversation, TTS

No. Hours: 1,000

Sinhalese Dataset

View More

Spanish (Mexico) Dataset Speech Data

TTS

No. Hours: 1,492

Spanish (Mexico) Dataset

View More

Turkish Turkey Dataset Speech Data

TTS

No. Hours: 2,027

Turkish Turkey Dataset

View More

Comprehensive Speech Data Solutions: Fast, Flexible, and Best-in-Class Quality

Comprehensive Voice Data Solutions

End-to-end service: Complete service with expert domain knowledge and fast delivery.

Flexible: Choose custom, semi-custom, or off-the-shelf voice datasets with flexible ownership.

Domain Expert: Hire a Specialized Domain Expert for Fast, Quality AI Datasets.

Quality: Get quality checks from industry experts.

Licensing: Get a license tailored to your needs.

Ethical Data: We ensure contributors are informed and consent to data use.

AI Data Services
  • Data Licensing
  • Data Collection
  • Data Annotation
  • Data De-Identification
Platform
  • Data Platform
  • Generative AI Platform
Speciality
  • Healthcare AI
  • Conversational AI
  • Generative AI
  • Computer Vision
Industry
  • Healthcare AI
  • Technology
  • eCommerce
Resources
  • Blogs
  • Case Study
  • Buyer’s Guide
  • Infographics
  • Sample Datasets
  • Media
Company
  • About
  • Leadership
  • Compliance
  • CSR
  • Press Room
  • Partners
Contact Us

(US): (866) 473-5655

marketing@shaip.com
vendorcolab@shaip.com
career@shaip.com

Vendor Enrolment Form

Linkedin X-twitter Facebook Youtube Instagram

© 2018 – 2025 Shaip | All Rights Reserved

Consent Preferences
  • Privacy Policy
  • Cookie Policy
  • Terms of Service
  • Privacy Policy
  • Cookie Policy
  • Terms of Service