Shaip
  • What We Do
        • What we do best

          AI Data Services

          Data Collection Create global audio, images, text & video.

          Data Annotation & LabelingAccurately annotate to make AI/ML think faster

          Data LicensingOff-the-Shelf Curated Data. Smarter Models

          Speciality

          Healthcare AI Transform complex data into actionable insight.

          Conversational AI Localize speech models with multi-lingual datasets.

          Computer Vision Best-in-class visual training data

          Generative AIFuel your Gen AI with our premium training data.

          • RAG
          • Fine-Tuning
          • Red Teaming
          • Multimodal AI
          • RLHF
          • AI Prompt Generation
  • Off-the-shelf Data
        • Off-the-shelf Data Catalog & Licensing

          Medical DatasetsGold standard, de-identified data

          Physician Dictation Datasets

          Transcribed Medical Records

          Electronic Health Records (EHR)

          CT Scan Images Datasets

          X-Ray Images Datasets

          View All

          Computer Vision DatasetsImage & Video data for ML

          Bank Statement Dataset

          Damaged Car Image Dataset

          Facial Recognition Datasets

          Landmark Image Dataset

          Pay Slips Dataset

          View All

          Speech/Audio DatasetsTranscribed & annotated data in 65+ languages.

          New York English

          Chinese Traditional

          Spanish (Mexico)

          Canadian French

          Arabic

          TTS

          Wake Word

          Call-Center

          Scripted Monologue

          General Conversation

          Podcast

          Spontaneous Dialogue

          Spontaneous IVR

          Singing Audio

          View All

  • Solutions
        • Solutions

          Industry

          Healthcare Transform complex data into actionable insight.

          Technology Powering Technology with Precision Data

          eCommerce Improve Conversion, Order Value, & Revenue

          View All

          Use Cases

          Biometric Data High-Quality Biometric Datasets

          Facial Recognition Auto-detect faces via facial landmarks

          Image Annotation Services Supercharge AI with Image Annotation

           

          Indic Language Data Pre-labeled Indian language speech datasets

          Content Moderation Services Boost AI trust & brand reputation

          Medical Data Annotation Extract entities from unstructured data

          View All

  • Platform
    • Data Platform
    • Generative AI Platform
  • Company
    • About
    • Leadership
    • Blogs
    • Events & Webinars
    • Careers
    • Press Room
    • Security & Compliance
    • Resources
      • Case Study
      • Buyer’s Guide
      • Infographics
      • In The Media
      • Sample Datasets
  • What We Do
    • AI Data Services
      • Data Collection
      • Data Annotation & Labeling
    • Speciality
      • Healthcare AI
      • Conversational AI
      • Computer Vision
      • Generative AI
      • Large Language Models Service
  • Off-the-shelf Data
    • Medical Data Catalog
    • Speech Data Catalog
    • Computer Vision Data Catalog
  • Solutions
    • Industry
      • Healthcare
      • Technology
      • eCommerce
    • Use Cases
      • Biometric Data
      • Facial Recognition
      • Image Annotation Services
      • Indic Language Data
      • Content Moderation Services
      • Medical Data Annotation
      • View All
  • Platform
    • Data Platform
    • Generative AI Platform
  • Resources
    • Case Study
    • Buyer’s Guide
    • Infographics
    • Sample Datasets
    • In The Media
    • Blogs
  • Company
    • About Us
    • Leadership
    • Careers
  • Contact
  • Collaborate with Us
Contact Us
Freelancer/Vendor

Home » Speech Datasets » Wake Word Dataset

Wake Word and Keyword Spotting Datasets

Train voice-activated AI models with high-quality wake word and keyword spotting datasets for accurate speech recognition.

Speech Datasets
Wake Word Arabic Dataset Speech Data

Wake Word / Keyphrase

No. Hours: 200 Speakers

Wake Word Arabic Dataset

View More

Wake Word Brazilian Portuguese Dataset Speech Data

Wake Word / Keyphrase

No. Hours: 10,000

Wake Word Brazilian Portuguese Dataset

View More

Wake Word Canadian French Dataset Speech Data

Wake Word / Keyphrase

No. Hours: 10,000

Wake Word Canadian French Dataset

View More

Wake Word Cantonese Dataset Speech Data

Wake Word / Keyphrase

No. Hours: 2,000

Wake Word Cantonese Dataset

View More

Wake Word Czech Dataset Speech Data

Wake Word / Keyphrase

No. Hours:

Wake Word Czech Dataset

View More

Wake Word Danish Dataset Speech Data

Wake Word / Keyphrase

No. Hours: 2,000

Wake Word Danish Dataset

View More

Wake Word Danish Dataset Speech Data

Wake Word / Keyphrase

No. Hours: 200 Speakers

Wake Word Danish Dataset

View More

Wake Word English Dataset Speech Data

Wake Word / Keyphrase

No. Hours: 200 Speakers

Wake Word English Dataset

View More

Wake Word France French Dataset Speech Data

Wake Word / Keyphrase

No. Hours: 10,000

Wake Word France French Dataset

View More

Wake Word French Dataset Speech Data

Wake Word / Keyphrase

No. Hours: 2,000

Wake Word French Dataset

View More

Wake Word French Dataset Speech Data

Wake Word / Keyphrase

No. Hours: 200 Speakers

Wake Word French Dataset

View More

Wake Word German Dataset Speech Data

Wake Word / Keyphrase

No. Hours: 2,000

Wake Word German Dataset

View More

Wake Word Hebrew Dataset Speech Data

Wake Word / Keyphrase

No. Hours: 200 Speakers

Wake Word Hebrew Dataset

View More

Wake Word Indian English Dataset Speech Data

Wake Word / Keyphrase

No. Hours: 40,000

Wake Word Indian English Dataset

View More

Wake Word Indian English Dataset Speech Data

Wake Word / Keyphrase

No. Hours: 2,000

Wake Word Indian English Dataset

View More

Wake Word Italian Dataset Speech Data

Wake Word / Keyphrase

No. Hours:

Wake Word Italian Dataset

View More

Wake Word Italian Dataset Speech Data

Wake Word / Keyphrase

No. Hours: 2,000

Wake Word Italian Dataset

View More

Wake Word Korean Dataset Speech Data

Wake Word / Keyphrase

No. Hours: 2,000

Wake Word Korean Dataset

View More

Wake Word Mandarin Dataset Speech Data

Wake Word / Keyphrase

No. Hours:

Wake Word Mandarin Dataset

View More

Wake Word Mexican Spanish Dataset Speech Data

Wake Word / Keyphrase

No. Hours: 10,000

Wake Word Mexican Spanish Dataset

View More

Wake Word Norwegian Dataset Speech Data

Wake Word / Keyphrase

No. Hours: 200 Speakers

Wake Word Norwegian Dataset

View More

Wake Word Polish Dataset Speech Data

Wake Word / Keyphrase

No. Hours: 200 Speakers

Wake Word Polish Dataset

View More

Wake Word Spain Spanish Dataset Speech Data

Wake Word / Keyphrase

No. Hours: 10,000

Wake Word Spain Spanish Dataset

View More

Wake Word Spain Spanish Dataset Speech Data

Wake Word / Keyphrase

No. Hours: 2,000

Wake Word Spain Spanish Dataset

View More

Wake Word Swedish Dataset Speech Data

Wake Word / Keyphrase

No. Hours: 2,000

Wake Word Swedish Dataset

View More

Wake Word Swedish Dataset Speech Data

Wake Word / Keyphrase

No. Hours: 200 Speakers

Wake Word Swedish Dataset

View More

Wake Word Turkish Dataset Speech Data

Wake Word / Keyphrase

No. Hours: 200 Speakers

Wake Word Turkish Dataset

View More

Wake Word UK English Dataset Speech Data

Wake Word / Keyphrase

No. Hours: 2,000

Wake Word UK English Dataset

View More

Wake Word UK English Dataset Speech Data

Wake Word / Keyphrase

No. Hours: 200 Speakers

Wake Word UK English Dataset

View More

Wake Word US English Dataset Speech Data

Wake Word / Keyphrase

No. Hours: 10,000

Wake Word US English Dataset

View More

Wake Word US English Dataset Speech Data

Wake Word / Keyphrase

No. Hours: 2,000

Wake Word US English Dataset

View More

Wake Word US Spanish Dataset Speech Data

Wake Word / Keyphrase

No. Hours: 10,000

Wake Word US Spanish Dataset

View More

Comprehensive Speech Data Solutions: Fast, Flexible, and Best-in-Class Quality

Comprehensive Voice Data Solutions

End-to-end service: Complete service with expert domain knowledge and fast delivery.

Flexible: Choose custom, semi-custom, or off-the-shelf voice datasets with flexible ownership.

Domain Expert: Hire a Specialized Domain Expert for Fast, Quality AI Datasets.

Quality: Get quality checks from industry experts.

Licensing: Get a license tailored to your needs.

Ethical Data: We ensure contributors are informed and consent to data use.

AI Data Services
  • Data Licensing
  • Data Collection
  • Data Annotation
  • Data De-Identification
Platform
  • Data Platform
  • Generative AI Platform
Speciality
  • Healthcare AI
  • Conversational AI
  • Generative AI
  • Computer Vision
Industry
  • Healthcare AI
  • Technology
  • eCommerce
Resources
  • Blogs
  • Case Study
  • Buyer’s Guide
  • Infographics
  • Sample Datasets
  • Media
Company
  • About
  • Leadership
  • Compliance
  • CSR
  • Press Room
  • Partners
Linkedin X-twitter Facebook Youtube Instagram
Consent Preferences
  • Privacy Policy
  • Cookie Policy
  • Terms of Service
  • Privacy Policy
  • Cookie Policy
  • Terms of Service