Indian English Dataset

Overview

Title

Indian English Language Dataset

Dataset Type

Wake Word

Description

Wake Words / Voice Command / Trigger Word / Keyphrase collection of data

  • 50 speakers
  • 4 unique keyphrases per speaker
  • 10 audio files per unique keyphrase
  • 40 total recorded utterances per speaker

Data Set Details

Total hours

2,000 Audios

Sample Rate

16 kHz

Audio Channel

1 channel

Recording Platform

Mobile App

Audio Format

.wav

Transcription Format

.json

WER (%)

5

Data Set Demographics

Country

Indian English

Language

Indian English

Gender

Female 50%, Male 50%, Unknown 10%

Number of Speakers

50

Age

18-50

Featured Clients

Empowering teams to build world-leading AI products.

Shaip contact us

Can’t find what you are looking for?

New off-the-shelf datasets are being collected across all data types

Contact us now to let go of your audio/speech training data collection worries

  • By registering, I agree with Shaip Privacy Policy and Terms of Service and provide my consent to receive B2B marketing communication from Shaip.
  • This field is for validation purposes and should be left unchanged.