High-quality Audio / Speech / Voice Datasets to Train Your Conversational AI Model
Off-the-shelf Voice / Speech / Audio Datasets in multiple languages to jump start your automatic speech recognition (ASR) models
Plug-in the audio data catalog you’ve been missing today
Menu
Details | Keyword | Language Dataset | Sample Rate | Dataset Type | Total Audio Hours | Total Speech Hours | Dataset description | Audio Channel | Recording Platform | WER (%) | Audio Format | Transcription Format | Use Case | CTA |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
African American | African American Vernacular | 8 kHz | Call-Center | 214 | 211 | Unscripted, synthetic telephonic conversation between "agent" and "customer", Approx. Audio Duration (Range) 5-15 Minutes, | Dual | Desktop | 5 | .wav | .json | ASR, Virtual Assistant, Chatbot, Conversational AI, Speech Analytics, TTS, Language Modelling | Contact | |
African American | African American Vernacular | 16 kHz | Media Audio | 159 | 149 | Licensable Public domain audio/video files such as interviews, podcasts etc - 1 to 5 people. Approx. Audio Duration (Range) 15-60 minutes | Mono | Desktop | 5 | .wav | .json | ASR, Virtual Assistant, Chatbot, Conversational AI, Speech Analytics, TTS, Language Modelling | Contact | |
Afrikaans | Afrikaans | 8 kHz | General Conversation | 368 | 404 | Unscripted telephonic conversation between two people. Approx. Audio Duration (Range) - 15-60 minutes, Afrikaans spoken in Africa | Dual | Desktop | 5 | .wav | .json | ASR, Virtual Assistant, Chatbot, Conversational AI, Speech Analytics, TTS, Language Modelling | Contact | |
Afrikaans | Afrikaans | 16 kHz | Media Audio | 658 | 615 | Licensable Public domain audio/video files such as interviews, podcasts etc - 1 to 5 people. Approx. Audio Duration (Range) 15-60 minutes | Mono | Desktop | 5 | .wav | .json | ASR, Virtual Assistant, Chatbot, Conversational AI, Speech Analytics, TTS, Language Modelling | Contact | |
Arabic | Arabic | 8 kHz | General Conversation | 293 | 297 | Unscripted telephonic conversation between two people. Approx. Audio Duration (Range) - 15-60 minutes, Arabic from Gulf countries | Dual | Desktop | 5 | .wav | .json | ASR, Virtual Assistant, Chatbot, Conversational AI, Speech Analytics, TTS, Language Modelling | Contact | |
Boston | Boston | 8 kHz | Call-Center | 177 | 175 | Unscripted, synthetic telephonic conversation between "agent" and "customer", Approx. Audio Duration (Range) 5-15 Minutes, | Dual | Desktop | 5 | .wav | .json | ASR, Virtual Assistant, Chatbot, Conversational AI, Speech Analytics, TTS, Language Modelling | Contact | |
Boston | Boston | 8 kHz | General Conversation | 32 | 32 | Unscripted telephonic conversation between two people. Approx. Audio Duration (Range) - 15-60 minutes, | Dual | Desktop | 5 | .wav | .json | ASR, Virtual Assistant, Chatbot, Conversational AI, Speech Analytics, TTS, Language Modelling | Contact | |
Boston | Boston | 16 kHz | Media Audio | 93 | 93 | Licensable Public domain audio/video files such as interviews, podcasts etc - 1 to 5 people. Approx. Audio Duration (Range) 15-60 minutes | Mono | Desktop | 5 | .wav | .json | ASR, Virtual Assistant, Chatbot, Conversational AI, Speech Analytics, TTS, Language Modelling | Contact | |
Chinese English | Chinese English | 8 kHz | Call-Center | 169 | 130 | Unscripted, synthetic telephonic conversation between "agent" and "customer", Approx. Audio Duration (Range) 5-15 Minutes, | Dual | Desktop | 5 | .wav | .json | ASR, Virtual Assistant, Chatbot, Conversational AI, Speech Analytics, TTS, Language Modelling | Contact | |
Chinese English | Chinese English | 16 kHz | Media Audio | 249 | 236 | Licensable Public domain audio/video files such as interviews, podcasts etc - 1 to 5 people. Approx. Audio Duration (Range) 15-60 minutes | Mono | Desktop | 5 | .wav | .json | ASR, Virtual Assistant, Chatbot, Conversational AI, Speech Analytics, TTS, Language Modelling | Contact | |
Danish | Danish | 8 kHz | General Conversation | 372 | 395 | Unscripted telephonic conversation between two people. Approx. Audio Duration (Range) - 15-60 minutes, | Dual | Desktop | 5 | .wav | .json | ASR, Virtual Assistant, Chatbot, Conversational AI, Speech Analytics, TTS, Language Modelling | Contact | |
Danish | Danish | 16 kHz | Media Audio | 664 | 603 | Licensable Public domain audio/video files such as interviews, podcasts etc - 1 to 5 people. Approx. Audio Duration (Range) 15-60 minutes | Mono | Desktop | 5 | .wav | .json | ASR, Virtual Assistant, Chatbot, Conversational AI, Speech Analytics, TTS, Language Modelling | Contact | |
English | English | 16 kHz | Media Audio | 10 | 9 | Licensable Public domain audio/video files such as interviews, podcasts etc - 1 to 5 people. Approx. Audio Duration (Range) 15-60 minutes | Mono | Desktop | 5 | .wav | .json | ASR, Virtual Assistant, Chatbot, Conversational AI, Speech Analytics, TTS, Language Modelling | Contact | |
English Deep South | English Deep South | 8 kHz | Call-Center | 151 | 149 | Unscripted, synthetic telephonic conversation between "agent" and "customer", Approx. Audio Duration (Range) 5-15 Minutes, | Dual | Desktop | 5 | .wav | .json | ASR, Virtual Assistant, Chatbot, Conversational AI, Speech Analytics, TTS, Language Modelling | Contact | |
English Deep South | English Deep South | 8 kHz | General Conversation | 56 | 56 | Unscripted telephonic conversation between two people. Approx. Audio Duration (Range) - 15-60 minutes, | Dual | Desktop | 5 | .wav | .json | ASR, Virtual Assistant, Chatbot, Conversational AI, Speech Analytics, TTS, Language Modelling | Contact | |
English Deep South | English Deep South | 16 kHz | Media Audio | 266 | 248 | Licensable Public domain audio/video files such as interviews, podcasts etc - 1 to 5 people. Approx. Audio Duration (Range) 15-60 minutes | Mono | Desktop | 5 | .wav | .json | ASR, Virtual Assistant, Chatbot, Conversational AI, Speech Analytics, TTS, Language Modelling | Contact | |
Hebrew | Hebrew | 8 kHz | General Conversation | 399 | 397 | Unscripted telephonic conversation between two people. Approx. Audio Duration (Range) - 15-60 minutes, Hebrew in Israel | Dual | Desktop | 5 | .wav | .json | ASR, Virtual Assistant, Chatbot, Conversational AI, Speech Analytics, TTS, Language Modelling | Contact | |
Hebrew | Hebrew | 16 kHz | Media Audio | 427 | 400 | Licensable Public domain audio/video files such as interviews, podcasts etc - 1 to 5 people. Approx. Audio Duration (Range) 15-60 minutes | Mono | Desktop | 5 | .wav | .json | ASR, Virtual Assistant, Chatbot, Conversational AI, Speech Analytics, TTS, Language Modelling | Contact | |
Hinglish | Hinglish | 8 kHz | Call-Center | 208 | 185 | Unscripted, synthetic telephonic conversation between "agent" and "customer", Approx. Audio Duration (Range) 5-15 Minutes, | Dual | Desktop | 5 | .wav | .json | ASR, Virtual Assistant, Chatbot, Conversational AI, Speech Analytics, TTS, Language Modelling | Contact | |
Hinglish | Hinglish | 16 kHz | Media Audio | 216 | 219 | Licensable Public domain audio/video files such as interviews, podcasts etc - 1 to 5 people. Approx. Audio Duration (Range) 15-60 minutes | Mono | Desktop | 5 | .wav | .json | ASR, Virtual Assistant, Chatbot, Conversational AI, Speech Analytics, TTS, Language Modelling | Contact | |
Hispanic English | Hispanic English | 8 kHz | Call-Center | 212 | 209 | Unscripted, synthetic telephonic conversation between "agent" and "customer", Approx. Audio Duration (Range) 5-15 Minutes, | Dual | Desktop | 5 | .wav | .json | ASR, Virtual Assistant, Chatbot, Conversational AI, Speech Analytics, TTS, Language Modelling | Contact | |
Hispanic English | Hispanic English | 16 kHz | Media Audio | 155 | 150 | Licensable Public domain audio/video files such as interviews, podcasts etc - 1 to 5 people. Approx. Audio Duration (Range) 15-60 minutes | Mono | Desktop | 5 | .wav | .json | ASR, Virtual Assistant, Chatbot, Conversational AI, Speech Analytics, TTS, Language Modelling | Contact | |
Indian English | Indian English | 16 kHz | Media Audio | 137 | 87 | Licensable Public domain audio/video files such as interviews, podcasts etc - 1 to 5 people. Approx. Audio Duration (Range) 15-60 minutes | Mono | Desktop | 5 | .wav | .json | ASR, Virtual Assistant, Chatbot, Conversational AI, Speech Analytics, TTS, Language Modelling | Contact | |
Indonesian | Indonesian | 8 kHz | General Conversation | 496 | 598 | Unscripted telephonic conversation between two people. Approx. Audio Duration (Range) - 15-60 minutes, Bahasa Indonesian | Dual | Desktop | 5 | .wav | .json | ASR, Virtual Assistant, Chatbot, Conversational AI, Speech Analytics, TTS, Language Modelling | Contact | |
Indonesian | Indonesian | 16 kHz | Media Audio | 643 | 610 | Licensable Public domain audio/video files such as interviews, podcasts etc - 1 to 5 people. Approx. Audio Duration (Range) 15-60 minutes | Mono | Desktop | 5 | .wav | .json | ASR, Virtual Assistant, Chatbot, Conversational AI, Speech Analytics, TTS, Language Modelling | Contact | |
Irish | Irish | 8 kHz | General Conversation | 192 | 180 | Unscripted telephonic conversation between two people. Approx. Audio Duration (Range) - 15-60 minutes, | Dual | Desktop | 5 | .wav | .json | ASR, Virtual Assistant, Chatbot, Conversational AI, Speech Analytics, TTS, Language Modelling | Contact | |
Korean | Korean | 8 kHz | Call-Center | 107 | 103 | Unscripted, synthetic telephonic conversation between "agent" and "customer", Approx. Audio Duration (Range) 5-15 Minutes, | Dual | Desktop | 5 | .wav | .json | ASR, Virtual Assistant, Chatbot, Conversational AI, Speech Analytics, TTS, Language Modelling | Contact | |
Korean | Korean | 16 kHz | Media Audio | 204 | 197 | Licensable Public domain audio/video files such as interviews, podcasts etc - 1 to 5 people. Approx. Audio Duration (Range) 15-60 minutes | Mono | Desktop | 5 | .wav | .json | ASR, Virtual Assistant, Chatbot, Conversational AI, Speech Analytics, TTS, Language Modelling | Contact | |
Malay | Malay | 8 kHz | General Conversation | 266 | 302 | Unscripted telephonic conversation between two people. Approx. Audio Duration (Range) - 15-60 minutes, Malay in Malaysia | Dual | Desktop | 5 | .wav | .json | ASR, Virtual Assistant, Chatbot, Conversational AI, Speech Analytics, TTS, Language Modelling | Contact | |
Malay | Malay | 16 kHz | Media Audio | 344 | 305 | Licensable Public domain audio/video files such as interviews, podcasts etc - 1 to 5 people. Approx. Audio Duration (Range) 15-60 minutes | Mono | Desktop | 5 | .wav | .json | ASR, Virtual Assistant, Chatbot, Conversational AI, Speech Analytics, TTS, Language Modelling | Contact | |
New Zealand English | New Zealand English | 8 kHz | General Conversation | 148 | 142 | Unscripted telephonic conversation between two people. Approx. Audio Duration (Range) - 15-60 minutes, | Dual | Desktop | 5 | .wav | .json | ASR, Virtual Assistant, Chatbot, Conversational AI, Speech Analytics, TTS, Language Modelling | Contact | |
New Zealand English | New Zealand English | 16 kHz | Media Audio | 400 | 400 | Licensable Public domain audio/video files such as interviews, podcasts etc - 1 to 5 people. Approx. Audio Duration (Range) 15-60 minutes | Mono | Desktop | 5 | .wav | .json | ASR, Virtual Assistant, Chatbot, Conversational AI, Speech Analytics, TTS, Language Modelling | Contact | |
New York English | New York English | 8 kHz | Call-Center | 103 | 103 | Unscripted, synthetic telephonic conversation between "agent" and "customer", Approx. Audio Duration (Range) 5-15 Minutes, | Dual | Desktop | 5 | .wav | .json | ASR, Virtual Assistant, Chatbot, Conversational AI, Speech Analytics, TTS, Language Modelling | Contact | |
New York English | New York English | 8 kHz | General Conversation | 107 | 106 | Unscripted telephonic conversation between two people. Approx. Audio Duration (Range) - 15-60 minutes, | Dual | Desktop | 5 | .wav | .json | ASR, Virtual Assistant, Chatbot, Conversational AI, Speech Analytics, TTS, Language Modelling | Contact | |
New York English | New York English | 16 kHz | Media Audio | 140 | 140 | Licensable Public domain audio/video files such as interviews, podcasts etc - 1 to 5 people. Approx. Audio Duration (Range) 15-60 minutes | Mono | Desktop | 5 | .wav | .json | ASR, Virtual Assistant, Chatbot, Conversational AI, Speech Analytics, TTS, Language Modelling | Contact | |
Polish | Polish | 16 kHz | Media Audio | 269 | 255 | Licensable Public domain audio/video files such as interviews, podcasts etc - 1 to 5 people. Approx. Audio Duration (Range) 15-60 minutes | Mono | Desktop | 5 | .wav | .json | ASR, Virtual Assistant, Chatbot, Conversational AI, Speech Analytics, TTS, Language Modelling | Contact | |
Scottish | Scottish | 8 kHz | General Conversation | 292 | 267 | Unscripted telephonic conversation between two people. Approx. Audio Duration (Range) - 15-60 minutes, | Dual | Desktop | 5 | .wav | .json | ASR, Virtual Assistant, Chatbot, Conversational AI, Speech Analytics, TTS, Language Modelling | Contact | |
Singapore English | Singapore English | 8 kHz | Call-Center | 218 | 194 | Unscripted, synthetic telephonic conversation between "agent" and "customer", Approx. Audio Duration (Range) 5-15 Minutes, | Dual | Desktop | 5 | .wav | .json | ASR, Virtual Assistant, Chatbot, Conversational AI, Speech Analytics, TTS, Language Modelling | Contact | |
Singapore English | Singapore English | 16 kHz | Media Audio | 247 | 240 | Licensable Public domain audio/video files such as interviews, podcasts etc - 1 to 5 people. Approx. Audio Duration (Range) 15-60 minutes | Mono | Desktop | 5 | .wav | .json | ASR, Virtual Assistant, Chatbot, Conversational AI, Speech Analytics, TTS, Language Modelling | Contact | |
South African English | South African English | 8 kHz | Call-Center | 261 | 204 | Unscripted, synthetic telephonic conversation between "agent" and "customer", Approx. Audio Duration (Range) 5-15 Minutes, | Dual | Desktop | 5 | .wav | .json | ASR, Virtual Assistant, Chatbot, Conversational AI, Speech Analytics, TTS, Language Modelling | Contact | |
South African English | South African English | 16 kHz | Media Audio | 251 | 245 | Licensable Public domain audio/video files such as interviews, podcasts etc - 1 to 5 people. Approx. Audio Duration (Range) 15-60 minutes | Mono | Desktop | 5 | .wav | .json | ASR, Virtual Assistant, Chatbot, Conversational AI, Speech Analytics, TTS, Language Modelling | Contact | |
Spanish | Spanish | 16 kHz | Media Audio | 3 | 2 | Licensable Public domain audio/video files such as interviews, podcasts etc - 1 to 5 people. Approx. Audio Duration (Range) 15-60 minutes | Mono | Desktop | 5 | .wav | .json | ASR, Virtual Assistant, Chatbot, Conversational AI, Speech Analytics, TTS, Language Modelling | Contact | |
Swahili | Swahili | 8 kHz | Call-Center | 184 | 165 | Unscripted, synthetic telephonic conversation between "agent" and "customer", Approx. Audio Duration (Range) 5-15 Minutes, | Dual | Desktop | 5 | .wav | .json | ASR, Virtual Assistant, Chatbot, Conversational AI, Speech Analytics, TTS, Language Modelling | Contact | |
Swahili | Swahili | 8 kHz | Call-Center | 46 | 44 | Unscripted, synthetic telephonic conversation between "agent" and "customer", Approx. Audio Duration (Range) 5-15 Minutes, | Dual | Desktop | 5 | .wav | .json | ASR, Virtual Assistant, Chatbot, Conversational AI, Speech Analytics, TTS, Language Modelling | Contact | |
Swahili | Swahili | 16 kHz | Media Audio | 203 | 191 | Licensable Public domain audio/video files such as interviews, podcasts etc - 1 to 5 people. Approx. Audio Duration (Range) 15-60 minutes | Mono | Desktop | 5 | .wav | .json | ASR, Virtual Assistant, Chatbot, Conversational AI, Speech Analytics, TTS, Language Modelling | Contact | |
Swahili | Swahili | 16 kHz | Media Audio | 62 | 58 | Licensable Public domain audio/video files such as interviews, podcasts etc - 1 to 5 people. Approx. Audio Duration (Range) 15-60 minutes | Mono | Desktop | 5 | .wav | .json | ASR, Virtual Assistant, Chatbot, Conversational AI, Speech Analytics, TTS, Language Modelling | Contact | |
Swedish | Swedish | 8 kHz | Call-Center | 250 | 224 | Unscripted, synthetic telephonic conversation between "agent" and "customer", Approx. Audio Duration (Range) 5-15 Minutes, | Dual | Desktop | 5 | .wav | .json | ASR, Virtual Assistant, Chatbot, Conversational AI, Speech Analytics, TTS, Language Modelling | Contact | |
Swedish | Swedish | 16 kHz | Media Audio | 278 | 255 | Licensable Public domain audio/video files such as interviews, podcasts etc - 1 to 5 people. Approx. Audio Duration (Range) 15-60 minutes | Mono | Desktop | 5 | .wav | .json | ASR, Virtual Assistant, Chatbot, Conversational AI, Speech Analytics, TTS, Language Modelling | Contact | |
Telugu | Telugu | 8 kHz | General Conversation | 553 | 582 | Unscripted telephonic conversation between two people. Approx. Audio Duration (Range) - 15-60 minutes, | Dual | Desktop | 5 | .wav | .json | ASR, Virtual Assistant, Chatbot, Conversational AI, Speech Analytics, TTS, Language Modelling | Contact | |
Telugu | Telugu | 16 kHz | Media Audio | 648 | 599 | Licensable Public domain audio/video files such as interviews, podcasts etc - 1 to 5 people. Approx. Audio Duration (Range) 15-60 minutes | Mono | Desktop | 5 | .wav | .json | ASR, Virtual Assistant, Chatbot, Conversational AI, Speech Analytics, TTS, Language Modelling | Contact | |
Thai | Thai | 8 kHz | General Conversation | 183 | 201 | Unscripted telephonic conversation between two people. Approx. Audio Duration (Range) - 15-60 minutes, An informal register used between friends | Dual | Desktop | 5 | .wav | .json | ASR, Virtual Assistant, Chatbot, Conversational AI, Speech Analytics, TTS, Language Modelling | Contact | |
Thai | Thai | 16 kHz | Media Audio | 173 | 167 | Licensable Public domain audio/video files such as interviews, podcasts etc - 1 to 5 people. Approx. Audio Duration (Range) 15-60 minutes | Mono | Desktop | 5 | .wav | .json | ASR, Virtual Assistant, Chatbot, Conversational AI, Speech Analytics, TTS, Language Modelling | Contact | |
Vietnamese | Vietnamese | 8 kHz | General Conversation | 295 | 293 | Unscripted, synthetic telephonic conversation between "agent" and "customer", Approx. Audio Duration (Range) 5-15 Minutes, Northern (e.g.,Hanoi), Central, and Southern (e.g., Ho Chi Minh City). | Dual | Desktop | 5 | .wav | .json | ASR, Virtual Assistant, Chatbot, Conversational AI, Speech Analytics, TTS, Language Modelling | Contact | |
Vietnamese | Vietnamese | 16 kHz | Media Audio | 257 | 248 | Licensable Public domain audio/video files such as interviews, podcasts etc - 1 to 5 people. Approx. Audio Duration (Range) 15-60 minutes | Mono | Desktop | 5 | .wav | .json | ASR, Virtual Assistant, Chatbot, Conversational AI, Speech Analytics, TTS, Language Modelling | Contact | |
Welsh | Welsh | 8 kHz | General Conversation | 278 | 299 | Unscripted, synthetic telephonic conversation between "agent" and "customer", Approx. Audio Duration (Range) 5-15 Minutes, | Dual | Desktop | 5 | .wav | .json | ASR, Virtual Assistant, Chatbot, Conversational AI, Speech Analytics, TTS, Language Modelling | Contact | |
Indian English | Indian English | 8 kHz | Call-Center | 200 | 200 | Unscripted, synthetic telephonic conversation between "agent" and "customer", Approx. Audio Duration (Range) 5-15 Minutes, | Mono | Desktop | 5 | .wav | .json | ASR, Virtual Assistant, Chatbot, Conversational AI, Speech Analytics, TTS, Language Modelling | Contact | |
Telugu | Telugu | NA | Call-Center | 30 | 30 | Unscripted, synthetic telephonic conversation between "agent" and "customer", Approx. Audio Duration (Range) 5-15 Minutes, | NA | Desktop | 5 | .wav | .json | ASR, Virtual Assistant, Chatbot, Conversational AI, Speech Analytics, TTS, Language Modelling | Contact | |
Tamil | Tamil | NA | Call-Center | 60 | 60 | Unscripted, synthetic telephonic conversation between "agent" and "customer", Approx. Audio Duration (Range) 5-15 Minutes, | NA | Desktop | 5 | .wav | .json | ASR, Virtual Assistant, Chatbot, Conversational AI, Speech Analytics, TTS, Language Modelling | Contact | |
Kannada | Kannada | NA | Call-Center | 60 | 60 | Unscripted, synthetic telephonic conversation between "agent" and "customer", Approx. Audio Duration (Range) 5-15 Minutes, | NA | Desktop | 5 | .wav | .json | ASR, Virtual Assistant, Chatbot, Conversational AI, Speech Analytics, TTS, Language Modelling | Contact | |
Malayalam | Malayalam | NA | Call-Center | 60 | 60 | Unscripted, synthetic telephonic conversation between "agent" and "customer", Approx. Audio Duration (Range) 5-15 Minutes, | NA | Desktop | 5 | .wav | .json | ASR, Virtual Assistant, Chatbot, Conversational AI, Speech Analytics, TTS, Language Modelling | Contact | |
Bengali | Bengali | NA | Call-Center | 60 | 60 | Unscripted, synthetic telephonic conversation between "agent" and "customer", Approx. Audio Duration (Range) 5-15 Minutes, | NA | Desktop | 5 | .wav | .json | ASR, Virtual Assistant, Chatbot, Conversational AI, Speech Analytics, TTS, Language Modelling | Contact | |
Gujarati | Gujarati | NA | Call-Center | 60 | 60 | Unscripted, synthetic telephonic conversation between "agent" and "customer", Approx. Audio Duration (Range) 5-15 Minutes, | NA | Desktop | 5 | .wav | .json | ASR, Virtual Assistant, Chatbot, Conversational AI, Speech Analytics, TTS, Language Modelling | Contact | |
Marathi | Marathi | NA | Call-Center | 60 | 60 | Unscripted, synthetic telephonic conversation between "agent" and "customer", Approx. Audio Duration (Range) 5-15 Minutes, | NA | Desktop | 5 | .wav | .json | ASR, Virtual Assistant, Chatbot, Conversational AI, Speech Analytics, TTS, Language Modelling | Contact | |
Assamese | Assamese | NA | Call-Center | 60 | 60 | Unscripted, synthetic telephonic conversation between "agent" and "customer", Approx. Audio Duration (Range) 5-15 Minutes, | NA | Desktop | 5 | .wav | .json | ASR, Virtual Assistant, Chatbot, Conversational AI, Speech Analytics, TTS, Language Modelling | Contact | |
Oriya | Oriya | NA | Call-Center | 60 | 60 | Unscripted, synthetic telephonic conversation between "agent" and "customer", Approx. Audio Duration (Range) 5-15 Minutes, | NA | Desktop | 5 | .wav | .json | ASR, Virtual Assistant, Chatbot, Conversational AI, Speech Analytics, TTS, Language Modelling | Contact | |
Punjabi | Punjabi | NA | Call-Center | 60 | 60 | Unscripted, synthetic telephonic conversation between "agent" and "customer", Approx. Audio Duration (Range) 5-15 Minutes, | NA | Desktop | 5 | .wav | .json | ASR, Virtual Assistant, Chatbot, Conversational AI, Speech Analytics, TTS, Language Modelling | Contact | |
Telugu | Telugu | NA | General Conversation | 50 | 50 | Unscripted, synthetic telephonic conversation between "agent" and "customer", Approx. Audio Duration (Range) 5-15 Minutes, | NA | Desktop | 5 | .wav | .json | ASR, Virtual Assistant, Chatbot, Conversational AI, Speech Analytics, TTS, Language Modelling | Contact | |
Tamil | Tamil | NA | General Conversation | 100 | 100 | Unscripted, synthetic telephonic conversation between "agent" and "customer", Approx. Audio Duration (Range) 5-15 Minutes, | NA | Desktop | 5 | .wav | .json | ASR, Virtual Assistant, Chatbot, Conversational AI, Speech Analytics, TTS, Language Modelling | Contact | |
Kannada | Kannada | NA | General Conversation | 100 | 100 | Unscripted, synthetic telephonic conversation between "agent" and "customer", Approx. Audio Duration (Range) 5-15 Minutes, | NA | Desktop | 5 | .wav | .json | ASR, Virtual Assistant, Chatbot, Conversational AI, Speech Analytics, TTS, Language Modelling | Contact | |
Malayalam | Malayalam | NA | General Conversation | 100 | 100 | Unscripted, synthetic telephonic conversation between "agent" and "customer", Approx. Audio Duration (Range) 5-15 Minutes, | NA | Desktop | 5 | .wav | .json | ASR, Virtual Assistant, Chatbot, Conversational AI, Speech Analytics, TTS, Language Modelling | Contact | |
Bengali | Bengali | NA | General Conversation | 100 | 100 | Unscripted, synthetic telephonic conversation between "agent" and "customer", Approx. Audio Duration (Range) 5-15 Minutes, | NA | Desktop | 5 | .wav | .json | ASR, Virtual Assistant, Chatbot, Conversational AI, Speech Analytics, TTS, Language Modelling | Contact | |
Gujarati | Gujarati | NA | General Conversation | 100 | 100 | Unscripted, synthetic telephonic conversation between "agent" and "customer", Approx. Audio Duration (Range) 5-15 Minutes, | NA | Desktop | 5 | .wav | .json | ASR, Virtual Assistant, Chatbot, Conversational AI, Speech Analytics, TTS, Language Modelling | Contact | |
Marathi | Marathi | NA | General Conversation | 100 | 100 | Unscripted, synthetic telephonic conversation between "agent" and "customer", Approx. Audio Duration (Range) 5-15 Minutes, | NA | Desktop | 5 | .wav | .json | ASR, Virtual Assistant, Chatbot, Conversational AI, Speech Analytics, TTS, Language Modelling | Contact | |
Assamese | Assamese | NA | General Conversation | 100 | 100 | Unscripted, synthetic telephonic conversation between "agent" and "customer", Approx. Audio Duration (Range) 5-15 Minutes, | NA | Desktop | 5 | .wav | .json | ASR, Virtual Assistant, Chatbot, Conversational AI, Speech Analytics, TTS, Language Modelling | Contact | |
Oriya | Oriya | NA | General Conversation | 100 | 100 | Unscripted, synthetic telephonic conversation between "agent" and "customer", Approx. Audio Duration (Range) 5-15 Minutes, | NA | Desktop | 5 | .wav | .json | ASR, Virtual Assistant, Chatbot, Conversational AI, Speech Analytics, TTS, Language Modelling | Contact | |
Punjabi | Punjabi | NA | General Conversation | 100 | 100 | Unscripted, synthetic telephonic conversation between "agent" and "customer", Approx. Audio Duration (Range) 5-15 Minutes, | NA | Desktop | 5 | .wav | .json | ASR, Virtual Assistant, Chatbot, Conversational AI, Speech Analytics, TTS, Language Modelling | Contact | |
Telugu | Telugu | NA | Media Audio | 20 | 20 | Licensable Public domain audio/video files such as interviews, podcasts etc - 1 to 5 people. Approx. Audio Duration (Range) 15-60 minutes | NA | Desktop | 5 | .wav | .json | ASR, Virtual Assistant, Chatbot, Conversational AI, Speech Analytics, TTS, Language Modelling | Contact | |
Tamil | Tamil | NA | Media Audio | 40 | 40 | Licensable Public domain audio/video files such as interviews, podcasts etc - 1 to 5 people. Approx. Audio Duration (Range) 15-60 minutes | NA | Desktop | 5 | .wav | .json | ASR, Virtual Assistant, Chatbot, Conversational AI, Speech Analytics, TTS, Language Modelling | Contact | |
Kannada | Kannada | NA | Media Audio | 40 | 40 | Licensable Public domain audio/video files such as interviews, podcasts etc - 1 to 5 people. Approx. Audio Duration (Range) 15-60 minutes | NA | Desktop | 5 | .wav | .json | ASR, Virtual Assistant, Chatbot, Conversational AI, Speech Analytics, TTS, Language Modelling | Contact | |
Malayalam | Malayalam | NA | Media Audio | 40 | 40 | Licensable Public domain audio/video files such as interviews, podcasts etc - 1 to 5 people. Approx. Audio Duration (Range) 15-60 minutes | NA | Desktop | 5 | .wav | .json | ASR, Virtual Assistant, Chatbot, Conversational AI, Speech Analytics, TTS, Language Modelling | Contact | |
Bengali | Bengali | NA | Media Audio | 40 | 40 | Licensable Public domain audio/video files such as interviews, podcasts etc - 1 to 5 people. Approx. Audio Duration (Range) 15-60 minutes | NA | Desktop | 5 | .wav | .json | ASR, Virtual Assistant, Chatbot, Conversational AI, Speech Analytics, TTS, Language Modelling | Contact | |
Gujarati | Gujarati | NA | Media Audio | 40 | 40 | Licensable Public domain audio/video files such as interviews, podcasts etc - 1 to 5 people. Approx. Audio Duration (Range) 15-60 minutes | NA | Desktop | 5 | .wav | .json | ASR, Virtual Assistant, Chatbot, Conversational AI, Speech Analytics, TTS, Language Modelling | Contact | |
Marathi | Marathi | NA | Media Audio | 40 | 40 | Licensable Public domain audio/video files such as interviews, podcasts etc - 1 to 5 people. Approx. Audio Duration (Range) 15-60 minutes | NA | Desktop | 5 | .wav | .json | ASR, Virtual Assistant, Chatbot, Conversational AI, Speech Analytics, TTS, Language Modelling | Contact | |
Assamese | Assamese | NA | Media Audio | 40 | 40 | Licensable Public domain audio/video files such as interviews, podcasts etc - 1 to 5 people. Approx. Audio Duration (Range) 15-60 minutes | NA | Desktop | 5 | .wav | .json | ASR, Virtual Assistant, Chatbot, Conversational AI, Speech Analytics, TTS, Language Modelling | Contact | |
Oriya | Oriya | NA | Media Audio | 40 | 40 | Licensable Public domain audio/video files such as interviews, podcasts etc - 1 to 5 people. Approx. Audio Duration (Range) 15-60 minutes | NA | Desktop | 5 | .wav | .json | ASR, Virtual Assistant, Chatbot, Conversational AI, Speech Analytics, TTS, Language Modelling | Contact | |
Punjabi | Punjabi | NA | Media Audio | 40 | 40 | Licensable Public domain audio/video files such as interviews, podcasts etc - 1 to 5 people. Approx. Audio Duration (Range) 15-60 minutes | NA | Desktop | 5 | .wav | .json | ASR, Virtual Assistant, Chatbot, Conversational AI, Speech Analytics, TTS, Language Modelling | Contact | |
English US | English US | 48 kHz | Scripted Monologue | 5 | 4 | Single-utterance recordings, which tend to fall in the 5 to 30 second range | Mono | Mobile App | 5 | .wav | .json | ASR, Virtual Assistant, Chatbot, Conversational AI, Speech Analytics, TTS, Language Modelling | Contact | |
Spanish Spain | Spanish Spain | 48 kHz | Scripted Monologue | 10 | 8 | Single-utterance recordings, which tend to fall in the 5 to 30 second range | Mono | Mobile App | 5 | .wav | .json | ASR, Virtual Assistant, Chatbot, Conversational AI, Speech Analytics, TTS, Language Modelling | Contact | |
Mexican | Mexican | 48 kHz | Scripted Monologue | 1,492 | 1,228 | Single-utterance recordings, which tend to fall in the 5 to 30 second range | Mono | Mobile App | 5 | .wav | .json | ASR, Virtual Assistant, Chatbot, Conversational AI, Speech Analytics, TTS, Language Modelling | Contact | |
Canadian | Canadian | 48 kHz | Scripted Monologue | 1,222 | 1,049 | Single-utterance recordings, which tend to fall in the 5 to 30 second range | Mono | Mobile App | 5 | .wav | .json | ASR, Virtual Assistant, Chatbot, Conversational AI, Speech Analytics, TTS, Language Modelling | Contact | |
Netherlands | Netherlands | 48 kHz | Scripted Monologue | 1,205 | 1,021 | Single-utterance recordings, which tend to fall in the 5 to 30 second range | Mono | Mobile App | 5 | .wav | .json | ASR, Virtual Assistant, Chatbot, Conversational AI, Speech Analytics, TTS, Language Modelling | Contact | |
Polish Poland | Polish Poland | 48 kHz | Scripted Monologue | 1,482 | 1,266 | Single-utterance recordings, which tend to fall in the 5 to 30 second range | Mono | Mobile App | 5 | .wav | .json | ASR, Virtual Assistant, Chatbot, Conversational AI, Speech Analytics, TTS, Language Modelling | Contact | |
Turkish Turkey | Turkish Turkey | 48 kHz | Scripted Monologue | 2,027 | 1,735 | Single-utterance recordings, which tend to fall in the 5 to 30 second range | Mono | Mobile App | 5 | .wav | .json | ASR, Virtual Assistant, Chatbot, Conversational AI, Speech Analytics, TTS, Language Modelling | Contact | |
Chinese Traditional | Chinese Traditional | 48 kHz | Scripted Monologue | 1,028 | 891 | Single-utterance recordings, which tend to fall in the 5 to 30 second range | Mono | Mobile App | 5 | .wav | .json | ASR, Virtual Assistant, Chatbot, Conversational AI, Speech Analytics, TTS, Language Modelling | Contact | |
Arabic | Arabic | 48 kHz | Scripted Monologue | 1,947 | 1,594 | Single-utterance recordings, which tend to fall in the 5 to 30 second range | Mono | Mobile App | 5 | .wav | .json | ASR, Virtual Assistant, Chatbot, Conversational AI, Speech Analytics, TTS, Language Modelling | Contact | |
Danish | Danish | 48 kHz | Scripted Monologue | 2,579 | 2,041 | Single-utterance recordings, which tend to fall in the 5 to 30 second range, Danish from Denmark | Mono | Mobile App | 5 | .wav | .json | ASR, Virtual Assistant, Chatbot, Conversational AI, Speech Analytics, TTS, Language Modelling | Contact | |
Hindi | Hindi | 8 kHz | Call-center | 122 | 131 | Unscripted, synthetic telephonic conversation between "agent" and "customer", Approx. Audio Duration (Range) 5-15 Minutes, | Dual | Desktop | 5 | .wav | .json | ASR, Virtual Assistant, Chatbot, Conversational AI, Speech Analytics, TTS, Language Modelling | Contact | |
Hindi | Hindi | 16 kHz | Media audio | 219 | 202 | Licensable Public domain audio/video files such as interviews, podcasts etc - 1 to 5 people. Approx. Audio Duration (Range) 15-60 minutes | Mono | Desktop | 5 | .wav | .json | ASR, Virtual Assistant, Chatbot, Conversational AI, Speech Analytics, TTS, Language Modelling | Contact | |
Hindi | Hindi | 48 kHz | Scripted Monologue | 2,867 | 2,105 | Single-utterance recordings, which tend to fall in the 5 to 30 second range | Mono | Mobile App | 5 | .wav | .json | ASR, Virtual Assistant, Chatbot, Conversational AI, Speech Analytics, TTS, Language Modelling | Contact | |
Japanese | Japanese | 48 kHz | Scripted Monologue | 2,335 | 2,029 | Single-utterance recordings, which tend to fall in the 5 to 30 second range | Mono | Mobile App | 5 | .wav | .json | ASR, Virtual Assistant, Chatbot, Conversational AI, Speech Analytics, TTS, Language Modelling | Contact | |
Korean | Korean | 48 kHz | Scripted Monologue | 1,955 | 1,548 | Single-utterance recordings, which tend to fall in the 5 to 30 second range | Mono | Mobile App | 5 | .wav | .json | ASR, Virtual Assistant, Chatbot, Conversational AI, Speech Analytics, TTS, Language Modelling | Contact | |
Russian | Russian | 48 kHz | Scripted Monologue | 2,398 | 2,046 | Single-utterance recordings, which tend to fall in the 5 to 30 second range | Mono | Mobile App | 5 | .wav | .json | ASR, Virtual Assistant, Chatbot, Conversational AI, Speech Analytics, TTS, Language Modelling | Contact | |
Chinese Simplified | Chinese Simplified | 48 kHz | Scripted Monologue | 2,762 | 2,181 | Single-utterance recordings, which tend to fall in the 5 to 30 second range | Mono | Mobile App | 5 | .wav | .json | ASR, Virtual Assistant, Chatbot, Conversational AI, Speech Analytics, TTS, Language Modelling | Contact | |
German | German | 8 kHz | Call-Center | 64 | 0 | Unscripted, synthetic telephonic conversation between "agent" and "customer", Approx. Audio Duration (Range) 5-15 Minutes, | Dual | Desktop | .wav | .json | ASR, Virtual Assistant, Chatbot, Conversational AI, Speech Analytics, TTS, Language Modelling | Contact |
Description
Call Center Conversations 8khz: Unscripted, synthetic telephonic conversation: “agent” & “customer”
Generic Conversations 8khz: Unscripted telephonic conversation between 2 people
Media & Podcasts 16khz: Public domain audio/video interviews, podcasts etc. 1-5 people
Utterance/Scripted Monologue 16khz: Recording based on Prompt
Can’t find what you are looking for?
New off-the-shelf audio & speech datasets are being collected across all data types
Contact us now to let go of your audio/speech training data collection worries