Electronic Health Records (EHR) datasets for AI & ML Projects

Off-the-shelf Electronic Health Records (EHR) Datasets to Jumpstart your Healthcare AI project.

Electronic Health Records (Ehr) Data

Plug-in the medical data you’ve been missing today

Find the right Electronic Health Records (EHR) Data For Your Healthcare AI

Improve your machine learning models with best-in-class training data. Our Off-the-shelf data catalog makes it easy for you to get medical training data you can trust.

EHR Data by Location
LocationText Documents
NorthEast4,473,573
South1,801,716
MidWest781,701
West1,509,109
EHR Data by Major Diagnosis Category
EHR Data by Major Diagnosis CategoryText Documents
Circulatory System589,730
Infectious & Parasitic Diseases559,244
Respiratory System561,983
Musculoskeletal System & Connective Tissue329,344
Digestive System
346,369
Nervous System
316,243
Mental Diseases & Disorders
282,501
Kidney & Urinary Tract
209,561
Pregnancy, Childbirth & the Puerperium
165,303
Newborns & Other Neonates with Conditions Originating in the Perinatal Period
163,605
Endocrine, Nutritional & Metabolic Diseases & Disorders
142,808
Hepatobiliary System & Pancreas
127,172
Skin, Subcutaneous Tissue & Breast
89,577
Injuries, Poisonings & Toxic Effects of Drugs
64,097
Blood, Blood Forming Organs, Immunologic Disorders
48,990
Alcohol/Drug Use & Alcohol/Drug-Induced Organic Mental Disorders
48,717
Multiple Significant Trauma
27,902
Ear, Nose, Mouth & Throat
22,987
Female Reproductive System
17,010
Factors Influencing Health Status & Other Contacts with Health Services
21,294
Myeloproliferative Diseases & Disorders, Poorly Differentiated Neoplasms
15,620
Human Immunodeficiency Virus Infections
12,422
Male Reproductive System
9,230
Eye
3,549
Burns
444
Alcohol/Drug Use or Induced Mental Disorders48,717
                                                                                  Total with MDC
4,175,702
Cases using a specialty grouper such as 3M (MDC not specified)
1,619,682
Outpatient Cases (MDC not specified)
1,980,606
Cases without reimbursement generated (MDC not specified)
790,697

Total including everything (Cases with & without MDC category)

8,566,687

We deal with all types of Data Licensing i.e., text, audio, video, or image. The datasets consist of Medical datasets for ML: Physician Dictation Dataset, Physician Clinical Notes, Medical Conversation Dataset, Medical Transcription Dataset, Doctor-Patient Conversation, Medical Text Data, Medical Images – CT Scan, MRI, Ultra Sound (collected basis custom requirements).

Shaip Contact Us

Can’t find what you are looking for?

New off-the-shelf medical datasets are being collected across all data types 

Contact us now to let go of your healthcare training data collection worries

  • By registering, I agree with Shaip Privacy Policy and Terms of Service and provide my consent to receive B2B marketing communication from Shaip.