A Prompt Library for Efficient Clinical Entity Recognition Using Large Language Models

Original paper: This folder include all 70 articles in PDF format.

Extracted prompt: This folder includes the prompts using extracted information from the 70 articles.

Table 1. Summary of collected articles categorized by disease focused with reference

Disease Category	Included Diseases	Included Clinical Note Type	Included Entities	Number of Papers
Neurological Diseases	Subdural hematoma (SDH)¹, Acute Stroke², Stroke³, Brain Injuries⁴, Alzheimer's Disease⁵	Clinical Letters, Radiology Reports, Clinical trial notes	Stroke Attributes, Brain Injury Characteristics, Alzheimer's Disease Factors, Clinical and Diagnostic Parameters	5
Cardiovascular Diseases	Heart Disease^6-9, Cardiac Function¹⁰, Peripheral Arterial Disease¹¹	Clinical Reports, Discharge Documents, Radiology Notes, CT reports	Cardiovascular Risk Factors, ARDS Treatment and Management, PAD Status, Cardiac Function Measures	6
Cancers	Lung Cancer¹², Melanoma¹³, Breast Cancer^{14, 15}, Prostate Cancer¹⁶, Liver cancer^17-19, Metastatic Disease²⁰, Gastroesophageal cancer²¹, Mutli-cancer²², General cancer^{23, 24}	Pathology reports, Drug Notes, Electronic Health Records (EHR), Radiology Reports, Operation Notes	Cancer Type and Location, Diagnostic and Prognostic Markers, Treatment and Genetic Information, Metastasis and Histological Details	13
Respiratory Diseases	Acute Respiratory Distress Syndrome (ARDS)²⁵, Pulmonary nodules²⁶	CT reports, Discharge documents	Pulmonary Nodule, ARDS, Mechanical Ventilation, ICU Admission, PICS Symptoms	2
Substance-Related Disorders	Substance use^27-29, Smoking status³⁰	Clinical Reports, Medical discharge records	Alcohol Use, Drug Use, Nicotine Use, Smoking Status, Non-Smoker, Current Smoker, Unknown	4
Injuries and Related Conditions	Bone Fractures³¹, Wound Information³², Fall-related information³³	Radiology Reports, Clinical Notes	Fracture Specifics, Wound Care Attributes, Fall Prevention Measures	3
Social and Economic Issues	Social Determinants of Health (SDoH)^{34, 35}	Clinical Notes, Social history sections	Alcohol, Drug, Tobacco, Employment, Living Status, Substance Use, Employment, Living Status	2
Blood and Circulatory System Disorders	Venous thromboembolisms (VTE)³⁶, Bleeding events³⁷	Narrative radiology reports, EHR	Thrombosis and Pulmonary Embolism Details, Bleeding Event Characteristics	2
Eye Diseases	Diabetic Retinopathy³⁸, Glaucoma^{39, 40}	Radiological Reports, Clinical Progress Notes, Ophthalmology Notes	Eye Disease Indicators, Treatment and Medication Adherence	3
Infectious Diseases	COVID-19⁴¹, Invasive fungal infection^{42, 43}	Radiological Reports	COVID-19 diagnostics and symptoms, infection, infection risk factors, abnormalities	3
Radiation and Related Treatments	Radiation Therapy⁵¹	Clinical Texts	Radiation Therapy Parameters	1
Other Specific Conditions	Preterm birth risk⁴⁴, Craniofacial and oral phenotypes⁴⁵, Colonoscopy^{46, 47}, Thyroid Nodules⁴⁸, Geriatric syndromes⁴⁹, Cartilage diseases⁵⁰	Medical notes, Clinical narratives, Colonoscopy Reports, Radiology Reports, Electronic Medical Records, Knee MRI	Pregnancy Risks, Craniofacial and Oral Health, Colonoscopy Results, Thyroid and Cartilage Condition Details, Geriatric Syndromes	7
Others	General^52-70	Clinical Reports, Radiology Reports, HER, Hospital Discharge Summaries, Operative Notes, Radiology Reports, EHR, Physician’s free-text notes	Identifications, Disorder, Medicine related, Observation uncertainty, Problems, Treatments, Tests, Drugs	19

References

Pruitt P, Naidech A, Van Ornam J, Borczuk P, Thompson W. A natural language processing algorithm to extract characteristics of subdural hematoma from head CT reports. Emergency Radiology. 2019;26:301-6.
Cutforth M, Watson H, Brown C, Wang C, Thomson S, Fell D, et al. Acute stroke CDS: automatic retrieval of thrombolysis contraindications from unstructured clinical letters. Frontiers in Digital Health. 2023;5:1186516.
Alex B, Grover C, Tobin R, Sudlow C, Mair G, Whiteley W. Text mining brain imaging reports. Journal of biomedical semantics. 2019;10:1-11.
Torres-Lopez VM, Rovenolt GE, Olcese AJ, Garcia GE, Chacko SM, Robinson A, et al. Development and validation of a model to identify critical brain injuries using natural language processing of text computed tomography reports. JAMA network open. 2022;5(8):e2227109-e.
Sun Z, Tao C, editors. Named entity recognition and normalization for alzheimer’s disease eligibility criteria. 2023 IEEE 11th International Conference on Healthcare Informatics (ICHI); 2023: IEEE.
Yang H, Garibaldi JM. A hybrid model for automatic identification of risk factors for heart disease. Journal of biomedical informatics. 2015;58:S171-S82.
Chen Q, Li H, Tang B, Wang X, Liu X, Liu Z, et al. An automatic system to identify heart disease risk factors in clinical texts over time. Journal of biomedical informatics. 2015;58:S158-S63.
Karystianis G, Dehghan A, Kovacevic A, Keane JA, Nenadic G. Using local lexicalized rules to identify heart disease risk factors in clinical notes. Journal of biomedical informatics. 2015;58:S183-S8.
Kim Y, Garvin JH, Goldstein MK, Hwang TS, Redd A, Bolton D, et al. Extraction of left ventricular ejection fraction information from various types of clinical reports. Journal of biomedical informatics. 2017;67:42-8.
Pandey M, Xu Z, Sholle E, Maliakal G, Singh G, Fatima Z, et al. Extraction of radiographic findings from unstructured thoracoabdominal computed tomography reports using convolutional neural network based natural language processing. PLoS One. 2020;15(7):e0236827.
Savova GK, Fan J, Ye Z, Murphy SP, Zheng J, Chute CG, et al., editors. Discovering peripheral arterial disease cases from radiology notes using natural language processing. AMIA Annual Symposium Proceedings; 2010.
Hu D, Zhang H, Li S, Wang Y, Wu N, Lu X. Automatic extraction of lung cancer staging information from computed tomography reports: deep learning approach. JMIR medical informatics. 2021;9(7):e27955.
Kang H, Li J, Wu M, Shen L, Hou L. Building a pharmacogenomics knowledge model toward precision medicine: case study in melanoma. JMIR Medical Informatics. 2020;8(10):e20291.
Zhou S, Wang N, Wang L, Liu H, Zhang R. CancerBERT: a cancer domain-specific language model for extracting breast cancer phenotypes from electronic health records. Journal of the American Medical Informatics Association. 2022;29(7):1208-16.
Zhang X, Zhang Y, Zhang Q, Ren Y, Qiu T, Ma J, et al. Extracting comprehensive clinical information for breast cancer using deep learning methods. International journal of medical informatics. 2019;132:103985.
Leyh-Bannurah S-R, Tian Z, Karakiewicz PI, Wolffgang U, Sauter G, Fisch M, et al. Deep learning for natural language processing in urology: state-of-the-art automated extraction of detailed pathologic prostate cancer data from narratively written electronic health records. JCO clinical cancer informatics. 2018;2:1-9.
Yim W-w, Kwan SW, Yetisgen M. Tumor reference resolution and characteristic extraction in radiology reports for liver cancer stage prediction. Journal of biomedical informatics. 2016;64:179-91.
Yim W-w, Denman T, Kwan SW, Yetisgen M. Tumor information extraction in radiology reports for hepatocellular carcinoma patients. AMIA Summits on Translational Science Proceedings. 2016;2016:455.
Ping X-O, Tseng Y-J, Chung Y, Wu Y-L, Hsu C-W, Yang P-M, et al. Information extraction for tracking liver cancer patients' statuses: from mixture of clinical narrative report types. TELEMEDICINE and e-HEALTH. 2013;19(9):704-10.
Tay SB, Low GH, Wong GJE, Tey HJ, Leong FL, Li C, et al. Use of natural language processing to infer sites of metastatic disease from radiology reports at scale. JCO Clinical Cancer Informatics. 2024;8:e2300122.
Oliwa T, Maron SB, Chase LM, Lomnicki S, Catenacci DV, Furner B, et al. Obtaining knowledge in pathology reports through a natural language processing approach with classification, named-entity recognition, and relation-extraction heuristics. JCO clinical cancer informatics. 2019;3:1-8.
Gao S, Young MT, Qiu JX, Yoon H-J, Christian JB, Fearn PA, et al. Hierarchical attention networks for information extraction from cancer pathology reports. Journal of the American Medical Informatics Association. 2018;25(3):321-30.
Ashish N, Dahm L, Boicey C. University of California, Irvine–Pathology Extraction Pipeline: The pathology extraction pipeline for information extraction from pathology reports. Health informatics journal. 2014;20(4):288-305.
Sugimoto K, Takeda T, Oh J-H, Wada S, Konishi S, Yamahata A, et al. Extracting clinical terms from radiology reports with deep learning. Journal of Biomedical Informatics. 2021;116:103729.
Weissman GE, Harhay MO, Lugo RM, Fuchs BD, Halpern SD, Mikkelsen ME. Natural language processing to assess documentation of features of critical illness in discharge documents of acute respiratory distress syndrome survivors. Annals of the American Thoracic Society. 2016;13(9):1538-45.
Mojibian A, Jaskolka J, Ching G, Lee B, Myers R, Devine C, et al. The Efficacy of a Named Entity Recognition AI Model for Identifying Incidental Pulmonary Nodules in CT Reports. Canadian Association of Radiologists Journal. 2025;76(1):68-75.
Wang Y, Chen ES, Pakhomov S, Arsoniadis E, Carter EW, Lindemann E, et al., editors. Automated extraction of substance use information from clinical texts. AMIA Annual Symposium Proceedings; 2015.
Poulsen M, Troiani V, Freda P, Mowery D, Davoudi A. Annotation dataset of problematic opioid use and related contexts from MIMIC-III Critical Care Database discharge summaries.
Lybarger K, Ostendorf M, Yetisgen M. Annotating social determinants of health using active learning, and characterizing determinants using neural event extraction. Journal of Biomedical Informatics. 2021;113:103631.
Uzuner Ö, Goldstein I, Luo Y, Kohane I. Identifying patient smoking status from medical discharge records. Journal of the American Medical Informatics Association. 2008;15(1):14-24.
Dai Z, Li Z, Han L, editors. Bonebert: A bert-based automated information extraction system of radiology reports for bone fracture detection and diagnosis. Advances in Intelligent Data Analysis XIX: 19th International Symposium on Intelligent Data Analysis, IDA 2021, Porto, Portugal, April 26–28, 2021, Proceedings 19; 2021: Springer.
Topaz M, Lai K, Dowding D, Lei VJ, Zisberg A, Bowles KH, et al. Automated identification of wound information in clinical notes of patients with heart diseases: Developing and validating a natural language processing application. International journal of nursing studies. 2016;64:25-31.
Topaz M, Murga L, Gaddis KM, McDonald MV, Bar-Bachar O, Goldberg Y, et al. Mining fall-related information in clinical notes: Comparison of rule-based and novel word embedding-based machine learning approaches. Journal of biomedical informatics. 2019;90:103103.
Zhao X, Rios A. A marker-based neural network system for extracting social determinants of health. Journal of the American Medical Informatics Association. 2023;30(8):1398-407.
Lituiev DS, Lacar B, Pak S, Abramowitsch PL, De Marchis EH, Peterson TA. Automatic extraction of social determinants of health from medical notes of chronic lower back pain patients. Journal of the American Medical Informatics Association. 2023;30(8):1438-47.
Tian Z, Sun S, Eguale T, Rochefort CM. Automated extraction of VTE events from narrative radiology reports in electronic health records: a validation study. Medical care. 2017;55(10):e73-e80.
Mitra A, Rawat BPS, McManus D, Kapoor A, Yu H, editors. Bleeding entity recognition in electronic health records: a comprehensive analysis of end-to-end systems. AMIA Annual Symposium Proceedings; 2021.
Yu Z, Yang X, Sweeting GL, Ma Y, Stolte SE, Fang R, et al. Identify diabetic retinopathy-related clinical concepts and their attributes using transformer-based natural language processing methods. BMC medical informatics and decision making. 2022;22(Suppl 3):255.
Lin W-C, Chen JS, Kaluzny J, Chen A, Chiang MF, Hribar MR, editors. Extraction of active medications and adherence using natural language processing for glaucoma patients. AMIA Annual Symposium Proceedings; 2022.
Wang SY, Huang J, Hwang H, Hu W, Tao S, Hernandez-Boussard T. Leveraging weak supervision to perform named entity recognition in electronic health records progress notes to identify the ophthalmology exam. International journal of medical informatics. 2022;167:104864.
Rozova V, Khanina A, Teng JC, Teh JS, Worth LJ, Slavin MA, et al. Detecting evidence of invasive fungal infections in cytology and histopathology reports enriched with concept-level annotations. Journal of Biomedical Informatics. 2023;139:104293.
Rozova V, Khanina, A., Ong, J., Alipour, R., Worth, L., Slavin, M., Thursky, K., & Verspoor, K. PIFIR: PET-CT Invasive Fungal Infection Reports (version 1.0.0). PhysioNet. 2025.
Rozova V, Khanina, A., Teng, J., Teh, J., Worth, L., Slavin, M., thursky, k., & Verspoor, K. CHIFIR: Cytology and Histopathology Invasive Fungal Infection Reports. PhysioNet. 2024.
Sterckx L, Vandewiele G, Dehaene I, Janssens O, Ongenae F, De Backere F, et al. Clinical information extraction for preterm birth risk prediction. Journal of Biomedical Informatics. 2020;110:103544.
Mishra R, Burke A, Gitman B, Verma P, Engelstad M, Haendel MA, et al. Data-driven method to enhance craniofacial and oral phenotype vocabularies. The Journal of the American Dental Association. 2019;150(11):933-9. e2.
Seong D, Choi YH, Shin S-Y, Yi B-K. Deep learning approach to detection of colonoscopic information from unstructured reports. BMC Medical Informatics and Decision Making. 2023;23(1):28.
Denny JC, Peterson JF, Choma NN, Xu H, Miller RA, Bastarache L, et al. Extracting timing and status descriptors for colonoscopy testing from electronic medical records. Journal of the American Medical Informatics Association. 2010;17(4):383-8.
Pathak A, Yu Z, Paredes D, Monsour EP, Rocha AO, Brito JP, et al., editors. Extracting thyroid nodules characteristics from ultrasound reports using transformer-based natural language processing methods. AMIA Annual Symposium Proceedings; 2024.
Chen T, Dredze M, Weiner JP, Hernandez L, Kimura J, Kharrazi H. Extraction of geriatric syndromes from electronic health record clinical notes: assessment of statistical natural language processing methods. JMIR medical informatics. 2019;7(1):e13039.
Valente AS, Trunfio TA, Aiello M, Baldi D, Baldi M, Imbò S, et al. Text mining approach for feature extraction and cartilage disease grade classification using knee MRI radiology reports. Computational and Structural Biotechnology Journal. 2024;24:622-9.
Bitterman DS, Goldner E, Finan S, Harris D, Durbin EB, Hochheiser H, et al. An end-to-end natural language processing system for automatically extracting radiation therapy events from clinical texts. International Journal of Radiation Oncology* Biology* Physics. 2023;117(1):262-73.
Kormilitzin A, Vaci N, Liu Q, Nevado-Holgado A. Med7: A transferable clinical natural language processing model for electronic health records. Artificial Intelligence in Medicine. 2021;118:102086.
Kusa W, Mendoza ÓE, Knoth P, Pasi G, Hanbury A. Effective matching of patients to clinical trials using entity extraction and neural re-ranking. Journal of biomedical informatics. 2023;144:104444.
Yang H. Automatic extraction of medication information from medical discharge summaries. Journal of the American Medical Informatics Association. 2010;17(5):545-8.
Cho M, Ha J, Park C, Park S. Combinatorial feature embedding based on CNN and LSTM for biomedical named entity recognition. Journal of biomedical informatics. 2020;103:103381.
Polignano M, de Gemmis M, Semeraro G, editors. Comparing Transformer-based NER approaches for analysing textual medical diagnoses. CLEF (Working Notes); 2021.
Li F, Liu W, Yu H. Extraction of information related to adverse drug events from electronic health record notes: design of an end-to-end model based on deep learning. JMIR medical informatics. 2018;6(4):e12159.
Dandala B, Joopudi V, Tsou C-H, Liang JJ, Suryanarayanan P. Extraction of information related to drug safety surveillance from electronic health record notes: Joint modeling of entities and relations using knowledge-aware neural attentive models. JMIR medical informatics. 2020;8(7):e18417.
Suresh S, Tavabi N, Golchin S, Gilreath L, Garcia-Andujar R, Kim A, et al., editors. Intermediate domain finetuning for weakly supervised domain-adaptive clinical NER. The 22nd Workshop on Biomedical Natural Language Processing and BioNLP Shared Tasks; 2023.
Thukral A, Dhiman S, Meher R, Bedi P. Knowledge graph enrichment from clinical narratives using NLP, NER, and biomedical ontologies for healthcare applications. International Journal of Information Technology. 2023;15(1):53-65.
Xu H, Stenner SP, Doan S, Johnson KB, Waitman LR, Denny JC. MedEx: a medication information extraction system for clinical narratives. Journal of the American Medical Informatics Association. 2010;17(1):19-24.
Yehia E, Boshnak H, AbdelGaber S, Abdo A, Elzanfaly DS. Ontology-based clinical information extraction from physician’s free-text notes. Journal of biomedical informatics. 2019;98:103276.
Platas A, Zotova E, Martínez-Arias P, López-Linares K, Cuadros M. Synthetic Annotated Data for Named Entity Recognition in Computed Tomography Scan Reports. 2024.
Jain S, Agrawal A, Saporta A, Truong SQ, Duong DN, Bui T, et al. Radgraph: Extracting clinical entities and relations from radiology reports. arXiv preprint arXiv:210614463. 2021.
Bear Don't Walk IV O, Pichon, A., Reyes Nieva, H., Sun, T., Lı, J., Joseph, J. W., Kinberg, S., Richter, L. R., Crusco, S., Kulas, K., Ahmed, S., Snyder, D., Rahbari, A., Ranard, B., Juneja, P., Demner-Fushman, D., & Elhadad, N. C-REACT: Contextualized Race and Ethnicity Annotations for Clinical Text (version 1.0.0). PhysioNet. 2024.
Goel A, Gueta A, Gilon O, Erell S, Feder A. Medication Extraction Labels for MIMIC-IV-Note Clinical Database. PhysioNet; 2023.
Patrick J, Li M. High accuracy information extraction of medication information from clinical notes: 2009 i2b2 medication extraction challenge. Journal of the American Medical Informatics Association. 2010;17(5):524-7.
Uzuner Ö, South BR, Shen S, DuVall SL. 2010 i2b2/VA challenge on concepts, assertions, and relations in clinical text. Journal of the American Medical Informatics Association. 2011;18(5):552-6.
Stubbs A, Kotfila C, Uzuner Ö. Automated systems for the de-identification of longitudinal clinical narratives: Overview of 2014 i2b2/UTHealth shared task Track 1. Journal of biomedical informatics. 2015;58:S11-S9.
Pradhan S, Elhadad N, Chapman W, Manandhar S, Savova G, editors. Semeval-2014 task 7: Analysis of clinical text. Proceedings of the 8th international workshop on semantic evaluation (SemEval 2014); 2014.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

A Prompt Library for Efficient Clinical Entity Recognition Using Large Language Models

About

Releases

Packages

Name		Name	Last commit message	Last commit date
Latest commit History 30 Commits
Extracted prompt		Extracted prompt
Original paper		Original paper
README.md		README.md

BIDS-Xu-Lab/UNER_Prompt_Library

Folders and files

Latest commit

History

Repository files navigation

A Prompt Library for Efficient Clinical Entity Recognition Using Large Language Models

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Packages