Praveen Kumar - Discoverbits logo

Praveen Kumar

Research Assistant Professor
School of Medicine, University of New Mexico

MY SHORT BIO

I hold a Bachelor's degree in Computer Engineering from Sardar Vallabhbhai National Institute of Technology (NIT) Surat, India, and both a Master's and Ph.D. in Computer Science from the University of New Mexico (UNM), Albuquerque, USA. Following my undergraduate studies, I spent ~12 years in the IT industry, working across the banking, insurance, and travel sectors in roles such as Software Engineer, Systems Analyst, and Associate Project Manager. However, my passion for artificial intelligence (AI) and machine learning (ML) led me to transition into academia. I returned to graduate school to pursue advanced studies, culminating in a Ph.D.

My research expertise spans health informatics and cheminformatics, focusing on developing AI/ML algorithms to extract insights from complex, noisy, and high-dimensional datasets. This includes working with patient health records, chemical compound data, and biomedical knowledge graphs. In health informatics, I develop methods to identify uncoded/undocumented mental health conditions and co-occurring disorders from electronic health records (EHR) and claims data. In cheminformatics, my work centers on discovering potential therapeutic compounds by analyzing chemical datasets. I also work on biomedical knowledge graphs to infer associations between biological entities such as genes, compounds, and diseases. Recently, I have been developing ML-based risk models for the early detection of cardiometabolic diseases.

Beyond my academic research, I have experience in teaching and mentoring, primarily gained during my time in the IT industry, where I was responsible for training new team members in both technical skills and domain-specific knowledge. I also hold certifications in Equity Derivatives and Mutual Funds from the National Stock Exchange (NSE) of India.

RESEARCH PUBLICATIONS

Journal/Conference Articles

  • Ahooyi TM, Stear B, Simmons JA, Metzger VT, Kumar P, Evangelista JE, Clarke DJB, Xie Z, Kim H, Jenkins SL, Maurya MR, Ramachandran S, Fahy E, Gillespie TH, Imam FT, Kokash N, Roth ME, Fullem R, Jevtic D, Mihajlovic A, Tiemeyer M, Bakker C, Schroeder AJ, Markowski J, Nedzel J, Hill DD, Terry J, Nemarich C, Boline J, Park PJ, Ardlie KG, Vora J, Mazumder R, Ranzinger R, de Bono B, Subramaniam S, Grethe JS, Yang JJ, Lambert CG, Resnick A, Milosavljevic A, Ma'ayan A, Silverstein JC, Taylor DM. The Data Distillery: A Graph Framework for Semantic Integration and Querying of Biomedical Data. bioRxiv [Preprint]. 2025 Oct 16:2025.08.11.666099. doi: 10.1101/2025.08.11.666099. PMID: 40832351; PMCID: PMC12363844. Link
  • D. Masood, M. Kim, J. Vora, R. Kahsay, P. McNeeley, S. Kim, S. Kulkarni, D. A. Natale, S. Ramachandran, S. Gupta, M. Maurya, C. G. Bologa, T. S. DeNapoli, V. T. Metzger, P. Kumar, N. Ahmed, J. E. Evangelista, S. C. Kelly, J. L. Sepulveda, A. Ma'ayan, J. Silverstein, D. M. Taylor, D. J. Crichton, A. Mahabal, J. J. Yang, C. G. Lambert, S. Subramaniam, M. Tiemeyer, R. Ranzinger, and R. Mazumder. Biomarkerkb: An integrated knowledgebase supporting biomarker-centric exploration of biomedical data, 2026. Link
  • Kumar P, Viszolay AD, Upadhayaya R, Moomtaheen F, Greer DR, Bologa CG, Schneider KA, Davis SE, Matheny ME, van der Goes D, Villarreal G, Zhu Y, Tohen M, Malec SA, Yang JJ, Fielstein EM, Lambert CG. Detecting Undiagnosed Mental Health Conditions Using Positive and Unlabeled Learning: Identifying Uncoded Self-Harm in Veterans’ Electronic Health Records. JMIR Preprints. 05/12/2025:89071. Link
  • Kumar P, Metzger VT, Purushotham ST, Kedia P, Bologa CG, Lambert CG, Yang JJ. KG2ML: integrating knowledge graphs and positive unlabeled learning for identifying disease-associated genes. Front Bioinform. 2026 Jan 8;5:1727953. doi: 10.3389/fbinf.2025.1727953. PMID: 41584517; PMCID: PMC12823822. Link
  • Praveen Kumar, Vincent T. Metzger, and Scott Alexander Malec. 2025. Unsupervised Latent Pattern Analysis for Estimating Type 2 Diabetes Risk in Undiagnosed Populations. Proceedings of the 16th ACM International Conference on Bioinformatics, Computational Biology, and Health Informatics. Association for Computing Machinery, New York, NY, USA, Article 49, 1–10. Link
  • Kumar P, Lambert CG. Positive Unlabeled Learning Selected Not At Random (PULSNAR): class proportion estimation without the selected completely at random assumption. PeerJ Comput Sci. 2024 Nov 5;10:e2451. doi: 10.7717/peerj-cs.2451. PMID: 39650456; PMCID: PMC11622864. Link
  • Kumar P, Moomtaheen F, Malec SA, Yang JJ, Bologa CG, Schneider KA, Zhu Y, Tohen M, Villarreal G, Perkins DJ, Fielstein EM, Davis SE, Matheny ME, Lambert CG. Detecting Opioid Use Disorder in Health Claims Data With Positive Unlabeled Learning. IEEE J Biomed Health Inform. 2025 Feb;29(2):750-757. doi: 10.1109/JBHI.2024.3515805. Epub 2025 Feb 10. PMID: 40030473; PMCID: PMC11971012. Link
  • Ranjbar M, Yang JJ, Kumar P, Byrd DR, Bearer EL, Oprea TI. Autophagy dark genes: Can we find them with machine learning?. Natural Sciences. 2023 Jul;3(3):e20220067. Link
  • Evangelista JE, Clarke DJB, Xie Z, Marino GB, Utti V, Jenkins SL, Ahooyi TM, Bologa CG, Yang JJ, Binder JL, Kumar P, Lambert CG, Grethe JS, Wenger E, Taylor D, Oprea TI, de Bono B, Ma'ayan A. Toxicology knowledge graph for structural birth defects. Commun Med (Lond). 2023 Jul 17;3(1):98. doi: 10.1038/s43856-023-00329-2. PMID: 37460679; PMCID: PMC10352311. Link
  • Jarratt L, Situ J, King RD, Montanez Ramos E, Groves H, Ormesher R, Cossé M, Raboff A, Mahajan A, Thompson J, Ko RF, Paltrow-Krulwich S, Price A, Hurwitz AM, CampBell T, Epler LT, Nguyen F, Wolinsky E, Edwards-Fligner M, Lobo J, Rivera D, Langsjoen J, Sloane L, Hendrix I, Munde EO, Onyango CO, Olewe PK, Anyona SB, Yingling AV, Lauve NR, Kumar P, Stoicu S, Nestsiarovich A, Bologa CG, Oprea TI, Tollestrup K, Myers OB, Anixter M, Perkins DJ, Lambert CG. A Comprehensive COVID-19 Daily News and Medical Literature Briefing to Inform Health Care and Policy in New Mexico: Implementation Study. JMIR Med Educ. 2022 Feb 23;8(1):e23845. doi: 10.2196/23845. PMID: 35142625; PMCID: PMC8908195. Link
  • Binder J, Ursu O, Bologa C, Jiang S, Maphis N, Dadras S, Chisholm D, Weick J, Myers O, Kumar P, Yang JJ, Bhaskar K, Oprea TI. Machine learning prediction and tau-based screening identifies potential Alzheimer's disease genes relevant to immunity. Commun Biol. 2022 Feb 11;5(1):125. doi: 10.1038/s42003-022-03068-7. PMID: 35149761; PMCID: PMC8837797. Link
  • Nestsiarovich A, Kumar P, Lauve NR, Hurwitz NG, Mazurie AJ, Cannon DC, Zhu Y, Nelson SJ, Crisanti AS, Kerner B, Tohen M, Perkins DJ, Lambert CG. Using Machine Learning Imputed Outcomes to Assess Drug-Dependent Risk of Self-Harm in Patients with Bipolar Disorder: A Comparative Effectiveness Study. JMIR Ment Health. 2021 Apr 21;8(4):e24522. doi: 10.2196/24522. PMID: 33688834; PMCID: PMC8100888. Link
  • Kumar P, Nestsiarovich A, Nelson SJ, Kerner B, Perkins DJ, Lambert CG. Imputation and characterization of uncoded self-harm in major mental illness using machine learning. J Am Med Inform Assoc. 2020 Jan 1;27(1):136-146. doi: 10.1093/jamia/ocz173. PMID: 31651956; PMCID: PMC7647246. Link
  • Zahoránszky-Kőhalmi G, Siramshetty VB, Kumar P, Gurumurthy M, Grillo B, Mathew B, Metaxatos D, Backus M, Mierzwa T, Simon R, Grishagin I, Brovold L, Mathé EA, Hall MD, Michael SG, Godfrey AG, Mestres J, Jensen LJ, Oprea TI. A Workflow of Integrated Resources to Catalyze Network Pharmacology Driven COVID-19 Research. J Chem Inf Model. 2022 Feb 14;62(3):718-729. doi: 10.1021/acs.jcim.1c00431. Epub 2022 Jan 20. PMID: 35057621; PMCID: PMC10790216. Link
  • Cavanagh JF, Kumar P, Mueller AA, Richardson SP, Mueen A. Diminished EEG habituation to novel events effectively classifies Parkinson's patients. Clin Neurophysiol. 2018 Feb;129(2):409-418. doi: 10.1016/j.clinph.2017.11.023. Epub 2017 Dec 13. PMID: 29294412; PMCID: PMC5999543. Link

Conference Posters/Talks

  • P. Kumar, K. A. Schneider, F. Moomtaheen, S. A. Malec, J. J. Yang, C. G. Bologa, Y. Zhu, M. Tohen, G. Villarreal, D. J. Perkins, E. M. Fielstein, S. E. Davis, M. E. Matheny, and C. G. Lambert. Evaluating the quality of positive unlabeled learning methods if unlabeled instances cannot be validated. OHDSI Symposium, 2025. Link
  • P. Kumar, K. A. Schneider, F. Moomtaheen, S. A. Malec, J. J. Yang, C. G. Bologa, Y. Zhu, M. Tohen, G. Villarreal, D. J. Perkins, E. M. Fielstein, S. E. Davis, M. E. Matheny, and C. G. Lambert. Data-driven identification of comorbidities and pharmacological patterns in patients with sleep disorders. OHDSI Symposium, 2025. Link
  • P. Kumar, V. Metzger, S. Purushotham, P. Kedia, C. Bologa, C. G. Lambert, and J. Yang. KG2ML: Integrating knowledge graphs and positive unlabeled learning for identifying disease-associated genes with case studies for 12 diseases. Common Fund Data Ecosystem (CFDE) All-Hands Meeting, 2025. Link
  • P. Kumar and V. T. Metzger. Predicting type 2 diabetes risk: A non-negative matrix factorization approach for feature selection. IEEE BHI, 2024. Link
  • P. Kumar, F. Moomtaheen, S. A. Malec, J. J. Yang, C. G. Bologa, K. A. Schneider, Y. Zhu, M. Tohen, G. Villarreal, D. J. Perkins, E. M. Fielstein, S. E. Davis, M. E. Matheny, and C. G. Lambert. Detecting opioid use disorder in health claims data with positive unlabeled learning. IEEE BHI, 2024.
  • P. Kumar, F. Moomtaheen, S. A. Malec, J. J. Yang, C. G. Bologa, K. A. Schneider, Y. Zhu, M. Tohen, G. Villarreal, D. J. Perkins, E. M. Fielstein, S. E. Davis, M. E. Matheny, and C. G. Lambert. Quantifying the opioid use disorder crisis: PULSNAR finds nearly 3/4 undiagnosed. OHDSI Symposium, 2024. Link
  • P. Kumar, V. Metzger, S. Purushotham, P. Kedia, C. G. Lambert, and J. Yang. Illuminating the druggable genome (IDG) scientific use cases powered by the cfde data distillery biomedical knowledge graph, integrating multiple common fund datasets. Common Fund Data Ecosystem (CFDE) All-Hands Meeting, 2024. Link
  • P. Kumar and C. G. Lambert. Improving the detection of behavioral health conditions through positive and unlabeled learning: Opioid use disorder. OHDSI Symposium, 2023. Link
  • P. Kumar, J. Tsosie, and C. G. Lambert. Improving the detection of behavioral health conditions through positive and unlabeled learning: Self-harm and opioid use disorder. UNM Brain and Behavioral Health, 2023. Link
  • P. Kumar, S. E. Davis, M. E. Matheny, G. Villarreal, Y. Zhu, M. Tohen, D. J. Perkins, and C. G. Lambert. Pulsnar: Positive Unlabeled Learning Selected Not At Random–towards imputing undocumented conditions in EHRs and estimating their incidence. OHDSI Symposium, 2022. Link
  • S. E. Davis, P. Kumar, N. R. Lauve, S. K. Parr, D. Park, M. E. Matheny, G. Villarreal, Y. Zhu, M. Tohen, G. Uhl, D. J. Perkins, and C. G. Lambert. Disparities in coded and imputed post-traumatic stress disorder and self-harm among us veterans. AMIA, 2021.
  • P. Kumar, N. R. Lauve, S. E. Davis, S. K. Parr, D. Park, M. E. Matheny, G. Villarreal, G. Uhl, Y. Zhu, and M. Tohen. Detecting PTSD and self-harm among us veterans using positive unlabeled learning. OHDSI Symposium, 2021. Link
  • P. Kumar, J. J. Yang, D. Byrd, O. Ursu, C. G. Bologa, S. L. Mathias, J. Berendzen, and T. I. Oprea. ProteinGraphML - predicting disease-to-protein associations from a biomedical knowledge graph. FASEB, 2021.
  • A. Nestsiarovich, P. Kumar, N. R. Lauve, A. J. Mazurie, N. G. Hurwitz, D. C. Cannon, Y. Zhu, S. J. Nelson, A. S. Crisanti, B. Kerner, M. Tohen, D. J. Perkins, and C. G. Lambert. Comparing drug-dependent risk of self-harm in bipolar disorder using machine learning imputed outcomes. OHDSI Symposium, 2020. Link
  • P. Kumar, A. Nestsiarovich, S. J. Nelson, B. Kerner, D. J. Perkins, and C. G. Lambert. Visit level machine learning imputation of uncoded self-harm in major mental illness and characterization of incidence of self-harm. OHDSI Symposium, 2019. Link
  • P. Kumar, A. Nestsiarovich, A. J. Mazurie, N. G. Hurwitz, S. J. Nelson, and C. G. Lambert. Visit level suicidality/self-harm phenotyping in major mental illness. AMIA, 2018.
  • P. Kumar, A. Nestsiarovich, A. J. Mazurie, N. G. Hurwitz, S. J. Nelson, and C. G. Lambert. Visit level suicidality/self-harm phenotyping in bipolar disorder. OHDSI Symposium, 2017. Link
  • P. Kumar, Amritansh, and C. G. Lambert. Transforming the 2.33m-patient medicare synthetic public use files to the OMOP CDMv5: ETL-cms software and processed data available and feature-complete. OHDSI Symposium, 2016. Link

Ph.D. Dissertation

  • Machine learning methods for computational phenotyping using patient healthcare data with noisy labels. Department of Computer Science, The University of New Mexico, Albuquerque, NM, Dec. 2022. Link

Colloquium Talks

  • Imputation and characterization of uncoded self-harm in major mental illness using machine learning. Department of Computer Science, University of New Mexico. 16 September 2020

PEER REVIEW ACTIVITIES

ACADEMIC AND PROFESSIONAL SERVICES

RESEARCH AND INDUSTRY EXPERIENCE

Research Assistant Professor· University of New Mexico Health Sciences Center

Feb 2024 — Present

Data Scientist II· University of New Mexico Health Sciences Center

Dec 2022 — Jan 2024

Graduate Research Assistant· University of New Mexico Health Sciences Center

Mar 2016 — Nov 2022

System Analyst/Associate Project Manager· Interglobe Inc.

Jul 2011 — Aug 2015

Lead Software Engineer· Interglobe Ltd.

May 2007 — Jul 2011

System Analyst· Fiserv Ltd.

Apr 2006 — May 2007

Software Engineer· Computer Sciences Corporation Ltd.

Dec 2004 — Mar 2006

Software Engineer· Satyam Computer Services Ltd.

Jul 2003 — Dec 2004

EDUCATION

Ph.D. in Computer Science· University of New Mexico, Albuquerque, NM, USA

2018 — 2022
Distinction, GPA: 4.04/4.00

M.S. in Computer Science· University of New Mexico, Albuquerque, NM, USA

2015 — 2017
GPA: 3.81/4.00

B.E. in Computer Engineering· Sardar Vallabhbhai National Institute of Technology, Surat, Gujarat, India

1999 — 2003
First Class with Distinction

TECHNICAL SKILLS

Programming Languages: Python, R, MATLAB, PHP, C, SQL, CQL, HTML, JavaScript, CSS

Databases: Neo4j, MySQL, and PostgreSQL

Operating Systems: Windows, and Linux

Web Servers: Apache, and Nginx

SCHOLARSHIPS, CERTIFICATIONS AND AWARDS

HOBBIES AND INTERESTS

Apart from my research pursuits, I have a wide range of interests. I am passionate about exploring the ancient history of world civilizations and uncovering the mysteries and accomplishments of past cultures. I also have a strong interest in finance and stock market investing, particularly in analyzing economic trends and market behavior. In my free time, I enjoy teaching mathematics to middle and high school students, sharing my enthusiasm for problem-solving and logical thinking.