Haihua Chen
Assistant Professor
University of North Texas
Department of Data Science
(940) 220-0057
Email: Haihua.Chen@unt.edu
Education
-
PhD, University of North Texas, 2022
Major: Information Science
Dissertation: Data Quality Evaluation and Improvement for Machine Learning
Teaching
Teaching Experience
- DTSC 3020 - Introduction to Computation with Python, 1 course.
- HINF 5506 - Applications of Artificial Intelligence in Health, 2 courses.
- INFO 4206 - Information Retrieval Systems, 1 course.
- INFO 5502 - Principles and Techniques for Data Science, 2 courses.
- INFO 5506 - Artificial Intelligence in Health, 1 course.
- INFO 5731 - Computational Methods for Information Systems, 17 courses.
- INFO 5810 - Data Analysis and Knowledge Discovery, 8 courses.
- INFO 5900 - Special Problems, 1 course.
- INFO 6660 - Readings in Information Science, 2 courses.
- INFO 6900 - Special Problems, 6 courses.
- INFO 6950 - Doctoral Dissertation, 1 course.
University of North Texas
Research
Published Intellectual Contributions
- Zhang, H., Li, R., Shrestha, S., Mamidala, S., Putta, R., Aggarwal, A., Xiao, T., Ding, J., Chen, H. (2025). ReviewGuard: Enhancing Deficient Peer Review Detection via LLM-Driven Data Augmentation. ACM Joint Conference in Digital Library.
- Li, R., Zhang, H., Gehringer, E., Xiao, T., Ding, J., Chen, H. (2025). Unveiling the Merits and Defects of LLMs in Automatic Review Generation for Scientific Papers. IEEE Conference in Data Mining (ICDM).
- Kim, J., Chen, H., Yang, L., Simic, J. (2024). Exploring the application of artificial intelligence and machine learning in GLAM collections. Proceedings of the Association for Information Science and Technology. Proceedings of the Association for Information Science and Technology.
- Chen, H., Kim, J., Yang, L., de Fremery, W., Wang, X. (2024). Utilizing AI/ML to enhance information extraction, organization, and retrieval from large-scale archival collections. Proceedings of the 24th ACM/IEEE Joint Conference on Digital Libraries.
- Zhou, Y., Tu, F., Sha, K., Ding, J., Chen, H. (2024). A Survey of Data Quality Evaluation and Tools for Machine Learning. 2024 IEEE Intl. Conference on AI Testing.
- Ding, J., Nguyen, H., Chen, H. (2024). Evaluation of Question-Answering Based Text Summarization using LLM. 2024 IEEE Intl. Conference on AI Testing.
- Zhang, X., Chen, H., Chong, M., Hagen, L. (2024). A Framework for Assessing Country Reputation: Case Study of China during the COVID-19 Pandemic. New York, Association for Computing Machinery. https://dl.acm.org/doi/proceedings/10.1145/3657054
- Nguyen, H., Chen, H., Maganti, R., Hossain, T., Ding, J. Identifying High-quality Informative Comments for Software Review Summarization. IEEE AITest.
- Feng, Y., Vanam, S., Cherukupally, M., Zheng, W., Qiu, M., Chen, H. (2023). Investigating Code Generation Performance of Chat-GPT with Crowdsourcing Social Data. Proceedings of the 47th IEEE Computer Software and Applications Conference. 1--10.
- Qin, C., , Y.Y., Chen, H., Ding, J. (2021). A Comparison Study of Machine Learning and Deep Learning for Legal Contract Understanding. JURISIN 2021: 15 Intl. Workshop on Juris-informatics.
- Chen, J., Chen, H., Tang, M. (2020). An Ontology-based Semantic Information Retrieval System. IEEE.
- Chen, J., Chen, H., Tang, M. (2020). Smart bookshelf for library book management. Proceedings of the Association for Information Science and Technology, 2020..
- Chen, H., Cao, G., Chen, J., Ding, J. (2019). A Practical Framework for Evaluating the Quality of Knowledge Graph. Knowledge Graph and Semantic Computing: Knowledge Computing and Language Understanding. 12. Singapore, Springer.
- Zhang, Y., Wang, Y., Sheng, Q.Z., Yao, L., Chen, H., Wang, K., Mahmood, A., Zhang, W.E., Zaib, M., Sagar, S., others. (2025). Deep learning meets bibliometrics: A survey of citation function classification. Other. 19 (1) 101608. Elsevier.
- Zhang, H., Li, R., Zhang, Y., Xiao, T., Chen, J., Ding, J., Chen, H. (2025). The evolving role of large language models in scientific innovation: Evaluator, collaborator, and scientist. Other.
- Chen, H., Zhou, Y., Li, R., Illa, A.M., Cleveland, A.D., Ding, J. (2025). A Comprehensive Survey on Medical Concept Normalization: Datasets, Techniques, Applications, and Future Directions. SSRN.
- Li, Y., Chen, H., Lund, B., Ma, R., Le, Y., Sweeney, M. (2025). Libraries in the Age of LLMs: Perceptions, Practices, and the Future of Scholarly Work. Proceedings of the Association for Information Science and Technology. 62 (1) 1254-1257. https://api.elsevier.com/content/abstract/scopus_id/105019359991
- Wang, Z., Wang, N., Zhang, H., Wang, Z., Wang, Z., Ding, J., Chen, H. (2025). IBID- CCT: A Novel Model for Interdisciplinary Breakthrough Innovation Detection based on Cusp Catas- trophe Theory. Information Processing & Management,.
- Tran, N., Chen, H., Cleveland, A.D., Zhou, Y. (2025). Disaster Informatics after the COVID-19 Pandemic: Bibliometric and Topic Analysis based on Large-scale Academic Literature. arXiv.
- Chen, H., Li, R., Cleveland, A.D., Ding, J. (2025). Enhancing data quality in medical concept normalization through large language models. 165 Journal of Biomedical Informatics.
- Chen, H., Li, R., Cleveland, A., Ding, J. (2025). Enhancing Data Quality in Medical Concept Normalization through Large Language Models. Elsevoer.
- Kim, J., Chen, H., Bloechle, M. (2025). Editorial: A roadmap to ethical research. The Electronic Library.
- Zhao, H., Chen, H., Ruggles, T.A., Feng, Y., Singh, D., Yoon, H. (2024). Improving Text Classification with Large Language Model-Based Data Augmentation. Other. 13 (13) 2535. MDPI.
- Jahan, R.I., Fan, H., Chen, H., Feng, Y. (2024). Unlocking Cross-Lingual Sentiment Analysis through Emoji Interpretation: A Multimodal Generative AI Approach. Other.
- Kargozari, K., Ding, J., Chen, H. (2024). Empowering Consumer Decision-Making: Decoding Incentive vs. Organic Reviews for Smarter Choices Through Advanced Textual Analysis. 13(21) Electronics.
- Chen, H., Kim, J., Chen, J. (2024). Demystifying Oral History with Natural Language Processing and Data Analytics: A Case Study of Densho Digital Collection. The Electronic Library. 42 (4) 643-665. Emerald.
- Tu, F., Wu, L., Kinshuk, X., Ding, J., Chen, H. (2024). Exploring the Influence of Regulated Learning Processes on Learners. Education and Information Technologies.
- Wang, Z., Zhang, H., Chen, H., Feng, Y., Ding, J. (2024). Content-based Quality Evaluation of Scientific Papers using Coarse Feature and Knowledge Entity Network. Journal of King Saud University - Computer and Information Sciences.
- Wang, Z., Qiao, X., Chen, J., Li, L., Zhang, H., Ding, J., Chen, H. (2024). Exploring and evaluating the index for interdisciplinary breakthrough innovation detection. The Electronic Library.
- Ge, J., Xu, G., Zhang, Y., Lu, J., Chen, H., Meng, X. (2023). Joint optimization of computation, communication and caching in D2D-assisted caching-enhanced MEC system. Other. 12 (15) 3249. MDPI.
- Wang, Z., Peng, S., Chen, J., Zhang, X., Chen, H. (2023). ICAD-MI: Interdisciplinary concept association discovery from the perspective of metaphor interpretation. Knowledge-Based Systems. 275 110695. Elsevier BV. http://dx.doi.org/10.1016/j.knosys.2023.110695
- Zhang, Y., Zhao, R., Wang, Y., Chen, H., Mahmood, A., Zaib, M., Zhang, W.E., Sheng, Q.Z. (2022). Towards employing native information in citation function classification. Other. 127 (11) 6557--6577. Springer International Publishing Cham.
- Tran, N., Chen, H., Bhuyen, J., Ding, J. (2022). Data Curation and Quality Evaluation for Machine Learning-Based Cyber Intrusion Detection. IEEE Access. 10 121900 - 121923. IEEE.
- Chen, H., Pieptea, L., Ding, J. (2022). Construction and Evaluation of a High-Quality Corpus for Legal Intelligence Using Semiautomated Approaches. IEEE Transactions on Reliability. 71 (2) 1-17. IEEE.
- Chen, H., Wu, L., Chen, J., Lu, W., Ding, J. (2021). A comparative study of automated legal text classification using random forests and deep learning. Information Processing & Management. 59 (2) Elsevier.
- Tran, N., Chen, H., Jiang, J., Bhuyan, J., Ding, J. (2021). Effect of Class Imbalance on the Performance of Machine Learning-based Network Intrusion Detection. International Journal of Performability Engineering. 17 (9) Totem.
- Cartwright, A.D., Carey, C.D., Chen, H. (2021). Multi-tiered intensive supervision: A culturally-informed method of clinical supervision. Teaching and Supervision in Counseling. https://trace.tennessee.edu/tsc/
- Chen, H., Chen, J., Ding, J. (2021). Data Evaluation and Enhancement for Quality Improvement of Machine Learning. IEEE Transactions on Reliability. 70 (2) IEEE.
- Chen, J., Chen, H., Tang, M. (2020). An ontology-improved vector space model for semantic retrieval. The Electronic Library.
- Chen, J., Lu, W., Chen, H. (2019). Result Diversification in Image Retrieval Based on Semantic Distance. Denton,
- Cui, J., Ma, Y., Zhang, J., Chen, H., Fang, R. (1996). Growth and characterization of diamond film on aluminum nitride. Materials Research Bulletin. 31 (7) 781--785. Pergamon.
- Ding, J., Chen, H., Feng, Y., Hossain, T. (2024). Applications of Deep Learning Techniques. Other. 13 (17) 3354. MDPI.
Conference Proceeding
Journal Article
Other
Presentations Given
- Chen, H. (Coordinator/Organizer), Kim, J. (Coordinator/Organizer), JCDL 2024 Conference, Utilizing AI/ML to enhance information extraction, organization, and retrieval from large-scale archival collections, Hong Kong. (2024 - 2024).
- Kim, J. (Author & Presenter), Chen, H. (Author & Presenter), Yang, L. (Author & Presenter), Simic, J. (Author), 2024 ASIS&T Annual Meeting, Exploring the application of artificial intelligence and machine learning in GLAM collections, Canada. (2024 - 2024).
- Zhou, Y. (Author & Presenter), Chen, H. (Author), Cleveland, A.D. (Author), South Central Chapter of the Medical Library Association Annual Meeting, Enhancing Extreme Multi-label Medical Concept Normalization with LLM-based Data Fusion and Quality Improvement, South Central Chapter of the Medical Library Association, Little Rock, AR, United States of America. (2025 - 2025).
- Fulton, S. (Author), Chen, H. (Author), Cleveland, A.D. (Author), Patel, D.N. (Author), Vinnakota, J. (Author), South Central Chapter of the Medical Library Association Annual Meeting, Evaluating Prompting Strategies for Generating Structured Lung Cancer Staging Notes Using Large Language Models, South Central Chapter of the Medical Library Association, Little Rock, AR, United States of America. (2025 - 2025).
- Donepudi, S.S. (Author & Presenter), Chen, H. (Author), Cleveland, A.D. (Author), Parimi, S. (Author), Potula, R.A. (Author), South Central Chapter of the Medical Library Association Annual Meeting, Semi-Automatic Dental Taxonomy Construction with Human and Generative AI Collaboration, South Central Chapter of the Medical Library Association, Little Rock, AR, United States of America. (2025 - 2025).
- Reyes, J. (Author & Presenter), Cleveland, A.D. (Author), Philbrick, J.L. (Author), Sharma, S. (Author), Chen, H. (Author), South Central Chapter of the Medical Library Association Annual Meeting, Use of virtual and augmented reality in health sciences libraries: A scoping review, South Central Chapter of the Medical Library Association, Little Rock, AR, United States of America. (2025 - 2025).
- Mantecon, H. (Author & Presenter), Cleveland, A.D. (Author), Sharma, S. (Author), Chen, H. (Author), South Central Chapter of the Medical Library Association Annual Meeting, Understanding Perceptions, Barriers, and Potential of Virtual Reality in the Mental Health Landscape: Graduate Student Perspectives, South Central Chapter of the Medical Library Associationo, Little Rock, AR, United States of America. (2025 - 2025).
- Zhou, Y. (Author & Presenter), Chen, H. (Author), Cleveland, A.D. (Author), Doswell Health Informatics Conference, Data Fusion and Quality Enhancement for Medical Concept Normalization using LLMs, Texas Woman’s University College of Nursing, Dallas, TX, United States of America. (2025 - 2025).
- Mantecon, H. (Author & Presenter), Cleveland, A.D. (Author), Sharma, S. (Author), Chen, H. (Author), Doswell Health Informatics Conference, Exploring Virtual Reality as a Mental Health Intervention: Perceptions, Barriers, and Potential Among Graduate Students, Texas Woman’s University College of Nursing, Dallas, TX, United States of America. (2025 - 2025).
- Chen, H. (Author), Cleveland, A.D. (Author), Ding, J. (Author), Li, R. (Author), South Central Chapter of the Medical Library Association Annual Meeting, Evaluating ChatGPT-based data augmentation for medical concept normalization, South Central Chapter of the Medical Library Association, Dallas, TX, United States of America. (2024 - 2024).
- Tran, N. (Author), Cleveland, A.D. (Author), Chen, H. (Author & Presenter), Medical Library Association Annual Conference, Disaster Informatics Over Time: A Bibliometric Study from 2016 to 2022, Medical Library Association, Portland, OR, United States of America. (2024 - 2024).
- Tran, N. (Author & Presenter), Chen, H. (Author), Cleveland, A.D. (Author), Joint Southern / South Central Chapter of the Medical Library Association Annual Meeting, Unveiling Key Research Themes in Disaster Informatics Amidst the Pandemic, Southern & South Central Chapters of the Medical Library Association, New Orleans, LA, United States of America. (2023 - 2023).
- Cleveland, A.D. (Author & Presenter), Chen, H. (Author), Joint Southern/South Central Chapter of the Medical Library Association Annual Meeting, Dental informatics over time: A 20-years longitudinal bibliometric analysis, Southern & South Central Chapters of the Medical Library Association, New Orleans, LA, United States of America. (2023 - 2023).
- Tran, N. (Author & Presenter), Cleveland, A.D. (Author & Presenter), Chen, H. (Author), Joint Southern/South Central Chapter of the Medical Library Association Annual Meeting, Exploring the Shift in Disaster Informatics Research 2016-2022, Southern & South Central Chapters of the Medical Library Association, New Orleans, LA, United States of America. (2023 - 2023).
- Zhou, Y. (Author & Presenter), Chen, H. (Author), Cleveland, A.D. (Author), AI & Data Science Executive Summit Challenges, Opportunities, and the Future Ahead, Quality-aware Multi-source Data Fusion and Enhancement for Medical Concept Normalization using Large Language Models, The Anuradha and Vikas Sinha Department of Data Science, Frisco, TX, United States of America. (2025 - 2025).
- Challa, L. (Author & Presenter), Chen, H. (Author), Cleveland, A.D. (Author), AI & Data Science Executive Summit Challenges, Opportunities, and the Future Ahead, Synthetic Oncology Data Generation & Evaluation Using Large Language Models, The Anuradha and Vikas Sinha Department of Data Science, Frisco, TX, United States of America. (2025 - 2025).
- Badhon, S.I. (Author & Presenter), Cleveland, A.D. (Author), Chen, H. (Author), Siddiqui, M.H. (Author), Das, D. (Author), Hossain, T. (Author), Fifth Annual Texas Health Informatics Alliance Conference, An LLM-Driven Clinical Support for Rural Nurse Practitioners in Texas, Texas Health Informatics Alliance, Arlington, TX, United States of America. (2025 - 2025).
- Zhou, Y. (Author & Presenter), Chen, H. (Author), Cleveland, A.D. (Author), Fifth Annual Texas Health Informatics Alliance Conference, Data Fusion and Quality Enhancement for Medical Concept Normalization Using LLMs, Texas Health Informatics Alliance, Arlington, TX, United States of America. (2025 - 2025).
- Challa, L. (Author & Presenter), Patel, D.N. (Author), Vinnakota, J. (Author), Chen, H. (Author), Cleveland, A.D. (Author), Fifth Annual Texas Health Informatics Alliance Conference, Synthetic Oncology Data Generation & Evaluation Using Large Language Models, Texas Health Informatics Alliance, Arlington, TX, United States of America. (2025 - 2025).
- Badhon, S.I. (Author & Presenter), Cleveland, A.D. (Author), Chen, H. (Author), Das, D. (Author), Siddiqui, M.H. (Author), Hossain, T. (Author), Doswell Health Informatics Conference, Empowering Rural Nurse Practitioners with a Large Language Model-Based Support System, Texas Woman’s University College of Nursing, Dallas, TX, United States of America. (2025 - 2025).
- Zhou, Y. (Author & Presenter), Chen, H. (Author), Cleveland, A.D. (Author), Multidisplinary Information Research Symposium (MIRS), Quality-aware Multi-source Data Fusion and Enhancement for Medical Concept Normalization using Large Language Models, UNT College of Information, Denton, TX, United States of America. (2025 - 2025).
- Tran, N. (Author & Presenter), Chen, H. (Author), Cleveland, A.D. (Author), Multidisplinary Information Research Symposium (MIRS), Uncovering Disaster Informatics in the Pandemic Era: A Topic Modeling Analysis, UNT College of Information, Denton, TX, United States of America. (2025 - 2025).
- Mantecon, H. (Author & Presenter), Cleveland, A.D. (Author), Sharma, S. (Author), Chen, H. (Author), Multidisplinary Information Research Symposium (MIRS), Virtual Reality for Academic Anxiety and Stress Reduction: A Pilot Study on Perceptions, Barriers, and Adoption, UNT College of Information, Denton, TX, United States of America. (2025 - 2025).
- Vengala, S. (Author & Presenter), Potula, R.A. (Author), Chen, H. (Author), Cleveland, A.D. (Author), University Research Day, An Analysis of the Dental Informatics Literature from 2018 – 2023: Keyword Extraction, University of North Texas, Denton, TX, United States of America. (2024 - 2024).
- Vengala, S. (Author & Presenter), Potula, R.A. (Author), Chen, H. (Author), Cleveland, A.D. (Author), UNT Day of Health Informatics and Data Science, Analysis of the Dental Informatics Literature from 2018 – 2023: Keyword Extraction, UNT College of Information, Frisco, TX, United States of America. (2024 - 2024).
- Tran, N. (Author & Presenter), Chen, H. (Author), Cleveland, A.D. (Author), UNT Day of Health Informatics and Data Science, Exploring Collaboration Patterns and Topics in Disaster Informatics during the Pandemic Era, UNT College of Information, Frisco, TX, United States of America. (2024 - 2024).
- Tran, N. (Author & Presenter), Cleveland, A.D. (Author), Chen, H. (Author), University Research Day, Analyzing the Evolution of the Disaster Informatics Literature From Pre-pandemic to Post-Pandemic, University of North Texas, Denton, TX, United States of America. (2023 - 2023).
- Tran, N. (Author & Presenter), Cleveland, A.D. (Author), Chen, H. (Author), University Research Day, Exploring Emerging Themes and Topics in Disaster Informatics during the Pandemic Era, University of North Texas, Denton, TX, United States of America. (2023 - 2023).
- Cleveland, A.D. (Author & Presenter), Chen, H. (Author & Presenter), Katragadda, N. (Author), Tran, N. (Author & Presenter), Research Brown Bag, Informatics applications in multiple fields: A bibliometric analysis, UNT Department of Information Science, Online, United States of America. (2022 - 2022).
Other
Panel Presentation
Paper
Poster
Webinar
Contracts, Grants, Sponsored Research
- Feng, Y. (Principal), Albert, M.V. (Co-Principal), Zhao, H. (Supporting), Aljedaani, W. (Supporting), Chen, H. (Supporting), Xiao, T. (Supporting), An, Y. (Supporting), Hull, D. (Supporting), "REU Site: Making Generative Artificial Intelligence Responsible," sponsored by National Science Foundation (NSF), Federal, $465000 Funded. (2025).
- Yu, J., Chen, H. (Co-Principal), "Partner, Not Crutch: Designing a Metacognitive Nudge to Promote AI Co-Regulation," sponsored by Learning Institute, University of North Texas, $5000 Funded. (2026 - 2026).
- Xiao, T. (Principal), Ding, J. (Co-Principal), Albert, M.V. (Supporting), Alam, Z.S. (Supporting), Hartmann, F. (Supporting), Wang, Y. (Supporting), Liang, L. (Supporting), Chen, H. (Supporting), Du, J. (Supporting), Azad, R.K. (Supporting), "NSF REU site: Beyond Language: Training to Create and Share Vector Embeddings across Application," sponsored by NSF, Federal, $403547 Funded. (2023 - 2025).
- Ding, J. (Principal), Kinshuk, X. (Co-Principal), Fu, S. (Co-Principal), Ludi, S.A. (Co-Principal), Chen, H. (Co-Principal), "HSI Implementation and Evaluation Project: Develop a High Quality Academic Environment for Broadening Participation of Hispanic Students in Computing," sponsored by National Science Foundation, Federal, $500000 Funded. (2022 - 2025).
- Ding, J., Ludi, S.A., Fu, S., Kinshuk, K., Chen, H., "Developing a High-Quality Academic Environment for Broadening Participation of Hispanic Students in Computing," sponsored by NSF, Federal, $499517 Funded. (2022 - 2025).