Curriculum Vitae
Curriculum Vitae Link
Education
-
PhD, Pennsylvania State University, 2017
Major: Information Sciences and Technology
Current Scheduled Teaching
| CSCE 6290.002 | Advanced Topics in Human/Machine Intelligence | Spring 2026 |
|
|
| CSCE 5934.878 | Directed Study | Spring 2026 |
|
|
| CSCE 6940.978 | Individual Research | Spring 2026 |
|
|
| CSCE 5950.878 | Master's Thesis | Spring 2026 |
|
|
| CSCE 5290.001 | Natural Language Processing | Spring 2026 |
|
|
Texas Education Code 51.974 (HB 2504) requires each institution of higher education to make available to the public, a syllabus for undergraduate lecture courses offered for credit by the institution.
Previous Scheduled Teaching
| CSCE 6940.878 | Individual Research | Fall 2025 |
|
|
| CSCE 5290.002 | Natural Language Processing | Fall 2025 |
|
SPOT
|
| CSCE 5900.878 | Special Problems | Fall 2025 |
|
|
| CSCE 5900.878 | Special Problems | Summer 10W 2025 |
|
|
| CSCE 5934.878 | Directed Study | Spring 2025 |
|
|
| CSCE 6940.978 | Individual Research | Spring 2025 |
|
|
| CSCE 4290.002 | Introduction to Natural Language Processing | Spring 2025 |
Syllabus
|
SPOT
|
| CSCE 5950.878 | Master's Thesis | Spring 2025 |
|
|
| CSCE 5290.002 | Natural Language Processing | Spring 2025 |
|
SPOT
|
| CSCE 5934.878 | Directed Study | Fall 2024 |
|
|
| CSCE 6940.878 | Individual Research | Fall 2024 |
|
|
| CSCE 5950.878 | Master's Thesis | Fall 2024 |
|
|
| CSCE 5290.005 | Natural Language Processing | Fall 2024 |
Syllabus
|
SPOT
|
Texas Education Code 51.974 (HB 2504) requires each institution of higher education to make available to the public, a syllabus for undergraduate lecture courses offered for credit by the institution.
Published Intellectual Contributions
- Kamal, S., Prakash, L.P., Rafiuddin, S.M., Rakib, M., Sen, A., Choudhury, S.R., Inui, K., Sakti, S., Wang, H., Wong, D.F., Bhattacharyya, P., Banerjee, B., Ekbal, A., Chakraborty, T., Singh, D.P. (2025). A Detailed Factor Analysis for the Political Compass Test: Navigating Ideologies of Large Language Models. Proceedings of the 14th International Joint Conference on Natural Language Processing and the 4th Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics. 284--303. Mumbai, India, The Asian Federation of Natural Language Processing and The Association for Computational Linguistics. https://aclanthology.org/2025.ijcnlp-short.25/
- Hossain, A.I., Choudhury, S.R., Alhoori, H., Inui, K., Sakti, S., Wang, H., Wong, D.F., Bhattacharyya, P., Banerjee, B., Ekbal, A., Chakraborty, T., Singh, D.P. (2025). SciHallu: A Multi-Granularity Hallucination Detection Dataset for Scientific Writing. Proceedings of the 14th International Joint Conference on Natural Language Processing and the 4th Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics. 1277--1304. Mumbai, India, The Asian Federation of Natural Language Processing and The Association for Computational Linguistics. https://aclanthology.org/2025.ijcnlp-long.70/
- Al Azher, I., Mokarrama, M.J., Guo, Z., Choudhury, S.R., Alhoori, H., Christodoulopoulos, C., Chakraborty, T., Rose, C., Peng, V. (2025). BAGELS: Benchmarking the Automated Generation and Extraction of Limitations from Scholarly Text. Findings of the Association for Computational Linguistics: EMNLP 2025. 19279--19294. Suzhou, China, Association for Computational Linguistics. https://aclanthology.org/2025.findings-emnlp.1050/
- Azad, T., Azher, I.A., Choudhury, S.R., Alhoori, H., Ghosal, T., Mayr, P., Singh, A., Naik, A., Rehm, G., Freitag, D., Li, D., Schimmler, S., De Waard, A. (2025). Predicting The Scholarly Impact of Research Papers Using Retrieval-Augmented LLMs. Proceedings of the Fifth Workshop on Scholarly Document Processing (SDP 2025). 124--131. Vienna, Austria, Association for Computational Linguistics. https://aclanthology.org/2025.sdp-1.11/
- Akella, A.P., Choudhury, S.R., Koop, D., Alhoori, H., Serra, E., Spezzano, F. (2024). Navigating the Landscape of Reproducible Research: A Predictive Modeling Approach. Proceedings of the 33rd ACM International Conference on Information and Knowledge Management, CIKM 2024, Boise, ID, USA, October 21-25, 2024. 24--33. ACM. https://doi.org/10.1145/3627673.3679831
- Dutt, R., Choudhury, S., Rao, V.V., Rose, C., Vydiswaran, V., Hupkes, D., Dankers, V., Batsuren, K., Kazemnejad, A., Christodoulopoulos, C., Giulianelli, M., Cotterell, R. (2024). Investigating the Generalizability of Pretrained Language Models across Multiple Dimensions: A Case Study of NLI and MRC. Proceedings of the 2nd GenBench Workshop on Generalisation (Benchmarking) in NLP. 165--182. Miami, Florida, USA, Association for Computational Linguistics. https://aclanthology.org/2024.genbench-1.11/
- Yaneva, V., North, K., Baldwin, P., Ha, Le An, Rezayi, S., Zhou, Y., Choudhury, S., Harik, P., Clauser, B., Kochmar, E., Bexte, M., Burstein, J., Horbach, A., Laarmann-Quante, R., Tack, Ana\"\is, Yaneva, V., Yuan, Z. (2024). Findings from the First Shared Task on Automated Prediction of Difficulty and Response Time for Multiple-Choice Questions. Proceedings of the 19th Workshop on Innovative Use of NLP for Building Educational Applications (BEA 2024). 470--482. Mexico City, Mexico, Association for Computational Linguistics. https://aclanthology.org/2024.bea-1.39
- Choudhury, S.R., Atanasova, P., Augenstein, I., Bouamor, H., Pino, J., Bali, K. (2023). Explaining Interactions Between Text Spans. Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, EMNLP 2023, Singapore, December 6-10, 2023. 12709--12730. Association for Computational Linguistics. https://doi.org/10.18653/v1/2023.emnlp-main.783
- Choudhury, S.R., Kalra, J., Jiang, J., Reitter, D., Deng, S. (2023). Implications of Annotation Artifacts in Edge Probing Test Datasets. Proceedings of the 27th Conference on Computational Natural Language Learning, CoNLL 2023, Singapore, December 6-7, 2023. 575--586. Association for Computational Linguistics. https://doi.org/10.18653/v1/2023.conll-1.39
- Choudhury, S.R., Bhutani, N., Augenstein, I., Calzolari, N., Huang, C., Kim, H., Pustejovsky, J., Wanner, L., Choi, K., Ryu, P., Chen, H., Donatelli, L., Ji, H., Kurohashi, S., Paggio, P., Xue, N., Kim, S., Hahm, Y., He, Z., Lee, T.K., Santus, E., Bond, F., Na, S. (2022). Can Edge Probing Tests Reveal Linguistic Knowledge in QA Models?. Proceedings of the 29th International Conference on Computational Linguistics, COLING 2022, Gyeongju, Republic of Korea, October 12-17, 2022. International Committee on Computational Linguistics. https://aclanthology.org/2022.coling-1.139
- Choudhury, S.R., Rogers, A., Augenstein, I., Calzolari, N., Huang, C., Kim, H., Pustejovsky, J., Wanner, L., Choi, K., Ryu, P., Chen, H., Donatelli, L., Ji, H., Kurohashi, S., Paggio, P., Xue, N., Kim, S., Hahm, Y., He, Z., Lee, T.K., Santus, E., Bond, F., Na, S. (2022). Machine Reading, Fast and Slow: When Do Models ``Understand'' Language?. Proceedings of the 29th International Conference on Computational Linguistics, COLING 2022, Gyeongju, Republic of Korea, October 12-17, 2022. International Committee on Computational Linguistics. https://aclanthology.org/2022.coling-1.8
- Lester, B., Choudhury, S.R., Prasad, R., Bangalore, S., Kim, Y., Li, Y., Rambow, O. (2021). Intent Features for Rich Natural Language Understanding. Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies: Industry Papers, NAACL-HLT 2021, Online, June 6-11, 2021. Association for Computational Linguistics. https://doi.org/10.18653/v1/2021.naacl-industry.27
- Lester, B., Pressel, D., Hemmeter, A., Choudhury, S.R., Bangalore, S., Cohn, T., He, Y., Liu, Y. (2020). Constrained Decoding for Computationally Efficient Named Entity Recognition Taggers. Findings of the Association for Computational Linguistics: EMNLP 2020, Online Event, 16-20 November 2020. EMNLP 2020 Association for Computational Linguistics. https://doi.org/10.18653/v1/2020.findings-emnlp.166
- Chiatti, A., Cho, M.J., Gagneja, A., Yang, X., Brinberg, M., Roehrick, K., Choudhury, S.R., Ram, N., Reeves, B., Giles, C.L., Haddad, H.M., Wainwright, R.L., Chbeir, R. (2018). Text extraction and retrieval from smartphone screenshots: building a repository for life in media. Proceedings of the 33rd Annual ACM Symposium on Applied Computing, SAC 2018, Pau, France, April 09-13, 2018. ACM. https://doi.org/10.1145/3167132.3167236
- Pressel, D., Choudhury, S., Lester, B., Zhao, Y., Barta, M. (2018). Baseline: A Library for Rapid Modeling, Experimentation and Development of Deep Learning Algorithms targeting NLP. Proceedings of Workshop for NLP Open Source Software (NLP-OSS). 34--40. Melbourne, Australia, Association for Computational Linguistics. https://aclanthology.org/W18-2506
- Wu, J., Choudhury, S.R., Chiatti, A., Liang, C., Giles, C.L. (2017). HESDK: A Hybrid Approach to Extracting Scientific Domain Knowledge Entities. 2017 ACM/IEEE Joint Conference on Digital Libraries, JCDL 2017, Toronto, ON, Canada, June 19-23, 2017. IEEE Computer Society. https://doi.org/10.1109/JCDL.2017.7991580
- Al-Zaidy, R.A., Choudhury, S.R., Giles, C.L., Khabsa, M., Giles, C.L., Wade, A.D. (2016). Automatic Summary Generation for Scientific Data Charts. Scholarly Big Data: AI Perspectives, Challenges, and Ideas, Papers from the 2016 AAAI Workshop, Phoenix, Arizona, USA, February 13, 2016. WS-16-13 AAAI Press. http://www.aaai.org/ocs/index.php/WS/AAAIW16/paper/view/12661
- Choudhury, S.R., Wang, S., Giles, C.L., Adam, N.R., Lillian (Boots) Cassel, Yesha, Y., Furuta, R., Weigle, M.C. (2016). Curve Separation for Line Graphs in Scholarly Documents. Proceedings of the 16th ACM/IEEE-CS on Joint Conference on Digital Libraries, JCDL 2016, Newark, NJ, USA, June 19 - 23, 2016. ACM. https://doi.org/10.1145/2910896.2925469
- Choudhury, S.R., Wang, S., Giles, C.L., Groppe, S., Le Gruenwald. (2016). Scalable algorithms for scholarly figure mining and semantics. Proceedings of the International Workshop on Semantic Big Data, San Francisco, CA, USA, July 1, 2016. ACM. https://doi.org/10.1145/2928294.2928305
- Choudhury, S., Giles, C.L. (2015). An Architecture for Information Extraction from Figures in Digital Libraries. Proceedings of the 24th International Conference on World Wide Web. 667–672. New York, NY, USA, Association for Computing Machinery. https://doi.org/10.1145/2740908.2741712
- Choudhury, S.R., Wang, S., Mitra, P., Giles, C.L. (2015). Automated data extraction from scholarly line graphs. Eleventh IAPR International Workshop on Graphics Recognition (GREC).
- Choudhury, S.R., Mitra, P., Giles, C.L., Vanoirbeek, C., Pierre Genev\`es. (2015). Automatic Extraction of Figures from Scholarly Documents. Proceedings of the 2015 ACM Symposium on Document Engineering, DocEng 2015, Lausanne, Switzerland, September 8-11, 2015. ACM. https://doi.org/10.1145/2682571.2797085
- Wu, J., Killian, J., Yang, H., Williams, K., Choudhury, S.R., Tuarob, S., Caragea, C., Giles, C.L., Barker, K., Jos\'e Manu\'el G\'omez-P\'erez. (2015). PDFMEF: A Multi-Entity Knowledge Extraction Framework for Scholarly Documents and Semantic Search. Proceedings of the 8th International Conference on Knowledge Capture, K-CAP 2015, Palisades, NY, USA, October 7-10, 2015. ACM. https://doi.org/10.1145/2815833.2815834
- Williams, K., Wu, J., Choudhury, S.R., Khabsa, M., Giles, C.L. (2014). Scholarly big data information extraction and integration in the CiteSeer\(^\mbox\(\chi\)\) digital library. Workshops Proceedings of the 30th International Conference on Data Engineering Workshops, ICDE 2014, Chicago, IL, USA, March 31 - April 4, 2014. IEEE Computer Society. https://doi.org/10.1109/ICDEW.2014.6818305
- Wu, Z., Wu, J., Khabsa, M., Williams, K., Chen, H., Huang, W., Tuarob, S., Choudhury, S.R., Ororbia, A., Mitra, P., Giles, C.L. (2014). Towards building a scholarly big data platform: Challenges, lessons and opportunities. IEEE/ACM Joint Conference on Digital Libraries, JCDL 2014, London, United Kingdom, September 8-12, 2014. IEEE Computer Society. https://doi.org/10.1109/JCDL.2014.6970157
- Choudhury, S.R., Tuarob, S., Mitra, P., Rokach, L., Kirk, A., Szep, S., Pellegrino, D.A., Jones, S., Giles, C.L., Downie, J.S., McDonald, R.H., Cole, T.W., Sanderson, R., Shipman, F. (2013). A figure search engine architecture for a chemistry digital library. 13th ACM/IEEE-CS Joint Conference on Digital Libraries, JCDL '13, Indianapolis, IN, USA, July 22 - 26, 2013. ACM. https://doi.org/10.1145/2467696.2467757
- Choudhury, S.R., Mitra, P., Kirk, A., Szep, S., Pellegrino, D.A., Jones, S., Giles, C.L. (2013). Figure Metadata Extraction from Digital Documents. 12th International Conference on Document Analysis and Recognition, ICDAR 2013, Washington, DC, USA, August 25-28, 2013. IEEE Computer Society. https://doi.org/10.1109/ICDAR.2013.34
- Williams, K., Chen, H., Choudhury, S.R., Giles, C.L., Forner, P., Navigli, R., Tufis, D., Ferro, N. (2013). Unsupervised Ranking for Plagiarism Source Retrieval Notebook for PAN at CLEF 2013. Working Notes for CLEF 2013 Conference , Valencia, Spain, September 23-26, 2013. 1179 CEUR-WS.org. https://ceur-ws.org/Vol-1179/CLEF2013wn-PAN-WilliamsEt2013.pdf
- Khabsa, M., Carman, S., Choudhury, S.R., Giles, C.L., Trotman, A., Clarke, C.L., Ounis, I., Culpepper, J.S., Cartright, M., Geva, S. (2012). A Framework for Bridging the Gap Between Open Source Search Tools. Proceedings of the SIGIR 2012 Workshop on Open Source Information Retrieval, OSIR@SIGIR 2012, Portland, Oregon, USA, 16th August 2012. University of Otago, Dunedin, New Zealand.
- Stańczak, K., Choudhury, S., Pimentel, T., Cotterell, R., Augenstein, I. (2023). Quantifying gender bias towards politicians in cross-lingual language models. PLOS One. 18 (11) 1-24. Public Library of Science. https://doi.org/10.1371/journal.pone.0277640
- Lester, B., Pressel, D., Hemmeter, A., Choudhury, S.R., Bangalore, S. (2020). Multiple Word Embeddings for Increased Diversity of Representation. Other. abs/2009.14394 https://arxiv.org/abs/2009.14394
- Kanan, T., Choudhury, S.R., Giles, C.L., Chandrasekar, P., Fox, E.A. (2015). Digital Library and Archiving for Qatar. Other. 11 (2) https://bulletin.jcdl.org/Bulletin/v11n2/papers/kanan.pdf
- Lahiri, S., Choudhury, S.R., Caragea, C. (2014). Keyword and Keyphrase Extraction Using Centrality Measures on Collocation Networks. Other. abs/1401.6571 http://arxiv.org/abs/1401.6571