Skip to main content

Sagnik Ray Choudhury

Title: Assistant Professor

Department: Computer Science and Engineering

College: College of Engineering

Curriculum Vitae

Curriculum Vitae Link

Education

  • PhD, Pennsylvania State University, 2017
    Major: Information Sciences and Technology

Current Scheduled Teaching

CSCE 5934.878Directed StudySpring 2025
CSCE 6940.978Individual ResearchSpring 2025
CSCE 4290.002Introduction to Natural Language ProcessingSpring 2025 Syllabus
CSCE 5950.878Master's ThesisSpring 2025
CSCE 5290.002Natural Language ProcessingSpring 2025

Previous Scheduled Teaching

CSCE 5934.878Directed StudyFall 2024
CSCE 6940.878Individual ResearchFall 2024
CSCE 5950.878Master's ThesisFall 2024
CSCE 5290.005Natural Language ProcessingFall 2024 Syllabus SPOT

Published Intellectual Contributions

    Conference Proceeding

  • Akella, A.P., Choudhury, S., Koop, D., Alhoori, H., Serra, E., Spezzano, F. (2024). Navigating the Landscape of Reproducible Research: A Predictive Modeling Approach. Proceedings of the 33rd ACM International Conference on Information and Knowledge Management, CIKM 2024, Boise, ID, USA, October 21-25, 2024. 24--33. ACM. https://doi.org/10.1145/3627673.3679831
  • Dutt, R., Ray Choudhury, S., Rao, V.V., Rose, C., Vydiswaran, V., Hupkes, D., Dankers, V., Batsuren, K., Kazemnejad, A., Christodoulopoulos, C., Giulianelli, M., Cotterell, R. (2024). Investigating the Generalizability of Pretrained Language Models across Multiple Dimensions: A Case Study of NLI and MRC. Proceedings of the 2nd GenBench Workshop on Generalisation (Benchmarking) in NLP. 165--182. Miami, Florida, USA, Association for Computational Linguistics. https://aclanthology.org/2024.genbench-1.11/
  • Yaneva, V., North, K., Baldwin, P., Ha, Le An, Rezayi, S., Zhou, Y., Ray Choudhury, S., Harik, P., Clauser, B., Kochmar, E., Bexte, M., Burstein, J., Horbach, A., Laarmann-Quante, R., Tack, Ana\"\is, Yaneva, V., Yuan, Z. (2024). Findings from the First Shared Task on Automated Prediction of Difficulty and Response Time for Multiple-Choice Questions. Proceedings of the 19th Workshop on Innovative Use of NLP for Building Educational Applications (BEA 2024). 470--482. Mexico City, Mexico, Association for Computational Linguistics. https://aclanthology.org/2024.bea-1.39
  • Choudhury, S.R., Atanasova, P., Augenstein, I., Bouamor, H., Pino, J., Bali, K. (2023). Explaining Interactions Between Text Spans. Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, EMNLP 2023, Singapore, December 6-10, 2023. 12709--12730. Association for Computational Linguistics. https://doi.org/10.18653/v1/2023.emnlp-main.783
  • Choudhury, S.R., Kalra, J., Jiang, J., Reitter, D., Deng, S. (2023). Implications of Annotation Artifacts in Edge Probing Test Datasets. Proceedings of the 27th Conference on Computational Natural Language Learning, CoNLL 2023, Singapore, December 6-7, 2023. 575--586. Association for Computational Linguistics. https://doi.org/10.18653/v1/2023.conll-1.39
  • Choudhury, S.R., Bhutani, N., Augenstein, I., Calzolari, N., Huang, C., Kim, H., Pustejovsky, J., Wanner, L., Choi, K., Ryu, P., Chen, H., Donatelli, L., Ji, H., Kurohashi, S., Paggio, P., Xue, N., Kim, S., Hahm, Y., He, Z., Lee, T.K., Santus, E., Bond, F., Na, S. (2022). Can Edge Probing Tests Reveal Linguistic Knowledge in QA Models?. Proceedings of the 29th International Conference on Computational Linguistics, COLING 2022, Gyeongju, Republic of Korea, October 12-17, 2022. International Committee on Computational Linguistics. https://aclanthology.org/2022.coling-1.139
  • Choudhury, S.R., Rogers, A., Augenstein, I., Calzolari, N., Huang, C., Kim, H., Pustejovsky, J., Wanner, L., Choi, K., Ryu, P., Chen, H., Donatelli, L., Ji, H., Kurohashi, S., Paggio, P., Xue, N., Kim, S., Hahm, Y., He, Z., Lee, T.K., Santus, E., Bond, F., Na, S. (2022). Machine Reading, Fast and Slow: When Do Models ``Understand'' Language?. Proceedings of the 29th International Conference on Computational Linguistics, COLING 2022, Gyeongju, Republic of Korea, October 12-17, 2022. International Committee on Computational Linguistics. https://aclanthology.org/2022.coling-1.8
  • Lester, B., Choudhury, S.R., Prasad, R., Bangalore, S., Kim, Y., Li, Y., Rambow, O. (2021). Intent Features for Rich Natural Language Understanding. Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies: Industry Papers, NAACL-HLT 2021, Online, June 6-11, 2021. Association for Computational Linguistics. https://doi.org/10.18653/v1/2021.naacl-industry.27
  • Lester, B., Pressel, D., Hemmeter, A., Choudhury, S.R., Bangalore, S., Cohn, T., He, Y., Liu, Y. (2020). Constrained Decoding for Computationally Efficient Named Entity Recognition Taggers. Findings of the Association for Computational Linguistics: EMNLP 2020, Online Event, 16-20 November 2020. EMNLP 2020 Association for Computational Linguistics. https://doi.org/10.18653/v1/2020.findings-emnlp.166
  • Chiatti, A., Cho, M.J., Gagneja, A., Yang, X., Brinberg, M., Roehrick, K., Choudhury, S.R., Ram, N., Reeves, B., Giles, C.L., Haddad, H.M., Wainwright, R.L., Chbeir, R. (2018). Text extraction and retrieval from smartphone screenshots: building a repository for life in media. Proceedings of the 33rd Annual ACM Symposium on Applied Computing, SAC 2018, Pau, France, April 09-13, 2018. ACM. https://doi.org/10.1145/3167132.3167236
  • Pressel, D., Ray Choudhury, S., Lester, B., Zhao, Y., Barta, M. (2018). Baseline: A Library for Rapid Modeling, Experimentation and Development of Deep Learning Algorithms targeting NLP. Proceedings of Workshop for NLP Open Source Software (NLP-OSS). 34--40. Melbourne, Australia, Association for Computational Linguistics. https://aclanthology.org/W18-2506
  • Wu, J., Choudhury, S., Chiatti, A., Liang, C., Giles, C.L. (2017). HESDK: A Hybrid Approach to Extracting Scientific Domain Knowledge Entities. 2017 ACM/IEEE Joint Conference on Digital Libraries, JCDL 2017, Toronto, ON, Canada, June 19-23, 2017. IEEE Computer Society. https://doi.org/10.1109/JCDL.2017.7991580
  • Al-Zaidy, R.A., Choudhury, S.R., Giles, C.L., Khabsa, M., Giles, C.L., Wade, A.D. (2016). Automatic Summary Generation for Scientific Data Charts. Scholarly Big Data: AI Perspectives, Challenges, and Ideas, Papers from the 2016 AAAI Workshop, Phoenix, Arizona, USA, February 13, 2016. WS-16-13 AAAI Press. http://www.aaai.org/ocs/index.php/WS/AAAIW16/paper/view/12661
  • Choudhury, S.R., Wang, S., Giles, C.L., Adam, N.R., Lillian (Boots) Cassel, Yesha, Y., Furuta, R., Weigle, M.C. (2016). Curve Separation for Line Graphs in Scholarly Documents. Proceedings of the 16th ACM/IEEE-CS on Joint Conference on Digital Libraries, JCDL 2016, Newark, NJ, USA, June 19 - 23, 2016. ACM. https://doi.org/10.1145/2910896.2925469
  • Choudhury, S.R., Wang, S., Giles, C.L., Groppe, S., Le Gruenwald. (2016). Scalable algorithms for scholarly figure mining and semantics. Proceedings of the International Workshop on Semantic Big Data, San Francisco, CA, USA, July 1, 2016. ACM. https://doi.org/10.1145/2928294.2928305
  • Ray Choudhury, S., Giles, C.L. (2015). An Architecture for Information Extraction from Figures in Digital Libraries. Proceedings of the 24th International Conference on World Wide Web. 667–672. New York, NY, USA, Association for Computing Machinery. https://doi.org/10.1145/2740908.2741712
  • Choudhury, S.R., Mitra, P., Giles, C.L., Vanoirbeek, C., Pierre Genev\`es. (2015). Automatic Extraction of Figures from Scholarly Documents. Proceedings of the 2015 ACM Symposium on Document Engineering, DocEng 2015, Lausanne, Switzerland, September 8-11, 2015. ACM. https://doi.org/10.1145/2682571.2797085
  • Wu, J., Killian, J., Yang, H., Williams, K., Choudhury, S., Tuarob, S., Caragea, C., Giles, C.L., Barker, K., Jos\'e Manu\'el G\'omez-P\'erez. (2015). PDFMEF: A Multi-Entity Knowledge Extraction Framework for Scholarly Documents and Semantic Search. Proceedings of the 8th International Conference on Knowledge Capture, K-CAP 2015, Palisades, NY, USA, October 7-10, 2015. ACM. https://doi.org/10.1145/2815833.2815834
  • Williams, K., Wu, J., Choudhury, S.R., Khabsa, M., Giles, C.L. (2014). Scholarly big data information extraction and integration in the CiteSeer\(^\mbox\(\chi\)\) digital library. Workshops Proceedings of the 30th International Conference on Data Engineering Workshops, ICDE 2014, Chicago, IL, USA, March 31 - April 4, 2014. IEEE Computer Society. https://doi.org/10.1109/ICDEW.2014.6818305
  • Wu, Z., Wu, J., Khabsa, M., Williams, K., Chen, H., Huang, W., Tuarob, S., Choudhury, S.R., Ororbia, A., Mitra, P., Giles, C.L. (2014). Towards building a scholarly big data platform: Challenges, lessons and opportunities. IEEE/ACM Joint Conference on Digital Libraries, JCDL 2014, London, United Kingdom, September 8-12, 2014. IEEE Computer Society. https://doi.org/10.1109/JCDL.2014.6970157
  • Choudhury, S., Tuarob, S., Mitra, P., Rokach, L., Kirk, A., Szep, S., Pellegrino, D.A., Jones, S., Giles, C.L., Downie, J.S., McDonald, R.H., Cole, T.W., Sanderson, R., Shipman, F. (2013). A figure search engine architecture for a chemistry digital library. 13th ACM/IEEE-CS Joint Conference on Digital Libraries, JCDL '13, Indianapolis, IN, USA, July 22 - 26, 2013. ACM. https://doi.org/10.1145/2467696.2467757
  • Choudhury, S., Mitra, P., Kirk, A., Szep, S., Pellegrino, D.A., Jones, S., Giles, C.L. (2013). Figure Metadata Extraction from Digital Documents. 12th International Conference on Document Analysis and Recognition, ICDAR 2013, Washington, DC, USA, August 25-28, 2013. IEEE Computer Society. https://doi.org/10.1109/ICDAR.2013.34
  • Williams, K., Chen, H., Choudhury, S.R., Giles, C.L., Forner, P., Navigli, R., Tufis, D., Ferro, N. (2013). Unsupervised Ranking for Plagiarism Source Retrieval Notebook for PAN at CLEF 2013. Working Notes for CLEF 2013 Conference , Valencia, Spain, September 23-26, 2013. 1179 CEUR-WS.org. https://ceur-ws.org/Vol-1179/CLEF2013wn-PAN-WilliamsEt2013.pdf
  • Khabsa, M., Carman, S., Choudhury, S.R., Giles, C.L., Trotman, A., Clarke, C.L., Ounis, I., Culpepper, J.S., Cartright, M., Geva, S. (2012). A Framework for Bridging the Gap Between Open Source Search Tools. Proceedings of the SIGIR 2012 Workshop on Open Source Information Retrieval, OSIR@SIGIR 2012, Portland, Oregon, USA, 16th August 2012. University of Otago, Dunedin, New Zealand.
  • Journal Article

  • Stańczak, K., Ray Choudhury, S., Pimentel, T., Cotterell, R., Augenstein, I. (2023). Quantifying gender bias towards politicians in cross-lingual language models. PLOS One. 18 (11) 1-24. Public Library of Science. https://doi.org/10.1371/journal.pone.0277640
  • Lester, B., Pressel, D., Hemmeter, A., Choudhury, S.R., Bangalore, S. (2020). Multiple Word Embeddings for Increased Diversity of Representation. Other. abs/2009.14394 https://arxiv.org/abs/2009.14394
  • Kanan, T., Choudhury, S.R., Giles, C.L., Chandrasekar, P., Fox, E.A. (2015). Digital Library and Archiving for Qatar. Other. 11 (2) https://bulletin.jcdl.org/Bulletin/v11n2/papers/kanan.pdf
  • Lahiri, S., Choudhury, S.R., Caragea, C. (2014). Keyword and Keyphrase Extraction Using Centrality Measures on Collocation Networks. Other. abs/1401.6571 http://arxiv.org/abs/1401.6571
,
Overall
Summative Rating
Challenge and
Engagement Index
Response Rate

out of 5

out of 7
%
of
students responded
  • Overall Summative Rating (median):
    This rating represents the combined responses of students to the four global summative items and is presented to provide an overall index of the class’s quality. Overall summative statements include the following (response options include a Likert scale ranging from 5 = Excellent, 3 = Good, and 1= Very poor):
    • The course as a whole was
    • The course content was
    • The instructor’s contribution to the course was
    • The instructor’s effectiveness in teaching the subject matter was
  • Challenge and Engagement Index:
    This rating combines student responses to several SPOT items relating to how academically challenging students found the course to be and how engaged they were. Challenge and Engagement Index items include the following (response options include a Likert scale ranging from 7 = Much higher, 4 = Average, and 1 = Much lower):
    • Do you expect your grade in this course to be
    • The intellectual challenge presented was
    • The amount of effort you put into this course was
    • The amount of effort to succeed in this course was
    • Your involvement in course (doing assignments, attending classes, etc.) was
CLOSE