I am a Ph.D. student in Computer and Information Science at University of Oregon, working with Prof. Thien Huu Nguyen in the UONLP lab in Multilingual Natural Language Processing with the focus on Information Extraction. Before starting my Ph.D., I received my bachelor’s degree in Computer Science from Hanoi University of Science and Technology and was a member of the DS lab under the supervision of Prof. Khoat Than and Dr. Linh Ngo Van.


  • University of Oregon
    • Ph.D. in Computer and Information Science, 2019 - .
    • Advisor: Prof. Thien Huu Nguyen.
  • Hanoi University of Science and Technology
    • B.S. in Computer Science, 2014 - 2019.
    • Advisor: Dr. Linh Ngo Van.


  • FourIE: a neural information extraction system that annotates text for entity mentions (names, pronouns, and nominals of people, organizations, locations, etc), relations (between two entity mentions), event triggers and argument roles using the information schema defined in the ACE 2005 dataset. FourIE leverages deep learning and graph convolutional networks to jointly perform four tasks in information extraction, i.e., entity mention detection, relation extraction, event detection and argument role prediction in an end-to-end fashion. Demo.
  • Trankit: a light-weight transformer-based toolkit for multilingual NLP that can process raw text and support fundamental NLP tasks for 56 languages. Trankit is based on recent advances on multilingual pre-trained language models, providing state-of-the-art performance for Sentence Segmentation, Part-of-Speech Tagging, Morphological Feature Tagging, Dependency Parsing, and Named Entity Recognition over 90 Universal Dependencies treebanks. Trankit is written in Python and can be installed via pip. Github, Demo, Documentation.


  • IARPA Better Extraction from Text Towards Enhanced Retrieval (BETTER)
    • Research Assistant, January 2020 - .
    • I am a Research Assistant for the project where I’ve been building different cross-lingual information extraction systems (with English as the source language) for extracting events in the form of who-did-what-to-whom-when-where, at different granularity levels of information, across various target languages.

Publications (*=equal contribution)


  • Cross-Task Instance Representation Interactions and Label Dependencies for Joint Information Extraction with Graph Convolutional Networks [Paper] [Demo]
    Minh Van Nguyen, Viet Dac Lai and Thien Huu Nguyen.
    Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL-HLT 2021).

  • Trankit: A Light-Weight Transformer-based Toolkit for Multilingual Natural Language Processing [Paper] [Github] [Demo] [Documentation]
    Minh Van Nguyen, Viet Dac Lai, Amir Pouran Ben Veyseh and Thien Huu Nguyen.
    Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: System Demonstrations (EACL 2021 Demo).
    (EACL2021 Outstanding Demo Paper Award)

  • Improving Cross-Lingual Transfer for Event Argument Extraction with Language-Universal Sentence Structures [To Appear]
    Minh Van Nguyen and Thien Huu Nguyen.
    Proceedings of the 6th Arabic Natural Language Processing Workshop at EACL 2021 (WANLP@EACL 2021).

  • Graph Learning Regularization and Transfer Learning for Few-Shot Event Detection [To Appear]
    Viet Dac Lai, Minh Van Nguyen, Thien Huu Nguyen, and Franck Dernoncourt.
    Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2021).

  • Event Extraction from Historical Texts: A New Dataset for Back Rebellions [To Appear]
    Viet Dac Lai, Minh Van Nguyen, Heidi Kaufman, and Thien Huu Nguyen.
    Proceedings of the Joint Conference of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Findings ACL-IJCNLP 2021).


  • Who is Killed by Police: Introducing Supervised Attention for Hierarchical LSTMs [Paper]
    Minh Van Nguyen and Thien Huu Nguyen.
    Proceedings of the 27th International Conference on Computational Linguistics (COLING 2018).

  • A Deep Learning Model with Hierarchical LSTMs and Supervised Attention for Anti-Phishing [Paper]
    Minh Van Nguyen, Toan Nguyen and Thien Huu Nguyen.
    Proceedings of the 1st Anti-Phishing Shared Pilot at 4th ACM International Workshop on Security and Privacy Analytics Academic Service (IWSPA@CODASPY 2018).


  • CIS 471: Introduction to Artificial Intelligence [Class Page]
    • Teaching Assistant, Fall 2019.
    • I was a Teaching Assistant for the class where I helped undergraduate students understand fundamental concepts and problems in Artificial Intelligence.

Academic Service

Program Committee: AAAI 2021, EMNLP 2021.