Release text translation model of VinAI Translate

12/10/2023 / AI Update

VinAI is pleased to publicly release the pre-trained text translation models “vinai/vinai-translate-vi2en” and “vinai/vinai-translate-en2vi” that are currently used in the translation component of our VinAI Translate system. The pre-trained models are state-of-the-art text translation models for Vietnamese-to-English and English-to-Vietnamese, which can be used with the popular library “transformers”.

Please find details about the pre-trained models at: https://github.com/VinAIResearch/VinAI_Translate.

Experimental results of the pre-trained models can be found in our VinAI Translate system paper “A Vietnamese-English Neural Machine Translation System”, which will be presented at the Interspeech 2022 Show & Tell session.

Other NLP resources from VinAI:

  • BARTpho (INTERSPEECH 2022): Pre-trained sequence-to-sequence models for Vietnamese.
  • QA-CarManual (IUI 2022): Demo video of a Vietnamese speech-based question answering over car manuals.
  • PhoMT (EMNLP 2021): A high-quality and large-scale benchmark dataset for Vietnamese-English machine translation.
  • PhoATIS (INTERSPEECH 2021): An intent detection and slot filling dataset for Vietnamese.
  • PhoNLP (NAACL 2021): A BERT-based multi-task learning toolkit for Vietnamese POS tagging, named entity recognition and dependency parsing.
  • PhoNER_COVID19 (NAACL 2021): A dataset for Vietnamese named entity recognition.
  • ViText2SQL (EMNLP 2020 Findings): A dataset for Vietnamese Text2SQL semantic parsing.
  • PhoBERT (EMNLP 2020 Findings): Pre-trained language models for Vietnamese.
  • BERTweet (EMNLP 2020): A pre-trained language model for English Tweets.
  • COVID19Tweet (WNUT 2020): A dataset released for the WNUT 2020 Shared Task on “Identification of informative COVID-19 English Tweets”.
Back to News

Follow Us

Subscribe to Newsletter

It is a long established fact that you are reading content by Top 20 AI Research Company in the world.

Subscribe to Newsletter

Bạn đang đọc các bài viết bởi Top 20 công ty nghiên cứu AI hàng đầu thế giới.