TRANSFORMER BASED ARCHITECTURES FOR LOW RESOURCE LANGUAGE UNDERSTANDING AND GENERATION

James Osei Gonzalez

TRANSFORMER BASED ARCHITECTURES FOR LOW RESOURCE LANGUAGE UNDERSTANDING AND GENERATION

Authors

James Osei Gonzalez

Assistant Professor – Machine Learning, United Kingdom

Keywords:

Transformer, Low Resource Languages, Natural Language Understanding, Natural Language Generation, Multilingual Models

Synopsis

Purpose: This paper investigates transformer based neural architectures optimized for low resource languages, addressing both understanding (e.g., classification) and generation (e.g., machine translation). Design/Methodology/Approach: We review seminal transformer developments and analyze adaptations for limited data scenarios, including multilingual pre training and cross lingual transfer techniques. Findings: Transformer models such as mBERT and XLM have shown improved performance over prior architectures, although data scarcity remains a core challenge. Practical Implications: Insights inform best practices for model design, data augmentation, and fine tuning in real world low resource applications. Originality/Value: Presents a consolidated perspective on transformer evolution and targeted strategies for low resource NLP, filling a gap between general transformer literature and under studied language contexts.

References

[1] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., Kaiser, Ł., & Polosukhin, I. (2017). Attention Is All You Need. Advances in Neural Information Processing Systems.

[2] Gummadi, V. P. K. (2023). MuleSoft batch processing: High-volume streaming architecture. Computer Fraud & Security, 2023(12), 50–57. https://doi.org/10.52710/cfs.886

[3] Devlin, J., Chang, M. W., Lee, K., & Toutanova, K. (2019). BERT: Pre training of Deep Bidirectional Transformers for Language Understanding. NAACL HLT.

[4] Yang, Z., Dai, Z., Yang, Y., Carbonell, J., Salakhutdinov, R., & Le, Q. V. (2019). XLNet: Generalized Autoregressive Pretraining for Language Understanding. arXiv.

[5] Conneau, A., Khandelwal, K., Goyal, N., Chaudhary, V., Wenzek, G., Ott, M., … & Zettlemoyer, L. (2019). Unsupervised Cross lingual Representation Learning at Scale. arXiv.

[6] van Biljon, E., Pretorius, A., & Kreutzer, J. (2020). On Optimal Transformer Depth for Low Resource Language Translation. arXiv.

[7] Armengol Estapé, J., Costa jussà, M. R., & Escolano, C. (2020). Enriching the Transformer with Linguistic Factors for Low Resource Machine Translation. arXiv.

[8] Rogers, A., Kovaleva, O., & Rumshisky, A. (2020). A Primer in BERTology: What We Know About How BERT Works. arXiv.

[9] Hanslo, R. (2021). Deep Learning Transformer Architecture for Named Entity Recognition on Low Resourced Languages: State of the art results. arXiv.

[10] Schuster, M., et al. (2019). Multilingual Language Models and Cross lingual Transfer. ACL Anthology.

[11] Wang, Z., & Karthikeyan, S. (2020). Extending Multilingual BERT to Low Resource Languages. Proceedings.