About me
I am a Research Scientist at Meta working on the LLaMa family of models. My research interests include, but are not limited to Machine translation, Quantisation, Low Level Optimisation, Multilinguality and Evaluation.
Beforehand, I was a Postdoc at The University of Edinburgh Institute for Languages, Cognition and Computation, with Kenneth Heafield. I completed my PhD in 2019 in the same institute supervised by Adam Lopez and Kenneth Heafield. My thesis is Fast machine translation on parallel and massively parallel hardware [pdf] [bib].
I like languages, logographic writing systems, game theory and GPUs. I (try to) make things run faster and enjoy (premature) optimization. In my spare time I learn languages and play football. Please check out my blog where I post random things about Chinese characters, code and life.
Projects
I am an avid supporter and believer in Open Source software and have contributed to various projects, among which:
- Various contributions to the marian machine translation framework.
- 8 bit integer GPU TensorCore decoding
- 8 bit integer CPU decoding
- Paraphraser, work together with Pinzhen Chen.
- Forced decoder for parallel corpora mining, work together with Pinzhen Chen.
- translateLocally a cross-platform desktop offline machine translation software.
- OpusCleaner and OpusTrainer modern machine translation data cleaner and trainer, built together with Jelmer van der Linde.
- Privacy focussed machine translation with Firefox.
- gemmBench a benchmark framework for various low precision GEMM frameworks.
- Collaborator on intgemm an 8 and 16bit intger GEMM framework by Kenneth Heafield.
- bfTile, an experimental VNNI GEMM tiling.
- imageSelector a cross-platform photo library organiser.
- gLM a GPU n-gram language model.
- ProbingPT a probing phrase table for statistical machine translation.
Consulting
I have consulted for the following companies:Talks
Robustness and Real-World Applications at MT Marathon 2023
Efficient Machine translation at MT Marathon 2022
Efficient Machine translation tutorial at MT Half Marathon 2021
Privacy focused machine translation in Firefox at W3C Workshop on Web and Machine Learning 2020
Interviews
Mechanical Ink Podcast conversation about Conversation about NLP and open source, January 2023
translateLocally, a desktop translation platform (in Bulgarian), February 2022
Publications
2024
The Llama Team The Llama 3 Herd of Models Arxiv preprint, 2024 [pdf] [bib]
Nikolay Bogoychev, Pinzhen Chen, Barry Haddow, Alexandra Birch The Ups and Downs of Large Language Model Inference with Vocabulary Trimming by Language Heuristics In Proceedings of the Fifth Workshop on Insights from Negative Results in NLP, Mexico City, Mexico, 2024 [pdf] [bib]
Pinzhen Chen, Shaoxiong Ji, Nikolay Bogoychev, Barry Haddow, Kenneth Heafield Monolingual or Multilingual Instruction Tuning: Which Makes a Better Alpaca In Findings of the Association for Computational Linguistics: EACL 2024, St. Julian’s, Malta [pdf] [bib] [video]
2023
Nikolay Bogoychev, Jelmer van der Linde, Graeme Nail, Barry Haddow, Jaume Zaragoza-Bernabeu, Gema Ramírez-Sánchez, Lukas Weymann, Tudor Nicolae Mateiu, Jindřich Helcl, Mikko Aulamo OpusCleaner and OpusTrainer, open source toolkits for training Machine Translation and Large language models Arxiv preprint, 2023 [pdf] [bib]
Laurie Burchell, Alexandra Birch, Nikolay Bogoychev, Kenneth Heafield An Open Dataset and Model for Language Identification In Proceedings of Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics, Toronto, Canada, 2023 [pdf] [bib]
Ramon Sanabria, Nikolay Bogoychev, Nina Markl, Andrea Carmantini, Ondrej Klejch, Peter Bell The Edinburgh International Accents of English Corpus: Towards the Democratization of English ASR In Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Rhodes Island, Greece, 2023 [pdf] [bib]
Mikko Aulamo, Nikolay Bogoychev, Shaoxiong Ji, Graeme Nail, Gema Ramírez-Sánchez, Jörg Tiedemann, Jelmer van der Linde, Jaume Zaragoza HPLT: High Performance Language Technologies In Proceedings of Proceedings of the 24th Annual Conference of the European Association for Machine Translation, Tampere, Finland, 2023 [pdf] [bib]
2022
Nikolay Bogoychev, Biao Zhang, Maximiliana Behnke, Graeme Nail, Jelmer van der Linde, Sidharth Kashyap, Kenneth Heafield Edinburgh’s Submission to the WMT 2022 Efficiency Task In Proceedings of the Seventh Conference on Machine Translation (WMT), Abu Dhabi, United Arab Emirates [pdf] [bib] [poster]
Kenneth Heafield, Biao Zhang, Graeme Nail, Jelmer Van Der Linde, Nikolay Bogoychev Findings of the WMT 2022 Shared Task on Efficient Translation In Proceedings of the Seventh Conference on Machine Translation (WMT), Abu Dhabi, United Arab Emirates [pdf] [bib] [presentation]
Andreas Grivas, Nikolay Bogoychev, Adam Lopez Low-Rank Softmax Can Have Unargmaxable Classes in Theory but Rarely in Practice In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics, Dublin, Ireland [pdf] [bib] [code] [demo]
2021
Nikolay Bogoychev Not all parameters are born equal: Attention is mostly what you need In Proceedings of the Fourth BlackboxNLP Workshop on Analyzing and Interpreting Neural Networks for NLP, Punta Cana, Dominican Republic [pdf] [bib] [blog] [poster]
Nikolay Bogoychev and Pinzhen Chen The Highs and Lows of Simple Lexical Domain Adaptation Approaches for Neural Machine Translation In Proceedings of the Second Workshop on Insights from Negative Results in NLP, Punta Cana, Dominican Republic [pdf] [bib] [blog] [presentation] [slides]
Nikolay Bogoychev, Jelmer Van der Linde, Kenneth Heafield TranslateLocally: Blazing-fast translation running on the local CPU Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing: System Demonstrations, Punta Cana, Dominican Republic [pdf] [bib] [blog] [demo]
Maximiliana Behnke, Nikolay Bogoychev, Alham Fikri Aji, Kenneth Heafield, Graeme Nail, Qianqian Zhu, Svetlana Tchistiakova, Jelmer van der Linde, Pinzhen Chen, Sidharth Kashyap and Roman Grundkiewicz Efficient Machine Translation with Model Pruning and Quantization In Proceedings of the Sixth Conference on Machine Translation (WMT21), Punta Cana, Dominican Republic [pdf] [bib]
Pinzhen Chen, Jindřich Helcl, Ulrich Germann, Laurie Burchell, Nikolay Bogoychev, Antonio Valerio Miceli Barone, Jonas Waldendorf, Alexandra Birch and Kenneth Heafield The University of Edinburgh’s English-German and English-Hausa Submissions to the WMT21 News Translation Task In Proceedings of the Sixth Conference on Machine Translation (WMT21), Punta Cana, Dominican Republic [pdf] [bib]
2020
Pinzhen Chen, Nikolay Bogoychev, Ulrich Germann Character Mapping and Ad-hoc Adaptation: Edinburgh’s IWSLT 2020 Open Domain Translation System. In Proceedings of the 17th International Conference on Spoken Language Translation, Seattle, USA [pdf] [bib]
Nikolay Bogoychev, Roman Grundkiewicz, Alham Fikri Aji, Maximiliana Behnke, Kenneth Heafield, Sidharth Kashyap, Emmanouil-Ioannis Farsarakis, Mateusz Chudyk Edinburgh’s Submissions to the 2020 Machine Translation Efficiency Task. In Proceedings of the Fourth Workshop on Neural Generation and Translation, Seattle, USA [pdf] [bib]
Pinzhen Chen, Nikolay Bogoychev, Kenneth Heafield, Faheem Kirefu Parallel Sentence Mining by Constrained Decoding. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, Seattle, USA [pdf] [bib]
Alham Fikri Aji, Nikolay Bogoychev, Kenneth Heafield In Neural Machine Translation, What Does Transfer Learning Transfer? In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, Seattle, USA [pdf] [bib]
2019
Nikolay Bogoychev and Rico Sennrich Domain, Translationese and Noise in Synthetic Data for Neural Machine Translation. arxiv preprint. [pdf] [bib]
Young Jin Kim, Marcin Junczys-Dowmunt, Hany Hassan, Alham Fikri Aji, Kenneth Heafield, Roman Grundkiewicz, Nikolay Bogoychev From Research to Production and Back: Ludicrously Fast Neural Machine Translation. In Proceedings of the 3rd Workshop on Neural Generation and Translation, Hong Kong. [pdf] [bib]
Alham Fikri Aji, Kenneth Heafield, Nikolay Bogoychev Combining Global Sparse Gradients with Local Gradients in Distributed Neural Network Training. In Proceedings of EMNLP, Hong Kong. [pdf] [bib]
Rachel Bawden, Nikolay Bogoychev, Ulrich Germann, Roman Grundkiewicz, Faheem Kirefu, Antonio Valerio Miceli Barone, Alexandra Birch The University of Edinburgh's Submissions to the WMT19 News Translation Task. In Proceedings of the Fourth Conference on Machine Translation, Florence, Italy [pdf] [bib]
Nikolay Bogoychev Fast machine translation on parallel and massively parallel hardware PhD Dissertation, Edinburgh, United Kingdom [pdf] [bib]
Lushi Chen, Abeer Aldayel, Nikolay Bogoychev, Tao Gong Similar Minds Post Alike: Assessment of Suicide Risk Using a Hybrid Model. In Proceedings of the Sixth Workshop on Computational Linguistics and Clinical Psychology, Minneapolis, Minnesota, USA. [pdf] [bib]
2018
Nikolay Bogoychev, Marcin Junczys-Dowmunt, Kenneth Heafield, Alham Fikri Aji Accelerating Asynchronous Stochastic Gradient Descent for Neural Machine Translation. In Proceedings of EMNLP, Brussels, Belgium. [pdf] [bib]
Barry Haddow, Nikolay Bogoychev, Denis Emelin, Ulrich Germann, Roman Grundkiewicz, Kenneth Heafield, Antonio Valerio Miceli Barone, Rico Sennrich The University of Edinburghs Submissions to the WMT18 News Translation Task In Proceedings of the Third Conference on Machine Translation, Brussels, Belgium. [pdf] [bib]
Marcin Junczys-Dowmunt, Roman Grundkiewicz, Tomasz Dwojak, Hieu Hoang, Kenneth Heafield, Tom Neckermann, Frank Seide, Ulrich Germann, Alham Fikri Aji, Nikolay Bogoychev, André F. T. Martins, Alexandra Birch (2018). Marian: Fast Neural Machine Translation in C++. Proceedings of ACL, Sydney, Australia, System Demonstrations. [pdf] [bib]
2016
Nikolay Bogoychev and Adam Lopez N-gram language models for massively parallel devices. In Proceedings of ACL, Berlin, Germany. [pdf] [bib]
Nikolay Bogoychev and Hieu Hoang Fast and highly parallelizable phrase table for statistical machine translation. In Proceedings of WMT, Berlin, Germany. [pdf] [bib]
Hieu Hoang, Nikolay Bogoychev, Lane Schwartz and Marcin Junczys-Dowmunt Fast, Scalable Phrase-Based SMT Decoding. In Proceedings of AMTA 2016, Austin, Texas [pdf] [bib]
2015
Barry Haddow, Matthias Huck, Alexandra Birch, Nikolay Bogoychev, Philipp Koehn The Edinburgh/JHU Phrase-based Machine Translation Systems for WMT 2015. In Proceedings of WMT, Lisboa, Portugal. [pdf] [bib]
2014
Alexandra Birch, Matthias Huck, Nadir Durrani, Nikolay Bogoychev, Philipp Koehn Edinburgh SLT and MT System Description for the IWSLT 2014 Evaluation. In Proceedings of IWSLT, Lake Tahoe, USA. [pdf] [bib]