Machine translation usage has become ubiquitous in our society. It's on Twitter, it's on Facebook, it's in your browser, it's in your inbox... But who provides those services? When you receive an email in German about the dubious purchases you made while in a night club in Hamburg, you copy it to Google translate and have a good laugh with your mates... and Google of course, they always get the last laugh.

At Evil Corp we really value your data... And your privacy, of course!

When you use an online translation service, you automatically compromise your privacy. You listened to a Spanish summer song? You copy/pasted the lyrics into an online translation engine? Now, your browser is full of Spanish holiday ads.

At least Spain is pretty..

Now, realistically, the implication of this privacy intrusion is just a bit of annoyance, or maybe some unintended spending. However, sometimes the information you submit for translation could leak in a data breach, as the Norwegian state oil company Statoil (now Equinor) found out when a bunch of its contracts were accidentally leaked by translate.com. This is not only embarassing, but a PR nightmare and might lead to some very expensive lawsuits.

We promise we won't do it again!

Why do we do this? Because historically, machine translation is deemed too computationally intensive to be performed on the end user's hardware. We are used to using the cloud's vast computational resources to hide the computational costs from end users, but it comes with some major tradeoffs.

Can we do better than a cloud based solution? Yes, we can!

translateLocally

We present translateLocally, a blazingly fast, privacy focussed machine translation service, running entirely on your local computer. We provide binaries for Ubuntu 20.04, Arch Linux, Windows and Mac (x86 only for now) at translatelocally.com and the source code is available at Github.

Translation!

We support 7 language pairs, including German, Spanish, Czech and many more in the works. The models can be downloaded in advance, or automatically fetched from a secure server via the application's interface. An average model is about 15 MB in size and users can supply their own custom built models.

Model Download!

The main innovation of translateLocally is its speed! Previous work or bringing machine translation to the end PC user exists, but they do not have as many platforms supported and are slow. How slow? About a 1000 times slower than translateLocally.

Previous work achieves a translation speeds of about ~70 words per second a modern 15 inch MacBook pro. This is fast enough for simultaneous translation to be performed while typing, but woefully inadequate for copy/pasting large chunks of text. translateLocally on the other hand achieves translation speeds of around ~7k words per second on a comparable hardware, making translation feel instantaneous and snappy.

But don't take my word for it. Go download and test it (or see the demo). Or read more about it in our EMNLP demo paper. Or see a youtube video about it. Or two.

If you care about your translation's privacy, your best bet is to do it locally! If you need help with it, get in touch with me at me@nbogoychev.com.

This work was made possible through the joint effort of a lot of people, working together on the Bergamot project. Special thanks to Jelmer van der Linde, Kenneth Heafield, Jerin Philip, Ulrich Germann, Vladimira Kalanovska and Graeme Nail.

Happy translating everybody!

Nick

Image sources: flickr pexels pixabay

Privacy focused machine translation for the end user

Nikolay Bogoychev

Nikolay Bogoychev

translateLocally

The Ups and Downs of Large Language Model Inference with Vocabulary Trimming by Language Heuristics

Terminology-Aware Translation with Constrained Decoding and Large Language Model Prompting

Not all parameters are born equal! Attention is mostly what you need!

Failing to do simple domain adaptation for Neural Machine Translation

On the peer review process in academia