Artificial Intelligence (AI) is recently taking many industries by the storm, and the translation market is no exception. With the new possibilities provided by tools such as ChatGPT, we observe a significant shift in a number of areas, one of them being open-source software translation. We’ve worked with ChatGPT to prepare an AI-automated translation of OpenLMIS, exploring the tool’s capabilities and testing its effectiveness as the main translator of software.
Breaking barriers: The importance of language accessibility
Developing a great solution can help thousands of people. But not adjusting it to the individual needs of end users can stop millions from actually benefiting from it. One of the things that still hugely affect the accessibility of systems and applications is the language barrier. The lack of support for a needed language often stops local professionals from choosing a solution that would otherwise meet their requirements.
Open-source software translations
This issue is especially present in open-source projects that aim to solve problems and have the means to do that in terms of technical skills and knowledge, but fall short when it comes to resources needed to hire professional translators for every language required by end users. It hinders the growth and reach of such initiatives, and deprives institutions and communities around the world from game changing opportunities.
The usual practice in open-source communities is for the contributors to take on the task of translating software to new languages. This approach, however, comes with its own problems. To translate software there first must be someone who knows the language and has a lot of time on their hands, which is rare. It’s especially challenging when it comes to more niche languages, where the chances of finding a translator within the community are small. In result, translations are often left unfinished, and they vary in quality. Many of them also cannot be verified in terms of correctness, since system maintainers also don’t know the languages.
The rise of AI-powered translations
With the growing need for efficient, cost-effective translations, solutions alternative to human translators are gaining popularity. AI-powered tools like ChatGPT are quickly becoming a strong option for those who need quick translations.
As an active contributor to open-source projects, which are designed to solve various problems, especially in the underserved countries, we often face the challenge of implementing new language support options. Searching for ways to optimize this process, we’ve worked with ChatGPT, testing its translating capabilities, and learning how to utilize the assets of this tool, such as cost-effectiveness and the quick response time, without sacrificing the quality of translations.
Case study: Translating OpenLMIS with ChatGPT
OpenLMIS Angola – English to Portuguese translation
Initial tests were conducted when preparing a new language version for OpenLMIS – an open-source healthcare logistics management system. With multiple implementations across Africa, the ability to quickly and cost-effectively translate OpenLMIS, primarily written in English, to other languages is truly invaluable.
In this particular case, OpenLMIS was being implemented in Angola, and so, a Portuguese version of the system was required. One of the more challenging aspects of creating new translations for OpenLMIS is that it involves a lot of specialized, technical language that often requires additional research to avoid any errors. It results in taking more time and generating higher costs when hiring a professional translator.
Looking for alternative options to minimize the resources required, we’ve decided to test ChatGPT as our main translation tool.
Initial translation attempt
First, we exported the content that needed to be translated from Transifex (a tool designed to support the process of translating software) in a .csv format, and uploaded it to ChatGPT. It took us just a few minutes to receive the first results, and the main translation process didn’t require the involvement of any external help, making it much quicker and more cost-effective.
Next, the translation generated by ChatGPT needed to be checked with the human eye to verify whether it meets a satisfactory level of quality. To do that, we connected with the OpenLMIS community. From members’ feedback we learned that the majority of translations were correct. However, probably due to the insufficient context provided to AI, there were some mistakes with word choices, and these parts required manual adjustments.
Overall, the first attempt at ChatGPT translation was successful, and we learned a lot about its capabilities in translating a specialized language as well as its contextual limitations.
Introducing Arabic and RtL support
Having gained the basic understanding of what to expect when translating software with ChatGPT, we’ve prepared for our second attempt – introducing Arabic language to OpenLMIS, along with the right-to-left language support.
This project was more challenging since implementing the Arabic language to the system involved working with an alphabet that is not only new to us, but also written from right to left. It
required additional focus on the layout that had to be dynamically switchable between LtR and RtL orientations, with the consideration of all corresponding design aspects. Having this many things to figure out, it was very convenient to have an AI-tool handle the translations.
Already knowing that most of the mistakes made by ChatGPT when generating English to Portuguese translation came from the lack of context, this time we put more effort into providing the tool with as much information on the app as possible. We’ve shared descriptions of the system, translations from other languages, and comprehensive explanations of more complex features.
Once finished, we’ve consulted the results with the OpenLMIS community to see if the changes we’ve introduced to the process have brought any improvement in quality. It turned out that, having this many details to work with, ChatGPT has generated much more accurate and contextually correct translations.
Techniques to improve AI translation accuracy
The experiment helped us gather a lot of information on how to work with ChatGPT in order to gain the maximum value that this tool has to offer. We also learned that ChatGPT is, in fact, a solid option for software translations, and can bring plenty of benefits to open-source communities, as long as it is used attentively and responsibly.
Here is a summary of our lessons learned:
Adding context to translations
The more precise information you feed to an AI tool, the more accurate results you will get. It’s important to provide a lot of details and points of reference. In our case, we quickly learned that even if some information might be obvious to us, since we’re working with OpenLMIS on a daily basis and know it well, ChatGPT won’t know these things until we share them. Same as working with a human translator – to provide an accurate translation they first need to learn the full context.
Some examples of how to help AI gain a contextual understanding:
- Providing translations from other languages as a point of reference. It can greatly improve the accuracy of translation, since AI will observe word choices made in different languages and apply them to the translation.
- Offering descriptions of the application’s function. Some features might require additional explanations in order for AI to fully grasp the meaning of related texts. It happens especially when a certain functionality is more complex or unique.
- Translating specific views in the app. Some errors in translation stem from the fact that certain words have multiple meanings, and different meanings should be selected based on the place in which the word shows up in the app. When presented with translated examples of specific UI views (like screenshots or text from particular pages of the app), AI can better understand the context in which certain terms are used.
Glossaries for specialized terms
One of the more difficult aspects of software translation, especially when it comes to open-source solutions, is the involvement of specialized language related to a certain field the app is dealing with. For example, OpenLMIS, a tool created to manage healthcare logistics, contains a lot of medical language that requires specific knowledge, and a lot of precision in order to be translated correctly. Providing AI with a glossary of specialized terms can help refine the translation, and avoid incorrect word choices.
Using translation memory
Another thing we found very beneficial is translation memory. ChatGPT stores generated translations in the database. When translating new content, the system scans the translation memory for segments that match or are similar to the new text. This feature positively affects the consistency of translations, and makes them more and more accurate with each use, since AI’s knowledge of the app expands.
Translating software: Comparison of options
Based on our tests – translating OpenLMIS to Portuguese and Arabic languages with ChatGPT – we gathered some conclusions on the effectiveness of AI translation as an alternative to human translators.
Sticking to OpenLMIS as our example, we analyzed three different translation approaches considering pros and cons for each. Being a complex, modular system, OpenLMIS contains around 3000-4000 translation keys that need to be translated for every new language version.
Option 1: Contributor translation
As already mentioned above, it’s the most common choice for translating open-source software due to its cost-effectiveness. It does, however, have many limitations – the most critical one is that it largely relies on luck. If within the community there happens to be a person that knows a certain language and is willing to take on the translation, then the system will be translated. But what if there is no one to do the job? Or if the job gets abandoned in the middle of translation due to the time constraints?
Considering the number of translation keys in OpenLMIS, even if each of them would be translated in just a few minutes, which is a very optimistic scenario, translating the whole thing would take from 150 to 300 hours, which is an almost impossible amount of time for someone to commit purely out of passion or willingness to help, and it doesn’t even consider the time needed to get familiar with the application, research terminology, upload the translation, etc.
Such amount of work not only is a burden on the contributors, but also significantly affects the time in which the translated version of the system can reach its end users, which is the ultimate goal of the whole project.
Option 2: Professional translation
Hiring a professional translator can undoubtedly provide the highest quality of translation. For the longest time it was the only available option for software that is not open-source and doesn’t have its own community of contributors.
However, professional translation is also the most expensive choice. Most translators charge around $0.20 – $0.30 per word. For applications such as OpenLMIS, assuming one translation key consists of a few words, the cost of professional translation would be from $3000 to $12000.
Another possible issue is a highly specialized language required to translate open-source applications that we are working on, since they’re designed for healthcare, information, logistics, and production management purposes. When hiring a professional translator we would have to make sure that they’re familiar with the certain industry.
On top of that, even with the help of a professional translator, it would still take at least 150 hours to translate software the size of OpenLMIS.
Option 3: AI translation
While this option is still evolving and changing, it already has the capability to serve as a strong, efficient, and cost-effective alternative to traditional translation. Generating translations for OpenLMIS took us just a few hours. A lot of this time was spent testing different approaches to prompts in order to work out the most accurate translation. As discussed previously, equipping ChatGPT, or any other AI-tool, with enough context is the key to receiving satisfactory results. It is also the process that still has a lot of room for improvement, and that we’ll continue to optimize.
Of course, this method of translation doesn’t come without its downsides, the main one being mistakes made by AI. For that reason, we wouldn’t recommend putting 100% of your trust into ChatGPT-generated translations. However, with enough attention given to prompts, and using techniques mentioned in this article, we’ve managed to significantly lower the number of AI’s incorrect word choices. Correcting the mistranslated terminology is also way faster and easier than translating the whole thing from scratch.
AI technology is in its blooming era, and constantly evolving. It’s highly possible that in the nearest future it will outgrow its current capabilities, and become an even more powerful and reliable choice for software translation. We’ll continue to observe its development and leverage its potential.
Final thoughts
Our main conclusion from this experiment is that, as long as used reasonably, ChatGPT outshines other translation methods, and will serve as a great help to open-source communities, allowing powerful solutions to reach those who need them faster and more consistently.
In scenarios requiring rapid development or handling large volumes of text, AI tools work wonders in terms of time, scalability and cost-efficiency. Although still not perfect, and still making noticeable contextual mistakes, with enough comprehensive resources provided, AI can be trained to return more accurate results, minimizing the number of errors.
As AI models continue to develop, they will become an even more viable alternative to traditional translators, providing faster results at a fraction of the cost.