Mastering machine translation: Customization vs. training for winning content

In this article

Companies are now turning to Machine Translation (MT) more than ever, and that number will continue to grow, perhaps more quickly than experts predict. According to a report by Mordor Intelligence, the Machine Translation Market is expected to grow at a CAGR of 7.1% between 2021 and 2026. Within half a year of implementing a ... Read more

Companies are now turning to Machine Translation (MT) more than ever, and that number will continue to grow, perhaps more quickly than experts predict. According to a report by Mordor Intelligence, the Machine Translation Market is expected to grow at a CAGR of 7.1% between 2021 and 2026. Within half a year of implementing a translation memory system, the digital marketing agency Dataduck could automatically translate 15% of their content using TM alone, which significantly reduced per-word translation costs (Webincare). You can attribute this trend to the technology’s increasingly predictable results and intense market pressure to produce more content quickly in many languages — within the same or even smaller budget. MT technology delivers translations with speed and cost efficiency in a way that human translators cannot, but companies must also address quality issues. To succeed in increasingly digital markets, they must provide personalized multilingual content that is domain-specific, hits a specific tone, and maintains a consistent brand voice across all channels.

Maximizing Machine Translation Potential

Want to optimize your MT initiatives and achieve your goals? Consider two methods to enhance Machine Translation effectiveness: Machine Translation customization and Machine Translation training. Each approach improves MT output quality and minimizes post-editing needs. However, MT customization and MT training aren’t interchangeable.

Discover how these methods work, their distinctions, and how to choose the right approach based on your use case.

The Limitations of Generic MT Engines

Companies often achieve satisfactory results with generic, untrained Machine Translation engines such as Amazon, Bing NMT, DeepL, Google NMT, or Yandex for general, straightforward content. However, the output might sometimes fall short.

Why? A generic engine may struggle to translate highly specialized content in industries like life sciences or legal, and words specific to these domains. It might fail to apply the correct definition for a word with multiple meanings. Additionally, it may not preserve your unique brand voice or determine when formal vs. informal language is necessary to engage your audience effectively.

MT customization and MT training help overcome these limitations, producing better translation output for specific requirements that generic engines can’t meet.

Understanding MT Customization

MT customization adapts a pre-existing Machine Translation engine using a translation glossary and Do Not Translate (DNT) list to enhance the accuracy of machine-generated translations. (A translation glossary contains essential terms for a company and their translations, while a DNT list comprises terms a company doesn’t want to translate.)

MT customization operates by uploading a list of source terms and their translations before the engine starts its work. This list guides the MT engine on translating terms or intentionally preventing their translations. As a result, the engine’s suggestions improve, helping the company maintain its brand name, adhere to terminology, and achieve regional variations. Enhanced translations reduce the necessity for post-editing.

Challenges of MT Customization

Although MT customization is typically easier to implement than MT training, certain challenges arise during the process. Uploading terms into a Machine Translation system may be simple, but selecting the right terms can be daunting. The effectiveness of MT customization hinges on the MT expert’s expertise and ability to manage input and output normalization rules, DNT lists, and glossaries, which all contribute to improved output. Novice authors might unintentionally cause the MT to generate subpar suggestions, negatively affecting the overall translation quality.

Exploring MT Training

MT training involves constructing and training an MT engine using extensive bilingual data from corpora and Translation Memories (previously translated content) to boost the accuracy of machine-generated translations.

By supplying the generic MT engine with company-specific bilingual corpora, the engine learns the company’s translation expectations. Rather than providing a generic translation suggestion, the engine creates a customized output based on the corpora. As a result, MT training allows companies to refine output to achieve a specific brand voice or style, producing more consistent translations. You can override the default formal tone of generic MT engines to accomplish a more informal tone. Similar to MT customization, companies obtain desired outcomes with less post-editing since the engine is more likely to generate accurate translations with minimal errors.

For successful MT training, a company should provide at least 16,224 unique bilingual segments that are high-quality, free of inconsistencies, and without source translation duplications. Insufficient data will likely result in minimal or no impact on output quality.

Distinguishing MT Customization from MT Training

While both methods enhance MT output and decrease post-editing, they are distinct and not interchangeable.

MT customization modifies an existing MT engine using glossaries and Do Not Translate (DNT) lists, while MT training constructs and trains the engine from scratch using extensive bilingual data from corpora and Translation Memories.

Customization offers greater versatility than MT training and produces suggestions suitable for most companies’ needs. However, customization incurs a one-time cost to update the profile sent to the MT engine and ongoing costs to maintain a glossary.

MT training is ideal for companies with highly specialized content and intricate use cases. Implementing MT training involves initial training costs and potential additional training expenses over time if MT performance monitoring indicates potential improvements.

Deciding Between MT Training and MT Customization

Does your company need to translate scientific materials or highly technical manuals? Is preserving your unique brand voice crucial? The answers to these questions will guide you in choosing the most suitable method, be it MT customization or MT training.

Opting for MT Customization

Two key scenarios call for MT customization when seeking to accomplish:

  • Precise translation of specialized terminology
  • Regional language variations, like English (United States) vs. English (United Kingdom), but lacking sufficient data for training

MT customization is an ideal choice for technical and detail-oriented content, as accurate translation of terminology is crucial. Opt for MT customization when there isn’t enough data to make MT training effective.

Choosing MT Training

Two primary use cases for MT training arise when aiming to achieve:

  • A distinct brand voice, tone, or style while minimizing post-editing requirements
  • Regional language variations, such as French (Switzerland) vs. French (France), and having adequate data for training

MT training is well-suited for marketing and creative content, where maintaining a specific brand voice, tone, and style is vital. However, ensure that you have a sufficient data volume to effectively train the engines.

Embracing a Hybrid Approach

In some cases, a combination of both methods yields the best results. For example, MT engines may generate superior suggestions when complemented by customization during MT training.

ABC Translations facilitates the implementation of a hybrid approach for its customers with its enterprise MT solution, Smart MT Portal. Additionally, customers can access professional training services from ABC Translations’ skilled teams. By collaborating with these teams, companies often adopt a more comprehensive MT approach, utilizing both MT training and customization to optimize output. Various tests help determine which strategies produce the best results, enabling a tailored MT approach.

Deciding Between MT Customization and MT Training

The most effective strategy for enhancing MT output depends on your specific needs. You might be tempted to view MT training as the primary solution or be swayed by the buzz around continuous training. Consider the following points as you explore your options:

Avoid Pitfall #1: The Offering of MT Training as the Sole Solution

MT Training: A Targeted Approach

MT training can be highly effective in enhancing MT output, but it is crucial to ensure it addresses specific and identified concerns.

As the use of MT grows, many providers turn to MT training as their primary solution to deliver value to their clients. However, this strategy can sometimes backfire. Some companies that exclusively used training in hopes of improving MT output have later approached ABC Translations for assistance, expressing disappointment with the training after evaluating its cost-effectiveness. They were dissatisfied with the engine’s suggestions and sought a more economical solution. The reason for their dissatisfaction? In their particular circumstances, other methods would have been more suitable.

Innovative MT providers, such as ABC Translations, employ MT training when appropriate but heavily rely on customization to attain desired MT results at a lower cost compared to MT training.

Steer Clear of Pitfall #2: The Allure of Continuous Training during MT Training

While exploring MT solutions, you may encounter providers that promote the idea of continuous training for engines after the completion of individual projects. Exercise caution when faced with such claims. Continuous training is only feasible for custom engines that necessitate constant updates.

It is essential to emphasize that successful MT training requires at least 15.6K unique segments for a single project to train the engine effectively. If a company lacks sufficient data, it may resort to using project content to update customization features, which is often inaccurately labeled as “training.”

The Ultimate Goal

Customization offers a more adaptable solution than MT training, catering to the majority of companies’ needs. By employing customization, you can significantly enhance MT suggestions, preserving your brand identity and adhering to specific terminology. This reduces the workload for post-editors in verifying these aspects. The one-time expense of updating the MT engine’s profile and the ongoing costs of maintaining a glossary are generally more cost-effective than the expenses associated with MT training.

MT Customization Best Practices

When implementing MT customization, it’s crucial to follow best practices for optimal results.

Input and Output Normalization Rules

Establish a library of input and output normalization rules for the most commonly used languages to manage MT input and enhance its output. These rules allow you to meet your unique requirements.

For example, an input normalization rule might direct the MT engine to use las comillas angulares [« … »] instead of double quotes [“…”] for Spanish translations output. This rule elevates the quality of Spanish translations, as Spanish-speaking readers expect las comillas angulares rather than double quotes. Businesses can apply input and output normalization rules to facilitate similar adjustments addressing regional language variations for parent languages, such as Spanish of Spain, or Spanish of Mexico or Argentina, etc.

Do Not Translate Lists and Rules

Develop a list of terms you want to remain untranslated, and create a rule that replaces any identified Do Not Translate (DNT) term with a token before it enters the engine. This action renders the term invisible to the engine, preventing translation. Once the translation is complete and the MT suggestion returns, apply the output normalization rule to replace the token with the original DNT term.

Preparing a Glossary

Take the time to carefully prepare your glossary, ensuring accurate and consistent translations. Keep the key factors listed in Table 1 in mind when deciding whether to include a term in your glossary.

General Guidelines for Glossary Compilation

ConsiderationQuestion to PoseInclusion in the Glossary?
FrequencyWhat is the term’s prevalence in the source text?Exclude infrequent terms.
AmbiguityAre there multiple meanings or potential confusion with other words?Include ambiguous terms, but verify alternate meanings are rare in the source text.
Specialized terminologyDoes the term pertain to a specific domain or subject?Include domain-specific terms.
ConsistencyHas the term been translated uniformly in previous instances?Exclude consistently translated terms.
ImportanceIs the term crucial to understanding the text’s overall meaning?Include terms vital to the text’s meaning.
ComplexityIs the term intricate, posing challenges for the Machine Translation system?Include complex terms that may be difficult for the MT system to translate accurately.

Table 1. Key Factors to Consider When Creating a Glossary for Machine Translation

Do’s and Don’ts

We highly recommend adhering to these do’s and don’ts during the glossary creation process (examples in Spanish):

  • Don’t include contradictory terms: Avoid including “banco” as both “bank” and “bench” in the same glossary.
  • Don’t include duplicate entries: Ensure there are no duplicate entries for “ordenador” and “computadora” referring to the same meaning, “computer”.
  • Don’t include generic terms: Avoid including common words like “casa” (house) or “comer” (to eat), which may negatively impact machine translation quality, sentence structure, agreement, and word order.
  • Don’t separate lengthy terms: Keep multiword expressions like “redes sociales” (social networks) together as one term.
  • Do incorporate DNT (Do Not Translate) terms: Include terms like brand names, e.g., “Coca-Cola”, which should not be translated.
  • Do include specific product names: Add product names like “iPhone 13” or “Samsung Galaxy S22” to the glossary.
  • Do limit term entries to one per source language: Use only one translation for “computer”, either “ordenador” or “computadora”, depending on the target audience’s regional preference.
  • Do utilize multiword expressions: Include phrases like “atención al cliente” (customer service) or “diseño web” (web design) in the glossary.

ABC Translations’ Approach to MT Customization and MT Training

Our ABC Go!MT Portal simplifies the process of implementing MT customization for our clients, allowing customization to function across multiple MT engines simultaneously. You can compile your MT glossaries and DNT lists and upload these terms, which will then apply to every MT engine. This technology empowers you to avoid engine lock-in and switch engines as needed to achieve optimal results.

Furthermore, our MT technology can be easily complemented by relevant services provided by our MT experts. When engaged, we assist companies in identifying the most effective MT strategy and how to best execute it.

Whether you are just starting to explore MT usage, seeking to enhance existing MT efforts through customization, or considering MT training as a viable option due to increased content creation — we have a solution tailored to your needs.

Comparing Machine Translation Training and Machine Translation Customization

Refer to Table 2 for an at-a-glance comparison of MT training and MT customization, helping you determine the most suitable method for your content needs.

MT CustomizationMT Training
Definition and FunctionRefining an existing Machine Translation engine using glossaries and Do Not Translate (DNT) lists to enhance the accuracy of machine-generated translations.Developing and training an MT engine using comprehensive bilingual data from corpora and Translation Memories (TMs) to elevate the accuracy of machine-generated translations.
PurposeBoosting MT suggestions for improved accuracy and minimizing post-editing requirements.Augmenting MT suggestions for better accuracy and lessening post-editing demands.
Specific AdvantagesAllows companies to maintain brand names, terminology, and achieve regional variations.Empowers companies to achieve a distinct brand voice, tone, style, and regional variations.
Potential RisksImproper execution may lead to subpar MT suggestions and negatively impact overall quality.Inadequate training data or excessive terminology use by inexperienced authors can result in poor MT suggestions and a decline in overall quality.
Appropriate UsageOptimal for technological and detail-oriented content requiring accurate translations of terminology or regional variations with insufficient data for MT training.Best suited for highly specialized, marketing, and creative content requiring a specific brand voice, tone, style, or regional variations with adequate data for MT training.
Key Success FactorsAn adept MT expert capable of managing input and output normalization rules, glossaries, and DNT.A minimum of 15.6K unique segments for effective engine training.
Cost ConsiderationsOne-time cost for updating the MT engine profile and ongoing costs for glossary maintenance; generally more affordable than MT training when factoring in potential benefits.Initial training costs and possible additional training expenses over time based on MT performance monitoring; a justifiable investment in certain cases, considering potential benefits.

Table 2. Comparing MT Customization and MT Training: Key Aspects and Considerations

Unlock the Full Potential of Machine Translation with ABC Translations

As companies increasingly rely on Machine Translation (MT) to meet the growing demand for multilingual content, it’s essential to optimize MT initiatives for improved quality and efficiency. Both MT customization and MT training offer valuable solutions to enhance translation output, but the right approach depends on your specific needs and use cases. Customization provides a more versatile and cost-effective solution for most companies, while training is well-suited for highly specialized content or maintaining a distinct brand voice. In some instances, a hybrid approach combining both methods may yield the best results.

By understanding the limitations of generic MT engines and adopting appropriate strategies, companies can achieve personalized, domain-specific, and consistent translations across multiple channels. Following best practices for MT customization, such as input and output normalization rules, and effectively managing Do Not Translate lists, ensures that your MT initiatives are successful and meet the ever-evolving demands of today’s digital markets. With ABC Translations as your trusted partner, you can unlock the full potential of Machine Translation and achieve outstanding results tailored to your unique requirements.