Adding support for global audiences
Your customers come from all around the globe. You need an assistant that can talk to them in their own language and in a familiar style. Choose the approach that best fits your business needs.
-
Quickest solution: The simplest way to add language support is to author the assistant in a single language. You can translate each message that is sent to your assistant from the customer's local language to the assistant language. Later you can translate each response from the assistant language back to the customer's local language.
This approach simplifies the process of authoring and maintaining the conversation. You can build one assistant and use it for all languages. However, the intention and meaning of the customer message can be lost in the translation.
For more information about webhooks you can use for translation, see Webhook overview.
-
Most precise solution: If you have the time and resources, the best user experience can be achieved when you build multiple assistants, one for each language that you want to support. IBM® watsonx™ Assistant has built-in support for all languages. Use one of 13 language-specific models or the universal model, which adapts to any other language you want to support.
When you build an assistant that is dedicated to a language, a language-specific classifier model is used by the assistant. The precision of the model means that your assistant can better understand and recognize the goals of even the most colloquial message from a customer.
Use the universal language model to create an assistant that is fluent even in languages that watsonx Assistant doesn't support with built-in models.
To deploy, use the web chat integration with your French-speaking assistant to deploy to a French-language page on your website. Deploy your German-speaking assistant to the German page of your website. Maybe you have a support phone number for French customers. You can configure your French-speaking assistant to answer those calls, and configure another phone number that German customers can use.
You can enable the download of language data files, in CSV format, so you can translate training examples and assistant responses from English into other languages and use in other assistants. For more information, see Using multilingual downloads for translation.
Understanding the universal language model
An assistant that uses the universal language model applies a set of shared linguistic characteristics and rules from multiple languages as a starting point. It then learns from the training data that you add to it.
The universal language classifier can adapt to a single language per assistant. It cannot be used to support multiple languages within a single assistant. However, you can use the universal language model in one assistant to support one language, such as Russian, and in another assistant to support another language, such as Hindi. The key is to add enough training examples or intent user examples in your target language to teach the model about the unique syntactic and grammatical rules of the language.
Use the universal language model when you want to create a conversation in a language where no model is available, and which is unique enough that an existing model is insufficient.
As you follow the normal steps to design a conversational flow, you teach the universal language model about the language you want your skill to support. It is by adding training data that is written in the target language that the universal model is constructed.
For more information about feature support in the universal language model, see Supported languages.
Integration considerations
Keep these tips in mind for integrations:
- Web chat: Web chat has some hardcoded strings that you can customize to reflect your target language. For more information, see Supporting global audiences in web chat.
- Phone integration: If you want to deploy an assistant that uses the universal language model, you must connect to custom Speech service language models that can understand the language you're using. For more information about supported language models, see the Speech to Text and Text to Speech documentation.
- Search integration: If you build an assistant that specializes in a single language, be sure to connect it to data collections that are written in that language. For more information about the languages that are supported by Discovery, see Language support.
Supported languages
IBM® watsonx™ Assistant supports individual features to varying degrees per language. It has classifier models that are designed specifically to support conversations in the following languages:
Language | Code |
---|---|
English | en-us |
Arabic | ar |
Chinese (Simplified) | zh-cn |
Chinese (Traditional) | zh-tw |
Czech | cs |
Dutch | nl |
French | fr |
German | de |
Italian | it |
Japanese | ja |
Korean | ko |
Portuguese (Brazilian) | pt-br |
Spanish | es |
Universal* | xx |
*If you want to support conversations in a language for which watsonx Assistant does not have a dedicated model, such as Russian, use Universal.
Changing an assistant language
After an assistant is created, its language cannot be modified.
Working with accented characters
In a conversational setting, users might or might not use accents with the watsonx Assistant service. As such, both accented and non-accented versions of words might be treated the same for intent detection and entity recognition.
However, for some languages like Spanish, some accents can alter the meaning of the entity. Thus, for entity detection, although the original entity might implicitly have an accent, your assistant can also match the non-accented version of the same entity, but with a slightly lower confidence score.
For example, for the word "barrió", which has an accent and corresponds to the past tense of the verb "barrer" (to sweep), your assistant can also match the word "barrio" (neighborhood), but with a slightly lower confidence.
The system provides the highest confidence scores in entities with exact matches. For example, barrio
isn't detected if barrió
is in the training set; and barrió
isn't detected if barrio
is
in the training set.
You are expected to train the system with the proper characters and accents. For example, if you are expecting barrió
as a response, put barrió
into the training set.
Although not an accent mark, the same applies to words that use the Spanish letter ñ
versus the letter n
, such as "uña" versus "una". In this case, the letter ñ
is not an n
with an accent; it is a unique, Spanish-specific letter.
Using multilingual downloads for translation
You can enable the download of language data files, in CSV format, so you can translate training examples and assistant responses into other languages and use in other assistants.
Each CSV file includes translatable_string
data that you can use with a machine or human translation service.
Each CSV file also includes id
, resource_type
, and locator
data that watsonx Assistant can use in another assistant to re-create your source assistant. You don't need to edit this information.
The overview of the multilingual process is:
- Enable multilingual download: In your source assistant, enable multilingual download
- Translate content: Use the CSV files with a translation service
- Upload to language-specific assistants: In a destination assistant for another language, use the CSV files to upload translated training and responses
Enabling multilingual download
To enable multilingual download:
-
Open Assistant settings.
-
In the Download/Upload section, click Enable multilingual download.
Enabling multilingual might take a few minutes to process, but you can work elsewhere in the assistant. The Download/Upload button is disabled until this process finishes. After the multilingual download is enabled in an assistant, it can't be disabled.
-
Click Download/Upload files.
-
On the Download tab, choose Multilingual file package.
-
Select a published version, then click Download. You need at least one version for the download to be available.
Downloading your first file might take a few minutes to process, but you can work elsewhere in the assistant. The Download/Upload button is disabled until this process finishes.
-
When the download finishes, your .zip file contains:
action-responses.csv
action-training.csv
Data (Do not edit)
folder, which containsassistant.bin
If you are using dialog, the .zip file also contains:
dialog-responses.csv
dialog-training.csv
Translating content
To translate content:
-
Translate the content in the CSV files, for example, from English to French. Save files with translations as CSV UTF-8 (Comma Delimited) to account for any language-specific characters.
-
When you are finished with the translation process, create a new .zip file that includes:
action-responses.csv
with your translations. Don't change the name of the file.action-training.csv
with your translations. Don't change the name of the file.- The original, untouched
Data (Do not edit)
folder, which containsassistant.bin
If you translated dialog content, also include
dialog-responses.csv
anddialog-training.csv
in the .zip file.
Uploading to language-specific assistants
To upload to a language-specific assistant:
-
Create or switch to a destination assistant that uses the language for your translations.
-
In the destination assistant, open Assistant settings.
-
If you translated dialog training and responses, ensure that dialog is active in the destination assistant.
-
In the Download/Upload section, click Download/Upload files.
You don't need to enable multilingual download in the destination assistant.
-
On the Upload tab, choose Multilingual file package.
-
Attach your multilingual .zip file package, then click Upload. The translated content is added to your draft environment, so you can work on publishing the translated assistant.
Content language support
These languages are supported for content in actions, dialog, and the search integration.
Language | Actions | Dialog | Search integration |
---|---|---|---|
English (en ) |
|||
Arabic (ar ) |
|||
Chinese (Simplified) (zh-cn ) |
|||
Chinese (Traditional) (zh-tw ) |
|||
Czech (cs ) |
|||
Dutch (nl ) |
|||
French (fr ) |
|||
German (de ) |
|||
Italian (it ) |
|||
Japanese (ja ) |
|||
Korean (ko ) |
|||
Portuguese (Brazilian) (pt-br ) |
|||
Spanish (es ) |
|||
Universal (xx ) |
The watsonx Assistant service supports multiple languages as noted, but the user interface itself (such as descriptions and labels) is in English. All supported languages can be input and trained through the English interface.
GB18030 compliance: GB18030 is a Chinese standard that specifies an extended code page for use in the Chinese market. This code page standard is important for the software industry because the China National Information Technology Standardization Technical Committee mandates that any software application that is released for the Chinese market after September 1, 2001, be enabled for GB18030. The watsonx Assistant service supports this encoding, and is certified GB18030-compliant
Dialog language support
For these dialog features, language support differs depending on the language.
Intent feature support
Language | Content Catalog | Algorithm version |
---|---|---|
English (en ) |
||
Arabic (ar ) |
(except Covid-19) | |
Chinese (Simplified) (zh-cn ) |
||
Chinese (Traditional) (zh-tw ) |
||
Czech (cs ) |
||
Dutch (nl ) |
||
French (fr ) |
||
German (de ) |
(except Covid-19) | |
Italian (it ) |
(except Covid-19) | |
Japanese (ja ) |
(except Covid-19) | |
Korean (ko ) |
||
Portuguese (Brazilian) (pt-br ) |
||
Spanish (es ) |
||
Universal (xx ) |
User input processing support
Language | Dictionary-based entity support | Fuzzy matching (Misspelling) | Fuzzy matching (Stemming) | Fuzzy matching (Partial match) | Autocorrection |
---|---|---|---|---|---|
English (en ) |
|||||
Arabic (ar ) |
|||||
Chinese (Simplified) (zh-cn ) |
|||||
Chinese (Traditional) (zh-tw ) |
|||||
Czech (cs ) |
|||||
Dutch (nl ) |
|||||
French (fr ) |
Beta | ||||
German (de ) |
|||||
Italian (it ) |
|||||
Japanese (ja ) |
|||||
Korean (ko ) |
|||||
Portuguese (Brazilian) (pt-br ) |
|||||
Spanish (es ) |
|||||
Universal (xx ) |
Entity feature support
Language | Contextual entities | System entities |
---|---|---|
English (en ) |
||
Arabic (ar ) |
||
Chinese (Simplified) (zh-cn ) |
||
Chinese (Traditional) (zh-tw ) |
||
Czech (cs ) |
||
Dutch (nl ) |
||
French (fr ) |
Beta | |
German (de ) |
||
Italian (it ) |
||
Japanese (ja ) |
||
Korean (ko ) |
||
Portuguese (Brazilian) (pt-br ) |
||
Spanish (es ) |
||
Universal (xx ) |