TTS and Voice Transcription
In Configuration Manager, open Speech Services (next to Optional Services) to configure Text-to-Speech and Voice Transcription providers for voicemail and call recordings, and for recording Voice Prompts used in IVR and elsewhere. While we describe the configuration steps here, please refer to your provider documentation for more detail.
To enable TTS and Voice Transcription for voicemail and recorded calls, configure the related services in your cloud provider (Google or AWS) or use Custom or Thirdlane providers as applicable, then assign services in Configuration Manager. Once configuration is in place, you can enable Voice Transcription at Tenant and User Extension level.
Custom provider (Text-to-Speech and Transcription)
You can define services whose Provider is Custom for either Text-to-Speech or Transcription.
- Set Purpose to Text-to-Speech or Transcription, then set Provider to Custom.
- Choose a script from the dropdown. Only executable files under the following directories are allowed:
- Text-to-Speech:
/usr/local/share/thirdlane/service/tts - Transcription:
/usr/local/share/thirdlane/service/transcribe
The form stores the full path; the list shows the file name only.
- Text-to-Speech:
- Optionally add Environment variables as grid rows. Each row is stored as one
KEY=valueline (newline-separated in the database). Duplicate variable names are not allowed. Empty values are allowed.
At runtime, custom environment entries are applied when the script runs. Custom Text-to-Speech uses the same generic CLI contract as other CLI-based TTS integrations (Polly-style arguments and JSON on stdout). Custom Transcription uses the generic transcription command runner (executable path with file substitution).
For Amazon and Google cloud setup, see the sections below.
AWS TTS and Voice Transcription setup
Setup on AWS web site
Sign up for an AWS account. https://portal.aws.amazon.com/billing/signup
Create an IAM (Identity Management and Access) user with required permissions. https://signin.aws.amazon.com/signin
Open IAM console then navigate to Users in the left pane and click Add user button. https://console.aws.amazon.com/iam/
Select AWS credential type: Access key - Programmatic access. Click Next: Permissions button.
Select Attach existing policies directly button. Select the following policies: AmazonTranscribeFullAccess, AmazonPollyFullAccess,AmazonS3FullAccess.
Click Next: Tags button.
Click Next: Review button.
Click Create User button.
Save Access key ID and Secret access key.
AWS setup in Configuration Manager
Description. Short description of this service.
Region. AWS region for speech services (Polly and Transcribe).
Access Key ID. Acceess Key ID obtained from AWS.
Secret Access Key. Access Key obtained from AWS.
1st Language. First language considered in Voice to Text transcription.
2nd Language. Second language considered in Voice to Text transcription.
3rd Language. Third language considered in Voice to Text transcription.
4th Language. Fourth language considered in Voice to Text transcription.
Google TTS and Voice Transcription setup
Setup on Google web site
Sign in to your Google account. https://accounts.google.com/Login
If you don’t have an account, sign up for a new account at https://accounts.google.com/SignUp
Open GCP Console at console.cloud.google.com and create a new project.
Choose the name for the project. In our example, we choose thirdlane-transcribe. Google requires the project ID to be a globally unique identifier.
Navigate to APIs & Services and select Library menu item.
Enable Text-to-Speech and Speech-to-Text API
Create a service account key by navigating to Credentials in the left pane and clicking Create credentials button. Choose Service account from the drop-down menu.
Click Create and Continue button. Grant this service account access role: Owner
Click Done button. Create service account key. Click on created Service Account.
Open Keys tab.
Click Add Key button and select Create new key.
Select JSON as key type and click Create button.
Save JSON file on your computer.
Google setup in Configuration Manager
Description. Short description of this service.
Key File. Enter the content of the Key File obtained from Google.
1st Language. First language considered in Voice to Text transcription.
2nd Language. Second language considered in Voice to Text transcription.
3rd Language. Third language considered in Voice to Text transcription.
4th Language. Fourth language considered in Voice to Text transcription.
Enabling Recorded Calls Transcription
Enabling Recorded Calls Transcription on a Tenant level
Recorded Calls Transcription can be enabled or disabled on a Tenant level.
Allow Recorded Calls Transcription. Specify whether Recorded Calls Transcription will be available for this Tenant.
Enable Transcription by default? Specify whether Recorded Calls Transcription will be enabled by default when creating User Extensions for this Tenant.
Enabling Recorded Calls Transcription for User Extension
If Recorded Calls Transcription is enabled for Tenant, you can enable it for User Extensions for that Tenant.
Transcribe to Text. Specify whether Recorded Calls Transcription will be enabled.
See also
- Speech Services overview — summary of the Speech Services section.
- AI Services — Chat AI for Connect (composer rewrite, summarize, translate) and Recording AI for post-call analysis (summary, sentiment, categorization, etc.).