Table of content

Audio

Stereo

We strongly recommend uploading stereo audio files with one speaker on each channel. We need to know the following information in order to setup your account, either:

left channel is the customer, right is the agent
left channel is the agent, right is the customer

Format can be mp3, wav... Any format supported by ffmpeg should be fine.

Audio with a duration of 6300 seconds or above won't be handled.

Mono

Although we are able to process mono audio files, we strongly recommend our users to use stereo. Many features won't be available with mono files, especially speech analytics capacities.

Format can be mp3, wav... Any format supported by ffmpeg should be fine. Please be aware that no speaker diarization will be performed on the audio. Even though speaker diarization can give rather fine results, our studies show that for natural conversation over telephone, a significant amount of data is lost because of numerous overtalk and lack of quality of this media.

Audio with a duration of 6300 seconds or above won't be handled.

Recipes for audio manipulation

Here are a few common audio manipulation recipes to help preparing files for Jupload.

Get audio file information

The following command outputs information about an audio file:

ffprobe -i audio_file.mp3 -v error -show_entries 'stream=codec_type,codec_name,channels,channel_layout,sample_rate,duration'

# Example output:
[STREAM]
codec_name=pcm_s16le  # 16 bits Little Endian
codec_type=audio
sample_rate=8000  # 8kHz
channels=1  # One channel
channel_layout=mono
duration=612.576000
[/STREAM]

Convert a single mono file to stereo

If you only have a mono file, you can create a stereo file with the audio on one channel and nothing on the second channel. Jupload will then treat the file as if there was only one speaker.

If you want to have the original mono audio on the left channel, and silence on the right one, run this command:

ffmpeg -i input_mono.wav -map_channel '0.0.0' -map_channel '0.0.1?' output_left.mp3

Alternatively, if you want to have the original mono audio on the right channel, and silence on the left one, run this command:

ffmpeg -i input_mono.wav -map_channel '0.0.1?' -map_channel '0.0.0' output_right.mp3

You can now upload the output file on the SFTP.

Convert two separated mono files to stereo

If you have two separated mono files for the agent speaker and the customer, you can create a single stereo file with the agent on one channel and the customer on the second channel.

Run either one of the following commands, depending on where the agent and the customer should be on which channel:

# Customer to left channel, agent to right channel
ffmpeg -i mono-customer.mp3 -i mono-agent.mp3 -filter_complex "[0]apad[a];[a][1]amerge[aout]" -map "[aout]" stereo-left_customer-right_agent.mp3

# Agent to left channel, customer to right channel
ffmpeg -i mono-agent.mp3 -i mono-customer.mp3 -filter_complex "[0]apad[a];[a][1]amerge[aout]" -map "[aout]" stereo-left_agent-right_customer.mp3

Naming the audio file

The name of the audio file, without extension, must be found in the metadata file.

For example my_call.mp3 should be accompanied by either:

my_call.json if metadata are JSON formatted
my_call.xml if metadata are XML formatted
A CSV file with a row having my_call as value of its first cell if metadata are CSV formatted