Audio
Stereo
We strongly recommend uploading stereo audio files with one speaker on each channel. We need to know the following information in order to setup your account, either:
left
channel is the customer,right
is the agentleft
channel is the agent,right
is the customer
Format can be mp3
, wav
... Any format supported by ffmpeg should be fine.
Audio with a duration of 6300 seconds or above won't be handled.
Mono
Although we are able to process mono audio files, we strongly recommend our users to use stereo. Many features won't be available with mono files, especially speech analytics capacities.
Format can be mp3
, wav
... Any format supported by ffmpeg should be fine. Please be aware that no speaker diarization will be performed on the audio. Even though speaker diarization can give rather fine results, our studies show that for natural conversation over telephone, a significant amount of data is lost because of numerous overtalk and lack of quality of this media.
Audio with a duration of 6300 seconds or above won't be handled.
Recipes for audio manipulation
Here are a few common audio manipulation recipes to help preparing files for Jupload.
Get audio file information
The following command outputs information about an audio file:
ffprobe -i audio_file.mp3 -v error -show_entries 'stream=codec_type,codec_name,channels,channel_layout,sample_rate,duration'
# Example output:
[STREAM]
codec_name=pcm_s16le # 16 bits Little Endian
codec_type=audio
sample_rate=8000 # 8kHz
channels=1 # One channel
channel_layout=mono
duration=612.576000
[/STREAM]
Convert a single mono file to stereo
If you only have a mono file, you can create a stereo file with the audio on one channel and nothing on the second channel. Jupload will then treat the file as if there was only one speaker.
If you want to have the original mono audio on the left channel, and silence on the right one, run this command:
ffmpeg -i input_mono.wav -map_channel '0.0.0' -map_channel '0.0.1?' output_left.mp3
Alternatively, if you want to have the original mono audio on the right channel, and silence on the left one, run this command:
ffmpeg -i input_mono.wav -map_channel '0.0.1?' -map_channel '0.0.0' output_right.mp3
You can now upload the output file on the SFTP.
Convert two separated mono files to stereo
If you have two separated mono files for the agent speaker and the customer, you can create a single stereo file with the agent on one channel and the customer on the second channel.
Run either one of the following commands, depending on where the agent and the customer should be on which channel:
# Customer to left channel, agent to right channel
ffmpeg -i mono-customer.mp3 -i mono-agent.mp3 -filter_complex "[0]apad[a];[a][1]amerge[aout]" -map "[aout]" stereo-left_customer-right_agent.mp3
# Agent to left channel, customer to right channel
ffmpeg -i mono-agent.mp3 -i mono-customer.mp3 -filter_complex "[0]apad[a];[a][1]amerge[aout]" -map "[aout]" stereo-left_agent-right_customer.mp3
Naming the audio file
The name of the audio file, without extension, must be found in the metadata file.
For example my_call.mp3
should be accompanied by either:
my_call.json
if metadata are JSON formattedmy_call.xml
if metadata are XML formatted- A CSV file with a row having
my_call
as value of its first cell if metadata are CSV formatted