Stream API, for human to human interactions.

Welcome to our automatic speech recognition API guide.

You will find documentation in order to use our API and build your awesome products with it.

If you know what you're looking for, you might be interested in our API Reference.

Getting started

What you'll be learning here: How to send an audio stream to our API and receive a text transcription, live.

Overview

This guide describes how to consume our websocket API by sending it some audio and receiving a transcription. It also presents you the basic concepts to have in mind when using this API. Once you cover this documentation page, you will know how to:

  • authenticate to our servers
  • send us some audio
  • deal with the transcription you'll get as a response

Examples are based on our SDKs, but you can dig into our API reference to implement the same logic in the language of your choice.

Get your credentials

Our API is yet in early stage of development. Please send us an e-mail in order for us to provide you with an API token and identifier. Please also specify whether you'd like to use english or french transcription. You will then be able to pass these credentials as a query parameter.

Install an SDK (or not)

For your convenience we provide SDKs to ease the use of our API. Of course you can dig into the API reference if you want to use your own implementation, or if your language of choice is not yet available.

Limitation

There is a bandwidth limit of 20kB / second / WebSocket connection. It is therefore highly suggested to use one WebSocket connection for each audio stream you'll send. Both the Python and JavaScript SDK handle this transparently.

This limitation allow us to provide the best quality of service as possible to each and every user, while permitting to stream audio in real time, in the spirit of this API.