Table of content

Getting started

What are you building here

This getting started guides you to build a very simple web app to be run locally in your browser, and allowing you to talk in your microphone and see the transcription displayed in the HTML document.

Quick start

You need to have NodeJS and NPM installed on your workstation to use the JS SDK locally. (npm is the dependencies manager for NodeJS)

In this example we use Vite to build an application, but you can use other bundlers like Parcel, Webpack, Gulp or others.

1. Setup a new app (optional)

If you start from scratch, we recommend you to use Vite to bootstrap a web application.

# npm 6.x
npm init vite uhlive-getting-started --template vanilla

# npm 7+, extra double-dash is needed:
npm init vite uhlive-getting-started -- --template vanilla

Then execute the following commands:

cd uhlive-getting-started
npm install

2. Install the `@Uhlive/javascript-sdk` package

The package is available publicly on npm and can be installed with:

npm install @uhlive/javascript-sdk

For now, the JavaScript SDK is compatible with Chrome-based browsers and Firefox. Support for other browsers and Node.js will come later.

3. Prepare the Javascript

Open or create the main.js file and replace all by

import { Uhlive } from "@uhlive/javascript-sdk";
const uhlive = new Uhlive("YOUR-IDENTIFIER", "YOUR-TOKEN", {
  // Add UhliveOptions here if needed
});
uhlive.connect().join("your-conversation-id", {
  // Add ConversationOptions here if needed
});

Check the conversation options or uhlive options for the list of all possible parameters.

Don't forget to change YOUR-IDENTIFIER and YOUR-TOKEN by your owns.

4. Prepare the HTML

Open or create the index.html file and replace its content by the following code. The presence of the tag <div id="uhlive"></div> will tell the SDK to automatically add the transcripted audio stream into it.

<body>
  <script type="module" src="/main.js"></script>
  <div id="uhlive"></div>
</body>

5. Launch

If you're using Vite, launch the app with npm run dev, then go to http://localhost:3000/, authorize microphone access, and speak !

API usage

Connect to the server

Don't forget to change YOUR-IDENTIFIER and YOUR-TOKEN by your owns.

import { Uhlive } from "@allomedia/uhlive";

// Set with your own identifier and token
const uhlive = new Uhlive("YOUR-IDENTIFIER", "YOUR-TOKEN");
uhlive.connect();

Check the connect() API reference.

Join a conversation

With a single connection, you can join several conversations. Think of a conversation as a conference call instance, or a room on a chat service. Only people who have joined the conversation can access the exchanged data.

const conversation = uhlive.join("my-conversation-id");

The join() method also takes an optionnal options parameter. Here is an example:

const conversation = uhlive.join("my-conversation-id", {
  speaker: "my-speaker-id",
  model: "fr",
  interim_results: false,
  rescoring: true
});

If you want to disable the automatic display of the transcript in the wrapper, remove the element with id uhlive.

You can read more about ASR parameters here.

There is no verification on our side of the uniqueness of the speakers id you provide. If you expect it to be unique, you must manage it on your side.

Send audio

We currently only support getting audio from the microphone. This is done automatically by joining a conversation, and will ask to access your microphone.

// This will ask the permission to access your microphone
uhlive.join("my-conversion-id");

Receive transcription

To receive the transcript of the streamed audio, you must subscribe to the words_decoded or/and segment_decoded events, which will pass the payload to your callback function.

Note that this is not needed by default, because the SDK will populate the wrapper defined in the Conversation.join method ("uhlive" by default). You need to use these methods if you want to perform custom actions on words/segments decoded.

The words_decoded event is triggered when the backend sends an interim transcript. Following words_decoded events for the same audio are susceptible to be different, until a final transcript is sent with the segment_decoded event.

conversation.onWordsDecoded((payload) => {
  // Do something with `payload`
});

conversation.onSegmentDecoded((payload) => {
  // Do something with `payload`
});

Alternatively, you can use the following syntax:

uhlive
  .join("my-conversation-id")
  .onWordsDecoded((payload) => {
    // Do something with `payload`
  })
  .onSegmentDecoded((payload) => {
    // Do something with `payload`
  });

Get notified when a speaker leave the conversation

It can be interesting to execute an action when a speaker leave the conversation. This can be done with the speaker_left event, as follow:

uhlive.onSpeakerLeft((speakerId) => {
  // Do something with `speakerId`
});

Leave a conversation

conversation.leave();
// or
uhlive.leave("my-conversation-id");
// your can also leave all the conversations at once
uhlive.leaveAllConversations();

Disconnect from the server

uhlive.disconnect();

Enrich events

You can read a description of what Enrich events are here.

You don't need to listen to these events if you use the wrapper option (which is enabled by default). Those events are useful only if you want to perform custom actions on Enrich events.

Listen for number found event

const uhlive = new Uhlive("my-token");
const myConversation = uhlive.join("my-conversation");
myConversation.onEntityNumberFound((entity) => {
  // Do something with `entity`...
});

Listen for ordinal found event

const uhlive = new Uhlive("my-token");
const myConversation = uhlive.join("my-conversation");
myConversation.onEntityOrdinalFound((entity) => {
  // Do something with `entity`...
});

There you go! You are now able to send an audio and receive its transcription.

To dive in deeper, you can browse the API Reference documentation.

If you are stuck, want to suggest something, or just want to say hello, send us an e-mail to support@allo-media.fr.