Getting started
What are you building here
This getting started guides you to build a very simple web app to be run locally in your browser, and allowing you to talk in your microphone and see the transcription displayed in the HTML document.
Quick start
You need to have NodeJS and NPM installed on your workstation to use the JS SDK locally. (npm is the dependencies manager for NodeJS)
In this example we use Vite to build an application, but you can use other bundlers like Parcel, Webpack, Gulp or others.
1. Setup a new app (optional)
If you start from scratch, we recommend you to use Vite to bootstrap a web application.
# npm 6.x
npm init vite uhlive-getting-started --template vanilla
# npm 7+, extra double-dash is needed:
npm init vite uhlive-getting-started -- --template vanilla
Then execute the following commands:
cd uhlive-getting-started
npm install
2. Install the @Uhlive/javascript-sdk
package
The package is available publicly on npm and can be installed with:
npm install @uhlive/javascript-sdk
For now, the JavaScript SDK is compatible with Chrome-based browsers and Firefox. Support for other browsers and Node.js will come later.
3. Prepare the Javascript
Open or create the main.js
file and replace all by
import { Uhlive } from "@uhlive/javascript-sdk";
const uhlive = new Uhlive("YOUR-IDENTIFIER", "YOUR-TOKEN", {
// Add UhliveOptions here if needed
});
uhlive.connect().join("your-conversation-id", {
// Add ConversationOptions here if needed
});
Check the conversation options or uhlive options for the list of all possible parameters.
Don't forget to change YOUR-IDENTIFIER
and YOUR-TOKEN
by your owns.
4. Prepare the HTML
Open or create the index.html
file and replace its content by the following code. The presence of the tag <div id="uhlive"></div>
will tell the SDK to automatically add the transcripted audio stream into it.
<body>
<script type="module" src="/main.js"></script>
<div id="uhlive"></div>
</body>
5. Launch
If you're using Vite, launch the app with npm run dev
, then go to http://localhost:3000/
, authorize microphone access, and speak !
API usage
Connect to the server
Don't forget to change YOUR-IDENTIFIER
and YOUR-TOKEN
by your owns.
import { Uhlive } from "@allomedia/uhlive";
// Set with your own identifier and token
const uhlive = new Uhlive("YOUR-IDENTIFIER", "YOUR-TOKEN");
uhlive.connect();
Check the connect()
API reference.
Join a conversation
With a single connection, you can join several conversations. Think of a conversation as a conference call instance, or a room on a chat service. Only people who have joined the conversation can access the exchanged data.
const conversation = uhlive.join("my-conversation-id");
The join()
method also takes an optionnal options parameter. Here is an example:
const conversation = uhlive.join("my-conversation-id", {
speaker: "my-speaker-id",
model: "fr",
interim_results: false,
rescoring: true
});
If you want to disable the automatic display of the transcript in the wrapper, remove the element with id uhlive
.
You can read more about ASR parameters here.
There is no verification on our side of the uniqueness of the speakers id you provide. If you expect it to be unique, you must manage it on your side.
Send audio
We currently only support getting audio from the microphone. This is done automatically by joining a conversation, and will ask to access your microphone.
// This will ask the permission to access your microphone
uhlive.join("my-conversion-id");
Receive transcription
To receive the transcript of the streamed audio, you must subscribe to the words_decoded
or/and segment_decoded
events, which will pass the payload to your callback function.
Note that this is not needed by default, because the SDK will populate the wrapper defined in the Conversation.join
method ("uhlive" by default). You need to use these methods if you want to perform custom actions on words/segments decoded.
The words_decoded
event is triggered when the backend sends an interim transcript. Following words_decoded
events for the same audio are susceptible to be different, until a final transcript is sent with the segment_decoded
event.
conversation.onWordsDecoded((payload) => {
// Do something with `payload`
});
conversation.onSegmentDecoded((payload) => {
// Do something with `payload`
});
Alternatively, you can use the following syntax:
uhlive
.join("my-conversation-id")
.onWordsDecoded((payload) => {
// Do something with `payload`
})
.onSegmentDecoded((payload) => {
// Do something with `payload`
});
Get notified when a speaker leave the conversation
It can be interesting to execute an action when a speaker leave the conversation. This can be done with the speaker_left
event, as follow:
uhlive.onSpeakerLeft((speakerId) => {
// Do something with `speakerId`
});
Leave a conversation
conversation.leave();
// or
uhlive.leave("my-conversation-id");
// your can also leave all the conversations at once
uhlive.leaveAllConversations();
Disconnect from the server
uhlive.disconnect();
Enrich events
You can read a description of what Enrich events are here.
You don't need to listen to these events if you use the wrapper
option (which is enabled by default). Those events are useful only if you want to perform custom actions on Enrich events.
Listen for number found event
const uhlive = new Uhlive("my-token");
const myConversation = uhlive.join("my-conversation");
myConversation.onEntityNumberFound((entity) => {
// Do something with `entity`...
});
Listen for ordinal found event
const uhlive = new Uhlive("my-token");
const myConversation = uhlive.join("my-conversation");
myConversation.onEntityOrdinalFound((entity) => {
// Do something with `entity`...
});
There you go! You are now able to send an audio and receive its transcription.
To dive in deeper, you can browse the API Reference documentation.
If you are stuck, want to suggest something, or just want to say hello, send us an e-mail to support@allo-media.fr.