Output - Allo-Media Documentation

This page describes what response from server look like, and what clients might expect from the server after they've sent a command.

Output events

RECOGNITION-COMPLETE

An event from the server to the client following a RECOGNIZE command. This event indicates that the recognition is complete. Event's content includes the speech recognition result and its interpretation, depending on grammar.

Success recognition

MRCP
WebSocket

MRCP/2.0 544 RECOGNITION-COMPLETE 2 COMPLETE
Channel-Identifier: 39ac4ea9750a4790@speechrecog
Completion-Cause: 000 success
Completion-Reason: success
Content-Type: application/x-nlsml
Content-Length: 331

<?xml version="1.0" encoding="UTF-8"?>
<result version="1.25.0">
  <interpretation grammar="session:demo-grammar-0" confidence="1.00">
    <instance>je veux changer mon billet</instance>
    <input mode="speech" timestamp-start="2020-12-22T11:11:45.620+01:00" timestamp-end="2020-12-22T11:11:47.060+01:00" confidence="1.00" asr-model="fr.basic">je veux changer mon billet</input>
  </interpretation>
</result>

{
  "event": "RECOGNITION-COMPLETE",
  "request_id": 8,
  "channel_id": "0x56432a7207d8",
  "headers": {},
  "completion_cause": "Success",
  "completion_reason": "success",
  "body": {
    "asr": {
      "transcript": "salut comment ça va",
      "confidence": 0.842793,
      "start": 1660899270646,
      "end": 1660899271696
    },
    "nlu": {
      "type": "builtin:speech/transcribe",
      "value": "salut comment ça va",
      "confidence": 0.842793
    },
    "grammar_uri": "session:transcribe",
    "asr_model": "fr.basic",
    "version": "1.25.0"
  }
}

With MRCP, a set of headers is returned, completed with a body (quite similar to an HTTP response). With our WebSocket protocol, a JSON message is returned, and both headers and body equivalent are to be found in the same object.

Headers are described below. Body includes the raw transcription and the ASR confidence, within input tag for MRCP and asr object for WebSocket, as well as the interpretation and its own confidence, within instance tag for MRCP and nlu object for WebSocket.

No match

MRCP
WebSocket

MRCP/2.0 658 RECOGNITION-COMPLETE 11 COMPLETE
Channel-Identifier: 1f38218a1f9d13e1@speechrecog
Completion-Cause: 001 no-match
Completion-Reason: unable to match grammar
Content-Type: application/x-nlsml
Content-Length: 427

<?xml version="1.0" encoding="UTF-8"?>
<result version="1.25.0">
  <interpretation confidence="0.94">
    <instance/>
    <input mode="speech" timestamp-start="2022-08-19T09:13:39.310+00:00" timestamp-end="2022-08-19T09:13:40.090+00:00" confidence="0.94" asr-model="fr-basic">
      <nomatch>je n'ai rien</nomatch>
    </input>
  </interpretation>
</result>

{
  "event": "RECOGNITION-COMPLETE",
  "request_id": 8,
  "channel_id": "0x564765bd0cd8",
  "headers": {},
  "completion_cause": "NoMatch",
  "completion_reason": "unable to match grammar",
  "body": {
    "asr": {
      "transcript": "je n'ai rien",
      "confidence": 0.998894,
      "start": 1660918888549,
      "end": 1660918889209
    },
    "nlu": null,
    "grammar_uri": "",
    "version": "1.25.0",
    "asr_model": "fr.basic"
  }
}

No input

If user does not speak at all, or the sound level is not high enough to trigger the VAD, a No Input completion cause is returned. For example:

MRCP
WebSocket

MRCP/2.0 365 RECOGNITION-COMPLETE 3 COMPLETE
Channel-Identifier: 8b9fca343f941dea@speechrecog
Completion-Cause: 002 no-input-timeout
Completion-Reason: no voice
Content-Type: application/x-nlsml
Content-Length: 142

<?xml version="1.0" encoding="UTF-8"?>
<result version="1.25.0">
  <interpretation>
    <instance/>
    <input asr-model="en.basic"><noinput/></input>
  </interpretation>
</result>

{
  "event": "RECOGNITION-COMPLETE",
  "request_id": 8,
  "channel_id": "0x557de1c01d58",
  "headers": {},
  "completion_cause": "NoInputTimeout",
  "completion_reason": "no voice",
  "body": {
    "asr": null,
    "nlu": null,
    "grammar_uri": "",
    "version": "1.25.0",
    "asr_model": "en.basic"
  }
}

START-OF-INPUT

An event from the server to the client, indicating that the recognizer has detected speech, only in normal mode.

Receiving this event can be helpful in case of barge-in scenario, in order to stop the IVR/bot prompt.

GET-PARAMS

Headers & statuses

The following sections present the more useful headers and how to interpret their value.

Completion cause

This is a header indicating the reason the recognition request completed. It is sent in DEFINE-GRAMMAR and RECOGNIZE responses.

success

partial-match

In response to a RECOGNIZE command. Speech Incomplete Timeout expired before there was a full match. But whatever was spoken till that point was a partial match to one or more grammars. It can only happen for normal mode.

no-match

no-input-timeout

no-match-maxtime

In response to a RECOGNIZE command. The Recognition-Timeout expired. Whatever was spoken till that point did not match any of the grammars. This cause could also be returned if the recognizer does not support detecting partial grammar matches.

success-maxtime

hotword-maxtime

partial-match-maxtime

In response to a RECOGNIZE command. The Recognition-Timeout expired before full match was achieved. But whatever was spoken till that point was a partial match to one or more grammars.

grammar-load-failure

recognizer-error

In response to a RECOGNIZE command. Something went wrong on the server's side.

language-unsupported

Completion reason

Other headers

Active-Request-Id-List

When client requests the server to STOP the recognition, the response will include this header. It will contain the request ID of the RECOGNIZE request that was actually stopped.

Output protocol

Output events

`RECOGNITION-COMPLETE`

Success recognition

No match

No input

`START-OF-INPUT`

`GET-PARAMS`

Headers & statuses

Completion cause

`success`

`partial-match`

`no-match`

`no-input-timeout`

`no-match-maxtime`

`success-maxtime`

`hotword-maxtime`

`partial-match-maxtime`

`grammar-load-failure`

`recognizer-error`

`language-unsupported`

Completion reason

Other headers

`Active-Request-Id-List`