A series of questions and answers to help you debug responses from Stream H2B API and chose the best parameter values.

I'm having an error with the WebSocket API

First things first, update your SDK to the latest version if you are using one, and check the API Changelog. It might resolve your issue. If not, please check other entries of this page, or contact us with details about your issue.

Why do I get a No match with a result matching my parameters?

For example, I request digits?length=1, I say One, and I get a no-match despite having One as output of ASR.

First things first, please check the response header Completion-Reason, where the server explains its result.

In the described case, the message is most certainly "confidence too low". Confidence threshold by default is equal to 0.5, and you can modify it. If response confidence is below this threshold, a no-match will be returned, even if as human we see that the result was in fact correct.

In case your scenario is asking the user to answer by a single word, we highly recommend to lower confidence threshold to 0.35 or 0.4. When providing only one word, the ASR has not that much of data to be as certain as with a full sentence.

I'm streaming audio fine but I still get a no_input_timeouteach time

  • Check that you aren't actually streaming silence;
  • Check that you specified the right codec on join (for WebSocket);
  • Check that your code is properly listening to the events returned by the API.

I have a very low success rate when using barge-in.

If so, it's highly probable that you are doing "full barge-in", that is, you emit the RECOGNIZE command at the same time that you start playing the prompt.

Remember that in "barge-in" mode, your equipment will mute the prompt as soon as the user voice is detected.

So if you use "full barge-in", the prompt can potentially be muted before the user even hear it! And that happens quite frequently:

  • if there is some latency (and there is most often than not) in your IVR reaction and if you never confirm to the user that their response was understood before stepping into the next question, they may repeat their previous answer which would overlap on the following barge-in RECOGNIZE, stopping the next question prompt before it starts, and making the iteration fail.
  • if you made an evolution to your IVR, changing some response formats and instructions on barge-in questions, the regular users, that were used to answer quickly without listening to the prompt, may never take note of the new instructions and continue to give their answers in the older, now unrecognized, format.

How to fix

If you have complete control over you equipment, you should program a "partial barge-in", that is, you should launch the voice recognition later, somewhere in the middle or near the end of the prompt.

If you can't, you may consider inserting an intermediate step to give feedback to the user about the answer they've just given, before launching the next question if this one is in full barge-in mode.

Finally, you may also disable barge-in altogether if it is not absolutely necessary.