DocumentationAbout MeContact

How to know if my microphone works?

By Olivier Anguenot
Published in api
March 20, 2022
5 min read

Table Of Contents

1
Introduction
2
Permission to use a device
3
Capture the Output Stream
4
Detect Noise or silence
5
Microphone Status
6
Optimize using Worklet
7
Use cases
8
Conclusion
How to know if my microphone works?

Sometimes users complaint about not being able to hear the sound from one person in a conference or from their recipients when in a peer-to-peer call.

If the problem is not located in the receiver side, it could be the problem of the emitter. This article focus on detecting if the microphone really works well which means if the microphone captures audio data or not.

Verifying that the microphone works could be done prior to the call. For sure, this is the good moment to check before engaging the conversation. But during the call, the user can manipulate his microphone or his computer, and so he can introduce “bad things” accidentally or not… So having a check in “live” during the call is interesting too, to detect any troubles and to assist the user to recover from this situation.

Introduction

In general, two cases are interesting to detect:

  • When the user speaks and his microphone is muted physically or is not working correctly

  • When the microphone is opened (not muted by the application) and no sound is detected

In these 2 cases, the user thinks that his recipients hear him but in reality, nobody is able to hear him.

For this article, I used the following microphones:

  • Rode NT-USB Microphone which works in stereo

  • Konftel Ego that mixes a microphone and a loudspeaker



Permission to use a device

The first think to do when dealing with devices is to check that the navigator is able to access them.

This action can be done by checking the permission to access the device using the Permissions API. This API checks if the browser has already the authorization to access and to use the microphone.

It is interesting to know if the user has denied the authorization because often he did not pay attention to the authorization request and denies it accidentally.

const permission = await navigator.permissions.query({ name: 'microphone' });
if (permission.state === 'granted') {
// OK - Access has been granted to the microphone
} else if(permission.state === 'denied') {
// KO - Access has been denied. Microphone can't be used
} else {
// Permission should be asked
}
permission.onchange = () => {
// React when the permission changed
}

And by listening to the event change, the application is able to react when the permission changes.

Note: At this time of writing, the Permissions API (for microphone) works only in Chrome

Capture the Output Stream

Input audio from the microphone can be captured using the getUserMedia function from the MediaDevices APIs. When the browser can’t access the device, an error is thrown and the application can react.

try {
const stream = await navigator.mediaDevices.getUserMedia({audio: true});
} catch (err) {
// Errors when accessing the device
}

Then, the application can check that an active audio track exists. An active audio track is a track that is actively sending media (audio) data.

const audioTracks = stream.getAudioTracks();
if (audioTracks.length === 0) {
// No audio from microphone has been captured
return;
}
// We asked for the microphone so one track
const track = audioTracks[0];
if (track.muted) {
// Track is muted which means that the track is unable to provide media data.
// When muted, a track can't be unmuted.
// This track will no more provide data...
}
if (!track.enabled) {
// Track is disabled (muted for telephonist) which means that the track provides silence instead of real data.
// When disabled, a track can be enabled again.
// When in that case, user can't be heard until track is enabled again.
}
if (track.readyState === "ended") {
// Possibly a disconnection of the device
// When ended, a track can't be active again
// This track will no more provide data
}

When the state of a track changes, events are fired to inform the application

track.addEventListener("ended", () => {
// Which means track.readyState = "ended"
});
track.addEventListener("mute", () => {
// Which means track.enabled = false
});
track.addEventListener("unmute", () => {
// Which means track.enabled = true
});

If a track fires the event ended (property readyState goes to ended), the track is terminated and becomes obsolete: No more audio data will be received.

Detect Noise or silence

Audio APIs are a set of APIs that can be used to manipulate audio in a way to build an audio pipeline where the audio stream crosses nodes that can access and modify it before passing that transformed audio stream to the next one until the final output node.

Here, with these APIs, we can plug an AnalyzerNode to look at the signal generated by the microphone.

// Get the stream
const stream = await navigator.mediaDevices.getUserMedia({ audio: true});
// Create and configure the audio pipeline
const audioContext = new AudioContext();
const analyzer = audioContext.createAnalyser();
analyzer.fftSize = 512;
analyzer.smoothingTimeConstant = 0.1;
const sourceNode = audioContext.createMediaStreamSource(stream);
sourceNode.connect(analyzer);
// Analyze the sound
setInterval(() => {
// Compute the max volume level (-Infinity...0)
const fftBins = new Float32Array(analyzer.frequencyBinCount); // Number of values manipulated for each sample
analyzer.getFloatFrequencyData(fftBins);
// audioPeakDB varies from -Infinity up to 0
const audioPeakDB = Math.max(...fftBins);
// Compute a wave (0...)
const frequencyRangeData = new Uint8Array(analyzer.frequencyBinCount);
analyzer.getByteFrequencyData(frequencyRangeData);
const sum = frequencyRangeData.reduce((p, c) => p + c, 0);
// audioMeter varies from 0 to 10
const audioMeter = Math.sqrt(sum / frequencyRangeData.length);
}, 100);

Using the variables audioPeakDB and audioMeter, the application can deduce the level of sound and display something on screen representing the activity of the microphone.

And by mixing all together the Audio APIs, the GetUserMedia APIs and the Permissions APIs, we can have a clear view of the parameters to control: The following table summarizes the different cases where the application could consider that no audible sound will be sent to the recipient(s).

APIStatusDescription
Permissions APIstate=deniedAccess to the device has not been granted
GetUserMedia APIthrows an errorError when accessing the device
GetUserMedia APINo audio trackNo audio captured (never seen such problem)
MediaStreamTrack APItrack.muted = trueAudio has been muted for that stream (eg: direction=“recvonly”
MediaStreamTrack APItrack.readyState = endedDevice is disconnected or track is unable to provide audio data anymore
MediaStreamTrack APItrack.enabled = falseAudio is temporaly inactive
Audio APIpeakDBLevel = -InfinityAudio is (temporaly) inactive

Microphone Status

From the previous paragraph, we can deduce a number of situations where the application is interesting to know if the microphone works well or not.

All the information gathered allow deducing 8 states:

StatesHow to detect?Status
Active sound
(Voice activity or loud sound)
When the audioPeakDB is above -50dbOK
Background noiseWhen the audioPeakDB is below -50db and audioMeter > 0OK
QuietWhen audioMeter equals 0 and audioPeakDB is different than -InfinityOK
Disabled “In-app”When track.enabled is equals to falseOK
Muted “In-app”When track.mutedis equals to trueOK
MutedWhen audioMeter equals to 0 and audioPeakDB goes to -Infinity (which is below -900db)?
EndedWhen track.readyState is equals to endedKO
Not accessibleWhen permission is equals to denied or when getUserMedia throws an errorKO

This information is useful to identify the status of the microphone and so to detect an eventual problem:

  • When in state Ended or Not accessible, there is no doubt: Something is not good. The microphone will not be able to provide any sounds.
  • When in the state Muted, something that is independent of the application has muted the audio from the microphone. It can be an issue or not (voluntarily done). So the application could ask the user to check his microphone when in that case or at least to inform him that the microphone seems muted.
  • For the other states, the microphone seems to be under control and so works as promised.

Optimize using Worklet

Using an Analyzer and setInterval in not the optimized way to deal with the Audio API.

If the sound is analyzed too often (short interval = some ms) and during a long time, it will affect the performance of the application.

This computation can be optimized by using a Worklet:

The Worklet interface is a lightweight version of Web Workers and gives developers access to low-level parts of the rendering pipeline. With Worklets, you can run JavaScript and WebAssembly code to do graphics rendering or audio processing where high performance is required. MDN Web Docs.

By using an AudioWorklet, the audio processing is done out of the main thread. As the AudioNode, the AudioWorkletProcessor, processes 128 frames at a time. This ensures to add no extra additional latency, but if you want to work on more frames, you will need to implement your own buffer.

Here is an example of an AudioWorkletProcessor.

// Put this code to a file name audioMeter.js
const SMOOTHING_FACTOR = 0.99;
class AudioMeter extends AudioWorkletProcessor {
constructor() {
super();
this._volume = 0;
this.port.onmessage = ( event ) => {
// Deal with message received from the main thread - event.data
};
}
process(inputs, outputs, parameters) {
const input = inputs[0];
const samples = input[0];
const sumSquare = samples.reduce((p, c) => p + (c * c), 0);
const rms = Math.sqrt(sumSquare / (samples.length || 1));
this._volume = Math.max(rms, this._volume * SMOOTHING_FACTOR);
this.port.postMessage({volume: this._volume});
// Don't forget to return true - else worklet is ended
return true;
}
}
registerProcessor('audioMeter', AudioMeter);

The AudioWorkletProcessor can be loaded and executed from the application

// Get the audio stream
const stream = await navigator.mediaDevices.getUserMedia({ audio: true, video: false });
// Create the Audio Context
const audioContext = new AudioContext();
const source = audioContext.createMediaStreamSource(stream);
// Load the worklet
await audioContext.audioWorklet.addModule('./audioMeter.js');
const node = new AudioWorkletNode(audioContext, 'audioMeter');
node.port.onmessage = (event) => {
// Deal with message received from the Worklet processor - event.data
};
// Connect the audio pipeline - this will start the processing
source.connect(node).connect(audioContext.destination);

A more complete description of AudioWorklet can be found here.

Using AudioWorklet, your application is able to monitor the microphone during a long period of time without having to worry about performance.

Note: Be careful to embed your worklet file when using Webpack

Use cases

Now that we know how to interpret the microphone state, we can see what happens when the user manipulates his computer or his microphone.

Detecting actions done from a physical device

When an external microphone is plugged to the computer, the user can interact with it and sometimes put it in a wrong way: Cable can be unplugged, physical mute button can be pressed unintentionally…

The following table summarizes what can be detected

ActionsAll browsersState
Pressing on the mute buttonpeakDBLevel = -InfinityMuted
Disconnecting the device (unplugged / bluetooth disconnected)track.readyState = ended
peakDBLevel = -Infinity
Ended

Note: The 3 major browsers detect these changes.

Note 2: Be careful with devices that have both a microphone and a speaker. Detection of the Muted state not always work. It is like there is always some noise that prevent to move to that state (-Infinity).


Detecting actions done from the System Preferences

On macOS (this should be the same on other OS), we can go to the Sound panel in the System Preferences and modify the input level of the microphone.

This has an impact when using the Audio API because level captured will be different: lower if you decrease the input level or higher if you increase it.

Chrome

ActionsChromeState
Put the input level at 1%Average level is around 40db lower than when at 100%Active Sound
Background-Noise
Quiet
Put the input level to 0%Track.enabled switched to true. Average level goes down do -InfinityMuted “in-app”

Note: Chrome fires the event mute and unmute when the input level is at 0%

Safari & Firefox

ActionsSafari & FirefoxState
Put the input level at 1%Average level is around 40db lower than when at 100%Active Sound
Background-Noise
Quiet
Put the input level to 0%Average level goes down do -InfinityMuted “in-app”

Note: No event mute/unmute in Safari/Firefox when the input level reaches 0%


Detecting actions done from the browser itself

Mute/Unmute microphone when in call (Safari)

In Safari, the user has the possibility to mute or unmute the microphone during a call directly from the browser itself. This action is available by clicking on the microphone icon located at the end of the URL field.

ActionsSafariState
Click on the microphone button to mute/unmute the microphonemute/unmute event is fired
Track.enabled switched to false/true
Average level goes down to -Infinity
Muted “in-app”

Disable authorization when in call (Chrome)

Authorization can be disabled during a call Chrome. This action is available by clicking on the microphone icon located in the URL field.

ActionsChromeState
Click on the microphone button to remove the authorizationPermission changed to denied
Event ended is fired
Track.readyState goes to ended
Average level goes down to -Infinity
Not accessible

Disable authorization when in call (Firefox)

Authorization can be disabled during a call Firefox. This action is available by clicking on the microphone icon located in the URL field.

ActionsChromeState
Click on the microphone button to remove the authorizationEvent ended is fired
Track.readyState goes to ended
Average level goes down to -Infinity
Ended

Detecting actions done from the application

When in call, the application can propose to mute the microphone. The easiest way to do that is to manipulate the MediaStreamTrack by putting the property enabled to false. True is used to unmute the microphone.

ActionsAll browsersState
Mute/Unmute the microphone from the applicationtrack.enabled goes to false/true
Average level goes down to -Infinity
Muted “in-app

Switching to a Virtual Audio recorder

If by default, a virtual microphone (virtual audio recorder) is selected in the System Preferences or if the user selects it accidentally, this can lead to troubles too.

Safari

ActionsAll browsersState
Using a virtual deviceGot error A MediaStreamTrack ended due to a capture failure
Event ended is fired
Track.readyState goes to ended
Average level is always equals to -Infinity
Ended

Chrome & Firefox

ActionsAll browsersState
Using a virtual deviceAverage level is always equals to -InfinityMuted

Conclusion

Monitoring the microphone can be done at 2 levels: At the Device level (for the permission and the authorization) and at the Media level (by using the MediaStreamTrack and the MediaStream WebRTC interfaces).

Having that monitoring in place can be helpful to prevent users from complaining about sound issues. But as seen in that article, there is a number of different cases to handle…

To remember: Safari does not allow asking for several audio streams (at the same time) by calling several times consecutively the getUserMedia API. Each call to getUserMedia automatically ends the previous track obtained. Current audio data stream obtained from the microphone is ended by calling getUserMedia during a call.


Tags

#getusermedia#permissions#worklet

Share


Previous Article
WebRTC Statistics using getStats
Olivier Anguenot

Olivier Anguenot

Your WebRTC copilot

Topics

api
dev
others

Related Posts

How Permissions APIs help managing the authorizations?
March 28, 2021
5 min
© 2024, All Rights Reserved.
Powered By

Quick Links

HomeAbout MeContact Me

Social Media