Control audio or video playback with your keyboard

Revision Information
  • Revision id: 207442
  • Created:
  • Creator: Angela Lazar
  • Comment: Cleanup
  • Reviewed: Yes
  • Reviewed:
  • Reviewed by: anlazar
  • Is approved? Yes
  • Is current revision? No
  • Ready for localization: Yes
  • Readied for localization:
  • Readied for localization by: anlazar
Revision Source
Revision Content

What is the media control?

Any media playback inside Firefox can be controlled using the media control feature. It contains different ways to control media playback without interacting with the playing media element itself. It can be controlled by pressing the hardware control buttons on a keyboard or headset, pressing the button on the virtual control interface or by ending commands via specific protocol, such as MPRIS.

This used with the MediaSession API, allows websites to customize what they would like to do when a user is pressing media control keys and deciding what information should be shown on the virtual control interface. But if websites don't use that API, we would still provide some default operations when a user wants to control media by this feature.

How can I enable this feature on Firefox?

To control media playback on Firefox, set the pref media.hardwaremediakeys.enabled to true. If you want to use the Media Session API along with the media control, you need to set the pref dom.media.mediasession.enabled to true as well.

Currently we only enable these two prefs on Nightly, they will be enabled them as default on Firefox when this feature is stable enough.

What platform supports this feature?

You can use this feature on MacOS (OSX 10.12.1 or above), Windows (Windows 8.1 or above), Linux (gtk-based distributions).

On Android, we don’t have a platform-level implementation, but currently we use separate Android media components to achieve the media control ability. The current media control feature on Android is different from what we have in other platforms and it doesn’t support Media Session API. You don’t need to enable any pref if you want to use that media control on Fenix.

How are media control supported?

What kinds of media can be controlled by this feature?

Currently we only support controlling media playing from audio and video, so it cannot control media from web-audio, web-speech and Flash plugin.

  • If an audible media is playing from audio and video, then it’s able to be controlled.
  • If a playable media enters fullscreen
  • If a playable media enters picture-in-picture mode

Once the media reaches to the end, we would stop controlling it. This feature can be turned off by setting the pref media.mediacontrol.stopcontrol.aftermediaend to false.

What kinds of media can NOT be controlled by this feature?

Except media from web-audio, web-speech and Flash plugin, there are some exceptions:

Inaudible media

It's become common for websites to use a silent video, which usually doesn’t contain an audio track, as a background image, or as a GIF-like image.

Usually a user won’t want to control that media because it is not seen as a content-type media, and to control a background media seems useless. The second point is that controlling media means that we have to intercept media keys from the platform, which might affect other apps using media keys, such as a background music app.

If we intercept media keys in a non-proper situation, this could cause users to not be able to control the app that they wanted to control.

Therefore, we only control media which has ever become audible at least once.

If media is always inaudible, which might be caused by “setting muted by HTMLMediaElement API”, “media doesn’t have audio track”, “media’s audio track contains only silence” or “the whole tab is muted by tab sound indicator”, then we would not regard them as controllable media.

Notification sound

It is common to see websites playing a short sound as a notification. Same thing as inaudible media, this would not be the kind of media users want to control.

So our current method is to filter out the sound based on its duration, and this value is adjustable by the pref media.mediacontrol.eligible.media.duration.s, and its default value is 3 seconds.

What information would be displayed on the virtual control interface?

If a user has enabled MediaSession API and the website that the user is browsing uses MediaSession API, then we would display the media metadata on the control interface.

But one tab might contain multiple media sessions, the common example is that embedding a YouTube video in your page, each embedded iframe would contain a media session. In this situation, we would show the metadata from the iframe which last becomes audible. That is the recommended way from the spec.

However, (1) if websites don’t use MediaSession API, or (2) they use MediaSession API but don’t set a media metadata or (3) the metadata they set is empty, then we would generate a default media metadata, which uses the website’s title as artist name and the default favicon icon as album artwork.

If media playback happens in a private browsing window, then we would always use default metadata, and the title would be "Firefox is playing media".

If I have multiple tabs playing media, which tab would be controlled?

If there are multiple tabs playing at the same time, we would control the last tab playing media.

However, there is an exception. If a user is consuming media via Picture-in-Picture mode, then the tab which the PIP video belongs to would always be the tab that is being controlled.

In addition, users can also enable the pref media.audioFocus.management which would only allow one tab to play at a time. When audio competing happens, the latter tab playing media would stop the former tab playing media. (This pref is turned on by default on Android, off on other platforms.)

Why sometimes when I press media control keys, it controls other apps but not Firefox? Or, other apps are also being controlled at the same time?

On each platform, they all have their own framework or mechanism to decide which application owns the audio focus, which is the application that should receive media control keys.

However, if applications use a different mechanism to listen to media control keys when they try to play media, then it might result in multiple applications receiving media control keys at the same time.

If applications start to listen to media control keys even if they didn’t play any media, then they would incorrectly intercept media control keys from the app which is actually playing media.

So if a user is facing this kind of problem, we suggest to closing the other apps or not use them at the same time.

Can I stop controlling the media after it pauses for a certain time?

This feature is off by default, but if a user wants to stop controlling the media after it pauses for a certain period. The intention to have this ability is to consider a user might have changed their focus on other tasks, so we are better to stop controlling the media and release this ability to other applications.

This feature is controlled by the pref media.mediacontrol.stopcontrol.timer and the period of time is adjustable by the pref media.mediacontrol.stopcontrol.timer.ms, and the default value is 60 seconds.

However, if the media is being used in the fullscreen or picture-in-picture mode, then we would not trigger a stop timer for that media.

Where can I report Media Control related bugs?

File a bug on bugzilla

  1. Go to this bug which is used for tracking all media control related issues
  2. Press New/Clone button on the top-right corner
  3. Press ... that blocks this bug.
  4. Enter your issue and press Submit Bug when finished.

Report bug on Matrix

Feel free to join the room #media and report bugs to us directly! (tag @alwu @chunmin)