Comparison of WebRTC Codecs for Video and Screen Sharing

By Olivier Anguenot

Published in dev

April 16, 2025

9 min read

Methodology

Video Call (720p)

Sharing a screen (Full HD)

Conclusion

Comparison of WebRTC Codecs for Video and Screen Sharing

This article marks my first in-depth exploration of video codec performance, driven by significant discrepancies I noticed during screen-sharing sessions on a Windows laptop compared to a MacBook.

While the MacBook delivered consistently clean results, the Windows laptop often displayed quality issues, prompting me to investigate further.

My goal was to understand the root cause of these differences and determine whether recommending specific codecs based on the user’s system and platform would be necessary.

This evaluation was also inspired by insights from Tsahi Levent-Levi’s “Built with WebRTC – Jitsi” event, where the Jitsi team highlighted a significant reduction in user complaints regarding blurry screen-sharing after introducing AV1 into their product offerings.

In this article, I’ll compare video and screen-sharing performance across Windows and Mac platforms, detailing some of the issues encountered with particular codecs.

Methodology

To thoroughly evaluate codec performance for video, I’ve designed a testing scenario using my default camera set at 720p resolution and 30 frames per second. I’ll establish a local peer-to-peer connection, displaying the resulting video at its actual received size.

There won’t be unpredictable network conditions; instead, I’ll simulate varying network constraints by beginning with a bandwidth of 3 Mbps and progressively decreasing it until the codec struggles to maintain the resolution and consequently reduces to a lower resolution.

For each bandwidth level tested, I’ll assess the codecs through several key performance indicators (KPIs).

Codecs evaluated

I will try all the codecs available in Chrome 135 (and Chrome 136 Beta) which means: VP8, VP9, H264 (Software encoder from OpenH264), H264 (Hardware encoder), H265 (Hardware encoder) which is available starting Chrome 136 and AV1.

Subjective Quality Measures

I’ll first observe the perceived quality from two distinct scenarios:

a) Static Person: The individual remains immobile, only speaking without significant movements.

b) Active Body Language: The subject actively moves their body, their head and hands while speaking, representing typical conversational gestures.

These evaluations will primarily focus on the individual, intentionally minimizing attention to background details.

I’ll be watching closely for visual artifacts, blurring, graininess, and latency.

Objective Performance Measures

To capture the worst-case performance, I’ve added an additional stress-test scenario:

c) High Motion (Quality Smoothness Test): An intentionally artificial scenario designed to maximize pixel changes by rapidly moving my hand close to the camera.

The following performance metrics will be collected during scenario c):

Encode Time: Maximum time required to encode a frame.
QP: Highest cumulative Quantization Parameter (QP) value observed.
CPU Usage: Maximum CPU utilization measured directly from the Chrome browser tab.

These performance metrics represent a worst-case scenario.

Under typical usage, you can expect significantly lower values.

Note: During these video tests, I will not impose any specific contentHint constraints.

Video Call (720p)

Here are the measurements taken from my Windows laptop (a professional-grade device). I’ll then compare these results to those obtained from my Mac Mini M2 Pro. All data has been collected to enable a direct comparison between the different codecs.

Codecs Table

Key findings

When there are no bandwidth constraints, the perceived quality is fairly consistent across all codecs. However, a closer look at the numbers—especially from a CPU usage perspective—reveals meaningful differences.

As bandwidth becomes more limited, newer codecs begin to stand out more clearly in terms of efficiency and quality.

AV1 stood out as the top performer across all bandwidth conditions. It maintained excellent visual quality even at bitrates as low as 200–600 kbps. That said, its high CPU usage (peaking at 225%) makes it less suitable for devices with limited processing power.
H265 (HEVC) and VP9 both delivered strong performance, closely trailing AV1 at moderate bitrates. In particular, H265 impressed with its low CPU usage, thanks to hardware acceleration. This makes it an ideal candidate for resource-constrained or battery-sensitive environments.
H264 (both hardware and software variants) performed well when bandwidth exceeded 1 Mbps, but its quality dropped quickly under tighter constraints. Despite its widespread compatibility, it struggles to adapt to lower bandwidth conditions compared to newer codecs.
VP8 remains the most conservative of the group. It performs adequately at bitrates up to 600 kbps, which covers many common scenarios. However, its high CPU usage can be problematic, especially for longer or multi-participant sessions. Remarkably, over 12 years after its release, there’s still no hardware VP8 decoder on most laptops or desktops. Only some Android devices benefit from chipsets that can encode and decode VP8 in hardware (e.g., c2.exynos.vp8).

🤯 Four surprises from this test:

1) AV1’s magic at low bitrates: Even at 200 kbps, the quality remained stunning. It was genuinely difficult to break AV1’s visual fidelity—even under sever constraints.

2) H265’s extremely low CPU usage: This was the biggest surprise. Despite offering very high quality, it did so with impressively minimal CPU demand—making it ideal for performance-sensitive applications.

3) VP8’s ongoing appetite for CPU: On my Windows laptop, VP8 behaved like a relic from another era—still consuming CPU like an old coal-powered train 🚂. Nostalgic, perhaps, but definitely not efficient for modern use cases.

4) The choice of H264 profile is determinant: Whether the encoder used is hardware-accelerated or software-based.

Metrics comparison Mac/PC

I summarize the result in the following table.

Codec	Lenovo ThinkPad (Windows)	Mac Mini M2 Pro
VP8	Enc: Libvpx Dec: Libvpx Max CPU: 110% Encoding Time: 10ms	Enc: Libvpx Dec: Libvpx Max CPU: 38% Encoding Time: 4ms
H264	Enc: MediaFoundation (hard) Dec: D3D11VideoDecoder (hard) Max CPU: 20% Encoding Time: 11ms	Enc: VideoToolbox (hard) Dec: VideoToolbox (hard) Max CPU: 14% Encoding Time: 7ms
H264	Enc: OpenH264 Dec: D3D11VideoDecoder (hard) Max CPU: 60% Encoding Time: 14ms	Enc: OpenH264 Dec: VideoToolbox (hard) Max CPU: 54% Encoding Time: 14ms
H265	Enc: MediaFoundation (hard) Dec: D3D11VideoDecoder (hard) Max CPU: 19% Encoding Time: 14ms	Enc: VideoToolbox (hard) Dec: VideoToolbox (hard) Max CPU: 14% Encoding Time: 9ms
VP9	Enc: Libvpx Dec: D3D11VideoDecoder (hard) Max CPU: 180% Encoding Time: 18ms	Enc: Libvpx Dec: VideoToolbox (hard) Max CPU: 80% Encoding Time: 8ms
AV1	Enc: Libaom Dec: dav1d Max CPU: 225% Encoding Time: 15ms	Enc: Libaom Dec: dav1d Max CPU: 100% Encoding Time: 6ms

The good news is that codecs are treated equally across both platforms: hardware encoders and decoders are leveraged in a similar way.

When hardware acceleration is available, performance is comparable between Mac and Windows. However, the story changes when relying on software encoders. My Mac—designed with content creation in mind—handles video tasks like encoding with ease. In contrast, my Windows laptop is clearly optimized for productivity tools like Visual Studio Code and PowerPoint, not for media-heavy workloads.

From a codec perspective, it’s worth noting that VP8 and AV1 lack hardware decoders on both platforms, meaning they always rely on software decoding.

Even though AV1, VP8, and VP9 are encoded using software on macOS, the encoding performance is significantly faster than on the Windows machine—though it does come at the cost of higher CPU usage. As always, performance has its price.

Metrics comparison Android/Ipad

I compared an iPad Pro M1 to a Lenovo tablet running Android 12, both tested in Web (browser) mode.

However, this comparison isn’t fully representative, primarily because the iPad provides limited visibility into key metrics—such as encoder/decoder details and power efficiency statistics. These were unavailable during testing on both Chrome and Safari.

Additionally, it’s worth noting that tablets are less commonly used for real-time video applications. Instead, smartphones are typically favored, as they tend to feature more powerful and up-to-date hardware and software stacks.

Codec	Lenovo Tablet(Android)	Ipad
VP8	Enc: Libvpx Dec: MediaCodec (hard) Encoding Time: 38ms	Enc: - Dec: - Encoding Time: 5ms
H264	Enc: c2.mtk.avc.encoder (hard) Dec: MediaCodec (hard) Encoding Time: 22ms	Enc: - Dec: - Encoding Time: -
VP9	Enc: Libvpx Dec: MediaCodec (hard) Encoding Time: 32ms	Enc: - Dec: - Encoding Time: 8ms
H265	Not available	Enc: - Dec: - Encoding Time: -

Note: Android ecosystem is so huge that you can find a large diversity of cases: For example, some devices use a VP8 hardware decoder (with c2.qti.vp8.decoder) but I’m pretty sure that some other don’t… Be careful even when using VP8 on Android!

This makes direct comparison difficult.

However, as you can see, the encoding time on this Android tablet is more than double that of the laptop—exceeding 200% in some cases. Maintaining a steady 30 FPS under these conditions is quite challenging.

Ramp-up time

This is an important aspect that’s often overlooked.

When starting a call, it takes time to reach the target resolution (in this case, 720p). The browser doesn’t immediately send high-quality video—it begins with a low resolution, then gradually increases it as it probes available bandwidth and adapts accordingly.

I initially assumed this ramp-up behavior was consistent across all codecs. However, during testing, I observed notable differences in how each codec handles this adaptation phase.

Codecs	Ramp-Up (Lenovo ThinkPad)	Ramp-up (Mac Mini)
VP8	17 seconds (at 3Mbps) 45 seconds (at 500kbps)	13 seconds (at 3Mbps) 50 seconds (at 500kbps)
OpenH264	17 seconds (at 3Mbps) 27 seconds (at 500kbps)	15 seconds (at 3Mbps) 25 seconds (at 500kbps)
H264 (hard)	Immediate (at 3Mbps) Immediate (at 500kbps)	Immediate (at 3Mbps) Immediate (at 500kbps)
H265 (hard)	Immediate (at 3Mbps) Immediate (at 500kbps)	Immediate (at 3Mbps) Immediate (at 500kbps)
VP9	12 seconds (at 3Mbps) 18 seconds (at 500kbps)	14 seconds (at 3Mbps) 18 seconds (at 500kbps)
AV1	17 seconds (at 3Mbps) 21 seconds (at 500kbps)	13 seconds (at 3Mbps) 21 seconds (at 500kbps)

Note: This represents the time it takes for the received video to reach 720p resolution. It explains why video quality may appear lower at the beginning of the call and then gradually improves as the connection stabilizes.

From this point on, the behavior is consistent across both platforms: it takes at least 15 seconds—in the best-case scenario—for the stream to reach full resolution.

The exception lies with hardware encoders (H264 & H265), which appear to handle this process differently—at least in this particular case. Instead of ramping up gradually, they seem to immediately use the full maxBitrate, resulting in an almost instant jump to high resolution. It feels like a case of “first in, take it all.”

Wrap-up

From the tests done, there are many things to take into account to select the best codec to use:

The bandwidth constraints set: If I restrict to less than 1mbps, codecs such as H264 seem to have lower quality compared to other codecs
The CPU limitation: Codec such as VP9 or AV1 should be used only in case of no CPU limitation to avoid drastically decrease the quality.
Ecosystem: Sometimes one user would like to use a codec but the other can’t. Depending on the devices used, codecs may be unavailable

Next, I performed the same test using screen sharing. The protocol remained the same, but this time I required two screens—one to share the content, and another to display the received stream at full size.

For this test, I selected a Full HD resolution (1080p) and limited the frame rate to 10 FPS, which is a common setting for screen sharing scenarios.

Here, I’ve set the contentHint to detail.

This evaluation focuses on three key pillars of screen sharing quality:

Is the shared content instantly clear when presenting slides?
Is scrolling smooth and readable without noticeable delay or artifacts?
Is there any perceptible latency between the action and what the viewer sees?

The goal is to step into the shoes of the viewer and assess the overall quality and responsiveness of the screen sharing experience.

Codecs Table

Key findings

When there are little to no bandwidth constraints, all codecs deliver good results. During fast text scrolling, readability is maintained almost immediately across the board. The only exception I noticed was with VP8, where the scrolling wasn’t consistently smooth—text clarity lagged slightly behind.

As bandwidth becomes more restricted, the differences between codecs become much more apparent:

• AV1 clearly stands out under tight bandwidth conditions. Even at very low bitrates, the viewer can easily follow a presentation without any degradation in readability or experience.

• H265 also performs exceptionally well—delivering high visual quality while consuming very little CPU, making it an excellent choice for efficient screen sharing.

• VP8 and VP9 begin to show limitations at lower bitrates. However, VP9 holds up well when bandwidth is reasonable, outperforming VP8 in most scenarios. Among all the codecs tested, VP8 showed the lowest overall performance (Except for hardware-accelerated H264; see below for explanation).

• OpenH264 behaves like a conservative and consistent codec—its performance is solid across all conditions, degrading gradually as bitrate drops, without major quality drops or spikes.

🤯 Big surprises here:

1) H264 (hardware-accelerated) was completely unusable. This was the biggest surprise. Even with high bitrate, the viewer couldn’t read the scrolling text. It seemed like something fundamental—like proper keyframe (FIR) signaling—was missing. The video stream never fully synchronized, making it unusable for screen sharing in this context. I suspect this is a Chrome integration issue.

2) AV1 is impressively responsive. Even when scrolling text rapidly, AV1 kept the content readable almost instantly, or at worst within a second. It’s an ideal choice for text-heavy screen sharing.

3) H265 is incredibly CPU-efficient: Sharing a Full HD screen with minimal CPU usage is a huge win—especially when you’re presenting unplugged and trying to conserve battery life 😅.

4) MacBook’s internal screen resolution is massive: Possibly due to the Retina display, my MacBook screen runs at a native resolution of 3600×2338. This is about 300% more pixels than full HD!!! This is something to keep in mind. Without constraints, screen sharing from high-resolution displays can quickly max out the CPU and consume excessive bandwidth. Always set resolution and frame rate limits to avoid unnecessary overhead.

Regarding H264 (hardware-accelerated), I tested on two laptops (from different vendors): One was clean, but on the Lenovo, it was a nightmare as described in point 1).

Importance of resolution

As said, resolution used takes a major role in the CPU consumption.

Here are the results of tests done with the Mac Mini M2 Pro

Wrap-up

The quality of the screen-sharing can change a lot depending on the codec used

AV1 and H265 (if supported) seem to be the best choice. H264 seems the best compromise.
On Windows, the H264 codec with hardware support should be selected with care. Depending on the chipset used, result can be very good or unusable. Be careful, this is the fist codec listed in the SDP (42001f).
Screen resolution impacts a lot the CPU consumption. Take care of Mac users!

Conclusion

The first thing to note is that older codecs are still very much in the game. VP8 (for video) and H264 (commonly used for screen sharing) continue to offer the best balance between compatibility and performance—especially across a wide range of devices and browsers.

That said, newer codecs like AV1 represent a significant leap forward in efficiency, particularly when operating under bandwidth constraints.

This is one of the key reasons why the Jitsi team has embraced AV1. But their implementation goes far beyond simply calling setCodecPreferences. They’ve invested heavily in making AV1 adoption adaptive and intelligent, dynamically switching between codecs based on real-time performance measurements—especially for devices with limited processing power. It’s a nuanced system designed for flexibility, not a one-size-fits-all approach.

However, the road to AV1 isn’t without its hurdles. Mobile platforms, in particular, still struggle. On many Android devices, decoding AV1 can severely impact battery life, as highlighted in Meta’s benchmark report: Benchmarking and Deploying AV1 in the Android Ecosystem.

If you’re planning to adopt newer codecs like AV1 or VP9 in your WebRTC application, one key takeaway from the conversation between Tsahi, Emil, and Jonathan is this: don’t let codec logic overcomplicate your architecture. If your QA team can’t properly test the matrix of possible codec combinations, fallback scenarios, and performance thresholds, you may end up introducing more issues than benefits. Users won’t care about your codec strategy if their experience degrades—they’ll just complain or leave.

The statistics such as powerEfficientEncoder, encoderImplementation, and decoderImplementation are incredibly valuable when trying to correlate performance metrics with perceived quality. Unfortunately, these stats are not available in either Firefox or Safari 🙅.