Real-Time Video Quality Assessment with pVMAF

Axel De Decker

Senior Video Algorithms Engineer

Jan De Cock

Director Codec Development

October 15, 2024

Category: Deliver

Introducing pVMAF

Real-time video quality measurement (VQM) is a persistent challenge in the streaming and broadcast industry. In the first entry of this blog series, we highlighted the limitations of current methods: many either lack the speed required for real-time analysis or fail to accurately capture the nuances of the Human Visual System (HVS) when assessing video quality.

What if we could combine the predictive accuracy of a metric like VMAF with the computational efficiency of a simpler method like PSNR? Enter pVMAF (predictive VMAF) — Synamedia’s innovative solution is designed to deliver real-time, accurate video quality measurement with minimal computational overhead. pVMAF operates within the encoding loop, leveraging key encoding parameters and pre-analysis statistics to achieve VMAF-like precision, but without the heavy computational load. This post will dive into how pVMAF functions, its development, and how it’s positioned to become a game-changer in real-time video quality measurement.

How does pVMAF work?

Feature extraction

pVMAF’s success in balancing accuracy and efficiency lies in its smart feature extraction. During video encoding, parameters like the quantization parameters (QPs) are already calculated to control the bitrate of the produced bitstream. These parameters also contain valuable information for assessing quality degradation — essentially offering quality-aware features “for free” since the encoder would compute them anyway. By repurposing these parameters, pVMAF avoids adding any significant computational load.

Another valuable source of quality information comes from pre-analysis statistics calculated on the pixels of original video frames. These statistics help optimize the encoding process and ensure the best quality-to-bitrate trade-off. Unlike PSNR, which only measures pixel-level differences, pVMAF takes into account how viewers perceive quality by factoring in measures for texture, brightness, and content masking — making it more closely aligned with human perception.

Additionally, pVMAF integrates PSNR as a core feature to capture distortion, while leveraging various encoding parameters that directly influence visual quality. These parameters provide a solid basis for assessing distortions in the bitstream. In parallel, the pre-analysis statistics gathered from the original video frames capture the nuanced characteristics of the HVS, especially regarding how perceptible visual artifacts are in different types of content, based on textures, brightness, and other visual characteristics. This combined feature set — distortion data from encoding parameters and PSNR, along with perceptual insights from pre-analysis statistics — allows pVMAF to make highly accurate predictions that closely align with human quality perception.

Fig1. High-level overview of pVMAF’s integration within the encoding loop. This diagram illustrates how pVMAF accumulates stream-based features from the encoding process, together with pre-analysis statistics and PSNR calculations, to generate real-time quality predictions.

Inference

Once these features are extracted, inference – the process of transforming input features into a quality score – is handled by a lightweight shallow neural network. This architecture includes only 1-2 hidden layers, ensuring minimal computational overhead. Furthermore, by implementing Single Instruction Multiple Data (SIMD) instructions, pVMAF’s efficiency is boosted further, making it highly suitable for real-time applications.

Fig 2. pVMAF model architecture, consisting of a shallow feed-forward neural network architecture.

pVMAF performance metrics

pVMAF has been trained to closely replicate the quality predictions of VMAF, which is renowned for its accuracy but is too compute-intensive for real-time applications. When tested on a diverse dataset, pVMAF demonstrated a high correlation with VMAF scores, as seen in the performance numbers in Table 1. We report the Spearman Rank Order Correlation Coefficient (SROCC) and the Pearson Linear Correlation Coefficient (PLCC), which measure the correlation between VMAF and pVMAF, as well as the Mean Absolute Error (MAE) and the Root Mean Squared Error (RMSE) which measure the average prediction error between pVMAF scores and true VMAF scores.

Performance Metric	Frame-level	Sequence-level
SROCC	0.941	0.988
PLCC	0.947	0.985
MAE	2.75	1.34
RMSE	4.51	2.20

Table 1: This table displays the correlation coefficients (SROCC, PLCC) and prediction error measures (MAE, RMSE) of pVMAF when evaluated at both the frame-level and sequence-level

These results highlight pVMAF’s ability to deliver highly accurate VQM with only a negligible increase in CPU usage (~0.06% increase in CPU cycles for FHD encoding on a medium encoding preset) — making it ideal for real-time video quality assessment.

Use cases

pVMAF opens the door to efficient, real-time video quality measurement during encoding, delivering VMAF-level accuracy with minimal computational overhead. This versatility makes it ideal for a range of practical applications:

In-depth video quality analysis: pVMAF’s frame-by-frame evaluations reveal precisely which video segments are more challenging to encode and where visual quality degradation occurs. This insight helps diagnose and troubleshoot specific encoding issues.
Live encoding monitoring: pVMAF serves as a real-time monitoring tool, capable of raising alerts when video quality falls below a defined threshold. Automating quality checks reduces the need for human oversight, freeing up resources and improving operational efficiency during live broadcasts.
Proactive rate control: pVMAF can also play an active role in rate control, acting as a feedback mechanism during constant-quality encoding. It informs the rate controller if the encoding process is overshooting or undershooting the target quality. This functionality is already integrated into Synamedia’s QC-VBR (Quality-Controlled Variable Bitrate) rate control algorithm, ensuring that the target quality is consistently met without sacrificing efficiency.

The future of pVMAF: from proprietary solutions to open-sourcing

Currently, pVMAF is integrated into Synamedia’s in-house H.264/AVC and H.265/HEVC encoders, but the goal is to make it widely accessible. In the near future, we plan to release a new version of pVMAF designed for the open-source encoder x264, encouraging community collaboration and further optimization.

Stay tuned for the next post, where we’ll dive into the specifics of this open-source release and share the code repository for pVMAF.

About the Authors

Axel De Decker:

Axel De Decker is a doctoral researcher at Synamedia’s Research and Development division, pursuing an industry Ph.D. in collaboration with Ghent University. His work focuses on video quality assessment and video compression.

Jan De Cock:

Jan leads the compression team at Synamedia and is responsible for the company’s codec development operations. Having spent his entire career in the compression space, most recently as Manager of Video and Image Encoding at Netflix, Jan is one of the industry’s foremost encoding experts.

Prior to his role at Netflix, Jan was Assistant Professor in the Department of Electronics and Information Systems at Ghent University in Belgium.

Jan holds a PhD in Engineering from Ghent University in Belgium. He was general co-chair of the 2018 Picture Coding Symposium (PCS), co-organizer of the 1^st AOMedia Research Symposium in 2019, and has been a presenter and speaker at a wide range of international conferences.