Videos uploaded to Meta's Family-of-Apps are transcoded into multiple bitstreams of various codec formats, resolutions and quality to provide the best video quality across the wide variety of devices and connection bandwidth constraints. On Facebook alone, there are more than 4 billion video views per day and to address the video processing at this scale, we needed a video processing solution that can deliver the best video quality possible, with the shortest amount of encoding time — all while being energy efficient, programmable, and scalable. In this paper, we present, Meta Scalable Video Processor (MSVP) that can do video processing at on-par quality compared to SW solutions but at a small fraction of the compute time and energy. Each MSVP ASIC can offer a peak SIMO (Single Input Multiple Output) transcoding performance of 4K at 15fps at the highest quality configuration and can scale up to 4K at 60fps at the standard quality configuration. This performance is achieved at ~10W of PCIe module power. We achieved a throughput gain of ~9x for H.264 when compared against libx264 SW encoding. For VP9, we achieved a throughput gain of ~50x when compared with libVPX speed 2 preset. Key components of MSVP transcoding include video decode, scalar, encoding and quality metric computation. In this paper, we go over ASIC architecture of MSVP, design of individual components and compare the perf/W vs quality against standard industry used SW encoders.
This paper describes FB-MOS metric that measures video quality at scale in Facebook ecosystem. As the quality of uploaded UGC source itself varies widely, FB-MOS consists of both a no-reference component to assess input (upload) quality and a full-reference component, based on SSIM, to assess quality preserved in the transcoding and delivery pipeline. Note that the same video may be watched on a variety of devices (Mobile/laptop/TV) in varying network conditions that cause quality fluctuations; moreover, the viewer can switch between in-line view and full-screen view during the same viewing session. We show how FB-MOS metric accounts for all this variation in viewing condition while minimizing the computation overhead. Validation of this metric on FB-content has shown that SROCC is 0.9147 using internally selected videos. The paper also discusses some of the optimizations to reduce metric computation complexity and scale the complexity in proportion to video popularity.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.