Maestro: Netflix’s Workflow Orchestrator
Netflix Tech Blog
by Netflix Technology Blog
2d ago
By Jun He, Natallia Dzenisenka, Praneeth Yenugutala, Yingyi Zhang, and Anjali Norwood TL;DR We are thrilled to announce that the Maestro source code is now open to the public! Please visit the Maestro GitHub repository to get started. If you find it useful, please give us a star. What is Maestro Maestro is a general-purpose, horizontally scalable workflow orchestrator designed to manage large-scale workflows such as data pipelines and machine learning model training pipelines. It oversees the entire lifecycle of a workflow, from start to finish, including retries, queuing, task ..read more
Visit website
Enhancing Netflix Reliability with Service-Level Prioritized Load Shedding
Netflix Tech Blog
by Netflix Technology Blog
1M ago
Applying Quality of Service techniques at the application level Anirudh Mendiratta, Kevin Wang, Joey Lynch, Javier Fernandez-Ivern, Benjamin Fedorka Introduction In November 2020, we introduced the concept of prioritized load shedding at the API gateway level in our blog post, Keeping Netflix Reliable Using Prioritized Load Shedding. Today, we’re excited to dive deeper into how we’ve extended this strategy to the individual service level, focusing on the video streaming control plane and data plane, to further enhance user experience and system resilience. The Evolution of Load Shedding a ..read more
Visit website
A Recap of the Data Engineering Open Forum at Netflix
Netflix Tech Blog
by Netflix Technology Blog
1M ago
A summary of sessions at the first Data Engineering Open Forum at Netflix on April 18th, 2024 The Data Engineering Open Forum at Netflix on April 18th, 2024. At Netflix, we aspire to entertain the world, and our data engineering teams play a crucial role in this mission by enabling data-driven decision-making at scale. Netflix is not the only place where data engineers are solving challenging problems with creative solutions. On April 18th, 2024, we hosted the inaugural Data Engineering Open Forum at our Los Gatos office, bringing together data engineers from various industries to sh ..read more
Visit website
Round 2: A Survey of Causal Inference Applications at Netflix
Netflix Tech Blog
by Netflix Technology Blog
1M ago
At Netflix, we want to ensure that every current and future member finds content that thrills them today and excites them to come back for more. Causal inference is an essential part of the value that Data Science and Engineering adds towards this mission. We rely heavily on both experimentation and quasi-experimentation to help our teams make the best decisions for growing member joy. Building off of our last successful Causal Inference and Experimentation Summit, we held another week-long internal conference this year to learn from our stunning colleagues. We brought together speakers f ..read more
Visit website
The Making of VES: the Cosmos Microservice for Netflix Video Encoding
Netflix Tech Blog
by Netflix Technology Blog
3M ago
Liwei Guo, Vinicius Carvalho, Anush Moorthy, Aditya Mavlankar, Lishan Zhu This is the second post in a multi-part series from Netflix. See here for Part 1 which provides an overview of our efforts in rebuilding the Netflix video processing pipeline with microservices. This blog dives into the details of building our Video Encoding Service (VES), and shares our learnings. Cosmos is the next generation media computing platform at Netflix. Combining microservice architecture with asynchronous workflows and serverless functions, Cosmos aims to modernize Netflix’s media processing pipelines wi ..read more
Visit website
Reverse Searching Netflix’s Federated Graph
Netflix Tech Blog
by Netflix Technology Blog
4M ago
By Ricky Gardiner, Alex Hutter, and Katie Lefevre Since our previous posts regarding Content Engineering’s role in enabling search functionality within Netflix’s federated graph (the first post, where we identify the issue and elaborate on the indexing architecture, and the second post, where we detail how we facilitate querying) there have been significant developments. We’ve opened up Studio Search beyond Content Engineering to the entirety of the Engineering organization at Netflix and renamed it Graph Search. There are over 100 applications integrated with Graph Search and nearly 50 i ..read more
Visit website
Sequential Testing Keeps the World Streaming Netflix Part 2: Counting Processes
Netflix Tech Blog
by Netflix Technology Blog
4M ago
Sequential A/B Testing Keeps the World Streaming Netflix Part 2: Counting Processes Michael Lindon, Chris Sanden, Vache Shirikian, Yanjun Liu, Minal Mishra, Martin Tingley Have you ever encountered a bug while streaming Netflix? Did your title stop unexpectedly, or not start at all? In the first installment of this blog series on sequential testing, we described our canary testing methodology for continuous metrics such as play-delay. One of our readers commented What if the new release is not related to a new play/streaming feature? For example, what if the new release includes modified ..read more
Visit website
Supporting Diverse ML Systems : Netflix Tech Blog
Netflix Tech Blog
by Netflix Technology Blog
5M ago
David J. Berg, Romain Cledat, Kayla Seeley, Shashank Srikanth, Chaoying Wang, Darin Yu Netflix uses data science and machine learning across all facets of the company, powering a wide range of business applications from our internal infrastructure and content demand modeling to media understanding. The Machine Learning Platform (MLP) team at Netflix provides an entire ecosystem of tools around Metaflow, an open source machine learning infrastructure framework we started, to empower data scientists and machine learning practitioners to build and manage a variety of ML systems. Since i ..read more
Visit website
Bending pause times to your will with Generational ZGC
Netflix Tech Blog
by Netflix Technology Blog
5M ago
The surprising and not so surprising benefits of generations in the Z Garbage Collector. By Danny Thomas, JVM Ecosystem Team The latest long term support release of the JDK delivers generational support for the Z Garbage Collector. Netflix has switched by default from G1 to Generational ZGC on JDK 21 and later, because of the significant benefits of concurrent garbage collection. More than half of our critical streaming video services are now running on JDK 21 with Generational ZGC, so it’s a good time to talk about our experience and the benefits we’ve seen. If you’re interested in how we us ..read more
Visit website
Evolving from Rule-based Classifier: Machine Learning Powered Auto Remediation in Netflix Data…
Netflix Tech Blog
by Netflix Technology Blog
5M ago
Evolving from Rule-based Classifier: Machine Learning Powered Auto Remediation in Netflix Data Platform by Binbing Hou, Stephanie Vezich Tamayo, Xiao Chen, Liang Tian, Troy Ristow, Haoyuan Wang, Snehal Chennuru, Pawan Dixit This is the first of the series of our work at Netflix on leveraging data insights and Machine Learning (ML) to improve the operational automation around the performance and cost efficiency of big data jobs. Operational automation–including but not limited to, auto diagnosis, auto remediation, auto configuration, auto tuning, auto scaling, auto debugging, and auto ..read more
Visit website

Follow Netflix Tech Blog on FeedSpot

Continue with Google
Continue with Apple
OR