Revenue Automation Series: Building Revenue Data Pipeline
Yelp Engineering Blog
by Yizheng Zhang, Software Engineer; Yirun Zhou, Software Engineer
1M ago
Background As Yelp’s business continues to grow, the revenue streams have become more complex due to the increased number of transactions, new products and services. These changes over time have challenged the manual processes involved in Revenue Recognition. As described in the first post of the Revenue Automation Series, Yelp invested significant resources in modernizing its Billing System to fulfill the pre-requisite of automating the revenue recognition process. In this blog, we would like to share how we built the Revenue Data Pipeline that facilitates the third party integration with a R ..read more
Visit website
Search Query Understanding with LLMs: From Ideation to Production
Yelp Engineering Blog
by Loc Trinh, Software Engineer; Ali Rokni, Tech Lead; John Hawksley, Group Tech Lead
1M ago
How we bring LLM intelligence to millions of daily searches at Yelp. From the moment a user enters a search query to when we present a list of results, understanding the user’s intent is crucial for meeting their needs. Were they looking for a general category of business for that evening, a particular dish or service, or one specific business nearby? Does the query contain nuanced location or attribute information? Is the query misspelled? Is their phrasing unusual, so that it might not align well with our business data? All of the above questions represent Natural Language Understanding task ..read more
Visit website
Enhancing Neural Network Training at Yelp: Achieving 1,400x Speedup with WideAndDeep
Yelp Engineering Blog
by Yunhui Zhang, Software Engineer
2M ago
At Yelp, we encountered challenges that prompted us to enhance the training time of our ad-revenue generating models, which use a Wide and Deep Neural Network architecture for predicting ad click-through rates (pCTR). These models handle large tabular datasets with small parameter spaces, requiring innovative data solutions. This blog post delves into our journey of optimizing training time using TensorFlow and Horovod, along with the development of ArrowStreamServer, our in-house library for low-latency data streaming and serving. Together, these components have allowed us to achieve a 1400x ..read more
Visit website
Revisiting Compute Scaling
Yelp Engineering Blog
by Ilkin Mammadzada and Ankit Tripathi, Site Reliability Engineers
3M ago
As mentioned in our earlier blog post Fine-tuning AWS ASGs with Attribute Based Instance Selection, we recently embarked on an exciting journey to enhance our Kubernetes cluster’s node autoscaler infrastructure. In this blog post, we’ll delve into the rationale behind transitioning from our internally developed Clusterman autoscaler to AWS Karpenter. Join us as we explore the reasons for our switch, address the challenges with Clusterman, and embrace the opportunities with Karpenter. Clusterman and its challenges At Yelp, we used Clusterman to handle autoscaling of nodes in Kubernetes clusters ..read more
Visit website
Revenue Automation Series: Modernizing Yelp's Legacy Billing System
Yelp Engineering Blog
by Simon Zeng, Payments Tech Lead; Supriya Lal, Commerce Platform Group Tech Lead
3M ago
This blog focuses on how Yelp successfully implemented a multi-year, cross-organizational initiative to modernize its billing processes. The goal was to automate its revenue recognition system by enhancing integration capabilities with third-party financial systems, all while maintaining the accuracy and reliability our users expect. Summary When Yelp first developed its billing system a decade ago, the database design was based on the requirements known at that time. These initial choices laid the foundation for the billing system, upon which multiple Yelp systems and processes were built. Ho ..read more
Visit website
Loading data into Redshift with DBT
Yelp Engineering Blog
by Christopher Arnold, Software Engineer
4M ago
At Yelp, we embrace innovation and thrive on exploring new possibilities. With our consumers’ ever growing appetite for data, we recently revisited how we could load data into Redshift more efficiently. In this blog post, we explore how DBT can be used seamlessly with Redshift Spectrum to read data from Data Lake into Redshift to significantly reduce runtime, resolve data quality issues, and improve developer productivity. Starting Point Our method of loading batch data into Redshift had been effective for years, but we continually sought improvements. We primarily used Spark jobs to read S3 d ..read more
Visit website
Migrating in-place from PostgreSQL to MySQL
Yelp Engineering Blog
by Alex Toumazis, Software Engineer
5M ago
The Yelp Reservations service (yelp_res) is the service that powers reservations on Yelp. It was acquired along with Seatme in 2013, and is a Django service and webapp. It powers the reservation backend and logic for Yelp Guest Manager, our iPad app for restaurants, and handles diner and partner flows that create reservations. Along with that, it serves a web UI and backend API for our Yelp Reservations app, which has been superseded by Yelp Guest Manager but is still used by many of our restaurant customers. This service was built using a DB-centric architecture, and uses a “DB sync ..read more
Visit website
Boosting ML Pipeline Efficiency: Direct Cassandra Ingestion from Spark
Yelp Engineering Blog
by Muhammad Junaid Muzammil, Software Engineer; Arnold Ziesche-Blank, Machine Learning Engineer
6M ago
Machine Learning Feature Stores ML Feature Store at Yelp Many of Yelp’s core capabilities such as business search, ads, and reviews are powered by Machine Learning (ML). In order to ensure these capabilities are well supported, we have built a dedicated ML platform. One of the pillars of this infrastructure is the Feature Store, which is a centralized data store for ML Features that are the input of ML models. Having a centralized dedicated datastore for ML Features serves a number of purposes: Data Quality and Data Governance Feature discovery Improved operational efficiency Availability of F ..read more
Visit website
Dbt Generic Tests in Sessions Validation at Yelp
Yelp Engineering Blog
by Tian, Yukang, Software Engineer
7M ago
Sessions, Where Everything Started For the past few years, Yelp has been using dbt as one of the tools to develop data products that power data marts, which are one stop shops for high visibility dashboards pertaining to top level business metrics. One of the key data products that’s owned by my team, Clickstream Analytics, is the Sessions Data Mart. This product is our in-house solution to understand what consumers do during their session interaction with Yelp products and provide insights on top of it. This blog post will walk you through how dbt is used as an important test ..read more
Visit website
Implementing multi-metric scaling: making changes to legacy code safely
Yelp Engineering Blog
by David R. Morrison (Contractor); Charan Gangaraju (Software Engineer)
8M ago
We’re excited to announce that multi-metric horizontal autoscaling is available for all services at Yelp. This allows us to scale services using multiple metrics, such as the number of in-flight requests and CPU utilization, rather than relying on a single metric. We expect this to provide us with better resilience and faster recovery during outages. This year, PaaSTA (Yelp’s platform-as-a-service, which we use to manage all of the applications running on our infrastructure) turns eleven years old! The first commit was on August 20th, 2013, and the first public commit was on October 22nd, 2015 ..read more
Visit website

Follow Yelp Engineering Blog on FeedSpot

Continue with Google
Continue with Apple
OR