
Yelp Engineering Blog
1,620 FOLLOWERS
Read news and blog posts to see the technical side of Yelp Engineering. Our engineers have shared their thoughts with the world on our blog. Take a peek into our engineering & product teams and all the work that we do. Yelp connects people with great local businesses.
Yelp Engineering Blog
1M ago
Background As Yelp’s business continues to grow, the revenue streams have become more complex due to the increased number of transactions, new products and services. These changes over time have challenged the manual processes involved in Revenue Recognition. As described in the first post of the Revenue Automation Series, Yelp invested significant resources in modernizing its Billing System to fulfill the pre-requisite of automating the revenue recognition process. In this blog, we would like to share how we built the Revenue Data Pipeline that facilitates the third party integration with a R ..read more
Yelp Engineering Blog
1M ago
How we bring LLM intelligence to millions of daily searches at Yelp. From the moment a user enters a search query to when we present a list of results, understanding the user’s intent is crucial for meeting their needs. Were they looking for a general category of business for that evening, a particular dish or service, or one specific business nearby? Does the query contain nuanced location or attribute information? Is the query misspelled? Is their phrasing unusual, so that it might not align well with our business data? All of the above questions represent Natural Language Understanding task ..read more
Yelp Engineering Blog
2M ago
At Yelp, we encountered challenges that prompted us to enhance the training time of our ad-revenue generating models, which use a Wide and Deep Neural Network architecture for predicting ad click-through rates (pCTR). These models handle large tabular datasets with small parameter spaces, requiring innovative data solutions. This blog post delves into our journey of optimizing training time using TensorFlow and Horovod, along with the development of ArrowStreamServer, our in-house library for low-latency data streaming and serving. Together, these components have allowed us to achieve a 1400x ..read more
Yelp Engineering Blog
3M ago
As mentioned in our earlier blog post Fine-tuning AWS ASGs with Attribute Based Instance Selection, we recently embarked on an exciting journey to enhance our Kubernetes cluster’s node autoscaler infrastructure. In this blog post, we’ll delve into the rationale behind transitioning from our internally developed Clusterman autoscaler to AWS Karpenter. Join us as we explore the reasons for our switch, address the challenges with Clusterman, and embrace the opportunities with Karpenter. Clusterman and its challenges At Yelp, we used Clusterman to handle autoscaling of nodes in Kubernetes clusters ..read more
Yelp Engineering Blog
3M ago
This blog focuses on how Yelp successfully implemented a multi-year, cross-organizational initiative to modernize its billing processes. The goal was to automate its revenue recognition system by enhancing integration capabilities with third-party financial systems, all while maintaining the accuracy and reliability our users expect. Summary When Yelp first developed its billing system a decade ago, the database design was based on the requirements known at that time. These initial choices laid the foundation for the billing system, upon which multiple Yelp systems and processes were built. Ho ..read more
Yelp Engineering Blog
4M ago
At Yelp, we embrace innovation and thrive on exploring new possibilities. With our consumers’ ever growing appetite for data, we recently revisited how we could load data into Redshift more efficiently. In this blog post, we explore how DBT can be used seamlessly with Redshift Spectrum to read data from Data Lake into Redshift to significantly reduce runtime, resolve data quality issues, and improve developer productivity. Starting Point Our method of loading batch data into Redshift had been effective for years, but we continually sought improvements. We primarily used Spark jobs to read S3 d ..read more
Yelp Engineering Blog
5M ago
The Yelp Reservations service (yelp_res) is the service that powers reservations on Yelp. It was acquired along with Seatme in 2013, and is a Django service and webapp. It powers the reservation backend and logic for Yelp Guest Manager, our iPad app for restaurants, and handles diner and partner flows that create reservations. Along with that, it serves a web UI and backend API for our Yelp Reservations app, which has been superseded by Yelp Guest Manager but is still used by many of our restaurant customers. This service was built using a DB-centric architecture, and uses a “DB sync ..read more
Yelp Engineering Blog
6M ago
Machine Learning Feature Stores ML Feature Store at Yelp Many of Yelp’s core capabilities such as business search, ads, and reviews are powered by Machine Learning (ML). In order to ensure these capabilities are well supported, we have built a dedicated ML platform. One of the pillars of this infrastructure is the Feature Store, which is a centralized data store for ML Features that are the input of ML models. Having a centralized dedicated datastore for ML Features serves a number of purposes: Data Quality and Data Governance Feature discovery Improved operational efficiency Availability of F ..read more
Yelp Engineering Blog
7M ago
Sessions, Where Everything Started For the past few years, Yelp has been using dbt as one of the tools to develop data products that power data marts, which are one stop shops for high visibility dashboards pertaining to top level business metrics. One of the key data products that’s owned by my team, Clickstream Analytics, is the Sessions Data Mart. This product is our in-house solution to understand what consumers do during their session interaction with Yelp products and provide insights on top of it. This blog post will walk you through how dbt is used as an important test ..read more
Yelp Engineering Blog
8M ago
We’re excited to announce that multi-metric horizontal autoscaling is available for all services at Yelp. This allows us to scale services using multiple metrics, such as the number of in-flight requests and CPU utilization, rather than relying on a single metric. We expect this to provide us with better resilience and faster recovery during outages. This year, PaaSTA (Yelp’s platform-as-a-service, which we use to manage all of the applications running on our infrastructure) turns eleven years old! The first commit was on August 20th, 2013, and the first public commit was on October 22nd, 2015 ..read more