What is Declarative Computing?
Sync Computing Blog
by Jeffrey Chou
1w ago
The problem today In the world of cloud computing today, there remains an echo of the past that comes from the old server days that still haunts us today – manual compute resource selection. At a very high level, there are are always 2 pieces of information you need to provide to a compute cluster before your job can run:    Your code / data  Compute resources (e.g. warehouse size, instance types, memory, number of workers) Some examples of popular platforms with recurring jobs and their basic infrastructure choices are shown in the table below.  In reality, many of ..read more
Visit website
April 2024 Release Notes
Sync Computing Blog
by McKinley Culbert
2w ago
Our April releases are here! Take a look at Sync’s latest product updates and features. Sync’s Databricks Workspace health check is now self-serve and available as a notebook that you simply download and run on your own. With the notebook, you’ll be able to answer questions such as: What is the distribution of job runs by compute type? What does Photon usage look like? What are the most frequently used instance types? Are APC clusters being auto-terminated or sitting idle? What are my most expensive jobs? The best part? It’s a free tool that gives you actionable insights so you can wor ..read more
Visit website
Sync’s Health Check for Databricks Workspaces
Sync Computing Blog
by Kartik Nagappa
3w ago
Whether you’re a data engineer, a manager of a data team, or an executive overseeing a data platform, your focus might be on growth, and to continue to build and innovate. However, this may come at the expense of ballooning costs that are getting harder and harder to get under control. This ultimately leads to a point where you need to make some tough cost-cutting decisions — like migrating to a less expensive platform — or even tougher decisions — like laying off part of your team. Our data platform costs are increasing 20% MoM. How do we reduce our costs and get our budget under control? Se ..read more
Visit website
Why are your Databricks jobs performances changing over time?
Sync Computing Blog
by Jeffrey Chou
1M ago
For those running and tracking their production Databricks jobs, many may often see “random” fluctuations in runtime or slowly changing performance over days and weeks. Immediately, people may often wonder: “Why is my runtime increasing since last week?” “Is the cost of this job also increasing?” “Is the input data size changing?” “Is my job spilling to disk more than before?” “Is my job in danger of crashing?” To help give engineers and managers more visibility into how their production jobs are performing over time, we just launched a new visualization feature in Gradient that will hopeful ..read more
Visit website
March 2024 Release Notes
Sync Computing Blog
by McKinley Culbert
1M ago
Our team has been hard at work to deliver industry-leading features to support users in achieving optimal performance within the Databricks ecosystem. Take a look at our most recent releases below. Worker Instance Recommendations Introducing Worker Instance Recommendations directly from the Sync UI. With this feature, you are able to tap into optimal cluster configuration recos so that your configs are optimized for individual jobs. The instance recos within Gradient not only optimize the number of workers, but also the worker size. For example, if you are using i3.2xl instances, Gradient will ..read more
Visit website
Databricks Delta Live Tables 101
Sync Computing Blog
by McKinley Culbert
2M ago
Since its release in 2022 Databricks Delta Live Tables (DLT) have quickly become a go-to resource for data engineers looking to build opinionated ETL pipelines for streaming data and big data. The pipeline management framework is considered one of most valuable offerings on the databricks platform, and is used by over 1,000 companies including Shell and H&R block.  In our quest to help customers manage, understand, and optimize their Databricks workloads, we sought out to understand the value proposition for both customers, and for Databricks. In this post, we break down DLT as both a ..read more
Visit website
February 2024 Release Notes
Sync Computing Blog
by McKinley Culbert
3M ago
We’re excited to share all the new and improved features that our team has recently released to help our customers gain full governance over their Databricks infrastructure. Databricks Workspace IntegrationIntroducing the Databricks Workspace Integration for Gradient. With this new feature, you’re able to further simplify the process of connecting your Databricks Workspace to the Sync platform. This capability eases the tedious process of consolidating with the Gradient UI without the use of the Sync CLI. To get started, head to the integrations tab in your Sync dashboard. Here you’ll see a l ..read more
Visit website
How Forma.ai improved their Databricks costs quickly and easily with Gradient
Sync Computing Blog
by Jeffrey Chou
3M ago
Forma.ai is a B2B SaaS startup based in Toronto, Canada building an AI powered sales compensation system for enterprise. Specifically, they seamlessly unify the design, execution, and orchestration of sales compensation to better mobilize sales teams and optimize go-to-market performance. Behind the scenes, Forma.ai deploys their pipelines on Databricks to process sales compensation pipelines for their customers. They process hundreds of terabytes of data per month across Databricks Jobs clusters and ad-hoc all-purpose compute clusters.   As their customer count grows, so w ..read more
Visit website
Rethinking Serverless: The Price of Convenience
Sync Computing Blog
by Vinoo Ganesh
3M ago
As is the case with many concepts in technology, the term Serverless is abusively vague. As such, discussing the idea of “serverless” usually invokes one of two feelings in developers. Either, it’s thought of as the catalyst for this potential incredible future, finally freeing developers from having to worry about resources or scaling concerns, or it’s thought of as the harbinger of yet another “we don’t need DevOps anymore” trend.  The root cause of this confusion has to do with the fact that the catch-all term “Serverless” actually compromises two large operating models: functions and ..read more
Visit website
Why Your Databricks Cluster EBS Settings Matter
Sync Computing Blog
by Sean Gorsky
3M ago
Sean Gorsky & Cayman Williams Figure 1: Point comparison of between the cost and runtime of a Databricks Job run using the Default EBS settings and Sync’s Optimized EBS settings. More details about the job that was used to create this data can be found in the lower left plot in Figure 4. Choosing the right hardware configuration for your Databricks jobs can be a daunting task. Between instance types, cluster size, runtime engine, and beyond, there are an enormous number of choices that can be made. Some of these choices will dramatically impact your cost and application duration, othe ..read more
Visit website

Follow Sync Computing Blog on FeedSpot

Continue with Google
Continue with Apple
OR