Rockset Blog
1,051 FOLLOWERS
Fast SQL on NoSQL data from Kafka, DynamoDB, S3 and more. Rockset is a scalable, reliable search and analytics service in the cloud that makes it easy to build fast operational applications on TBs of data simply using SQL. Rockset delivers millisecond-latency SQL directly on raw data, including nested JSON, XML, Parquet and CSV, without any ETL. Use Rockset to build a Python application that..
Rockset Blog
3w ago
The speed and scalability of data used in applications, which pairs closely with its cost, are critical components every development team cares about. This blog describes how we optimized Rockset’s hot storage tier to improve efficiency by more than 200%. We delve into how we architect for efficiency by leveraging new hardware, maximizing the use of available storage, implementing better orchestration techniques and using snapshots for data durability. With these efficiency gains, we were able to reduce costs while keeping the same performance and pass along the savings to users. Rockset’s new ..read more
Rockset Blog
1M ago
Today, Confluent announced the general availability of its serverless Apache Flink service. Flink is one of the most popular stream processing technologies, ranked as a top five Apache project and backed by a diverse committer community including Alibaba and Apple. It powers steam processing at many companies including Uber, Netflix, and Linkedin.
Rockset customers using Flink often share how challenging it is to self-manage Flink for streaming transformations. That’s why we’re thrilled that Confluent Cloud is making it easier to use Flink, providing efficient and performant stream processing ..read more
Rockset Blog
1M ago
A good CPU profiler is worth its weight in gold. Measuring performance in-situ usually means using a sampling profile. They provide a lot of information while having very low overhead. In a concurrent system, however, it is hard to use the resulting data to extract high-level insights. Samples don’t include context like query IDs and application-level statistics; they show you what code was run, but not why.
This blog introduces trampoline histories, a technique Rockset has developed to efficiently attach application-level information (query IDs) to the samples of a CPU profile. This lets us u ..read more
Rockset Blog
2M ago
Introduction
Indexes are a crucial part of proper data modeling for all databases, and DynamoDB is no exception. DynamoDB's secondary indexes are a powerful tool for enabling new access patterns for your data.
In this post, we'll look at DynamoDB secondary indexes. First, we'll start with some conceptual points about how to think about DynamoDB and the problems that secondary indexes solve. Then, we'll look at some practical tips for using secondary indexes effectively. Finally, we'll close with some thoughts on when you should use secondary indexes and when you should look for other solutions ..read more
Rockset Blog
2M ago
Klarna is a leading buy-now-pay-later company, giving shoppers more time to pay while paying merchants in full upfront. With a number of payment options, including direct payments, pay after delivery and installment plans, Klarna provides shoppers flexibility in how they pay with zero interest. The number of new payment options helps over 500k merchants using Klarna to attract, convert and retain global shoppers.
Klarna integrates seamlessly into the payment experience offering one-click purchases, regardless of the payment plan. The flexible options enable shoppers to make larger purchases re ..read more
Rockset Blog
3M ago
In 2023, Rockset announced a new cloud architecture for search and analytics that separates compute-storage and compute-compute. With this architecture, users can separate ingestion compute from query compute, all while accessing the same real-time data. This is a game changer in disaggregated, real-time architectures. It also unlocks ways to make it easier and cheaper to build applications on Rockset.
Today, Rockset releases new features that make search and analytics more affordable than ever before:
General purpose instance class: A new ratio of compute and memory resources that is suitabl ..read more
Rockset Blog
3M ago
Elasticsearch is an open-source search and analytics engine based on Apache Lucene. When building applications on change data capture (CDC) data using Elasticsearch, you’ll want to architect the system to handle frequent updates or modifications to the existing documents in an index.
In this blog, we’ll walk through the different options available for updates including full updates, partial updates and scripted updates. We’ll also discuss what happens under the hood in Elasticsearch when modifying a document and how frequent updates impact CPU utilization in the system.
Example application wit ..read more
Rockset Blog
3M ago
Data mutability is the ability of a database to support mutations (updates and deletes) to the data that’s stored inside it. It’s a critical feature, especially in real-time analytics where data constantly changes and you need to present the latest version of that data to your customers and end users. Data can arrive late, it can be out of order, it can be incomplete or you might have a scenario where you need to enrich and extend your datasets with additional information for them to be complete. In either case, the ability to change your data is very important.
Rockset is fully mutable
Rocks ..read more
Rockset Blog
4M ago
Data modeling in Elasticsearch is not as obvious as it is when dealing with relational databases. Unlike traditional relational databases that rely on data normalization and SQL joins, Elasticsearch requires alternative approaches for managing relationships.
There are four common workarounds to managing relationships in Elasticsearch:
Application-side joins
Data denormalization
Nested field types and nested queries
Parent-child relationships
In this blog, we’ll discuss how you can design your data model to handle relationships using the nested field type and parent-child relationships. We’ll ..read more
Rockset Blog
4M ago
Overview
In this guide, we will:
Understand the Blueprint of any modern recommendation system
Dive into a detailed analysis of each stage within the blueprint
Discuss infrastructure challenges associated with each stage
Cover special cases within the stages of the recommendation system blueprint
Get introduced to some storage considerations for recommendation systems
And finally, end with what the future holds for the recommendation systems
Introduction
In a recent insightful talk at Index conference, Nikhil, an expert in the field with a decade-long journey in machine learning and infrastru ..read more