DNS Zone Setup Best Practices on Azure
Cloudera | Data Engineering
by Dongkai Yu
2M ago
In Cloudera deployments on public cloud, one of the key configuration elements is the DNS. Get it wrong and your deployment may become wholly unusable with users unable to access and use the Cloudera data services. If the DNS is set up less ideal than it could be, connectivity and performance issues may arise. In this blog, we’ll take you through our tried and tested best practices for setting up your DNS for use with Cloudera on Azure. To get started and give you a feel for the dependencies for the DNS, in an Azure deployment for Cloudera, these are the Azure managed services being used:&nbs ..read more
Visit website
Using Dead Letter Queues with SQL Stream Builder
Cloudera | Data Engineering
by Cloudera
1y ago
What is a dead letter queue (DLQ)? Cloudera SQL Stream builder gives non-technical users the power of a unified stream processing engine so they can integrate, aggregate, query, and analyze both streaming and batch data sources in a single SQL interface. This allows business users to define events of interest for which they need to continuously monitor and respond quickly. A dead letter queue (DLQ) can be used if there are deserialization errors when events are consumed from a Kafka topic. DLQ is useful to see if there are any failures due to invalid input in the source Kafka topic and makes i ..read more
Visit website
Trusted Data: Alchemy For Misinformation
Cloudera | Data Engineering
by Shayde Christian
1y ago
The best description of untrusted data I’ve ever heard is, “We all attend the QBR – Sales, Marketing, Finance – and present quarterly results, except the Sales reports and numbers don’t match Marketing numbers and neither match Finance reports. We argue about where the numbers came from, then after 45 minutes of digging for common ground, we chuck our shovels and abandon the call in disgust.”  How would you go about fixing that situation? How would you get the trust into trusted data? Consult the Book of Spells Our spells are cast from our Enterprise Business Glossary. Our wizard is Data ..read more
Visit website
Materialized Views in SQL Stream Builder
Cloudera | Data Engineering
by Cloudera
1y ago
What is a materialized view? Cloudera SQL Stream Builder (SSB) gives the power of a unified stream processing engine to non-technical users so they can integrate, aggregate, query, and analyze both streaming and batch data sources in a single SQL interface. This allows business users to define events of interest for which they need to continuously monitor and respond quickly.   There are many ways to distribute the results of SSB’s continuous queries to embed actionable insights into business processes. In this blog we will cover materialized views—a special type of sink that makes t ..read more
Visit website
Implementing and Using UDFs in Cloudera SQL Stream Builder
Cloudera | Data Engineering
by Cloudera
1y ago
Cloudera’s SQL Stream Builder (SSB) is a versatile platform for data analytics using SQL. As apart of Cloudera Streaming Analytics it enables users to easily write, run, and manage real-time SQL queries on streams with a smooth user experience, while it attempts to expose the full power of Apache Flink. SQL has been around for a long time, and it is a very well understood language for querying data. The SQL standard has had time to mature, and thus it provides a complete set of tools for querying and analyzing data. Nevertheless, as good as it is sometimes it is necessary, or at least desirab ..read more
Visit website
Job Notifications in SQL Stream Builder
Cloudera | Data Engineering
by Botond Kismoni
1y ago
Special co-author credits: Adam Andras Toth, Software Engineer Intern With enterprises’ needs for data analytics and processing getting more complex by the day, Cloudera aims to keep up with these needs, offering constantly evolving, cutting-edge solutions to all your data related problems. Cloudera Stream Processing aims to take real-time data analytics to the next level. We’re excited to highlight job monitoring with notifications, a new feature for SQL Stream Builder (SSB). What problem are we solving with job notifications? The sudden failing of a complex data pipeline can lead to devasta ..read more
Visit website
Spark Technical Debt Deep Dive
Cloudera | Data Engineering
by François Reynald
1y ago
How Bad is Bad Code: The ROI of Fixing Broken Spark Code Once in a while I stumble upon Spark code that looks like it has been written by a Java developer and it never fails to make me wince because it is a missed opportunity to write elegant and efficient code: it is verbose, difficult to read, and full of distributed processing anti-patterns. One such occurrence happened a few weeks ago when one of my colleagues was trying to make some churn analysis code downloaded from GitHub work. I was looking for some broken code to add a workshop to our Spark Performance Tuning class and write a blog p ..read more
Visit website
Optimizing the Energy Sector with Data Analytics
Cloudera | Data Engineering
by Pablo Boixeda
1y ago
Across the energy supply chain from generation to consumer, we can see that the trend toward investing in renewable energy has picked up pace as demand has grown for energy companies to actively pursue investments in energies with little or no environmental impact in the quest for decarbonisation. McKinsey estimates that by 2035, 50% of energy will be wind and solar. The move toward renewable energy has a distinct and significant impact on energy generation and distribution that needs to be carefully managed. Efficient use of data will therefore be critical to improving the competitiveness an ..read more
Visit website
Cloudera Named a Leader in the 2022 Gartner® Magic Quadrant™ for Cloud Database Management Systems (DBMS)
Cloudera | Data Engineering
by David Dichmann
1y ago
We are pleased to announce that Cloudera has been named a Leader in the 2022 Gartner® Magic Quadrant for Cloud Database Management Systems. Cloudera has been recognized in this cloud DBMS report since its inception in 2020. This year we’ve been named a Leader. This validates our significant momentum in global enterprises. And together, with our recent recognition in the Gartner Peer Insights Customer Choice Distinction for Cloud DBMS, cements our position as an industry leader. We’re proud to be recognized for the data management and data analytics innovations we have delivered in the new Clo ..read more
Visit website
Implement A Multi-cloud Open Lakehouse with Apache Iceberg in Cloudera Data Platform
Cloudera | Data Engineering
by Bill Zhang
1y ago
Since we announced the general availability of Apache Iceberg in Cloudera Data Platform, Cloudera customers, such as Teranet, have built open lakehouses to future-proof their data platforms for all their analytical workloads. Cloudera partners are also benefiting from Apache Iceberg in CDP. For example Modak Nabu, is helping their enterprise customers accelerate data ingestion, curation and consumption at petabyte scale. Today, we are thrilled to share some new advancements in Cloudera’s integration of Apache Iceberg in CDP, such as to help accelerate your multi-cloud open data lakehouse impl ..read more
Visit website

Follow Cloudera | Data Engineering on FeedSpot

Continue with Google
Continue with Apple
OR