Loading...

Follow Cloudera on Feedspot

Continue with Google
Continue with Facebook
or

Valid

One instance of Cloudera Manager (CM) can manage N clusters. In the current Role Based Access Control (RBAC) model, CM users hold privileges and permissions across everything in CM’s purview (including every cluster that CM manages). For example, Read-Only user John is a user who can perform all the actions of Read-Only users on all clusters managed by CM. The “Cluster Admin” user Chris is a cluster administrator of all the clusters managed by CM.

Read more

The post Fine Grained Access Control in Cloudera Manager appeared first on Cloudera Engineering Blog.

Read Full Article
  • Show original
  • .
  • Share
  • .
  • Favorite
  • .
  • Email
  • .
  • Add Tags 

Cloudera Enterprise Backup and Disaster Recovery (BDR) enables you to replicate data across data centers for disaster recovery scenarios. As a lower cost solution to geographical redundancy or as a means to perform an on-premises to cloud migration, BDR can also replicate HDFS and Hive data to and from Amazon S3 or a Microsoft Azure Data Lake Store.

Many customers may require an automated solution for creating, running, and managing replication schedules in order to minimize Recovery Point Objectives (RPOs) for late arriving data or to automate recovery after disaster recovery.

Read more

The post How-to: Automate Replications with Cloudera Manager API appeared first on Cloudera Engineering Blog.

Read Full Article
  • Show original
  • .
  • Share
  • .
  • Favorite
  • .
  • Email
  • .
  • Add Tags 

Cloudera Director 2.8 introduces a simpler way to create clusters in AWS or Microsoft Azure that requires less information to get started than the standard procedure. A new configuration export capability enables retrieval of a client configuration file for any cluster as a starting point to create new clusters.

Cloudera Director helps you deploy, scale, and manage Cloudera clusters in AWS, Microsoft Azure, or Google Cloud Platform.

Read more

The post What’s New in Cloudera Director 2.8? appeared first on Cloudera Engineering Blog.

Read Full Article
  • Show original
  • .
  • Share
  • .
  • Favorite
  • .
  • Email
  • .
  • Add Tags 

Self-service BI and exploratory analytics are some of the most common use cases we see our customers running on Cloudera’s analytic database solution. Over the past year, we made significant advancements to provide a simpler user experience for SQL developers and make them more productive for their everyday self-service BI tasks and workflows by leveraging Hue as the SQL development workbench.

With the recent release of Cloudera 5.15,

Read more

The post New in Cloudera 5.15: Simplifying the end user Data Catalog for the Self Service Analytic Database appeared first on Cloudera Engineering Blog.

Read Full Article
  • Show original
  • .
  • Share
  • .
  • Favorite
  • .
  • Email
  • .
  • Add Tags 

The multi-part blog post Untangling Apache Hadoop YARN provided an overview of how the YARN scheduler works. In this post we discuss technical details around how FairScheduler Preemption works and best practices to consider when configuring it.

We also present a recent overhaul of FairScheduler Preemption in CDH 5.11 which attempts to address a number of issues as documented in YARN-4752.

Definitions

Before we begin,

Read more

The post YARN FairScheduler Preemption Deep Dive appeared first on Cloudera Engineering Blog.

Read Full Article
  • Show original
  • .
  • Share
  • .
  • Favorite
  • .
  • Email
  • .
  • Add Tags 

The previous two sections have concentrated on infrastructure considerations and services and role layouts for categories of workloads such as Analytic DB and Operational DB. Many of the concepts described therein apply predominantly to on-premise clusters while others apply to clusters deployed on-premise or in the cloud. This section will concentrate predominantly on those considerations that apply to cloud deployments only.

At the time of this writing, Cloudera supports 3 Infrastructure as a Service (IaaS) platforms: Amazon Elastic Compute Cloud (AWS),

Read more

The post Deploy Cloudera EDH Clusters Like a Boss Revamped – Part 3: Cloud Considerations appeared first on Cloudera Engineering Blog.

Read Full Article
  • Show original
  • .
  • Share
  • .
  • Favorite
  • .
  • Email
  • .
  • Add Tags 
Cloudera by Alex Moundalexis - 1M ago

As a member of Cloudera’s Partner Engineering team, I evaluate hardware and cloud computing platforms offered by commercial partners who want to certify their products for use with Cloudera software. One of my primary goals is to make sure that these platforms provide a stable and well-performing base upon which our products will run, a state of operation that a wide variety of customers performing an even wider variety of tasks can appreciate.

Read more

The post Evaluating Partner Platforms appeared first on Cloudera Engineering Blog.

Read Full Article
  • Show original
  • .
  • Share
  • .
  • Favorite
  • .
  • Email
  • .
  • Add Tags 

It has been a long and patient wait for Apache Hadoop 3.0 to mature. A major new version of the storage layer obviously impacts all our integrated components, including Apache Solr and all our integrations with the rest of the platform, commonly referred to as Cloudera Search. Since our customers’ Search deployments are so often mission critical, we’ve made sure to take time to do extensive integration testing and focus on the upgrade experience.

Now the moment has finally come to announce Solr 7.0 in Cloudera Search and available as of our new major release,

Read more

The post New in Cloudera Enterprise 6.0: Analytic Search appeared first on Cloudera Engineering Blog.

Read Full Article
  • Show original
  • .
  • Share
  • .
  • Favorite
  • .
  • Email
  • .
  • Add Tags 

Traditional messaging models fall into two categories: Shared Message Queues and Publish-Subscribe models. Both models have their own pros and cons. Neither could successfully handle big data ingestion at scale due to limitations in their design. Apache Kafka implements a publish-subscribe messaging model which provides fault tolerance, scalability to handle large volumes of streaming data for real-time analytics. It was developed at LinkedIn in 2010 to meet its growing data pipeline needs. Apache Kafka bridges the gaps that traditional messaging models failed to achieve.

Read more

The post Scalability of Kafka Messaging using Consumer Groups appeared first on Cloudera Engineering Blog.

Read Full Article
  • Show original
  • .
  • Share
  • .
  • Favorite
  • .
  • Email
  • .
  • Add Tags 

One of the worst things that can happen in mission-critical production environments is loss of data and another is downtime. For a search service that provides end users with easy access to data using natural language, downtime would mean complete halt for those parts of your organization. Even worse if the search service is fueling your online business, it interrupts your customer access and end user experience.

That is why we designed multiple options of backup and disaster recovery for your data served via Cloudera Search,

Read more

The post Backup and Disaster Recovery for Cloudera Search appeared first on Cloudera Engineering Blog.

Read Full Article

Read for later

Articles marked as Favorite are saved for later viewing.
close
  • Show original
  • .
  • Share
  • .
  • Favorite
  • .
  • Email
  • .
  • Add Tags 

Separate tags by commas
To access this feature, please upgrade your account.
Start your free month
Free Preview