Adaltas
316 FOLLOWERS
Adaltas is a team of consultants with a focus on Open Source, Big Data and distributed systems based in France, Canada and Morocco.
Adaltas
6M ago
The Trunk Data Platform (TDP) is a 100% open source big data distribution, based on Apache Hadoop and compatible with HDP 3.1. Initiated in 2021 by EDF, the DGFiP and Adaltas, the project is governed by the TOSIT - an association under the 1901 law with the objective of promoting open source to major companies and institutions.
Version 1.1, which release is expected duing the 4th quarter of 2023, adds features necessary for managing a production cluster (see #308). Support and training offers are already available from some consulting firms like Adaltas with Alliage.
TDP is aimed at anyone wis ..read more
Adaltas
7M ago
The new TDP (Trunk Data Platform) website is online. We invite you to browse its pages to discover the platform, stay informed, and cultivate contact with the TDP community.
TDP is a completely open-source big data platform leveraging the Hadoop eco-system. Initiated 3 years ago at the initiative of DGFIP, EDF and Adaltas, it is governed by the TOSIT association, whose mission is to promote open source to support the emergence of codes, softwares and IT solutions under open source licenses.
The site is aimed at all those wishing to discover the features of the platform. The documentation will ..read more
Adaltas
8M ago
In this hands-on lab session we demonstrate how to build an end-to-end big data solution with Cloudera Data Platform (CDP) Public Cloud, using the infrastructure we have deployed and configured over the course of the series.
This is the final article in a series of six:
CDP part 1: introduction to end-to-end data lakehouse architecture with CDP
CDP part 2: CDP Public Cloud deployment on AWS
CDP part 3: Data Services activation on CDP Public Cloud environment
CDP part 4: user management on CDP Public Cloud with Keycloak
CDP part 5: user permission management on CDP Public Cloud
CDP part 6: end ..read more
Adaltas
10M ago
The Cloudera Data Platform (CDP) Public Cloud provides the foundation upon which full featured data lakes are created.
In a previous article, we introduced the CDP platform. This article is the second in a series of six to learn how to build end-to-end big data architectures with CDP:
CDP part 1: introduction to end-to-end data lakehouse architecture with CDP
CDP part 2: CDP Public Cloud deployment on AWS
CDP part 3: Data Services activation on CDP Public Cloud environment
CDP part 4: user management on CDP Public Cloud with Keycloak
CDP part 5: user permission management on CDP Public Cloud ..read more
Adaltas
10M ago
Cloudera Data Platform (CDP) is a hybrid data platform for big data transformation, machine learning and data analytics. In this series we describe how to build and use an end-to-end big data architecture with CDP Public Cloud on Amazon Web Services (AWS).
Our architecture is designed to retrieve data from an API, store it in a data lake, move it to a data warehouse and eventually serve it in a data visualization application to analytics end users.
This series includes the following six articles:
CDP part 1: introduction to end-to-end data lakehouse architecture with CDP
CDP part 2: CDP Publi ..read more
Adaltas
11M ago
As a Big Data Solutions Architect and InfraOps, I need development environments to install and test software. They have to be configurable, flexible, and performant. Working with distributed systems, the best-fitting setups for this use case are local virtualized clusters of several Linux instances.
For a few years, I have been using HashiCorp’s Vagrant to manage libvirt/KVM instances. This is working well, but I recently experienced another setup that works better for me: LXD to manage instances and Terraform (another HashiCorp tool) to operate LXD. In this article, I explain what the advanta ..read more
Adaltas
1y ago
Kubernetes is not the first platform that comes to mind to run Apache Kafka clusters. Indeed, Kafka’s strong dependency on storage might be a pain point regarding Kubernetes’ way of doing things when it comes to persistent storage. Kafka brokers are unique and stateful, how can we implement this in Kubernetes?
Let’s go through the basics of Strimzi, a Kafka operator for Kubernetes curated by Red Hat and see what problems it solves.
A special focus will be made on how to plug additional Kafka tools to a Strimzi installation.
We will also compare Strimzi with other Kafka operators by providing t ..read more
Adaltas
1y ago
Docker has become the new standard for building your application. In a Docker image we place our source code, its dependencies, some configurations and our application is almost ready to be deployed on our workstation or on our production, either in the cloud or on premises. For several years, Docker has been eclipsed by the open-source standard OCI (Open Container Initiative). Today it is not even necessary to use a Dockerfile to build our applications! Let’s have a look at what Buildpacks offers in this respect, but first we need to understand what an OCI image really is.
OCI Layers
Let’s ta ..read more
Adaltas
1y ago
Job description
Big Data and distributed computing are at the core of Adaltas. We accompagny our partners in the deployment, maintenance, and optimization of some of the largest clusters in France. Since recently we also provide support for day-day operations.
As a great defender and active contributor of open source, we are at the forefront of the data platform initiative TDP (TOSIT Data Platform).
During this internship, you will contribute to the development of TDP, its industrialization, and the integration of new open source components and new functionalities. You will be accompanied by t ..read more
Adaltas
1y ago
Good tech adventures start with some frustration, a need, or a requirement. This is the story of how I simplified the management and access of my local web applications with the help of Traefik and dnsmasq. The reasoning applies just as well for a production server using Docker.
My dev environment is composed of a growing number of web applications self-hosted on my laptop. Such applications include several websites, tools, editors, registries, … They use databases, REST APIs, or more complex backends. Take the example of Supabase, the Docker Compose file includes the Studio, the Kong API gate ..read more