Installation Guide to TDP, the 100% open source big data platform
Adaltas
by
6M ago
The Trunk Data Platform (TDP) is a 100% open source big data distribution, based on Apache Hadoop and compatible with HDP 3.1. Initiated in 2021 by EDF, the DGFiP and Adaltas, the project is governed by the TOSIT - an association under the 1901 law with the objective of promoting open source to major companies and institutions. Version 1.1, which release is expected duing the 4th quarter of 2023, adds features necessary for managing a production cluster (see #308). Support and training offers are already available from some consulting firms like Adaltas with Alliage. TDP is aimed at anyone wis ..read more
Visit website
New TDP website launched
Adaltas
by
7M ago
The new TDP (Trunk Data Platform) website is online. We invite you to browse its pages to discover the platform, stay informed, and cultivate contact with the TDP community. TDP is a completely open-source big data platform leveraging the Hadoop eco-system. Initiated 3 years ago at the initiative of DGFIP, EDF and Adaltas, it is governed by the TOSIT association, whose mission is to promote open source to support the emergence of codes, softwares and IT solutions under open source licenses. The site is aimed at all those wishing to discover the features of the platform. The documentation will ..read more
Visit website
CDP part 6: end-to-end data lakehouse ingestion pipeline with CDP
Adaltas
by
8M ago
In this hands-on lab session we demonstrate how to build an end-to-end big data solution with Cloudera Data Platform (CDP) Public Cloud, using the infrastructure we have deployed and configured over the course of the series. This is the final article in a series of six: CDP part 1: introduction to end-to-end data lakehouse architecture with CDP CDP part 2: CDP Public Cloud deployment on AWS CDP part 3: Data Services activation on CDP Public Cloud environment CDP part 4: user management on CDP Public Cloud with Keycloak CDP part 5: user permission management on CDP Public Cloud CDP part 6: end ..read more
Visit website
CDP part 2: CDP Public Cloud deployment on AWS
Adaltas
by
10M ago
The Cloudera Data Platform (CDP) Public Cloud provides the foundation upon which full featured data lakes are created. In a previous article, we introduced the CDP platform. This article is the second in a series of six to learn how to build end-to-end big data architectures with CDP: CDP part 1: introduction to end-to-end data lakehouse architecture with CDP CDP part 2: CDP Public Cloud deployment on AWS CDP part 3: Data Services activation on CDP Public Cloud environment CDP part 4: user management on CDP Public Cloud with Keycloak CDP part 5: user permission management on CDP Public Cloud ..read more
Visit website
CDP part 1: introduction to end-to-end data lakehouse architecture with CDP
Adaltas
by
10M ago
Cloudera Data Platform (CDP) is a hybrid data platform for big data transformation, machine learning and data analytics. In this series we describe how to build and use an end-to-end big data architecture with CDP Public Cloud on Amazon Web Services (AWS). Our architecture is designed to retrieve data from an API, store it in a data lake, move it to a data warehouse and eventually serve it in a data visualization application to analytics end users. This series includes the following six articles: CDP part 1: introduction to end-to-end data lakehouse architecture with CDP CDP part 2: CDP Publi ..read more
Visit website
Local development environments with Terraform + LXD
Adaltas
by
11M ago
As a Big Data Solutions Architect and InfraOps, I need development environments to install and test software. They have to be configurable, flexible, and performant. Working with distributed systems, the best-fitting setups for this use case are local virtualized clusters of several Linux instances. For a few years, I have been using HashiCorp’s Vagrant to manage libvirt/KVM instances. This is working well, but I recently experienced another setup that works better for me: LXD to manage instances and Terraform (another HashiCorp tool) to operate LXD. In this article, I explain what the advanta ..read more
Visit website
Operating Kafka in Kubernetes with Strimzi
Adaltas
by
1y ago
Kubernetes is not the first platform that comes to mind to run Apache Kafka clusters. Indeed, Kafka’s strong dependency on storage might be a pain point regarding Kubernetes’ way of doing things when it comes to persistent storage. Kafka brokers are unique and stateful, how can we implement this in Kubernetes? Let’s go through the basics of Strimzi, a Kafka operator for Kubernetes curated by Red Hat and see what problems it solves. A special focus will be made on how to plug additional Kafka tools to a Strimzi installation. We will also compare Strimzi with other Kafka operators by providing t ..read more
Visit website
How to build your OCI images using Buildpacks
Adaltas
by
1y ago
Docker has become the new standard for building your application. In a Docker image we place our source code, its dependencies, some configurations and our application is almost ready to be deployed on our workstation or on our production, either in the cloud or on premises. For several years, Docker has been eclipsed by the open-source standard OCI (Open Container Initiative). Today it is not even necessary to use a Dockerfile to build our applications! Let’s have a look at what Buildpacks offers in this respect, but first we need to understand what an OCI image really is. OCI Layers Let’s ta ..read more
Visit website
Big data infrastructure internship
Adaltas
by
1y ago
Job description Big Data and distributed computing are at the core of Adaltas. We accompagny our partners in the deployment, maintenance, and optimization of some of the largest clusters in France. Since recently we also provide support for day-day operations. As a great defender and active contributor of open source, we are at the forefront of the data platform initiative TDP (TOSIT Data Platform). During this internship, you will contribute to the development of TDP, its industrialization, and the integration of new open source components and new functionalities. You will be accompanied by t ..read more
Visit website
Traefik, Docker and dnsmasq to simplify container networking
Adaltas
by
1y ago
Good tech adventures start with some frustration, a need, or a requirement. This is the story of how I simplified the management and access of my local web applications with the help of Traefik and dnsmasq. The reasoning applies just as well for a production server using Docker. My dev environment is composed of a growing number of web applications self-hosted on my laptop. Such applications include several websites, tools, editors, registries, … They use databases, REST APIs, or more complex backends. Take the example of Supabase, the Docker Compose file includes the Studio, the Kong API gate ..read more
Visit website

Follow Adaltas on FeedSpot

Continue with Google
Continue with Apple
OR