Data Architecture & Databases on Feedspot

How to Be Useful: Unpacking Arnold Schwarzenegger’s Secrets to Success

Data Architecture & Databases

by abuData

3w ago

Did you know that the man who conquered bodybuilding, Hollywood, and the political arena believes that his multifaceted success boils down to just seven principles? Yes, Arnold Schwarzenegger, in his book “Be Useful: Seven Tools for Life,” distills the essence of his incredible journey into actionable insights for anyone looking to reinvent themselves or simply aiming for greatness. But what makes Schwarzenegger’s advice stand out in a sea of self-help gurus and motivational speakers? The Seven Pillars of Success According to Arnold What are the seven tools for life that Arnold Schwarzenegg ..read more

Visit website

Data visualization with Flourish

Data Architecture & Databases

by abuData

4M ago

Flourish is a data visualization and storytelling platform that helps data enthusiasts understand and communicate complex data. With a wide range of customizable templates and interactive features, Flourish makes it easy to create beautiful and engaging visualizations that tell a story and bring data to life. There are many different types of visualizations available in Flourish, including maps, charts, and diagrams. You can use these visualizations to explore data trends, discover insights, and communicate findings to others. For example, you might create a map to show the distribution of a ..read more

Visit website

Predictions about data for 2023 and beyond

Data Architecture & Databases

by abuData

4M ago

Predictions about data for 2023 and beyond. End of the year: it’s the time for predictions. Let’s have a look at some predictions regarding data. There are many predictions for Machine Learning, Deep Learning, and AI – explainability, professionalisation, and automation are often covered. The article lists some other topics that are hot for 2023 and beyond. Data Management predictions 2023 Some data management predictions for 2023. Data Engineering Teams Will Spend More Time On FinOps / Data Cloud Cost Optimisation The cloud is mainstream, and the costs are increasing. Storage may be cheap ..read more

Visit website

Data Vault and Star Schema with PlantUML: Entity Relationship Diagram as Code

Data Architecture & Databases

by abuData

4M ago

Entity Relationship Diagram as code means developers use the same tools for creating the diagrams – or documentation in general – as for coding. Documentation includes more than just source code and some comments. If the documentation is textual and not binary, versioning can be easily done with continuous integration generating executable software and documentation. PlantUML is a tool for creating UML (Unified Modeling Language) diagrams using a text editor. UML is a standardized modelling language that is commonly used in software engineering to design and document software systems. Pla ..read more

Visit website

Materialization examples of Data Engineering with dbt

Data Architecture & Databases

by abuData

4M ago

dbt offers several materialization options to create ETL/ELT processes. The article shows and compares various approaches how to use dbt for ETL/ELT. A previous post contains an introduction into dbt: Data Engineering with dbt – first steps using PostgreSQL and Oracle. The article has three main sections: setup of the data in the staging tables and the dbt models / snapshots data flow execution with initial (full) and delta load including materializations with dbt result discussion for view, table, incremental, and snapshot materialization Setup There are two staging tables stg_place ..read more

Visit website

Data Engineering with dbt – first steps using PostgreSQL and Oracle

Data Architecture & Databases

by abuData

4M ago

dbt is a Data Engineering tool supporting version control with CI/CD for transformations and materialization. The approach with dbt differs from tools like SSIS, DataFactory, Informatica. The developer models the target tables/views and the transformations. dbt uses these models to create target views or target tables and run transformations from the source to the target view/tables. The models are in plain text, so version control with CI/CD is easily supported. dbt claims to be a self-service Data Engineering tool so that not only Data Engineers but especially Data Scientists, Data ..read more

Visit website

PostgreSQL application_name

Data Architecture & Databases

by abuData

4M ago

PostgreSQL application_name can be set in the connection string. The view pg_stat_activity will show the application_name to help to identify the sessions. The article shows how to set application_name and how to benefit from it. It is highly recommended to set the application_name in any application regardless of the programming language. The command within the PostgreSQL command line pgsql is: “set application_name = ‘<application>’;”. The application_name makes it easier to identify sessions during troubleshooting, monitoring, or similar activities. The following examples show the d ..read more

Visit website

PostgreSQL columnar extension cstore_fdw

Data Architecture & Databases

by abuData

4M ago

PostgreSQL columnar extension cstore_fdw is a storage extension which is suited for OLAP-/DWH-style queries and data-intense applications. Columnar analytical databases have unique characteristics compared to row-oriented data access. Many commercial products exist: Pure columnar analytical databases like Vertica, Exasol, Snowflake (cloud-only), Amazon Redshift (cloud-only), Google BigQuery (cloud-only) are ideal for analytical workloads only. SAP Hana is optimized for columnar analytical workload but also supports row-oriented workload (OTLP-style). Classical row-oriented databases like O ..read more

Visit website

PostgreSQL partitioning guide

Data Architecture & Databases

by abuData

4M ago

PostgreSQL partitioning is a powerful feature when dealing with huge tables. Partitioning allows breaking a table into smaller chunks, aka partitions. Logically, there seems to be one table only if accessing the data, but physically there are several partitions. Queries reading a lot of data can become faster if only some partitions have to be read instead of the whole table. Maintenance tasks will also benefit from the smaller partitions. The blog post first introduces the data used in the article and how to reproduce the examples, then explains partition pruning, finally shows range, li ..read more

Visit website

Anonymization techniques and data privacy

Data Architecture & Databases

by abuData

4M ago

Anonymization techniques are essential for data analytics or in test/dev databases. Anonymization and pseudonymization are very different but often confused. GDPR does not apply to anonymized data anymore. GDPR is still applicable for pseudonymized data that can be achieved by hashing or tokenization. There are various anonymization techniques like Data redaction Differential privacy Grouping/clustering with k-anonymity, l-diversity, and similar methods Generating synthetic data with the help of lookups or with machine learning models Trust is becoming essential when working with data — ethi ..read more

Visit website

Follow Data Architecture & Databases on FeedSpot