Loading...

Follow Percona Database Performance Blog | MongoDB on Feedspot

Continue with Google
Continue with Facebook
or

Valid

Percona announces the release of Percona Server for MongoDB 3.4.21-2.19 on June 20, 2019. Download the latest version from the Percona website or the Percona software repositories.

Percona Server for MongoDB is an enhanced, open source, and highly-scalable database that is a fully-compatible, drop-in replacement for MongoDB 3.4 Community Edition. It supports MongoDB 3.4 protocols and drivers.

Percona Server for MongoDB extends Community Edition functionality by including the Percona Memory Engine storage engine, as well as several enterprise-grade features:

Also, it includes MongoRocks storage engine, which is now deprecated. Percona Server for MongoDB requires no changes to MongoDB applications or code.

This release is based on MongoDB 3.4.21. In Percona Server for MongoDB 3.4.21-2.19, there are no additional improvements or new features on top of those upstream fixes.

The Percona Server for MongoDB 3.4.21-2.19 release notes are available in the official documentation.

Read Full Article
  • Show original
  • .
  • Share
  • .
  • Favorite
  • .
  • Email
  • .
  • Add Tags 

Percona recognizes MongoDB’s performance, reliability, and flexibility and has been evaluating, developing, and extending MongoDB software since 2015.

Along with many others, Percona believes that fully open source software provides great value to users and we will continue to follow that ethos and practice.

We are involved with MongoDB software, services, and support on many levels. Some of the most notable are detailed below.

Percona Server for MongoDB

Percona Server for MongoDB is a free and open source drop-in replacement for MongoDB Community Edition. It offers all the features and benefits of MongoDB Community Edition, plus additional enterprise-grade functionality.

The first GA version of Percona Server for MongoDB was released on December 14, 2015. Since then we have regularly released updates, training, and bug fixes, ensuring our users are fully informed and have the most up-to-date software possible.

Our latest version, Percona Server for MongoDB 4.0.10-5, includes HashiCorp Vault integration for added security. You can read our recent blog for more information on the new and improved features it contains.

Percona Backup for MongoDB

As part of our commitment to MongoDB, we continue to create features and software components that help users achieve optimal database performance and security.

Having observed a business need we set about creating an open source MongoDB backup tool which gives users enhanced database disaster recovery abilities. As a result, on June 17, 2019 we were excited to announce the early release of our latest software product Percona Backup for MongoDB 0.5.0.

Percona Backup for MongoDB is a free open source back-up solution for consistent backups of MongoDB sharded clusters and replica sets. It enables you to manage your own back-ups without third party involvement or costly licenses.

The GA version of Percona Backup for MongoDB is scheduled to be released later in 2019.

Percona Monitoring and Management (PMM) for MongoDB

Percona Monitoring and Management (PMM) is a free and open source platform which allows you to manage and monitor the performance of your MongoDB database.

You can run PMM in your own environment for maximum security and reliability. It provides thorough time-based analysis to ensure that your data works as efficiently as possible.

Percona Monitoring and Management can be used on-premises and in the cloud with providers such as Amazon Web Services (AWS), Google Cloud, Microsoft Azure, and others.

We will be releasing PMM 2, a greatly enhanced version of our PMM software later in 2019. Further details can be found here.

Percona Toolkit for MongoDB

Percona Toolkit is a collection of advanced open source command-line tools, developed and used by the Percona technical staff, engineered to perform a variety of MongoDB server and system tasks that are too difficult or complex to perform manually. This frees up your DBAs to focus on work that helps you achieve your business goals.

These tools are ideal alternatives to private or “one-off” scripts because they are professionally developed, formally tested, and fully documented. They are also fully self-contained, so installation is quick and easy, and no libraries are installed.

MongoDB User Services

In addition to our own open source versions of MongoDB software, we offer a range of services, including Support and Consulting.

Many companies have mission-critical applications which depend on their MongoDB database environment. But what happens if your database goes down or isn’t running at the optimum level? Percona brings world-class database expertise and open source values to provide a comprehensive, responsive, and cost-effective plan to help your business succeed.

Percona services are available to help your business succeed, whether you rely on Percona Server for MongoDB or MongoDB Community.

MongoDB Webinars and Updates

One of our core values is to keep users informed and up-to-date on software and market changes. Consequently, we provide regular webinars and updates on a variety of topics. Past webinars can be accessed via our website.

We have three upcoming MongoDB-themed webinars led by Percona experts:

Tuesday, June 25th, 2019 – Aayushi Mangal presents: MongoDB Transactions

Transactions are a feature of RDBMS systems, but MongoDB is a document-oriented non-RDBMS database. In MongoDB 4.0, a new multi-document transaction feature has been introduced in a replica set environment. This webinar will explore how a transaction in MongoDB is achieved.

Wednesday, June 26th, 2019 – Adamo Tonete presents: MongoDB Data Security – Custom Rules and Views

Adamo will discuss how to create custom rules when the default rules are not enough for the application, and how to use view to hide fields from users when you don’t want them to read the whole collection. This is a great webinar for anyone with security concerns.

Thursday, June 27th, 2019 – Marcos Albe presents: Beyond Relational Databases: A Look Into MongoDB, Redis, and ClickHouse

We all love relational databases… until we use them for a purpose they are not best suited for. Queues, caches, catalogs, unstructured data, counters, and many other use cases could be solved with relational databases, but are better solved with alternatives. This webinar reviews goals, pros and cons, and use cases of some of these alternatives, looking at modern open source implementations. You will learn the basics of three database paradigms (document, key-value, and columnar store), find out when it’s appropriate to opt for one of these, and when you should choose relational databases instead.

Please Contact us for More Information

We hope this brief update on Percona’s MongoDB capabilities has been useful. Please look out for our upcoming MongoDB software releases, webinars, and blogs.

Percona has extensive experience advising companies on the best way to configure, manage, and run their MongoDB databases. To learn more, please contact us at 1-888-316-9775 or 0-800-051-8984 in Europe or sales@percona.com.

Read Full Article
  • Show original
  • .
  • Share
  • .
  • Favorite
  • .
  • Email
  • .
  • Add Tags 

Last Friday we announced the release of Percona Server for MongoDB 4.0.10-5. This can be downloaded from the Percona website or the Percona software repositories.

Percona Server for MongoDB is an enhanced, open source, and highly-scalable database. It is a fully compatible, drop-in replacement for MongoDB 4.0 Community Edition and doesn’t require any changes to MongoDB applications or code.

Our latest release Multi-Document ACID transactions, enhancements to the Change Streams support.

At Percona we pride ourselves on adding new and exciting enterprise-level features to our software, not just duplicating the latest community version. We are also strongly focused on ensuring our software users have the tools they need to securely manage their data.

HashiCorp Vault Integration

As a result, we are excited to announce integration with HashiCorp Vault in our release of Percona Server for MongoDB 4.0.10-5.

Understanding who is accessing private information on your system can be a challenge. Regular password changes, safe storage, and detailed audit logs are essential to ensuring secure systems.

HashiCorp Vault is a product which manages secrets and protects sensitive data. It securely stores and tightly controls access to confidential information.

In previous versions of Percona Server for MongoDB, the data at rest encryption key was stored locally on the server inside the key file. Our integration withHashiCorp Vault now enables you to store the encryption key more securely inside the vault

Further information on the key features and benefits of HashiCorp Vault can be found Percona Backup for MongoDB 0.5.0. For more insight into Percona’s MongoDB capabilities please look out for our upcoming software announcements, webinars, and blogs.

Percona has extensive experience advising companies on the best way to configure, manage, and run their MongoDB databases. To learn more, please contact us at 1-888-316-9775 or 0-800-051-8984 in Europe or sales@percona.com.

Read Full Article
  • Show original
  • .
  • Share
  • .
  • Favorite
  • .
  • Email
  • .
  • Add Tags 

Percona is pleased to announce the early release of our latest software product Percona Backup for MongoDB 0.5.0 on June 17, 2019. The GA version is scheduled to be released later in 2019.

Percona Backup for MongoDB is a distributed, low-impact solution for consistent backups of MongoDB sharded clusters and replica sets. This is a tool for creating consistent backups across a MongoDB sharded cluster (or a single replica set), and for restoring those backups to a specific point in time. Percona Backup for MongoDB uses a distributed client/server architecture to perform backup/restore actions. The project was inspired by (and intends to replace) the MongoDB Community Server version 3.6 or higher with our bug tracking system.

New Features and Improvements
  • PBM-99: pbm-agent is deprecated on mongos
  • PBM-122: Include the replica set name to the output of pmbctl list nodes
  • PBM-139: Support listing backup status in the output of pmbctl list nodes
  • Apache License v2.0.

    We look forward to sharing the GA version of Percona Backup for MongoDB later this year and appreciate any feedback that you might have on our 0.5.0 version in the meantime.

Read Full Article
  • Show original
  • .
  • Share
  • .
  • Favorite
  • .
  • Email
  • .
  • Add Tags 

Percona announces the release of Percona Server for MongoDB 4.0.10-5 on June 13, 2019. Download the latest version from the Percona website or the Percona software repositories.

Percona Server for MongoDB is an enhanced, open source, and highly-scalable database that is a fully-compatible, drop-in replacement for MongoDB 4.0 Community Edition. It supports MongoDB 4.0 protocols and drivers.

Percona Server for MongoDB extends the functionality of the MongoDB 4.0 Community Edition by including the Percona Memory Engine storage engine, encrypted WiredTiger storage engineaudit loggingSASL authenticationhot backups, and enhanced query profiling. Percona Server for MongoDB requires no changes to MongoDB applications or code.

Percona Server for MongoDB 4.0.10-5 introduces the support of HashiCorp Vault key management service. For more information, see Data at Rest Encryption in the documentation of Percona Server for MongoDB.

This release includes all features of MongoDB 4.0 Community Edition. Most notable among these are:

Note that the MMAPv1 storage engine is deprecated in MongoDB 4.0 Community Edition.

Percona Server for MongoDB 4.0.10-5 is based on MongoDB 4.0.10.

New Features

The Percona Server for MongoDB 4.0.10-5 release notes are available in the official documentation.

Read Full Article
  • Show original
  • .
  • Share
  • .
  • Favorite
  • .
  • Email
  • .
  • Add Tags 

Redhat Enterprise Linux 8 packages have been released into our repositories.  Recent versions of Percona Server for MySQL, Percona XtraDB Cluster, Percona XtraBackup,  Percona Server for MongoDB, Percona Toolkit, and the PMM Client can now be automatically installed via Redhats dnf utility.

How to install RHEL 8

Visit https://www.redhat.com/en/technologies/linux-platforms/enterprise-linux and follow the instructions.  If you have a paid subscription,  you will be able to download the RHEL 8 DVD ISO and use this to install the operating system on a physical or virtual machine.   You may also signup for a free 30-day trial.

Once installed, you will need to register the instance with Redhat using subscription-manager in order to update your instance.   While logged in as a user with administrator privileges from shell prompt, issue the following commands:

$ sudo subscription-manager register
  (provide Red Hat account Username and Password)
$ sudo subscription-manager attach --auto
$ sudo dnf update

How to install Percona packages for RHEL 8

After you have started the RHEL 8 instance and registered it with Redhat,  you can install the percona-release package which will allow you to configure the correct repositories for the product you want to install.  

$ sudo dnf install https://repo.percona.com/yum/percona-release-latest.noarch.rpm

Installation instructions per product

Newer Percona products are separated into their own repositories in order to reduce dependency conflicts with other Percona software versions.  The percona-release script is used to configure the correct repositories for each product.

Percona Server 8.0.x

Percona Server 8.0.x is deployed into a separate repository.   Using the percona-release tool, set up the ps80 product.

Note:  Due to the modularity features built into RHEL 8’s dnf software package manager,  the mysql module must first be disabled before installing Percona Server 8.0.

$ sudo dnf module disable mysql
$ sudo percona-release setup ps80
$ sudo dnf install percona-server-server percona-server-client

If you want to install the MyRocks or TokuDB packages

$ sudo dnf install percona-server-rocksdb
$ sudo dnf install percona-server-tokudb

After installation you will need to start the server, and use the assigned temporary password to assign a new one:

$ sudo systemctl start mysqld
$ sudo grep 'temporary password' /var/log/mysqld.log
$ mysqladmin -uroot -p password

Percona Server 5.7.x

Percona Server 5.7.x is deployed into the “original” repository location.   

Note:  Due to the modularity features built into RHEL 8’s dnf software package manager,  the mysql module must first be disabled before installing Percona Server 8.0.

$ sudo dnf module disable mysql
$ sudo percona-release setup ps57
$ sudo dnf install Percona-Server-server-57 Percona-Server-client-57

If you want to install the MyRocks or TokuDB packages:

$ sudo dnf install Percona-Server-rocksdb-57
$ sudo dnf install Percona-Server-tokudb-57

After installation you will need to start the server, and use the assigned temporary password to assign a new one:

$ sudo systemctl start mysqld
$ sudo grep 'temporary password' /var/log/mysqld.log
$ mysqladmin -uroot -p password

Percona XtraBackup 8.0.x

Percona XtraBackup 8.0.x is deployed into a separate tools repository.   If you have configured the repository using the percona-release setup ps80 command no additional repository configuration is required.  If you are installing Percona XtraBackup 8.0 for use with MySQL Community 8.0 you will need to use the percona-release tool to enable the tools repository which contains XtraBackup,  Toolkit, the PMM Client and other dependencies.   If you have followed the instructions for Percona Server 8.0 above,  the tools repository is enabled.

Note:  Due to the modularity features built into RHEL 8’s dnf software package manager,  the mysql module must first be disabled before installing Percona Server 8.0.

To install and use with Percona Server 8.0

$ sudo dnf module disable mysql
$ sudo percona-release setup ps80
$ sudo dnf install percona-xtrabackup-80

To install and use with MySQL Community 8.0

$ sudo percona-release enable-only tools
$ sudo dnf install percona-xtrabackup-80

Percona XtraBackup 2.4.x

Percona XtraBackup 2.4.x is deployed into the “original” repository location.  This repository is enabled by default so no repository configuration is required.

$ sudo dnf install percona-xtrabackup-24

Percona XtraDB Cluster 5.7.x

Percona XtraDB Cluster 5.7.x is deployed into the “original” repository location.   This repository is enabled by default so no repository configuration is required.

Note:  Due to the modularity features built into RHEL 8’s dnf software package manager,  the mysql module must first be disabled before installing Percona Server 8.0.

$ sudo dnf module disable mysql
$ sudo percona-release setup pxc57
$ sudo dnf install Percona-XtraDB-Cluster-57

Percona Server for MongoDB 4.0.x

Percona Server for MongoDB 4.0.x is deployed into a separate repository.   Using the percona-release tool, set up the psmdb40 product repositories.

$ sudo percona-release setup psmdb40
$ sudo dnf install percona-server-mongodb

Percona Toolkit 3.0.x

Percona Toolkit 3.0.x is deployed into both the “original” repository location and the new tools location.  This allows Percona Toolkit to be easily used with different products. If you have set up one of the server products above,  Percona Toolkit 3.0.x will be available.

Note: There is a known issue with Percona Toolkit and RHEL8.  Percona Toolkit uses the RHEL8 perl-DBD-MySQL database driver which is linked against the MariaDB C Connector.  The version of this connector that ships with RHEL8 does not support MySQL 8.0 SHA-2 (SHA256) authentication. Therefore, trying to connect to a Percona Server 8.0 or MySQL Community 8.0 instance that has this authentication mode enabled will fail.   The SHA-2 authentication plugin is enabled by default.  However, at the time of this writing,  the MySQL Community 8.0 server that ships with RHEL8 disables this authentication plugin to workaround the incompatibility with the MariaDB C Connector.  Percona Server 8.0 does not disable SHA-2 by default.

$ sudo dnf install percona-toolkit

PMM Client 1.x

PMM Client 1.x  is deployed into both the “original” repository location and the new tools location.  This allows PMM Client to be easily used with different products. If you have enabled one of the server products above,  PMM Client will be available. This “original” repository is enabled by default so no repository configuration is required.    

$ sudo dnf install pmm-client

Read Full Article
  • Show original
  • .
  • Share
  • .
  • Favorite
  • .
  • Email
  • .
  • Add Tags 

We are pleased to announce the release of PMM 2 Beta!  PMM (Percona Monitoring and Management) is a free and open-source platform for managing and monitoring MySQL, MongoDB, and PostgreSQL performance.

  • Query Analytics:
    • MySQL and MongoDB – Slow log, PERFORMANCE_SCHEMA, and Profiler data sources
    • Support for large environments – default view all queries from all instances
    • Filtering – display only the results matching filters such as the schema name or the server instance
    • Sorting and more columns – now sort by any column.
    • Modify Columns – Add one or more columns for any field exposed by the data source
    • Sparklines –  restyled sparkline targeted at making data representation more accurate
  • Labels – Prometheus now supports auto-discovered and custom labels
  • Inventory Overview Dashboard – Displays the agents, services, and nodes which are registered with PMM Server
  • Environment Overview Dashboard – See issues at a glance across multiple servers
  • API – View versions and list hosts using the API
  • MySQL, MongoDB, and PostgreSQL Metrics – Visualize database metrics over time
  • pmm-agent – Provides secure remote management of the exporter processes and data collectors on the client

PMM 2 Beta is still a work in progress – you may encounter some bugs and missing features. We are aware of a number of issues, but please report any and all that you find to Percona’s JIRA.

This release is not recommended for Production environments.

PMM 2 is designed to be used as a new installation – please don’t try to upgrade your existing PMM 1 environment.

Query Analytics Dashboard

Query Analytics Dashboard now defaults to display all queries on each of the systems that are configured for MySQL PERFORMANCE_SCHEMA, Slow Log, and MongoDB Profiler, and includes comprehensive filtering capabilities.

Query Analytics Overview

You’ll recognize some of the common elements in PMM 2 Query Analytics such as the Load, Count, and Latency columns. However, there are new elements such as the filter box and more arrows on the columns:

Query Detail

Query Analytics continues to deliver detailed information regarding individual query performance

Filter and Search By

There is a filtering panel on the left, or use the search by bar to set filters using key:value syntax. For example, I’m interested in just the queries related to mysql-sl2 server, I could then type d_server:mysql-sl2:

Sort by any column

This is a much-requested feature from PMM Query Analytics and we’re glad to announce that you can now sort by any column! Just click the small arrow to the right of the column name and:

Sparklines

As you may have already noticed, we have changed the sparkline representation. New sparklines are not points-based lines, but are interval-based, and look like a staircase line with flat values for each of the displayed period:

We also position a single sparkline for only the left-most column and render numeric values for all remaining columns.

Add extra columns

Now you can add a column for each additional field which is exposed by the data source. For example, you can add Rows Examined by clicking the + sign and typing or selecting from the available list of fields:

MySQL Query Analytics Slow Log source

We’ve increased our MySQL support to include both PERFORMANCE_SCHEMA and Slow log – and if you’re using

Read Full Article
  • Show original
  • .
  • Share
  • .
  • Favorite
  • .
  • Email
  • .
  • Add Tags 

A couple of weeks ago I presented at Percona University São Paulo about the new features in PostgreSQL that allow the deployment of simple shards. I’ve tried to summarize the main points in this post, as well as providing an introductory overview of sharding itself. Please note I haven’t included any third-party extensions that provide sharding for PostgreSQL in my discussion below.

Partitioning in PostgreSQL

In a nutshell, until not long ago there wasn’t a dedicated, native feature in PostgreSQL for table partitioning. Not that that prevented people from doing it anyway: the PostgreSQL community is very creative. There’s a table inheritance feature in PostgreSQL that allows the creation of child tables with the same structure as a parent table. That, combined with the employment of proper constraints in each child table along with the right set of triggers in the parent table, has provided practical “table partitioning” in PostgreSQL for years (and still works). Here’s an example:

Using table inheritance

CREATE TABLE temperature (
  id BIGSERIAL PRIMARY KEY NOT NULL,
  city_id INT NOT NULL,
  timestamp TIMESTAMP NOT NULL,
  temp DECIMAL(5,2) NOT NULL
);

Figure 1a. Main (or parent) table

CREATE TABLE temperature_201901 (CHECK (timestamp >= DATE '2019-01-01' AND timestamp <= DATE '2019-01-31')) INHERITS (temperature);
CREATE TABLE temperature_201902 (CHECK (timestamp >= DATE '2019-02-01' AND timestamp <= DATE '2019-02-28')) INHERITS (temperature);
CREATE TABLE temperature_201903 (CHECK (timestamp >= DATE '2019-03-01' AND timestamp <= DATE '2019-03-31')) INHERITS (temperature);

Figure 1b. Child tables inherit the structure of the parent table and are limited by constraints

CREATE OR REPLACE FUNCTION temperature_insert_trigger()
RETURNS TRIGGER AS $$
BEGIN
    IF ( NEW.timestamp >= DATE '2019-01-01' AND NEW.timestamp <= DATE '2019-01-31' ) THEN INSERT INTO temperature_201901 VALUES (NEW.*);
    ELSIF ( NEW.timestamp >= DATE '2019-02-01' AND NEW.timestamp <= DATE '2019-02-28' ) THEN INSERT INTO temperature_201902 VALUES (NEW.*);
    ELSIF ( NEW.timestamp >= DATE '2019-03-01' AND NEW.timestamp <= DATE '2019-03-31' ) THEN INSERT INTO temperature_201903 VALUES (NEW.*);
    ELSE RAISE EXCEPTION 'Date out of range!';
    END IF;
    RETURN NULL;
END;
$$
LANGUAGE plpgsql;

Figure 1c. A function that controls in which child table a new entry should be added according to the timestamp field

CREATE TRIGGER insert_temperature_trigger
    BEFORE INSERT ON temperature
    FOR EACH ROW EXECUTE PROCEDURE temperature_insert_trigger();

Figure 1d. A trigger is added to the parent table that calls the function above when an INSERT is performed

The biggest drawbacks for such an implementation was related to the amount of manual work needed to maintain such an environment (even though a certain level of automation could be achieved through the use of 3rd party extensions such as pg_partman) and the lack of optimization/support for “distributed” queries. The PostgreSQL optimizer wasn’t advanced enough to have a good understanding of partitions at the time, though there were workarounds that could be used such as employing constraint exclusion.

Declarative partitioning

About 1.5 year ago, PostgreSQL 10 was released with a bunch of new features, among them native support for table partitioning through the new declarative partitioning feature. Here’s how we could partition the same temperature table using this new method:

CREATE TABLE temperature (
  id BIGSERIAL NOT NULL,
  city_id INT NOT NULL,
  timestamp TIMESTAMP NOT NULL,
  temp DECIMAL(5,2) NOT NULL
) PARTITION BY RANGE (timestamp);

Figure 2a. Main table structure for a partitioned table

CREATE TABLE temperature_201901 PARTITION OF temperature FOR VALUES FROM ('2019-01-01') TO ('2019-02-01');
CREATE TABLE temperature_201902 PARTITION OF temperature FOR VALUES FROM ('2019-02-01') TO ('2019-03-01');
CREATE TABLE temperature_201903 PARTITION OF temperature FOR VALUES FROM ('2019-03-01') TO ('2019-04-01');

Figure 2b. Tables defined as partitions of the main table; with declarative partitioning, there was no need for triggers anymore.

It still missed the greater optimization and flexibility needed to consider it a complete partitioning solution. It wasn’t possible, for example, to perform an UPDATE that would result in moving a row from one partition to a different one, but the foundation had been laid. Fast forward another year and PostgreSQL 11 builds on top of this, delivering additional features like:

  • the possibility to define a default partition, to which any entry that wouldn’t fit a corresponding partition would be added to.
  • having indexes added to the main table “replicated” to the underlying partitions, which improved declarative partitioning usability.
  • support for Foreign Keys

These are just a few of the features that led to a more mature partitioning solution.

Sharding in PostgreSQL

By now you might be reasonably questioning my premise, and that partitioning is not sharding, at least not in the sense and context you would have expected this post to cover. In fact, PostgreSQL has implemented sharding on top of partitioning by allowing any given partition of a partitioned table to be hosted by a remote server. The basis for this is in PostgreSQL’s Foreign Data Wrapper (FDW) support, which has been a part of the core of PostgreSQL for a long time. While technically possible to implement, we just couldn’t make practical use of it for sharding using the table inheritance + triggers approach. Declarative partitioning allowed for much better integration of these pieces making sharding – partitioned tables hosted by remote servers – more of a reality in PostgreSQL.

CREATE TABLE temperature_201904 (
  id BIGSERIAL NOT NULL,
  city_id INT NOT NULL,
  timestamp TIMESTAMP NOT NULL,
  temp DECIMAL(5,2) NOT NULL
);

Figure 3a. On the remote server we create a “partition” – nothing but a simple table

CREATE EXTENSION postgres_fdw;
GRANT USAGE ON FOREIGN DATA WRAPPER postgres_fdw to app_user;
CREATE SERVER shard02 FOREIGN DATA WRAPPER postgres_fdw
    OPTIONS (dbname 'postgres', host 'shard02', port
    '5432');
CREATE USER MAPPING for app_user SERVER shard02
    OPTIONS (user 'fdw_user', password 'secret');

Figure 3b. On the local server the preparatory steps involve loading the postgres_fdw extension, allowing our local application user to use that extension, creating an entry to access the remote server, and finally mapping that user with a user in the remote server (fdw_user) that has local access to the table we’ll use as a remote partition.

CREATE FOREIGN TABLE temperature_201904 PARTITION OF temperature
    FOR VALUES FROM ('2019-04-01') TO ('2019-05-01')
    SERVER remoteserver01;

Figure 3c. Now it’s simply a matter of creating a proper partition of our main table in the local server that will be linked to the table of the same name in the remote server

You can read more about postgres_fdw in Foreign Data Wrappers in PostgreSQL and a closer look at postgres_fdw.

When does it make sense to partition a table?

There are a several principle reasons to partition a table:

  1. When a table grows so big that searching it becomes impractical even with the help of indexes (which will invariably become too big as well).
  2. When data management is such that the target data is often the most recently added and/or older data is constantly being purged/archived, or even not being searched anymore (at least not as often).
  3. If you are loading data from different sources and maintaining it as a data warehousing for reporting and analytics.
  4. For a less expensive archiving or purging of massive data that avoids exclusive locks on the entire table.
When should we resort to sharding?

Here are a couple of classic cases:

  1. To scale out (horizontally), when even after partitioning a table the amount of data is too great or too complex to be processed by a single server.
  2. Use cases where the data in a big table can be divided into two or more segments that would benefit the majority of the search patterns. A common example used to describe a scenario like this is that of a company whose customers are evenly spread across the United States and searches to a target table involves the customer ZIP code. A shard then could be used to host entries of customers located on the East coast and another for customers on the West coast.

Note though this is by no means an extensive list.

How should we shard the data?

With sharding (in this context) being “distributed” partitioning, the essence for a successful (performant) sharded environment lies in choosing the right shard key – and by “right” I mean one that will distribute your data across the shards in a way that will benefit most of your queries. In the example above, using the customer ZIP code as shard key makes sense if an application will more often be issuing queries that will hit one shard (East) or the other (West). However, if most queries would filter by, say, birth date, then all queries would need to be run through all shards to recover the full result set. This could easily backfire on performance with the shard approach, by not selecting the right shard key or simply by having such a heterogeneous workload that no shard key would be able to satisfy it.

It only ever makes sense to shard if the nature of the queries involving the target table(s) is such that distributed processing will be the norm and constitute an advantage far greater than any overhead caused by a minority of queries that rely on JOINs involving multiple shards. Due to the distributed nature of sharding such queries will necessarily perform worse if compared to having them all hosted on the same server.

Why not simply rely on replication or clustering?

Sharding should be considered in those situations where you can’t efficiently break down a big table through data normalization or use an alternative approach and maintaining it on a single server is too demanding. The table is then partitioned and the partitions distributed across different servers to spread the load across many servers. It doesn’t need to be one partition per shard, often a single shard will host a number of partitions.

Note how sharding differs from traditional “share all” database replication and clustering environments: you may use, for instance, a dedicated PostgreSQL server to host a single partition from a single table and nothing else. However, these data scaling technologies may well complement each other: a PostgreSQL database may host a shard with part of a big table as well as replicate smaller tables that are often used for some sort of consultation (read-only), such as a price list, through logical replication.

How does sharding in PostgreSQL relates to sharding in MongoDB®?

MongoDB® tackles the matter of managing big collections straight through sharding: there is no concept of local partitioning of collections in MongoDB. In fact, the whole MongoDB scaling strategy is based on sharding, which takes a central place in the database architecture. As such, the sharding process has been made as transparent to the application as possible: all a DBA has to do is to define the shard key.

Instead of connecting to a reference database server the application will connect to an auxiliary router server named mongos which will process the queries and request the necessary information to the respective shard. It knows which shard contains what because they maintain a copy of the metadata that maps chunks of data to shards, which they get from a config server, another important and independent component of a MongoDB sharded cluster. Together, they also play a role in maintaining good data distribution across the shards, actively splitting and migrating chunks of data between servers as needed.

In PostgreSQL the application will connect and query the main database server. There isn’t an intermediary router such as the mongos but PostgreSQL’s query planner will process the query and create an execution plan. When data requested from a partitioned table is found on a remote server PostgreSQL will request the data from it, as shown in the EXPLAIN output below:

…
   Remote SQL: UPDATE public.emp SET sal = $2 WHERE ctid = $1
   ->  Nested Loop  (cost=100.00..300.71 rows=669 width=118)
     	Output: emp.empno, emp.ename, emp.job, emp.mgr, emp.hiredate, (emp.sal * '1.1'::double precision), emp.comm, emp.deptno, emp.ctid, salgrade.ctid
     	Join Filter: ((emp.sal > (salgrade.losal)::double precision) AND (emp.sal < (salgrade.hisal)::double precision)) ->  Foreign Scan on public.emp  (cost=100.00..128.06 rows=602 width=112)
           	Output: emp.empno, emp.ename, emp.job, emp.mgr, emp.hiredate, emp.sal, emp.comm, emp.deptno, emp.ctid
…

Figure 4: excerpt of an EXPLAIN plan that involves processing a query in a remote server

Note in the above query the mention “Remote SQL”. A lot of optimizations have been made in the execution of remote queries in PostgreSQL 10 and 11, which contributed to mature and improve the sharding solution. Among them is support for having grouping and aggregation operations executed on the remote server itself (“push down”) rather than recovering all rows and having them processed locally.

What is missing in PostgreSQL implementation?

There is, however, still room for improvement. In terms of remote execution, reports from the community indicate not all queries are performing as they should. For example, in some cases the PostgreSQL planner is not performing a full push-down, resulting in shards transferring more data than required. Parallel scheduling of queries that touch multiple shards is not yet implemented: for now, the execution is taking place sequentially, one shard at a time, which takes longer to complete. When it comes to the maintenance of partitioned and sharded environments, changes in the structure of partitions are still complicated and not very practical. For example, when you add a new partition to a partitioned table with an appointed default partition you may need to detach the default partition first if it contains rows that would now fit in the new partition, manually move those to the new partition, and finally re-attach the default partition back in place.

But that is all part of a maturing technology. We’re looking forward to PostgreSQL 12 and what it will bring in the partitioning and sharding fronts.


Image based on photos by Leonardo Quatrocchi from Pexels

Read Full Article
  • Show original
  • .
  • Share
  • .
  • Favorite
  • .
  • Email
  • .
  • Add Tags 

In this blog post, I’ll walk through using Docker containers and discuss the advantages of using them to run multiple versions of Percona Server for MongoDB (PSMDB) instances in the same server. This is particularly useful for local testing. Lots of companies have started to use containers for their applications, due to the advantages of this model, and due to their hassle-free deployment. I’ll show you how to configure variables for  MongoDB® when using docker containers. Hopefully, this will be a useful introduction that’ll help you to start working with Dockers yourself. Let’s get into the topic…

Experiments and testing…

It is always been fun to work with test cases for our databases. When I do that, I generally use VMs, Dockers or a test a machine – hmm! I know what you think… your boss should give you one if he wants you to solve problems by testing. There are still more things that you could try.  As we are working inside Percona, this might be MongoDB operators (see early release here:  https://www.percona.com/blog/2019/02/21/percona-server-for-mongodb-operator-0-2-1-early-access-release-is-now-available/) which is light weight packed MongoDB instance and these could be used with kubernetes + minikube or docker too to deploy and using it. You can expect more information about how this might work from our folks very soon!

What is Docker?

Here’s a basic Docker architecture diagram:

Read Full Article
  • Show original
  • .
  • Share
  • .
  • Favorite
  • .
  • Email
  • .
  • Add Tags 

Please join Percona’s Product Manager Michael Coburn as he presents his talk Monitoring MongoDB with Percona Monitoring and Management (PMM) on May 16th, 2019 at 10:00 AM PDT (UTC-7) / 1:00 PM EDT (UTC-4).

Register Now

In this talk, learn how to monitor MongoDB using Percona Monitoring and Management (PMM) so that you can:

– Gain greater visibility of performance and bottlenecks MongoDB
– Consolidate your MongoDB servers into the same monitoring platform you already use for MySQL and PostgreSQL
– Respond more quickly and efficiently in Severity 1 issues

We’ll also show how to use PMM’s native support for MongoDB so that you can have MongoDB integrated in only minutes!

In order to learn more, watch our webinar.

Read Full Article

Read for later

Articles marked as Favorite are saved for later viewing.
close
  • Show original
  • .
  • Share
  • .
  • Favorite
  • .
  • Email
  • .
  • Add Tags 

Separate tags by commas
To access this feature, please upgrade your account.
Start your free month
Free Preview