In this presentation, we will discuss how to create custom rules when the default rules are not enough for the application. Have you ever needed to give a more permissive rule to a user just because of this user wanted to run a specific command?
Also, we will discuss how to use view for hiding fields from users when we don’t want them to read all the collection. If you have concerns about security, come to this talk.
Those are all very important, but to me making MongoDB easier to use and removing technical debt from the code base is even more exciting. These ‘small print’ developments are the ones that DBAs, site reliability, and support engineers need the most.
MMAPv1 storage engine removed
Dearly beloved, we are here today to mourn MMAPv1 Storage Engine. Called into the world SERVER-35112 reads: “156 changed files with 89 additions and 33,895 deletions.”
When you are troubleshooting slow performance, and need to find the commands that are the main trouble-makers, you can now grep, awk, sort and uniq -c so much more easily. I expect in time this will be leveraged to make the mplotqueries run so much faster (4.2+ log files only).
The queryHash will also be included in currentOp output and saved in system.profile profiling sample documents.
Autosplitting thread removed from mongos nodes
“Rather than mongos tracking chunk size and dictating to a shard when a chunk should be split, the primary node of the shard will now track the chunk size, providing more consistent splitting behavior.”
This will possibly surprise many users, but the responsibility of running the balancer and finding large chunks and splitting them was originally programmed to be run by a mongos node, rather than in one config server or some fixed shard node. All the mongos nodes had an extra thread running the code for this necessary sharding maintenance; all would race to take the cluster’s distributed lock and be the one mongos node that would be allowed to proceed.
There could be conflict for the balancer lock etc., which is one problem. Another problem is that the sudden death of mongos node might leave the balancer locked indefinitely (“stale” lock). Both of these were resolved when the balancer logic was migrated to the primary config server in v3.4. Debugging a balancer lock issue would no longer involve looking through every mongos log.
However, SERVER-34448 marks the end link of a chain of tickets that finally removes the mongos node from these duties. Shards will now search for their own large chunks, and split them and update the chunk ranges in the config db when they are found.
Transactions are a feature of RDBMS systems, but MongoDB is a document-oriented non-RDBMS database. It is widely known for its simplicity of importing data and the sharding of data across many servers. In MongoDB 4.0, a new feature multi-document Transaction has been introduced in a replicaSet environment and we will explore how an application uses MongoDB to adapt to this new feature. This webinar will also examine how a Transaction in MongoDB is achieved, and we will discuss other exciting news from the launch of MongoDB version 4.2.
Percona Server for MongoDB is an enhanced, open source, and highly-scalable database that is a fully-compatible, drop-in replacement for MongoDB 3.6 Community Edition. It supports MongoDB 3.6 protocols and drivers.
Percona Server for MongoDB extends Community Edition functionality by including the Percona Memory Engine storage engine, as well as several enterprise-grade features. Also, it includes MongoRocks storage engine, which is now deprecated. Percona Server for MongoDB requires no changes to MongoDB applications or code.
Percona Server for MongoDB 3.6.13-3.3 introduces the support of HashiCorp Vault key management service. For more information, see
Percona’s MongoDB Tech Lead Akira Kurogane takes a look at MongoDB’s 4.2 release.
Initial thoughts? Some great! Some not so compelling.
Including distributed transactions is a great accomplishment, making MongoDB the only popular NoSQL distributed database to plug the gap with this fundamental feature, previously only available in RDBMS.
There is tight use of the WiredTiger storage engine API, implementation of logical clocks, sharding logic enhancement, and so many changes in various dimensions. But, maybe most importantly, there is a big change in the cost dimension for consumers of the existing transaction-supporting distributed data products. Distributed processing is not an add-on feature in MongoDB and the sharded transactions released this week are not an add-on Enterprise-only feature either.
I wouldn’t be surprised if some customers of proprietary RDBMS’s start calling their account executives to ask if the coming year’s renewal license fees are going to be ramped down for the coming years. They could say; “We’re thinking about migrating to a DB with better pricing, performance, availability, and flexibility, one that just overcame its last feature gap. I understand that migrations are expensive, but then again so are your annually-recurring, per-node, database software licenses.”
To go back to the technical points, my favorite part in the MongoDB World 2019 Morning Keynotes was the clear demonstration of how distributed transactions appear in the oplog. You can see this 32:00 minutes into this Multiverse databases having recent advances and remember wishing for them long ago before any were implemented as public projects.
Although MongoDB’s FLE isn’t that sort of implementation, it achieves a similar business goal: A database-internalized safety catch, preventing the revelation of a user’s document data to another without making separate database collections for each user. I deduce the implementation is a collection of keys (as small or as large as you might like) to encrypt documents independently of other documents, with application rules by database object namespace and/or user id matching. I haven’t yet been in, but undoubtedly the DBA’s burden is going to be key management. Sadly we weren’t given a preview of what that looks like, as we had been with the other hands-on demos in the morning keynote.
(A) Third-party search engine integration
I think MongoDB has absolutely done the right thing here – by not trying to further enhance full-text search within the database server itself.
Speaking from experience as a search engine developer, I know the inner loop of a search engine’s algorithm is a very different business to a database’s. Having a pure search engine run in a different process, presumably on a separate server, is just good sense for the following reasons.
Which algorithm you choose varies dramatically according to what you value in relevancy. Do popular search terms in the last 1, 6, and 24hrs get a boost? Do you need phrase detection? Automatic detection and suppression of keyword-stuffing content? Inclusion of non-European languages? There are so many different sorts of search that the demand would overwhelm MongoDB’s core server development if they tried to take it all on.
Also, there would be a big impact on performance. Search index (re)building is compute-intensive, to put it mildly. If it were within the mongod process, database operation latency would be volatile while the search engine reindexes.
In my opinion, providing search queries through the same MongoDB driver interface is totally the right way to go, so kudos for that! Caveat: only as long as the syntax design is right – though it looked correct in the on-stage demonstration.
Arguably you could have the same thing right now if you just (mis)use one of the popular open-source search engines as your database, but the performance for doing typical database operations won’t be as high.
Lucene is the only search engine supported. MongoDB did not announce a generic interface to integrate with an external index-making service. So, owners of in-house search solutions who have excellent, unreplaceable relevancy will not be able to integrate them with MongoDB using this new feature. For them, goals such as consolidation of data feed, or getting combined database documents and search matches in a single query/request, will remain just dreams for now.
I also wonder how accessible and modifiable the configuration of the Lucene server will be. There was a claim that you will get the “full power of Lucene,” but that is immediately false if it is unconfigurable. And there are other pressing questions. When you change something in the Lucene configuration the search indexes (and hence document ‘hits’) will typically change and be rebuilt over, say, hours. Is it a full downtime situation? Or is more like the indexes are dropped and will reappear after background index build? I look forward to getting clarification on this.
Server-side document updates
As Eliot Horowitz put it succinctly on Tuesday morning:
“There is one thing you haven’t been able to do [with a MongoDB update command] before though, and that is set the value of A to value of B + C.”
Thanks to long-running development this is now possible, by both the classic update command:
And also (much more impressively) in the aggregation pipeline through a new Soyuz T Control Panel is more soothing on the eyes than the above.
There are pros and cons. The primary con is that the aggregation pipeline syntax is very verbose. It would be an unreasonable expectation that people can create statements like the two examples above on the first try, or even on the third. Even when you do learn to create the commands you want, you will not retain that memory in fluent recall capacity for very long, and you’ll be back to the manual pages every time.
The pro (at least for me) is that it makes it easier to picture how the server processes the command. With a language like SQL that abstracts over implementation details, you know that you don’t know how it is making the access to table/collection data. It was MongoDB’s open source and open JIRA ticket information in the early years that provided, for me, the sudden break away from that ‘learned helplessness’ as an RDBMS user.
Come back soon for the sequel post: “Diving into the small-print of MongoDB 4.2 features (which I am much more excited about!).”
Percona Server for MongoDB is an enhanced, open source, and highly-scalable database that is a fully-compatible, drop-in replacement for MongoDB 3.4 Community Edition. It supports MongoDB 3.4 protocols and drivers.
Percona Server for MongoDB extends Community Edition functionality by including the Percona Memory Engine storage engine, as well as several enterprise-grade features:
The first GA version of Percona Server for MongoDB was released on December 14, 2015. Since then we have regularly released updates, training, and bug fixes, ensuring our users are fully informed and have the most up-to-date software possible.
Our latest version, Percona Server for MongoDB 4.0.10-5, includes HashiCorp Vault integration for added security. You can read our recent blog for more information on the new and improved features it contains.
Percona Backup for MongoDB
As part of our commitment to MongoDB, we continue to create features and software components that help users achieve optimal database performance and security.
Having observed a business need we set about creating an open source MongoDB backup tool which gives users enhanced database disaster recovery abilities. As a result, on June 17, 2019 we were excited to announce the early release of our latest software product Percona Backup for MongoDB 0.5.0.
Percona Backup for MongoDB is a free open source back-up solution for consistent backups of MongoDB sharded clusters and replica sets. It enables you to manage your own back-ups without third party involvement or costly licenses.
The GA version of Percona Backup for MongoDB is scheduled to be released later in 2019.
Percona Monitoring and Management (PMM) for MongoDB
You can run PMM in your own environment for maximum security and reliability. It provides thorough time-based analysis to ensure that your data works as efficiently as possible.
Percona Monitoring and Management can be used on-premises and in the cloud with providers such as Amazon Web Services (AWS), Google Cloud, Microsoft Azure, and others.
We will be releasing PMM 2, a greatly enhanced version of our PMM software later in 2019. Further details can be found here.
Percona Toolkit for MongoDB
Percona Toolkit is a collection of advanced open source command-line tools, developed and used by the Percona technical staff, engineered to perform a variety of MongoDB server and system tasks that are too difficult or complex to perform manually. This frees up your DBAs to focus on work that helps you achieve your business goals.
These tools are ideal alternatives to private or “one-off” scripts because they are professionally developed, formally tested, and fully documented. They are also fully self-contained, so installation is quick and easy, and no libraries are installed.
MongoDB User Services
In addition to our own open source versions of MongoDB software, we offer a range of services, including Support and Consulting.
Many companies have mission-critical applications which depend on their MongoDB database environment. But what happens if your database goes down or isn’t running at the optimum level? Percona brings world-class database expertise and open source values to provide a comprehensive, responsive, and cost-effective plan to help your business succeed.
Percona services are available to help your business succeed, whether you rely on Percona Server for MongoDB or MongoDB Community.
MongoDB Webinars and Updates
One of our core values is to keep users informed and up-to-date on software and market changes. Consequently, we provide regular webinars and updates on a variety of topics. Past webinars can be accessed via our website.
We have three upcoming MongoDB-themed webinars led by Percona experts:
Transactions are a feature of RDBMS systems, but MongoDB is a document-oriented non-RDBMS database. In MongoDB 4.0, a new multi-document transaction feature has been introduced in a replica set environment. This webinar will explore how a transaction in MongoDB is achieved.
Adamo will discuss how to create custom rules when the default rules are not enough for the application, and how to use view to hide fields from users when you don’t want them to read the whole collection. This is a great webinar for anyone with security concerns.
We all love relational databases… until we use them for a purpose they are not best suited for. Queues, caches, catalogs, unstructured data, counters, and many other use cases could be solved with relational databases, but are better solved with alternatives. This webinar reviews goals, pros and cons, and use cases of some of these alternatives, looking at modern open source implementations. You will learn the basics of three database paradigms (document, key-value, and columnar store), find out when it’s appropriate to opt for one of these, and when you should choose relational databases instead.
Please Contact us for More Information
We hope this brief update on Percona’s MongoDB capabilities has been useful. Please look out for our upcoming MongoDB software releases, webinars, and blogs.
Percona Server for MongoDB is an enhanced, open source, and highly-scalable database. It is a fully compatible, drop-in replacement for MongoDB 4.0 Community Edition and doesn’t require any changes to MongoDB applications or code.
At Percona we pride ourselves on adding new and exciting enterprise-level features to our software, not just duplicating the latest community version. We are also strongly focused on ensuring our software users have the tools they need to securely manage their data.
HashiCorp Vault Integration
As a result, we are excited to announce integration with HashiCorp Vault in our release of Percona Server for MongoDB 4.0.10-5.
Understanding who is accessing private information on your system can be a challenge. Regular password changes, safe storage, and detailed audit logs are essential to ensuring secure systems.
HashiCorp Vault is a product which manages secrets and protects sensitive data. It securely stores and tightly controls access to confidential information.
In previous versions of Percona Server for MongoDB, the data at rest encryption key was stored locally on the server inside the key file. Our integration withHashiCorp Vault now enables you to store the encryption key more securely inside the vault
Further information on the key features and benefits of HashiCorp Vault can be found Percona Backup for MongoDB 0.5.0. For more insight into Percona’s MongoDB capabilities please look out for our upcoming software announcements, webinars, and blogs.
Percona is pleased to announce the early release of our latest software product Percona Backup for MongoDB 0.5.0 on June 17, 2019. The GA version is scheduled to be released later in 2019.
Percona Backup for MongoDB is a distributed, low-impact solution for consistent backups of MongoDB sharded clusters and replica sets. This is a tool for creating consistent backups across a MongoDB sharded cluster (or a single replica set), and for restoring those backups to a specific point in time. Percona Backup for MongoDB uses a distributed client/server architecture to perform backup/restore actions. The project was inspired by (and intends to replace) the MongoDB Community Server version 3.6 or higher with our bug tracking system.
Percona Server for MongoDB is an enhanced, open source, and highly-scalable database that is a fully-compatible, drop-in replacement for MongoDB 4.0 Community Edition. It supports MongoDB 4.0 protocols and drivers.
Percona Server for MongoDB 4.0.10-5 introduces the support of HashiCorp Vault key management service. For more information, see Data at Rest Encryption in the documentation of Percona Server for MongoDB.