Efficient Techniques for Fuzzy and Partial matching in mongoDB
I learn as I go along
by
3y ago
Efficient Techniques for Fuzzy and Partial matching in mongoDB Abstract This blogpost describes a number of techniques, in MongoDB, for efficiently finding documents that have a number of similar attributes to a supplied query whilst not being an exact match. This concept of "Fuzzy" searching allows users to avoid the risks of failing to find important information due to slight differences in how was entered. Introduction Where users are able to enter data items manually, rather than choosing from a list of options, there exists a reasonable probability that multiple users will enter data i ..read more
Visit website
Calculating Correlation inside MongoDB
I learn as I go along
by
3y ago
I've been pondering recently the idea of a library of statistical and heuristic functions that run inside MongoDB using the aggregation Pipeline. After all if we can avoid pulling data out of the database that must help performance. As a little experiment, here is  the correlation co-efficient of two fields using Pearsons Rho. It's broken down into individual variables to make it easier to read rather than a huge piece of javascript. That's usually the best way to write pipelines. //Pearsons Rho as a pipeline testdata = [{x:1,y:2},             {x:2,y:3}, &nbs ..read more
Visit website
MongoDB Queryable Backups - Time Travel in the Database
I learn as I go along
by
3y ago
MongoDB maintains a statement  based transaction log of all write operations called the OpLog. This  is used to keep High Availability Replicas in sync with the Master copy. The Backup Agent ships this, in encrypted slices to the backup server every minute. The Backup Server stores these slices in a database, called the Oplog Store. The Backup server then replays them into a copy of the database on the Backup Server  called the Head Database. Every few hours, it stops replaying them, looks at what has changed in the binary files of the Head Database and saves those changed ..read more
Visit website
Twelve Steps to MongoDB Enlightenment
I learn as I go along
by
3y ago
You install the database with a single command, connect with second and add your first document with a third - this seems really easy. You write a performance test in the single threaded JavaScript shell and are underwhelmed - then try POCDriver and marvel at the difference. You use conditional update and findOneAndUpdate to make sequences, locks and queues. You discover you cannot update an array in two different ways simultaneously - you raise a JIRA ticket. You find your first use for a truly dynamic schema - then wonder how you can index it. You discover the aggregation pipeline and map ..read more
Visit website
Connecting to and Authenticating with MongoDB from Java using x509
I learn as I go along
by
3y ago
IntroductionI recently worked with a MongoDB Customer who wanted to do encryption in flight correctly:  SSL/TLS ,  x509, mutual certificate authentication between clients and servers the full Monty. We also created service accounts for applications, which would authenticate using x509 - as compared to human administrators who would present a valid certificate to establish a TLS connection then authenticate with their username and password from LDAP/ Active Directory. As an added bonus, we configured MongoDB auditing to audit the deliberate activities of human admins whilst ..read more
Visit website
MongoDB is not a Javascript Database
I learn as I go along
by
3y ago
Sometimes people refer to MongoDB as a Javascript/JSON database - it's not, its internal format is BSON - which is a serialised binary representation of objects. Unlike Javascript everything in MongoDB is stored strongly typed and quickly traversable. That being said, people do access MongoDB with duck-typed Languages like PHP, Perl, Python and yes - Javascript so it can certainly seem that way - and the Mongo shell is is a Javascript REPL so it all seems very Javascript focussed. Why does this matter and what's this Blog post about? Well it matters - because one of the things I want from a ..read more
Visit website
How Wired Tiger opens up the option of MongoDB for Big Data Analytics.
I learn as I go along
by
3y ago
Introduction Using the Wired Tiger engine for MongoDB brings big improvements to concurrency, although as I blogged last time not in every case unless you understand how to take advantage of it. However the data compression in Wired Tiger is another real win with the cost of enterprise disk being so high and high performance cloud storage and SSD's not being far behind. What though if you want to use MongoDB as a data analytics platform? What if you have a lot of data – I'm going to avoid calling it big data because then everyone just discusses what big means – But what if you have a lot ..read more
Visit website
Building a Wired Tiger FIFO Queue in MongoDB.
I learn as I go along
by
3y ago
                  Summary – Wired Tiger is much faster than MMAP but it does need you to start thinking about some new things. Don’t just accept 5 times faster when it can be 10 or 20 times with a little thought. Here's how to improve the speed of queueing code. Wired Tiger is new, and whilst it is nearly always faster then MMAP as a storage engine it does have a few things you need to take into account when writing high performance code for it. Here is one example, building a queue. This is a real example from a ..read more
Visit website
MongoDB Full Text Search - just got 66% more usable.
I learn as I go along
by
3y ago
Last time I blogged, I spoke about index compression. When you use the Wired Tiger storage engine in MongoDB 3.0 you get compressed indexes - and better yet indexes that don't need to be decompressed in RAM - ones that stay compressed and reduce the RAM footprint. Why does this matter? - because for a database system to work well. You need enough RAM to hold your indexes. MongoDB 2.4 added a beta Full Text Search (FTS) capability, in 2.6 it became a GA Release. which, whilst it doesn't have all the bells and whistles of a dedicated  FTS indexing engine like Elasticsearch it has enough t ..read more
Visit website
Wired Tiger - how to reduce your MongoDB hosting costs 10x
I learn as I go along
by
3y ago
Forgive me father, it’s been a long time since I last blogged  - I've been giving in to temptation and getting to grips with MongoDB 3.0 and Wired Tiger Storage (WT)  - and I've learned some things I'd like to share. Much has been said about WT's ingestion speed and the fine grained concurrency that allows it to rocket through mixed workloads. Playing with the POCLoader ( http://github.com/johnlpage/POCDriver ) I've seen some incredible throughput - occasionally into 7 figures per second of transactions per server. However I'm leaving all the speed posts to others - I want to talk ..read more
Visit website

Follow I learn as I go along on FeedSpot

Continue with Google
Continue with Apple
OR