MongoDB 4.0 A way forward for NoSQL DB

I have followed up MongoDB from past 6 month or so. Recently, Mongo has released its 4.0 version. So it was again an opportunity for me to visit Mongo University to grab latest updates. Last week, I finished the course and I have to admit that it has really very cool updates which will surly make MongoDB stand out in comparison with other databases in long run. It was a huge release with over 100 new features.

Let`s start and go through few important updates.

ACID Transaction:

MongoDB has extended its ACID transaction functionality superbly. A transaction can now support to multiple statements and multiple documents across one or multiple collections. However, multi document transactions work over replica sets only. So, developers need not to think about limiting their logic to single document. By default, if a transaction runs more than 60 seconds, it will be automatically rolled back by WiredTiger engine and an exception will be returned back to the driver. There is no limit on number of documents that can be processed in single transaction however MongoDB recommend to limit it around 1000 documents per transaction. Transaction will be automatically aborted in case of cache pressure on server. Write conflicts have been taken care properly to ensure consistency and integrity. Transient transaction error exception can be observed in the case of write happened before write lock was taken but not present into transaction snapshot.

Horizontal Scaling- Sharding:

Sharding is a powerful and key concept in MongoDB which provides flexibility to developer to handle large data set with high throughput. Shard balancer has been improved significantly to automatically distribute data evenly on each nodes. MongoDB now allow us to elastically add or remove nodes from shared cluster in real time without any outage to applications and balancer will automatically redistribute the data. Slow operations will be logged into in mongos(query router) as well along with respective mongod server and it can fire sharded kill operation to each of the individual shard servers to stop long running operations. Below two are noted improvements related to shard which I personally liked:

1 : if an operation dies on a shard node then mongos will make sure that its threads running on other nodes are wiped out to avoid unnecessary CPU consumption.

2: From 4.0, all the secondary reads will happen from snapshot which means that now secondary read operations need not to wait for replication of writes from primary node. This concept has also improved write performance as operations with write with majority concern on primary can be acknowledge faster.

Security Improvements:

It seems there was lot of focus on security enhancement during 4.0 version planning. There are multiple improvements in transport encryption and authentication section.

In transport layer security, MongoDB has stopped support of TLS1.1 on systems where TLS1.1+ is available. TLS1.3 is also approved which will reduce network traffic significantly as it will require only one round trip during handshake and with help of caches future connectivity will require zero round trip from same same server. MongoDB is now integrated with native OS crypto libraries as well like OpenSSL on Linux, secure channel on windows etc. TLS uses will cover all possible communications such as between clients, drivers and servers as well as internal cluster level communication between shard servers.

In Authentication, MongoDB has added SCRAM SHA-256 authentication. It uses SHA-256 bit hashing mechanism. Many security have been placed to explicitly forbid any use of SHA-1 without exception. User upgrade is completely automated where existing users continue to use SHA-1 until you chose them to upgrade and newly created users will by default use SHA-256. Once all users are upgraded, we can completely disable SHA-1 to avoid any security loophole. Mongo has stopped Mongodb-CR authentication mechanism.

Aggregation Framework:

Lots of new type conversion operators have been added in this release like “$convert” which converts a value into a specific type and “$toDouble” which converts a value into double type. Such type conversion also supports fallback option as well if conversion is failing or not supported. They also allow fallback value in case of null or missing value in input.

There have been multiple improvements into date type conversion as well. “$dateToParts” now support interval adding and subtracting of dates. “$dateFromString” which converts string to date now accepts string format option as well. It has been improved for on error and on null fallback option as well.

Mongo introduces new string operations in aggregation. “$rtrim , $ltrim, $trim” have been added for any string parsing and formatting. We can pass expression or list of expression and set of characters we are interested in removing. If char is not set then by default it will remove white-space.

Above new features provide great support to data analyst while playing around with improper formatted data.

Change Stream Improvements:

Change stream was added in 3.6 version that allows application to rely on stream of data changes and notifications. MongoDB now allows to open change stream on database level to watch changes in all non-system collections. It also allows change stream on cluster level as well but it should be enabled on “admin”.

A new argument called “startAtOperationTime” has been added for specifying a start time for a change stream regardless of current time. Time can be in past as well. Any cluster time will be accepted, unless it is not older than oldest entry in your oplog. Change event message now will return two more attributes — transaction number and local session ID.

Compass Enhancement:

This enhancement is the one section that I liked the most. There has been significant amount of work done by Mongo team provide best visualization view to the users specially who are kind of more business oriented users. A scrips and precise view of schema is now available with stats like frequency, types and range of values. Business user can now export and import data into CSV format. JSON import/export is also available.

Technology users can get nice view of key server performance matrix and can get visual explain plan to analyse long running queries. Document validation rules and index management can be done from front end directly.

Compass is now equipped with aggregation builder to provide a very comprehensive view data to data analyst. Aggregation pipeline builder provides a user interface to aggregate data and visualize data stage by stage by turning them on and off. Final code can be retrieved further and can be used into project. Export to language feature can be used to convert queries into driver language like python, java etc.

Compass is now available into two version. Isolated edition accepts only TLS encryption supported and work over TCP connections. Read only addition supports only users to read only.

Enhancement In Atlas:

Atlas now supports global sharding which means documents linked are placed to nearest zone. Geographical access of documents are provided. BI connector has been enabled from both primary and secondary which gives us a huge opportunity to utilize secondary. It also provides real time performance panel and can kill slow running operation from front end. User can now explore data from atlas as well. CRUD operations are supported from frond end directly. Connect string has been shortened with help of SRV records. Data migration can been done in staging layer then post validation, can be done into main production layer. Test fail over button has been provided to ensure fail over capability by triggering election process. Cross project recovery is possible now. Own LDAP and private KMS for security are available now. Auto scaling of storage is possible.

There are many more enhancements and bug fixes in this version and you can find them all in MongoDB 4.0 release notes. Thanks a lot for reading through, I hope you would have liked this article. Its my first article on Medium and many more to come. Any suggestion or feedback will be highly appreciated. Thanks to Mongo University for providing excellent platform for learning.

A data science enthusiast and Passionate for Cloud Technology. AWS, Azure, GCP, MongoDB certified developer/architect. Interested in Google Cloud Engineering.