Real-time MongoDB Replication in minutes

Complete hands-on example, with code & instructions, to run data synchronization between two Mongo instances with Grainite.
Kunaal Kumar
6 min to read

In the real-time database synchronization blog, we talked about the challenges developers face in leveraging traditional tools to achieve data synchronization between multiple data sources.  We also provided an overview of how Grainite simplifies database synchronization and the benefits it provides.

This new blog provides a step-by-step example of synchronizing data between two MongoDB instances using Grainite. By the end of this blog, you would have

  • Brought up the Mongo and Grainite docker instances
  • Add dummy data to test the two-way replication
  • Connect to your own Mongo instances to replicate actual data in real-time

Prior to running this example, if you would like a high-level overview of Grainite, you can read the conceptual model in our documentation. 

This blog is part of series where we provide step-by-step examples of performing data synchronization between various sources and sinks. In this blog, we cover mongo to mongo. In upcoming blogs we will cover SQLserver, Salesforce, Kafka and others.


This demo has been developed using the Java programming language. To run this successfully, you should have the following software packages installed on your Mac/Linux machine.

You can refer to the environment setup in our documentation for additional instructions.

Install gx CLI

gx is a command line interface (CLI) tool provided by Grainite. While it is recommended to review the capabilities of this tool, it is not required to walk through the demo. 

  1. Download the gx CLI installer script from here -
  2. Run the installer script.

chmod +x && sudo ./                                                       
  1. gx should have been installed at /usr/local/bin. To confirm, run the following:

gx --version                                                                                         
  1. Set the core pattern. 

sudo sysctl -w kernel.core_pattern=/home/grainite/dx/cores/core.%e.%p.%t                             

Install Java

Java is required to build the app using mvnw, as well as to run the app.

Install docker and docker-compose

Docker Compose is required to launch Grainite, along with two Mongo instances. Follow the instructions on this page to install the docker desktop or the docker engine.

Running the demo

Download the package

There are multiple Grainite applications available in the samples repository. You can download the samples directory from Git using the command 

git clone                                                         

The samples/cdc/mongo_cdc has the necessary files which will be used for the demo.

cd samples/cdc/mongo_cdc                                                                             

Start the Docker containers

Run the following command to start Grainite and the two Mongo instances.

docker compose pull && docker compose up -d                                                           

Verify that the containers have been created without any errors. The mongo_one_init and mongo_two_init are used to initialize the Mongo replica sets for both instances. This will not be required when you connect your own MongoDB instances. 


Build and load the application

  1. Run the following command to build the application

 ./mvnw clean package                                                                                 
  1. Once the application has been built, run the following command to load the Grainite application.

gx load -c app.yaml                                                                                   
  1. Verify the app has been successfully loaded

gx app ls                                                                                             

Connect to Mongo instances

Grainite is frequently used in situations where other services or products are already running. Often, it is necessary to either obtain data from (or send data to) those products. Instead of building integrations for those products into the core Grainite product, we've introduced Extensions. Extensions are individual packages that contain Tasks, Handlers, and other common code needed to integrate with those products.  

Grainite contains various extensions for Azure, Debezium, JDBC, Kafka, SQL Server, and Salesforce. You can read about Grainite Extensions in our documentation.  

Run the following commands to have the tasks poll both the Mongo instances for changes:

gx task start mongo_cdc_one -c app.yaml                                                               

gx task start mongo_cdc_two -c app.yaml       																											 

To confirm that both tasks are working, use gx mon (monitoring command in gx CLI) to see the counters.

gx mon																																															 

The Message Flow section indicates how many events were appended to a topic, or how the message flow looks for the message between topics and tables.

The Action Status section indicates how many times an action was invoked.

For example, mongo_cdc_one_doWork and mongo_cdc_two_doWork indicate that these actions were invoked 39 and 28 times, respectively. You will see the doWork action gets invoked frequently to poll for changes on the Mongo instances. The startTask (used to start a task) and startTaskInstance (used to start a task instance) action counters indicate how many times those methods were called.

Finally, we have the App Counters section which contains app-specific counters. These counters can be included by users in any app by using the counters/gauges API. This app's cdc_controller_created counter indicates how many times a controller was created to poll the Mongo instances. Since there are two tasks polling each Mongo instance, the count is 2. The task_execution_status gauge indicates the status of a task. Use curl localhost:5064/export-dashboard | grep "task_execution_status" to see the label for this gauge. The task_instance_start counter indicates the number of times a task instance was started.

Add entries to the Mongo instances

Run the following commands to add some dummy data to the databases.

  1. Run the following to add some dummy data to mongo_one's test db and customers collection:

./ mongodb://localhost:27017 test customers 10 10																						   
  1. Confirm records were inserted into mongo_one by running the following command to dump the collection:

docker exec -it mongo_one mongosh --eval "db.getSiblingDB('test').getCollection('customers').find()" 
  1. Additionally, confirm records were synced into mongo_two by running the following command to dump the collection:

docker exec -it mongo_two mongosh --eval "db.getSiblingDB('test').getCollection('customers').find()" 
  1. Run the following to add some dummy data to mongo_two's test db and customers collection:

./ mongodb://localhost:27018 test customers 10 10																							 
  1. Confirm records were inserted into mongo_two by running the following command to dump the collection:

docker exec -it mongo_two mongosh --eval "db.getSiblingDB('test').getCollection('customers').find()" 
  1. Additionally, confirm records were synced into mongo_one by running the following command to dump the collection:

docker exec -it mongo_one mongosh --eval "db.getSiblingDB('test').getCollection('customers').find()" 
  1. Run the Verify program to confirm all records have been synced:

java -cp target/mongocdc-jar-with-dependencies.jar org.samples.mongocdc.Verify

Note: To get the number of documents in a collection, you can run the following:

For mongo_one

docker exec -it mongo_one mongosh --eval "db.getSiblingDB('test').getCollection('customers').countDocuments()"                                 

For mongo_two

docker exec -it mongo_two mongosh --eval "db.getSiblingDB('test').getCollection('customers').countDocuments()"

Connecting to your own Mongo Instances

To connect to your own Mongo instances, you can change the connection strings in the app.yaml to point to your Mongo instances, and the database and collection names.

YAML file


While this example focused on real-time replication between Mongo to Mongo instances, Grainite makes it easy to move data between any two database instances. Grainite supports multiple extensions that can help move data from sources such as Apache Kafka, Azure, Debezium, Salesforce, and SQL Server and replicate the data to downstream Databases and Data Warehouses. In addition to making the replication real-time, Grainite also automatically handles failures, retries & scaling and enables complex and stateful transformation capabilities, etc.

Additional samples which demonstrate data replication between MongoDB to SQL Server, SQL Server to SQL Server, SQL Server to Kafka, and Salesforce to SQL Server will be added to the GitHub repository soon.

Try Grainite for free

Get started now
Takes only 5 mins to get started
Test drive the platform with sample applications
Dedicated slack channel for expert engagement