MongoDB Space Usage

A common question we’re asked here at ObjectRocket is how we calculate space usage for our instandes. This page is meant to explain in more detail how we determine how much space your instance is using, and what we do and don’t charge for.

MongoDB Plans

On our MongoDB platform a plan represents maximum amount of data you can store on an instance without needing to take action. As you may have noticed when building an instance for the first time, you’re forced to choose a size in incrementing blocks, ie. 5GB, 20GB, 50GB, etc. This makes it easy to plan out scaling in advance, as you’ll know exactly how much capacity will be added as you grow.

If the instance is sharded, the plan size is multiplied by the number of shards currently attached to the instance to give you that maximum amount of space available. If the instance is a Replica Set, the plan size is the maximum available storage space.

Note

Maximum gigabytes in plan (mgp) will be the used acronym moving forward.

The formula below can be used to determine the mgp for all MongoDB instance types:

mgp = plansize * n (where n is the number of shards)

Using the above, we know that a 5GB sharded instance with 3 shards will have an mgp of 15GB.

Getting into more detail, all data used for determining space usage is pulled directly from MongoDB using db.stats(), across each individual database on each shard. This is determined slightly differently depending on if you’re using WiredTiger or MMAPv1.

WiredTiger

Two fields are used from the data provided by dbStats, dbStats.storageSize (dss) and dbStats.indexSize (dis), if the instance is running on WiredTiger. The formula below shows how space consumed is summarized for WiredTiger per database.

dbs1 + dbs2 + … + dbsn = Document Storage for instance (ids)
dbi1 + dbi2 + … + dbin = Index Storage for instance (iis)
dbs + dbi = Storage used for a single database (sdb)
sdb1 + sdb2 + … + sdbn = Aggregated consumed space within Replica Set (asc)

Once we’ve calculated how much space each database and index takes up, we use that information to get the total space an instance is using:

asc1 + asc2 + … + ascn = Total Consumed Space on all Replica Sets (tsc)

_images/wiredtiger_spaceusage.png

MMAPv1

As it’s a simpler engine, we only need to worry about the dbStats.fileSize. While MMAPv1 offers the output of dbStats.storageSize and dbStats.indexSize, they’re only really useful in roughly determining the level of fragmentation within the instance.

dfs1 + dfs2 + … + dfsn = Aggregated consumed space within Replica Set (asc)

_images/mmap_spaceusage.png

What we don’t calculate

As there are a few portions of MongoDB that we handle for you, like replication, journaling, namespace size, and our own administrative databases, we remove the following from any space calculations.

  • Oplog size (local)
  • Journal size
  • Namespace size
  • The admin and config databases

As always, if you have any more questions or would like us to go into more detail, we’ll be happy to help if you reach out to our support team!