Write Concern
Write concern describes the level of acknowledgment requested from MongoDB for write operations to a standalone mongod or to replica sets or to sharded clusters. In sharded clusters, mongos instances will pass the write concern on to the shards.
For multi-document transactions, you set the write concern at the transaction level, not at the individual operation level. Do not explicitly set the write concern for individual write operations in a transaction.
Write Concern Specification
Write concern can include the following fields:
{ w: <value>, j: <boolean>, wtimeout: <number> }
- the
woption to request acknowledgment that the write operation has propagated to a specified number of mongod instances or to mongod instances with specified tags. - the
joption to request acknowledgment that the write operation has been written to the on-disk journal, and - the
wtimeoutoption to specify a time limit to prevent write operations from blocking indefinitely.
w Option
The w option requests acknowledgment that the write operation has propagated to a specified number of mongod instances or to mongod instances with specified tags.
Using the w option, the following w: <value> write concerns are available:
Requests acknowledgment that the write operation has propagated to the specified number of mongod instances. For example:
w: 1
Requests acknowledgment that the write operation has propagated to the standalone mongod or the primary in a replica set. w: 1 is the default write concern for MongoDB.
w: 0
Requests no acknowledgment of the write operation. However, w: 0 may return information about socket exceptions and networking errors to the application.
If you specify w: 0 but include j: true, the j: true prevails to request acknowledgment from the standalone mongod or the primary of a replica set.
w greater than 1 requires acknowledgment from the primary and as many additional data-bearing secondaries to meet the specified write concern. For example, consider a 3-member replica set with no arbiters. Specifying w: 2 would require acknowledgment from the primary and one of the secondaries. Specifying w: 3 would require acknowledgment from the primary and both secondaries.
Hidden, delayed, and priority 0 members can acknowledge
w: <number>write operations. Delayed secondaries can return write acknowledgment no earlier than the configuredslaveDelay.
majority
Requests acknowledgment that write operations have propagated to the majority of data-bearing voting members (i.e. members[n].votes is greater than 0 and members[n].arbiterOnly is false).
For example, consider a replica set with 3 data-bearing voting members. "majority" write concern requires acknowledgment from two out of three members, specifically the primary and one secondary. If you later scaled the replica set to 5 data-bearing voting members, "majority" would require acknowledgment from three out of five members. Specifically, the primary and two secondaries.
Hidden, delayed, and priority 0 members can acknowledge
w: <number>write operations. Delayed secondaries can return write acknowledgment no earlier than the configuredslaveDelay.
After the write operation returns with a w: "majority" acknowledgment to the client, the client can read the result of that write with a "majority" readConcern.
<tag set>
Requests acknowledgment that the write operations have propagated to a replica set member with the specified tag.
j Option
The j option requests acknowledgment from MongoDB that the write operation has been written to the on-disk journal.
If j: true, requests acknowledgment that the mongod instances, as specified in the w: <value>, have written to the on-disk journal. j: true does not by itself guarantee that the write will not be rolled back due to replica set primary failover.
Specifying a write concern that includes
j: trueto a mongod instance that is running without journaling produces an error.Note If journaling is enabled,
w: "majority"may implyj: true. ThewriteConcernMajorityJournalDefaultreplica set configuration setting determines the behavior.
wtimeout
This option specifies a time limit, in milliseconds, for the write concern. wtimeout is only applicable for w values greater than 1.
wtimeout causes write operations to return with an error after the specified limit, even if the required write concern will eventually succeed. When these write operations return, MongoDB does not undo successful data modifications performed before the write concern exceeded the wtimeout time limit.
If you do not specify the wtimeout option and the level of write concern is unachievable, the write operation will block indefinitely. Specifying a wtimeout value of 0 is equivalent to a write concern without the wtimeout option.
Acknowledgment Behavior
The w option and the j option determine when mongod instances acknowledge write operations.
Standalone
A standalone mongod acknowledges a write operation either after applying the write in memory or after writing to the on-disk journal. The following table lists the acknowledgment behavior for a standalone and the relevant write concerns:
j is unspecified |
j:true |
j:false |
|
|---|---|---|---|
w: 1 |
In memory | On-disk journal | In memory |
w: "majority" |
On-disk journal if running with journaling | On-disk journal | In memory |
With
writeConcernMajorityJournalDefaultset tofalse, MongoDB does not wait forw: "majority"writes to be written to the on-disk journal before acknowledging the writes. As such,majoritywrite operations could possibly roll back in the event of a transient loss (e.g. crash and restart) of a majority of nodes in a given replica set.
Replica Sets
The value specified to w determines the number of replica set members that must acknowledge the write before returning success. For each eligible replica set member, the j option determines whether the member acknowledges writes after applying the write operation in memory or after writing to the on-disk journal.
w: "majority"
Any data-bearing voting member of the replica set can contribute to write acknowledgment of "majority" write operations.
The following lists when the member can acknowledge the write based on the j value:
j is unspecified
Acknowledgment depends on the value of writeConcernMajorityJournalDefault:
- If
true, acknowledgment requires writing operation to on-disk journal (j: true).writeConcernMajorityJournalDefaultdefaults totrue - If
false, acknowledgment requires writing operation in memory (j: false).
j: true
Acknowledgment requires writing operation to on-disk journal.
j: false
Acknowledgment requires writing operation in memory.
With
writeConcernMajorityJournalDefaultset tofalse, MongoDB does not wait forw: "majority"writes to be written to the on-disk journal before acknowledging the writes. As such,majoritywrite operations could possibly roll back in the event of a transient loss (e.g. crash and restart) of a majority of nodes in a given replica set.Note Hidden, delayed, and priority 0 members can acknowledge
w: <number>write operations. Delayed secondaries can return write acknowledgment no earlier than the configuredslaveDelay.
w: <number>
Any data-bearing member of the replica set can contribute to write acknowledgment of w: <number> write operations.
The following table lists when the member can acknowledge the write based on the j value:
j is unspecified |
Acknowledgment requires writing operation in memory (j: false). |
j: true |
Acknowledgment requires writing operation to on-disk journal. |
j: false |
Acknowledgment requires writing operation in memory. |
Hidden, delayed, and priority 0 members can acknowledge
w: <number>write operations. Delayed secondaries can return write acknowledgment no earlier than the configuredslaveDelay.
Examples
mongo
db.inventory.insert(
{ sku: "abcdxyz", qty : 100, category: "Clothing" },
{ writeConcern: { w: 5, j: true, wtimeout: 5000 } }
)
WriteResult({
"nInserted" : 1,
"writeConcernError" : {
"code" : 100,
"codeName" : "CannotSatisfyWriteConcern",
"errmsg" : "Not enough data-bearing nodes"
}
})
Verify Write Operations to Replica Sets
The following operation includes the writeConcern option to the insert() method. The operation specifies "majority" write concern and a 5 second timeout using the wtimeout write concern parameter so that the operation does not block indefinitely.
db.products.insert(
{ item: "envelopes", qty : 100, type: "Clasp" },
{ writeConcern: { w: "majority" , wtimeout: 5000 } }
)
WriteResult({ "nInserted" : 1 })
Modify Default Write Concern
You can modify the default write concern for a replica set by setting the settings.getLastErrorDefaults setting in the replica set configuration. The following sequence of commands creates a configuration that waits for the write operation to complete on a majority of the voting members before returning:
cfg = rs.conf()
cfg.settings.getLastErrorDefaults
{ "w": 1, "wtimeout": 0 }
cfg.settings.getLastErrorDefaults = { w: "majority", wtimeout: 5000 }
rs.reconfig(cfg)
{
"ok": 1,
"operationTime": Timestamp(1551430486, 1),
"$clusterTime": {
"clusterTime": Timestamp(1551430486, 1),
"signature": {
"hash": BinData(0, "AAAAAAAAAAAAAAAAAAAAAAAAAAA="),
"keyId": NumberLong(0)
}
}
}
cfg.settings.getLastErrorDefaults
{ "w": "majority", "wtimeout": 5000 }
Custom Write Concerns
You can tag the members of replica sets and use the resulting tag sets to create custom write concerns.
Configure Replica Set Tag Sets
Tag sets let you customize write concern and read preferences for a replica set. MongoDB stores tag sets in the replica set configuration object, which is the document returned by rs.conf(), in the members[n].tags embedded document.
Differences Between Read Preferences and Write Concerns
Custom read preferences and write concerns evaluate tags sets in different ways:
- Read preferences consider the value of a tag when selecting a member to read from.
- Write concerns do not use the value of a tag to select a member except to consider whether or not the value is unique.
For example, a tag set for a read operation may resemble the following document:
{ "disk": "ssd", "use": "reporting" }
To fulfill such a read operation, a member would need to have both of these tags. Any of the following tag sets would satisfy this requirement:
{ "disk": "ssd", "use": "reporting" }
{ "disk": "ssd", "use": "reporting", "rack": "a" }
{ "disk": "ssd", "use": "reporting", "rack": "d" }
{ "disk": "ssd", "use": "reporting", "mem": "r"}
The following tag sets would not be able to fulfill this query:
{ "disk": "ssd" }
{ "use": "reporting" }
{ "disk": "ssd", "use": "production" }
{ "disk": "ssd", "use": "production", "rack": "k" }
{ "disk": "spinning", "use": "reporting", "mem": "32" }
Add Tag Sets to a Replica Set
You could add tag sets to the members of this replica set with the following command sequence in the mongo shell:
conf = rs.conf()
conf.members[0].tags = { "dc": "east", "use": "production" }
conf.members[1].tags = { "dc": "east", "use": "reporting" }
conf.members[2].tags = { "use": "production" }
rs.reconfig(conf)
{
"ok": 1,
"operationTime": Timestamp(1551430993, 1),
"$clusterTime": {
"clusterTime": Timestamp(1551430993, 1),
"signature": {
"hash": BinData(0, "AAAAAAAAAAAAAAAAAAAAAAAAAAA="),
"keyId": NumberLong(0)
}
}
}
In tag sets, all tag values must be strings.
Custom Multi-Datacenter Write Concerns
Given a three member replica set with members in two data centers:
- a facility
VAtaggeddc_va - a facility
GTOtaggeddc_gto
Create a custom write concern to require confirmation from two data centers using replica set tags, using the following sequence of operations in the mongo shell:
- Create a replica set configuration JavaScript object
conf:
conf = rs.conf()
- Add tags to the replica set members reflecting their locations:
conf.members[0].tags = { "dc_va": "rack1"}
conf.members[1].tags = { "dc_va": "rack2"}
conf.members[2].tags = { "dc_gto": "rack1"}
rs.reconfig(conf)
{
"ok": 1,
"operationTime": Timestamp(1551431137, 1),
"$clusterTime": {
"clusterTime": Timestamp(1551431137, 1),
"signature": {
"hash": BinData(0, "AAAAAAAAAAAAAAAAAAAAAAAAAAA="),
"keyId": NumberLong(0)
}
}
}
- Create a custom
settings.getLastErrorModessetting to ensure that the write operation will propagate to at least one member of each facility:
conf.settings.getLastErrorModes
{}
conf.settings = { getLastErrorModes: { MultipleDC : { "dc_va": 1, "dc_gto": 1 } } }
rs.reconfig(conf)
{
"ok": 1,
"operationTime": Timestamp(1551431211, 2),
"$clusterTime": {
"clusterTime": Timestamp(1551431211, 2),
"signature": {
"hash": BinData(0, "AAAAAAAAAAAAAAAAAAAAAAAAAAA="),
"keyId": NumberLong(0)
}
}
}
rs.conf().settings.getLastErrorModes
{ "MultipleDC": { "dc_va": 1, "dc_gto": 1 } }
To ensure that a write operation propagates to at least one member of the set in both data centers, use the MultipleDC write concern mode as follows:
db.users.insert( { id: "xyz", status: "A" }, { writeConcern: { w: "MultipleDC" } } )
WriteResult({ "nInserted": 1 })
Tag Sets and Custom Write Concern Behavior
The numeric value in the custom getLastErrorModes write concern refers to the number of unique tag values (in the associated replica set tag) required to satisfy the write concern.
For example, given the following tag set configuration:
conf = rs.conf()
conf.members[0].tags = { "dc": "east", "production": "node-1" }
conf.members[1].tags = { "dc": "east", "production": "node-2" }
conf.members[2].tags = { "dc": "east", "production": "node-3" }
rs.reconfig(conf)
The custom write concern productionWriteConcern defined below is satisfied if the write propagates to the three replica set members since across the three members, the production tag contains three unique values:
conf.settings = {
getLastErrorModes: {
productionWriteConcern : { "production": 3 }
}
}
However, the following custom write concern dcWriteConcern can never succeed:
conf.settings = {
getLastErrorModes: {
dcWriteConcern : { "dc": 3 } // this will never succeed
}
}
This is because the dc tag does not contain three unique values, but rather a single tag/value repeated three times across the replica set members (i.e. {"dc": "east"}). Therefore the custom write concern setting of {"dc": 3} will never be satisfied.