Sharding Reference

mongo --host ip-172-31-91-37.ec2.internal:27017

MongoDB shell version: 3.2.20
connecting to: ip-172-31-91-37.ec2.internal:27017/test
Server has startup warnings:
2018-05-31T19:06:08.302+0000 I CONTROL  [main]
2018-05-31T19:06:08.308+0000 I CONTROL  [main] ** WARNING: Access control is not enabled for the database.
2018-05-31T19:06:08.308+0000 I CONTROL  [main] **          Read and write access to data and configuration is unrestricted.
2018-05-31T19:06:08.308+0000 I CONTROL  [main]
mongos>

Seeing the Current State

mongos> sh.status()

config.locks collection empty or missing. be sure you are connected to a mongos
--- Sharding Status ---
  sharding version: {
        "_id" : 1,
        "minCompatibleVersion" : 5,
        "currentVersion" : 6,
        "clusterId" : ObjectId("5b10479f5412e9e45a4a2ba9")
  }
  shards:
        {  "_id" : "myshard_0",  "host" : "myshard_0/ip-172-31-91-37.ec2.internal:27000,ip-172-31-91-37.ec2.internal:27001,ip-172-31-93-223.ec2.internal:27000",  "state" : 1 }
        {  "_id" : "myshard_1",  "host" : "myshard_1/ip-172-31-91-37.ec2.internal:27002,ip-172-31-93-223.ec2.internal:27001,ip-172-31-93-223.ec2.internal:27002",  "state" : 1 }
  active mongoses:
        "3.6.5" : 2
  balancer:
        Currently enabled:  yes
        Currently running:  no
        Failed balancer rounds in last 5 attempts:  0
        Migration Results for the last 24 hours:
                No recent migrations
  databases:

Enable sharding for a database

mongos> sh.enableSharding('myShardedDB')

{
        "ok" : 1,
        "$clusterTime" : {
                "clusterTime" : Timestamp(1527836346, 6),
                "signature" : {
                        "hash" : BinData(0,"AAAAAAAAAAAAAAAAAAAAAAAAAAA="),
                        "keyId" : NumberLong(0)
                }
        },
        "operationTime" : Timestamp(1527836346, 6)
}

Shard a collection

mongos> sh.shardCollection('myShardedDB.people', {language: 1})

{
        "collectionsharded" : "myShardedDB.people",
        "collectionUUID" : BinData(4,"N3DGiUQrRVSgGsqerfFFmA=="),
        "ok" : 1,
        "$clusterTime" : {
                "clusterTime" : Timestamp(1527836576, 15),
                "signature" : {
                        "hash" : BinData(0,"AAAAAAAAAAAAAAAAAAAAAAAAAAA="),
                        "keyId" : NumberLong(0)
                }
        },
        "operationTime" : Timestamp(1527836576, 15)
}

Add some data to our database

mkdir -p /home/hadoop/MongoDB
cd /home/hadoop/MongoDB

wget https://www.dropbox.com/s/qav1gra0dp031zx/chapter_2_mock_data.csv

mongoimport -h ip-172-31-91-37.ec2.internal:27017 --type csv --headerline -d myShardedDB -c people chapter_2_mock_data.csv

2018-06-01T06:56:59.315+0000    connected to: ip-172-31-91-37.ec2.internal:27017
2018-06-01T06:57:02.302+0000    [#####################...] myShardedDB.people   3.30MB/3.66MB (90.1%)
2018-06-01T06:57:02.642+0000    [########################] myShardedDB.people   3.66MB/3.66MB (100.0%)
2018-06-01T06:57:02.642+0000    imported 100000 documents

Inspect the data distribution

mongo --host ip-172-31-91-37.ec2.internal:27017

mongos> sh.status()
config.locks collection empty or missing. be sure you are connected to a mongos
--- Sharding Status ---
  sharding version: {
        "_id" : 1,
        "minCompatibleVersion" : 5,
        "currentVersion" : 6,
        "clusterId" : ObjectId("5b10479f5412e9e45a4a2ba9")
  }
  shards:
        {  "_id" : "myshard_0",  "host" : "myshard_0/ip-172-31-91-37.ec2.internal:27000,ip-172-31-91-37.ec2.internal:27001,ip-172-31-93-223.ec2.internal:27000",  "state" : 1,  "tags" : [ "Zone0", "Zone2" ] }
        {  "_id" : "myshard_1",  "host" : "myshard_1/ip-172-31-91-37.ec2.internal:27002,ip-172-31-93-223.ec2.internal:27001,ip-172-31-93-223.ec2.internal:27002",  "state" : 1,  "tags" : [ "Zone1", "Zone2" ] }
  active mongoses:
        "3.6.5" : 2
  balancer:
        Currently enabled:  yes
        Currently running:  no
        Collections with active migrations:
                myShardedDB.people started at Fri Jun 01 2018 07:16:04 GMT+0000 (UTC)
        Failed balancer rounds in last 5 attempts:  0
        Migration Results for the last 24 hours:
                1 : Success
  databases:
        {  "_id" : "myShardedDB",  "primary" : "myshard_0",  "partitioned" : true }
                myShardedDB.people
                        shard key: { "language" : 1 }
                        unique: false
                        balancing: true
                        chunks:
                                myshard_0       2
                                myshard_1       1
                        { "language" : { "$minKey" : 1 } } -->> { "language" : "Irish Gaelic" } on : myshard_0 Timestamp(2, 1)
                        { "language" : "Irish Gaelic" } -->> { "language" : "Norwegian" } on : myshard_1 Timestamp(2, 0)
                        { "language" : "Norwegian" } -->> { "language" : { "$maxKey" : 1 } } on : myshard_0 Timestamp(1, 3)
                         tag: Zone0  { "language" : { "$minKey" : 1 } } -->> { "language" : "Irish Gaelic" }
                         tag: Zone0  { "language" : "Irish Gaelic" } -->> { "language" : "Norwegian" }
                         tag: Zone0  { "language" : "Norwegian" } -->> { "language" : { "$maxKey" : 1 } }

Fetch some records from a single shard

mongos> use myShardedDB
switched to db myShardedDB

mongos> db.people.find({ "language" : "Norwegian" }).explain()

{
        "queryPlanner" : {
                "mongosPlannerVersion" : 1,
                "winningPlan" : {
                        "stage" : "SINGLE_SHARD",
                        "shards" : [
                                {
                                        "shardName" : "myshard_0",
                                        "connectionString" : "myshard_0/ip-172-31-91-37.ec2.internal:27000,ip-172-31-91-37.ec2.internal:27001,ip-172-31-93-223.ec2.internal:27000",
                                        "serverInfo" : {
                                                "host" : "ip-172-31-91-37",
                                                "port" : 27000,
                                                "version" : "3.6.5",
                                                "gitVersion" : "a20ecd3e3a174162052ff99913bc2ca9a839d618"
                                        },
                                        "plannerVersion" : 1,
                                        "namespace" : "myShardedDB.people",
                                        "indexFilterSet" : false,
                                        "parsedQuery" : {
                                                "language" : {
                                                        "$eq" : "Norwegian"
                                                }
                                        },
                                        "winningPlan" : {
                                                "stage" : "FETCH",
                                                "inputStage" : {
                                                        "stage" : "SHARDING_FILTER",
                                                        "inputStage" : {
                                                                "stage" : "IXSCAN",
                                                                "keyPattern" : {
                                                                        "language" : 1
                                                                },
                                                                "indexName" : "language_1",
                                                                "isMultiKey" : false,
                                                                "multiKeyPaths" : {
                                                                        "language" : [ ]
                                                                },
                                                                "isUnique" : false,
                                                                "isSparse" : false,
                                                                "isPartial" : false,
                                                                "indexVersion" : 2,
                                                                "direction" : "forward",
                                                                "indexBounds" : {
                                                                        "language" : [
                                                                                "[\"Norwegian\", \"Norwegian\"]"
                                                                        ]
                                                                }
                                                        }
                                                }
                                        },
                                        "rejectedPlans" : [ ]
                                }
                        ]
                }
        },
        "ok" : 1,
        "$clusterTime" : {
                "clusterTime" : Timestamp(1527836740, 1),
                "signature" : {
                        "hash" : BinData(0,"AAAAAAAAAAAAAAAAAAAAAAAAAAA="),
                        "keyId" : NumberLong(0)
                }
        },
        "operationTime" : Timestamp(1527836740, 1)
}

Fetch records from multiple shards

db.people.find({ "language": {"$in": ["Norwegian", "Arabic"]} }).explain()

results matching ""

    No results matching ""