How $ Operators Use Indexes

Some queries can use indexes more efficiently than others; some queries cannot use indexes at all. This section covers how various query operators are handled by MongoDB.

INEFFICIENT OPERATORS

In general, negation is inefficient. $ne queries can use an index, but not very well. They must look at all the index entries other than the one specified by $ne, so they basically have to scan the entire index. For example, for a collection with an index on the field named i, here are the index ranges traversed for such a query:

db.example.find({"i" : {"$ne" : 3}}).explain()

{
    "queryPlanner" : {
        ...,
        "parsedQuery" : {
            "i" : {
                "$ne" : "3"
            }
        },
        "winningPlan" : {
        {
            ...,
            "indexBounds" : {
                "i" : [
                    [
                        {
                            "$minElement" : 1
                        },
                        3
                    ],
                    [
                        3,
                        {
                            "$maxElement" : 1
                        }
                    ]
                ]
                }
            }
        },
        "rejectedPlans" : [ ]
    },
    "serverInfo" : {
        ...,
    }
}

This query looks at all index entries less than 3 and all index entries greater than 3. This can be efficient if a large swath of your collection is 3, but otherwise it must check almost everything.

$not can sometimes use an index but often does not know how. It can reverse basic ranges ({"key" : {"$lt" : 7}} becomes {"key" : {"$gte" : 7}}) and regular expressions. However, most other queries with $not will fall back to doing a table scan. $nin always uses a table scan.

If you need to perform one of these types of queries quickly, figure out if there’s another clause that you could add to the query that could use an index to filter the result set down to a small number of documents before MongoDB attempts to do nonindexed matching.

RANGES

Compound indexes can help MongoDB efficiently execute queries with multiple clauses. When designing an index with multiple fields, put fields that will be used in exact matches first (e.g., "x" : 1) and ranges last (e.g., "y": {"$gt" : 3, "$lt" : 5}). This allows the query to find an exact value for the first index key and then search within that for a second index range. For example, suppose we were querying for a specific age and a range of usernames using an {"age" : 1, "username" : 1} index. We would get fairly exact index bounds:

db.users.find({"age" : 47, "username" : {"$gt" : "user5", "$lt" : "user8"}}).explain('executionStats')

{
  "queryPlanner": {
    "plannerVersion": 1,
    "namespace": "test.users",
    "indexFilterSet": false,
    "parsedQuery": {
      "$and": [
        {
          "age": {
            "$eq": 47
          }
        },
        {
          "username": {
            "$lt": "user8"
          }
        },
        {
          "username": {
            "$gt": "user5"
          }
        }
      ]
    },
    "winningPlan": {
      "stage": "FETCH",
      "inputStage": {
        "stage": "IXSCAN",
        "keyPattern": {
          "age": 1,
          "username": 1
        },
        "indexName": "age_1_username_1",
        "isMultiKey": false,
        "multiKeyPaths": {
          "age": [],
          "username": []
        },
        "isUnique": false,
        "isSparse": false,
        "isPartial": false,
        "indexVersion": 2,
        "direction": "forward",
        "indexBounds": {
          "age": ["[47.0, 47.0]"],
          "username": ["(\"user5\", \"user8\")"]
        }
      }
    },
    "rejectedPlans": [
      {
        "stage": "FETCH",
        "filter": {
          "age": {
            "$eq": 47
          }
        },
        "inputStage": {
          "stage": "IXSCAN",
          "keyPattern": {
            "username": 1
          },
          "indexName": "username_1",
          "isMultiKey": false,
          "multiKeyPaths": {
            "username": []
          },
          "isUnique": false,
          "isSparse": false,
          "isPartial": false,
          "indexVersion": 2,
          "direction": "forward",
          "indexBounds": {
            "username": ["(\"user5\", \"user8\")"]
          }
        }
      }
    ]
  },
  "executionStats": {
    "executionSuccess": true,
    "nReturned": 2742,
    "executionTimeMillis": 5,
    "totalKeysExamined": 2742,
    "totalDocsExamined": 2742,
    "executionStages": {
      "stage": "FETCH",
      "nReturned": 2742,
      "executionTimeMillisEstimate": 0,
      "works": 2743,
      "advanced": 2742,
      "needTime": 0,
      "needYield": 0,
      "saveState": 23,
      "restoreState": 23,
      "isEOF": 1,
      "invalidates": 0,
      "docsExamined": 2742,
      "alreadyHasObj": 0,
      "inputStage": {
        "stage": "IXSCAN",
        "nReturned": 2742,
        "executionTimeMillisEstimate": 0,
        "works": 2743,
        "advanced": 2742,
        "needTime": 0,
        "needYield": 0,
        "saveState": 23,
        "restoreState": 23,
        "isEOF": 1,
        "invalidates": 0,
        "keyPattern": {
          "age": 1,
          "username": 1
        },
        "indexName": "age_1_username_1",
        "isMultiKey": false,
        "multiKeyPaths": {
          "age": [],
          "username": []
        },
        "isUnique": false,
        "isSparse": false,
        "isPartial": false,
        "indexVersion": 2,
        "direction": "forward",
        "indexBounds": {
          "age": ["[47.0, 47.0]"],
          "username": ["(\"user5\", \"user8\")"]
        },
        "keysExamined": 2742,
        "seeks": 1,
        "dupsTested": 0,
        "dupsDropped": 0,
        "seenInvalidated": 0
      }
    }
  },
  "serverInfo": {
    "host": "eoinbrazil-laptop-osx",
    "port": 27017,
    "version": "4.0.12",
    "gitVersion": "5776e3cbf9e7afe86e6b29e22520ffb6766e95d4"
  },
  "ok": 1
}

The query goes directly to "age" : 47 and then searches within that for usernames between "user5" and "user8".

Conversely, suppose we use an index on {"username" : 1, "age" : 1}. This changes the query plan, as the query must look at all users between "user5" and "user8" and pick out the ones with "age" : 47:

db.users.find({"age" : 47, "username" : {"$gt" : "user5", "$lt" : "user8"}}).explain('executionStats')

{
  "queryPlanner": {
    "plannerVersion": 1,
    "namespace": "test.users",
    "indexFilterSet": false,
    "parsedQuery": {
      "$and": [
        {
          "age": {
            "$eq": 47
          }
        },
        {
          "username": {
            "$lt": "user8"
          }
        },
        {
          "username": {
            "$gt": "user5"
          }
        }
      ]
    },
    "winningPlan": {
      "stage": "FETCH",
      "filter": {
        "age": {
          "$eq": 47
        }
      },
      "inputStage": {
        "stage": "IXSCAN",
        "keyPattern": {
          "username": 1
        },
        "indexName": "username_1",
        "isMultiKey": false,
        "multiKeyPaths": {
          "username": []
        },
        "isUnique": false,
        "isSparse": false,
        "isPartial": false,
        "indexVersion": 2,
        "direction": "forward",
        "indexBounds": {
          "username": ["(\"user5\", \"user8\")"]
        }
      }
    },
    "rejectedPlans": [
      {
        "stage": "FETCH",
        "inputStage": {
          "stage": "IXSCAN",
          "keyPattern": {
            "username": 1,
            "age": 1
          },
          "indexName": "username_1_age_1",
          "isMultiKey": false,
          "multiKeyPaths": {
            "username": [],
            "age": []
          },
          "isUnique": false,
          "isSparse": false,
          "isPartial": false,
          "indexVersion": 2,
          "direction": "forward",
          "indexBounds": {
            "username": ["(\"user5\", \"user8\")"],
            "age": ["[47.0, 47.0]"]
          }
        }
      }
    ]
  },
  "executionStats": {
    "executionSuccess": true,
    "nReturned": 2742,
    "executionTimeMillis": 369,
    "totalKeysExamined": 333332,
    "totalDocsExamined": 333332,
    "executionStages": {
      "stage": "FETCH",
      "filter": {
        "age": {
          "$eq": 47
        }
      },
      "nReturned": 2742,
      "executionTimeMillisEstimate": 312,
      "works": 333333,
      "advanced": 2742,
      "needTime": 330590,
      "needYield": 0,
      "saveState": 2697,
      "restoreState": 2697,
      "isEOF": 1,
      "invalidates": 0,
      "docsExamined": 333332,
      "alreadyHasObj": 0,
      "inputStage": {
        "stage": "IXSCAN",
        "nReturned": 333332,
        "executionTimeMillisEstimate": 117,
        "works": 333333,
        "advanced": 333332,
        "needTime": 0,
        "needYield": 0,
        "saveState": 2697,
        "restoreState": 2697,
        "isEOF": 1,
        "invalidates": 0,
        "keyPattern": {
          "username": 1
        },
        "indexName": "username_1",
        "isMultiKey": false,
        "multiKeyPaths": {
          "username": []
        },
        "isUnique": false,
        "isSparse": false,
        "isPartial": false,
        "indexVersion": 2,
        "direction": "forward",
        "indexBounds": {
          "username": ["(\"user5\", \"user8\")"]
        },
        "keysExamined": 333332,
        "seeks": 1,
        "dupsTested": 0,
        "dupsDropped": 0,
        "seenInvalidated": 0
      }
    }
  },
  "serverInfo": {
    "host": "eoinbrazil-laptop-osx",
    "port": 27017,
    "version": "4.0.12",
    "gitVersion": "5776e3cbf9e7afe86e6b29e22520ffb6766e95d4"
  },
  "ok": 1
}

This forces MongoDB to scan 100 times the number of index entries as using the previous index would. Using two ranges in a query basically always forces this less-efficient query plan.

OR QUERIES

As of this writing, MongoDB can only use one index per query. That is, if you create one index on {"x" : 1} and another index on {"y" : 1} and then do a query on {"x" : 123, "y" : 456}, MongoDB will use one of the indexes you created, not both. The only exception to this rule is "$or". "$or" can use one index per "$or" clause, as "$or" performs two queries and then merges the results:

db.foo.find({"$or" : [{"x" : 123}, {"y" : 456}]}).explain()

{
    "queryPlanner" : {
        "plannerVersion" : 1,
        "namespace" : "foo.foo",
        "indexFilterSet" : false,
        "parsedQuery" : {
            "$or" : [
                {
                    "x" : {
                        "$eq" : 123
                    }
                },
                {
                    "y" : {
                        "$eq" : 456
                    }
                }
            ]
        },
        "winningPlan" : {
            "stage" : "SUBPLAN",
            "inputStage" : {
                "stage" : "FETCH",
                "inputStage" : {
                    "stage" : "OR",
                    "inputStages" : [
                        {
                            "stage" : "IXSCAN",
                            "keyPattern" : {
                                "x" : 1
                            },
                            "indexName" : "x_1",
                            "isMultiKey" : false,
                            "multiKeyPaths" : {
                                "x" : [ ]
                            },
                            "isUnique" : false,
                            "isSparse" : false,
                            "isPartial" : false,
                            "indexVersion" : 2,
                            "direction" : "forward",
                            "indexBounds" : {
                                "x" : [
                                    "[123.0, 123.0]"
                                ]
                            }
                        },
                        {
                            "stage" : "IXSCAN",
                            "keyPattern" : {
                                "y" : 1
                            },
                            "indexName" : "y_1",
                            "isMultiKey" : false,
                            "multiKeyPaths" : {
                                "y" : [ ]
                            },
                            "isUnique" : false,
                            "isSparse" : false,
                            "isPartial" : false,
                            "indexVersion" : 2,
                            "direction" : "forward",
                            "indexBounds" : {
                                "y" : [
                                    "[456.0, 456.0]"
                                ]
                            }
                        }
                    ]
                }
            }
        },
        "rejectedPlans" : [ ]
    },
    "serverInfo" : {
    ...,
     },
    "ok" : 1
}

As you can see, this explain required two separate queries on the two indexes (as indicated by the two IXSCAN stages). In general, doing two queries and merging the results is much less efficient than doing a single query; thus, whenever possible, prefer "$in" to "$or".

If you must use an "$or", keep in mind that MongoDB needs to look through the results of both queries and remove any duplicates (documents that matched more than one "$or" clause).

When running "$in" queries there is no way, other than sorting, to control the order of documents returned. For example, {"x" : {"$in" : [1, 2, 3]}} will return documents in the same order as {"x" : {"$in" : [3, 2, 1]}}.

How $ Operators Use Indexes

How $ Operators Use Indexes

INEFFICIENT OPERATORS

RANGES

OR QUERIES

results matching ""

No results matching ""