patternMinor
MongoDB: is order guaranteed deep into the aggregation pipeline?
Viewed 0 times
aggregationordertheintomongodbdeepguaranteedpipeline
Problem
I'm trying to get the "top" 2 items for any number of foreign keys, sorted by some key. What I have is the following, which appears to work:
My question is: does MongoDB guarantee that the sorting I apply in the second step in my aggregation via
The data I'm seeing indicates that MongoDB is currently behaving this way, but I'm wondering if this is spec'd or if MongoDB is free to return the items array in a different order if the optimizer ever decides it would be advantageous to do so.
db.getCollection('col1').aggregate([
{$match: {fk: {$in: [1, 2]}}}
{$sort: {name: -1}},
{$group: {_id: "$fk", items: {$push: "$ROOT"} }},
{$project: {items: {$slice: ["$items", 2]} }}
])My question is: does MongoDB guarantee that the sorting I apply in the second step in my aggregation via
$sort will be maintained when I get down to the $slice inside my $project in the last step of my aggregation?The data I'm seeing indicates that MongoDB is currently behaving this way, but I'm wondering if this is spec'd or if MongoDB is free to return the items array in a different order if the optimizer ever decides it would be advantageous to do so.
Solution
While the accepted answer is correct in that
If you look at the comments on the Jira Ticket linked it clears it up there to. Basically it says that
So
So I guess for example if you have the following data
Running the example query
You would get
which returns the first 2 names in descending order
But if you changed the sort by to be on
you could see something like
where you can see that
$group stage will not preserver order, I think it left some important information out and can be misleading.If you look at the comments on the Jira Ticket linked it clears it up there to. Basically it says that
$group maintains the order within the 'groups' but order will not be guaranteed for stages after the group stage.So
$items will be sorted by name: -1 in the example and calling $slice 2 on it in the $project stage will result in the first 2 items sorted by name.So I guess for example if you have the following data
[
{ fk: 1, name: 'James' },
{ fk: 2, name: 'Zelda' },
{ fk: 2, name: 'Victor' },
{ fk: 1, name: 'Becky' },
{ fk: 1, name: 'Carlos' },
{ fk: 2, name: 'Katie' },
]
Running the example query
db.col1.aggregate([
{ $match: { fk: { $in: [1, 2] } } },
{ $sort: { name: -1 } },
{ $group: { _id: "$fk", items: { $push: "$$ROOT" } } },
{ $project: { items: {$slice: ["$items", 2] } } }
])
You would get
{ _id: 2, items: [ { fk: 2, name: "Zelda" }, { fk: 2, name: "Victor" }] }
{ _id: 1, items: [ { fk: 1, name: "James" }, { fk: 1, name: "Carlos" }] }which returns the first 2 names in descending order
But if you changed the sort by to be on
fkdb.col1.aggregate([
{ $match: { fk: { $in: [1, 2] } } },
{ $sort: { fk: -1 } },
{ $group: { _id: "$fk", items: { $push: "$$ROOT" } } },
{ $project: { items: {$slice: ["$items", 2] } } }
])
you could see something like
{ _id: 1, items: [ { fk: 1, name: "James" }, { fk: 1, name: "Becky" } ] }
{ _id: 2, items: [ { fk: 2, name: "Zelda" }, { fk: 2, name: "Victor" } ] }
where you can see that
_id (which is fk) is not sorted in descending order. If you wanted it to be sorted by _id you could just add that sorting after the $group stageCode Snippets
{ _id: 2, items: [ { fk: 2, name: "Zelda" }, { fk: 2, name: "Victor" }] }
{ _id: 1, items: [ { fk: 1, name: "James" }, { fk: 1, name: "Carlos" }] }Context
StackExchange Database Administrators Q#197434, answer score: 4
Revisions (0)
No revisions yet.