如何计算mongodb聚合数组内的统计信息?



我正在mongodb上构建一个简单的调查应用程序,最难的部分是使用mongodb的聚合框架获得统计数据。

这是一个提交的调查答案。answers属性是一个数组,包含每个问题的答案。

{
uid:'xxxx',
surveyId:'xxxxx',
answers:[
{   answer: A },    // question 1 ‘s answer
{   answer: B },    // question 2 ‘s answer 
{   answer: C }      // question 3 ‘s answer 
]
}

我想要的最终结果是知道调查的总体统计数据。

Question 1:
A  50%  B 40% C 10%
Question 2:
A  60%  B 40% C 0%

mongodb聚合的棘手部分是如何处理'数组'

一个简单的选择是$unwind$group:

db.collection.aggregate([
{$unwind: {path: "$answers", includeArrayIndex: "index"}},
{$group: {
_id: {
surveyId: "$surveyId",
index: "$index"
},
A: {$sum: {$cond: [{$eq: ["$answers.answer", "A"]}, 1, 0]}},
B: {$sum: {$cond: [{$eq: ["$answers.answer", "B"]}, 1, 0]}},
C: {$sum: {$cond: [{$eq: ["$answers.answer", "C"]}, 1, 0]}},
all: {$sum: 1}
}},
{$project: {
A: {$multiply: [{$divide: ["$A", "$all"]}, 100]},
B: {$multiply: [{$divide: ["$B", "$all"]}, 100]},
C: {$multiply: [{$divide: ["$C", "$all"]}, 100]},
question: {$add: ["$_id.index", 1]},
surveyId: "$_id.surveyId",
_id: 0
}}
])

看看它在操场的例子中是如何工作的

或者更通用的方法可以使用$group两次而不知道每个问题的答案选项:

db.collection.aggregate([
{$unwind: {path: "$answers", includeArrayIndex: "index"}},
{$group: {
_id: {
surveyId: "$surveyId",
index: "$index",
res: "$answers.answer"
},
count: {$sum: 1}
}},
{$group: {
_id: {
surveyId: "$_id.surveyId",
index: "$_id.index"
},
data: {$push: {answer: "$_id.res", count: "$count"}},
all: {$sum: "$count"}
}},
{$project: {
data: {
$map: {
input: "$data",
in: {
answer: "$$this.answer",
percent: {$multiply: [{$divide: ["$$this.count", "$all"]}, 100]}
}
}
},
question: {$add: ["$_id.index", 1]},
surveyId: "$_id.surveyId",
_id: 0
}}
])

看看它在操场的例子中是如何工作的

> db.collection.find({},{"_id":0});
< { uid: 'xxxx',
surveyId: 'xxx',
answers: [ { answer: 'A' }, { answer: 'B' }, { answer: 'C' } ] }
{ uid: 'xxxx',
surveyId: 'xxx',
answers: [ { answer: 'B' }, { answer: 'C' }, { answer: 'C' } ] }
{ uid: 'xxxx',
surveyId: 'xxx',
answers: [ { answer: 'B' }, { answer: 'B' }, { answer: 'B' } ] }
{ uid: 'xxxx',
surveyId: 'xxx',
answers: [ { answer: 'B' }, { answer: 'A' }, { answer: 'C' } ] }
{ uid: 'xxxx',
surveyId: 'xxx',
answers: [ { answer: 'B' }, { answer: 'B' }, { answer: 'C' } ] }

使用自定义$accumulator,

> db.collection.aggregate([
{
$group: {
"_id": "$surveyId",
"statistics": {
$accumulator: {
initArgs: [3],                        //number of questions as argument for the init function
init: function(questions){            //initialise 3 arrays for each question, with all counts set to 0
let answerArray = new Array();
for(i=1;i<=questions;i++){
answerArray.push({"question":i,"A":0,"B":0,"C":0,"total":0});
}
return answerArray;
},
accumulateArgs: ["$answers"],         //answers list as argument for accumulate function
accumulate: function(state, answers) {            //increment the counter/total for each answer
for(i=0;i<answers.length;i++){
switch(answers[i].answer){
case "A": state[i].A = state[i].A+1; break;
case "B": state[i].B = state[i].B+1; break;
case "C": state[i].C = state[i].C+1; break;
}
state[i].total = state[i].total+1
}
return state;
},
merge: function(state1, state2) {                 //merge function to merge the intermediate results
for(i=0;i<state.length;i++){
state[i].A = state1[i].A+state2[i].A
state[i].B = state1[i].B+state2[i].B
state[i].C = state1[i].C+state2[i].C
state[i].total = state1[i].total+state2[i].total
}
return state;
},
finalize: function(state){                        //finalize the result by calculating % ratio
for(i=0;i<state.length;i++){
state[i].A = state[i].A*100/state[i].total;
state[i].B = state[i].B*100/state[i].total;
state[i].C = state[i].C*100/state[i].total;
delete state[i].total;
}
return state;
},
lang: "js"
}
}
}
}
]);
< {
"_id" : "xxx",
"statistics" : [ 
{
"question" : 1.0, "A" : 20.0, "B" : 80.0, "C" : 0.0
}, 
{
"question" : 2.0, "A" : 20.0, "B" : 60.0, "C" : 20.0
}, 
{
"question" : 3.0, "A" : 0.0, "B" : 20.0, "C" : 80.0
}
]
}

最新更新