我已经开始学习MongoDB并遇到了一个问题。我有一个集合名称为server_logs.
包含以下列(SOURCE_SERVER, SOURCE_PORT, DESTINATION_PORT, DESTINATION_SERVER, MBYTES)。
我需要SOURCE_SERVER与总金额到兆字节转移到每个SOURCE_SERVER。(但这里还有一点是,如果任何source_server存在,也存在于target_server中,那么它们的MBYTES也将添加到每个source_server中)。
例如:我有下面的表结构
SOURCE S_PORT DEST D_PORT MBYTES
1)server1 446 server2 555 10MB
2)server3 226 server1 666 2MB
3)server1 446 server3 226 5MB
我需要下面的结果:
Server1 17MB
Server3 7MB
我已经在mysql中创建了一个查询,以计算每兆位元组的数据传输到该源的顶级源。它工作得很好,我通过这个查询在MYSQL中得到所需的结果。
SELECT SOURCE, DEST, sum( logs.MBYTES )+(
SELECT SUM(log.MBYTES) as sum
from logs as log
where logs.DEST=log.SOURCE
) AS MBYTES
我想在MongoDB这个查询。请帮. .
Thanks in advance.
虽然这种"自连接"类型的查询对您如何使用MongoDB执行此操作似乎不太明显,但它可以使用聚合框架完成,但只需要在您的思维中做一点改变。
在MongoDB中使用这种形式的数据,这仍然非常像原始的SQL源:
{
"source" : "server1",
"s_port" : 446,
"dest" : "server2",
"d_port" : 555,
"transferMB" : 10
},
{
"source" : "server3",
"s_port" : 226,
"dest" : "server1",
"d_port" : 666,
"transferMB" : 2
},
{
"source" : "server1",
"s_port" : 446,
"dest" : "server3",
"d_port" : 226,
"transferMB" : 5
}
使用2.6版本的MongoDB,您的查询将看起来像这样:
db.logs.aggregate([
// Project a "type" tag in order to transform, then unwind
{ "$project": {
"source": 1,
"dest": 1,
"transferMB": 1,
"type": { "$cond": [ 1,[ "source", "dest" ],0] }
}},
{ "$unwind": "$type" },
// Map the "source" and "dest" servers onto the type, keep the source
{ "$project": {
"type": 1,
"tag": { "$cond": [
{ "$eq": [ "$type", "source" ] },
"$source",
"$dest"
]},
"mbytes": "$transferMB",
"source": 1
}},
// Group for totals, keep an array of the "source" for each
{ "$group": {
"_id": "$tag",
"mbytes": { "$sum": "$mbytes" },
"source": { "$addToSet": "$source" }
}},
// Unwind that array
{ "$unwind": "$source" },
// Is our grouped tag one on the sources? Inner join simulate
{ "$project": {
"mbytes": 1,
"matched": { "$eq": [ "$source", "$_id" ] }
}},
// Filter the results that did not match
{ "$match": { "matched": true }},
// Discard duplicates for each server tag
{ "$group": {
"_id": "$_id",
"mbytes": { "$first": "$mbytes" }
}}
])
对于2.6及以上版本,您可以获得一些额外的操作符来简化此操作,或者至少使用不同的操作符:
db.logs.aggregate([
// Project a "type" tag in order to transform, then unwind
{ "$project": {
"source": 1,
"dest": 1,
"transferMB": 1,
"type": { "$literal": [ "source", "dest" ] }
}},
{ "$unwind": "$type" },
// Map the "source" and "dest" servers onto the type, keep the source
{ "$project": {
"type": 1,
"tag": { "$cond": [
{ "$eq": [ "$type", "source" ] },
"$source",
"$dest"
]},
"mbytes": "$transferMB",
"source": 1
}},
// Group for totals, keep an array of the "source" for each
{ "$group": {
"_id": "$tag",
"mbytes": { "$sum": "$mbytes" },
"source": { "$addToSet": "$source" }
}},
// Co-erce the server tag into an array ( of one element )
{ "$group": {
"_id": "$_id",
"mbytes": { "$first": "$mbytes" },
"source": { "$first": "$source" },
"tags": { "$push": "$_id" }
}},
// User set intersection to find common element count of arrays
{ "$project": {
"mbytes": 1,
"matched": { "$size": {
"$setIntersection": [
"$source",
"$tags"
]
}}
}},
// Filter those that had nothing in common
{ "$match": { "matched": { "$gt": 0 } }},
// Remove the un-required field
{ "$project": { "mbytes": 1 }}
])
两种形式都会产生结果:
{ "_id" : "server1", "mbytes" : 17 }
{ "_id" : "server3", "mbytes" : 7 }
两者的一般原则是,通过保留有效的"源"服务器列表,您可以"过滤"合并的结果,以便只有那些被列为源的服务器将记录其总传输。
所以你可以使用一些技巧来"重塑"、"组合"one_answers"过滤"你的文档,以得到你想要的结果。
请阅读更多关于聚合操作符的信息,文档中的SQL到aggregation映射图也值得一看,它可以让您对转换常用操作有一些了解。
甚至可以在Stack Overflow上浏览聚合框架标签,找到一些有趣的转换操作。
您可以使用聚合框架:
db.logs.aggregate([
{$group:{_id:"$SOURCE",MBYTES:{$sum:"$MBYTES"}}}
])
假设MBYTES
字段中只有数值。所以结果是:
{
_id: server1,
MBYTES: 17
},
{
_id: server3,
MBYTES: 7
}
如果你必须计算服务器出现在DEST字段,你应该使用map-reduce方法:
var mapF = function(){
emit(this.SOURCE,this.MBYTES);
emit(this.DEST,this.MBYTES);
}
var reduceF = function(serverId,mbytesValues){
var reduced = {
server: serverId,
mbytes: 0
};
mbytesValues.forEach(function(value) {
reduced.mbytes += value;
});
return reduced;
}
db.logs.mapReduce(mapF,reduceF,{out:"server_stats"});
之后,您可以在server_stats集合中找到结果。