我正试图找到一个mongoDB脚本,该脚本将查看同一文档中有多条记录的集合,并只向我提供每个文档的最新版本作为结果集。
我无法用英语比上面更好地解释它,但也许下面的这个小SQL可以进一步解释它。我想要transaction_reference
之前的每份文件,但只需要最新日期的版本(object_creation_date
)。
select
t.transaction_reference,
t.transaction_date,
t.object_creation_date,
t.transaction_sale_value
from MyTable t
inner join (
select
transaction_reference,
max(object_creation_date) as MaxDate
from MyTable
group by transaction_reference
) tm
on t.transaction_reference = tm.transaction_reference
and t.object_creation_date = tm.MaxDat
同一个文档有多个版本的原因是我想存储事务的每个迭代。我第一次收到文档时,它可能在UNPAID的transaction_status
中,然后我再次收到相同的交易,这次transaction_status
是PAID。
一些分析将对所有唯一的交易进行汇总,而另一些分析可能是测量状态为UNPAID的文件与下一个PAID之间的时间距离。
根据要求,这里有两份文件:
{
"_id": {
"$oid": "579aa337f36d2808839a05e8"
},
"object_class": "Goods & Services Transaction",
"object_category": "Revenue",
"object_type": "Transaction",
"object_origin": "Sage One",
"object_origin_category": "Bookkeeping",
"object_creation_date": "2016-07-05T00:00:00.201Z",
"party_uuid": "dfa1e80a-5521-11e6-beb8-9e71128cae77",
"connection_uuid": "b945bd7c-7988-4d2a-92f5-8b50ab218e00",
"transaction_reference": "SI-1",
"transaction_status": "UNPAID",
"transaction_date": "2016-06-16T00:00:00.201Z",
"transaction_due_date": "2016-07-15T00:00:00.201Z",
"transaction_currency": "GBP",
"goods_and_services": [
{
"item_identifier": "PROD01",
"item_name": "Product One",
"item_quantity": 1,
"item_gross_unit_sale_value": 1800,
"item_revenue_category": "Sales Revenue",
"item_net_unit_cost_value": null,
"item_net_unit_sale_value": 1500,
"item_unit_tax_value": 300,
"item_net_total_sale_value": 1500,
"item_gross_total_sale_value": 1800,
"item_tax_value": 300
}
],
"transaction_gross_value": 1800,
"transaction_gross_curr_value": 1800,
"transaction_net_value": 1500,
"transaction_cost_value": null,
"transaction_payments_value": null,
"transaction_payment_extras_value": null,
"transaction_tax_value": 300,
"party": {
"customer": {
"customer_identifier": "11",
"customer_name": "KP"
}
}
}
和第二个版本,现在支付
{
"_id": {
"$oid": "579aa387f36d2808839a05ee"
},
"object_class": "Goods & Services Transaction",
"object_category": "Revenue",
"object_type": "Transaction",
"object_origin": "Sage One",
"object_origin_category": "Bookkeeping",
"object_creation_date": "2016-07-16T00:00:00.201Z",
"party_uuid": "dfa1e80a-5521-11e6-beb8-9e71128cae77",
"connection_uuid": "b945bd7c-7988-4d2a-92f5-8b50ab218e00",
"transaction_reference": "SI-1",
"transaction_status": "PAID",
"transaction_date": "2016-06-16T00:00:00.201Z",
"transaction_due_date": "2016-07-15T00:00:00.201Z",
"transaction_currency": "GBP",
"goods_and_services": [
{
"item_identifier": "PROD01",
"item_name": "Product One",
"item_quantity": 1,
"item_gross_unit_sale_value": 1800,
"item_revenue_category": "Sales Revenue",
"item_net_unit_cost_value": null,
"item_net_unit_sale_value": 1500,
"item_unit_tax_value": 300,
"item_net_total_sale_value": 1500,
"item_gross_total_sale_value": 1800,
"item_tax_value": 300
}
],
"transaction_gross_value": 1800,
"transaction_gross_curr_value": 1800,
"transaction_net_value": 1500,
"transaction_cost_value": null,
"transaction_payments_value": null,
"transaction_payment_extras_value": null,
"transaction_tax_value": 300,
"party": {
"customer": {
"customer_identifier": "11",
"customer_name": "KP"
}
}
}
感谢您的支持,Matt
如果我正确理解问题,你可以使用类似的东西
db.getCollection('yourTransactionsCollection').aggregate([
{
$sort: {
"transaction_reference": 1,
"object_creation_date": -1
}
},
{
$group: {
_id: "$transaction_reference",
"transaction_date": { $first: "$transaction_date" },
"object_creation_date": { $first: "$transaction_date" },
"transaction_sale_value": { $first: "$transaction_sale_value" }
}
}
])
其输出类似以下的结果
{
"_id" : "SI-1",
"transaction_date" : "2016-06-16T00:00:00.201Z",
"object_creation_date" : "2016-06-16T00:00:00.201Z",
"transaction_sale_value" : null
}
请注意,您可以将$sort
更改为仅包括object_creation_date
,但我同时包括了transaction_reference
和object_creation_date
,因为我认为在它们上创建一个复合索引而不仅仅是创建日期是有意义的。根据您的索引进行调整,使$sort
达到1
此外,没有文档字段transaction_sale_value
,因此在结果中为其指定了null
。也许你错过了,或者它不在你的样本文件中,但我认为你已经明白了,可以根据你的需求进行调整。