我一直在尝试规范化这个嵌套的json,现在已经两天了,没有有效的输出,我已经尝试了额外的英里,阅读文档,但没有显示。我将感激额外的人手和专业知识。谢谢你。
这是我想要输出的字段列表;
- clusterTime
- documnetKey
- fulldocument.buzName
- fulldocument.dealType">
- fulldocument。公司的
- fulldocument。Sub_companies">
- fulldocument。目录 <
- 词/gh>
- "operationType">
{'_id': {'_data': '826019AB3C000000012B0'},
'clusterTime': Timestamp(1612294972, 1),
'documentKey': {'_id': ObjectId('5b7cfc0172cb100011ddadfb')},
'fullDocument': {'buzName': 'Market',
'v': 29,
'_id': ObjectId('5b7cfc0172cb100011ddadfb'),
'addersValues': [],
'contractType': 'C & I',
'Volume': 54.04637908572,
'VolumeUOM': 'MWh',
'dealType': 'New',
'Companies': [{'Number': '002093834',
'DC': 'flex',
'_id': ObjectId('5b555555cb100011dde53e'),
'FlowEnd': datetime.datetime(2015, 10, 21, 0, 0),
'FlowStart': datetime.datetime(2014, 10, 22, 0, 0),
'profile': 'SERVICE_NORTH',
'rateCode': 'DC'}],
'government': False,
'heatRate': None,
'Sub_companies': [{'Number': '1000002093834',
'DC': 'easy',
'_id': ObjectId('5b555555cb100011dde53e'),
'FlowEnd': datetime.datetime(2015, 10, 21, 0, 0),
'FlowStart': datetime.datetime(2014, 10, 22, 0, 0),
'profile': 'SERVICE_NORTH',
'rateCode': 'DC'}],
'heatRateIndex': None,
'heatRateUOM': None,
'TexasTIndex': None,
'MaxEnd': datetime.datetime(2015, 10, 21, 0, 0),
'originaldate': datetime.datetime(2018, 10, 12, 5, 0),
'originator': 'Luke',
'parentCustomerName': None,
'Directories': ['TEXAS_AdminFeeInc_PT', 'TXLZ_PT'],
'term': 12,
'voluntaryRECPerc': None,
'voluntaryRECs': False,
'westRTIndex': None},
'operationType': 'update'}
这就是我所尝试的,我用一个从公司开始的聚合管道隔离每个嵌入对象,并希望对子公司和目录做同样的事情,然后通过documentKey加入。
for output in watchdeal.aggregate([{"$project":{"_id":0,"operationType":1,"clusterTime":1,'documentKey':1,"fullDocument":{"buzName":1},"facilities":"$fullDocument.Companies"}}]):
print(pd.json_normalize(output,'Companies',["clusterTime","documentKey","operationType",['fullDocument','buzName']]))