我有一个数据帧df1
,看起来像这样。日期格式为M/D/y。
<表类>
日期
tbody><<tr>1/1/2001 100 2/1/2001 101 表类>
好了,这就满足您的要求了。但是,您自己的数据不一致。你的文本说你把数字加到库存中;在第一天,你减去这个数字,但你加上其余的。
所以,给定"x1.csv":
date,inventory
1/1/2001,100
2/1/2001,101
3/1/2001,103
和"x2.csv"
date,update
1/1/2001,2
1/2/2001,3
1/3/2001,-2
1/4/2001,0
2/1/2001,5
2/2/2001,3
2/3/2001,-10
3/1/2001,0
3/2/2001,0
Python代码:
import csv
import pandas as pd
f1 = csv.reader( open('x1.csv'))
f2 = csv.reader( open('x2.csv'))
# Read the first file into a dictionary.
inventory = {}
for row in f1:
if row[0] != 'date':
inventory[row[0]] = int(row[1])
# Now process each line in the detail list.
current = 0
rows = []
for row in f2:
if row[0] == 'date':
continue
if row[0] in inventory:
current = inventory[row[0]]
rows.append( (row[0],str(current)) )
current += int(row[1])
df = pd.DataFrame( rows, columns=['date','level'])
print(df)
产生如下输出:
date level
0 1/1/2001 100
1 1/2/2001 102
2 1/3/2001 105
3 1/4/2001 103
4 2/1/2001 101
5 2/2/2001 106
6 2/3/2001 109
7 3/1/2001 103
8 3/2/2001 103
确切的逻辑还不清楚,但是对于合并的全局逻辑,您可以在月份期间使用merge
:
df2['a'] += df2.merge(df1,
left_on=pd.to_datetime(df2['date'], dayfirst=False).dt.to_period('M'),
right_on=pd.to_datetime(df1['date'], dayfirst=False).dt.to_period('M'),
how='left', suffixes=('_', '')
)['a']
输出:
date a
0 1/1/2001 102
1 1/2/2001 103
2 1/3/2001 98
3 1/4/2001 100
中间体输出:
df2.merge(df1,
left_on=pd.to_datetime(df2['date'], dayfirst=False).dt.to_period('M'),
right_on=pd.to_datetime(df1['date'], dayfirst=False).dt.to_period('M'),
how='left', suffixes=('_', '')
)['a']
0 100
1 100
2 100
3 100
Name: a, dtype: int64