比较两个不同的熊猫数据帧并删除行 Python



我正在努力解决以下问题。我有两个数据帧 df1 和 df2,并希望通过列比较它们 运输 选择然后从 df1 中选择国家/地区并 drom 每个国家/地区定义的日期,如下面的代码所示。如果这样做,我会收到以下错误消息:

值错误:只能比较相同标记的系列对象

代码如下所示:

from pandas.tseries.holiday import (
AbstractHolidayCalendar, EasterMonday,
GoodFriday, Holiday, next_monday,
Easter, nearest_workday, Day, USMartinLutherKingJr,
USPresidentsDay, USMemorialDay, USLaborDay,
USThanksgivingDay)
class GermanHoliday(AbstractHolidayCalendar):
    rules = [
             Holiday('New Years Day', month=1, day=1, observance=next_monday),
             GoodFriday,
             EasterMonday,
             Holiday('Reformation Day', year=2017, month=10, day=31, observance=nearest_workday),
             Holiday('Labour Day', month=5, day=1, observance=nearest_workday),
             Holiday('Whit Monday', month=1, day=1, offset=[Easter(), Day(50)]),
             Holiday('Day of German Unity', month=10, day=3, observance=nearest_workday),
             Holiday('Christmas Day', month=12, day=25, observance=nearest_workday),
             Holiday('Boxing Day',month=12, day=26, observance=nearest_workday) 
    ]
class USHolidays(AbstractHolidayCalendar):
    rules = [
             Holiday('NewYearsDay', month=1, day=1, observance=nearest_workday),
             USMartinLutherKingJr,
             USPresidentsDay,
             GoodFriday,
             USMemorialDay,
             Holiday('USIndependenceDay', month=7, day=4, observance=nearest_workday),
             USLaborDay,
             USThanksgivingDay,
             Holiday('Christmas Day', month=12, day=25, observance=nearest_workday)
             ]
calendarGermany = GermanHoliday()
calendarUS = USHolidays()
holidaysGermany = calendarGermany .holidays().to_pydatetime()
holidaysUS = calendarUS .holidays().to_pydatetime()
qry = "Transportation in @df1.ticker and Date not in @holidaysGermany "
df2 = df2.query(qry)

数据框 df1 和 df2 的结构如下:

DF1:

0    transportation  country
1    ICE             Germany
2    Lufthansa       Germany
3    SIXT            Germany
4    TGV             France
5    Air France      France
6    Alamo           France
7    National        USA
8    Amtrak          USA
9    Delta           USA

DF2:

   Date         transportation price
0  2015-12-21   ICE            81.9924
1  2015-12-22   ICE            81.5173
2  2015-12-23   ICE            83.5015
3  2015-12-24   ICE            83.5015
4  2015-12-25   ICE            83.5015
5  2015-12-28   ICE            83.0357
6  2015-12-29   ICE            84.6286
7  2015-12-30   ICE            83.7250
8  2015-12-31   ICE            83.7250
9  2016-01-01   ICE            83.7250
10 2015-12-21   National       127.3900
11 2015-12-22   National       129.0000
12 2015-12-23   National       131.8800
13 2015-12-24   National       131.8800
14 2015-12-25   National       131.8800
15 2015-12-28   National       130.0300
16 2015-12-29   National       132.1700
...

最终结果应如下所示:

DF2:

   Date         transportation price
0  2015-12-21   ICE            81.9924
1  2015-12-22   ICE            81.5173
2  2015-12-23   ICE            83.5015
3  2015-12-24   ICE            83.5015
4  2015-12-28   ICE            83.0357
5  2015-12-29   ICE            84.6286
6  2015-12-30   ICE            83.7250
7  2015-12-31   ICE            83.7250
8  2016-01-01   ICE            83.7250
9  2015-12-21   National       127.3900
10 2015-12-22   National       129.0000
11 2015-12-23   National       131.8800
12 2015-12-26   National       131.8800
13 2015-12-28   National       130.0300
14 2015-12-29   National       132.1700
...

IIUC 你可以这样做:

In [197]: qry = "transportation in @df1.transportation and 
     ...:        Date not in ['2015-12-24','2015-12-25']"
In [198]: df2.query(qry)
Out[198]:
         Date transportation     price
0  2015-12-21            ICE   81.9924
1  2015-12-22            ICE   81.5173
2  2015-12-23            ICE   83.5015
5  2015-12-28            ICE   83.0357
6  2015-12-29            ICE   84.6286
7  2015-12-30            ICE   83.7250
8  2015-12-31            ICE   83.7250
9  2016-01-01            ICE   83.7250
10 2015-12-21       National  127.3900
11 2015-12-22       National  129.0000
12 2015-12-23       National  131.8800
15 2015-12-28       National  130.0300
16 2015-12-29       National  132.1700

最新更新