我有一个包含 6 列(不包括索引(的数据帧,其中 2 列是函数的相关输入,该函数有两个输出。我想将这些输出作为列插入到原始数据帧。
我在这里遵循toto_tico的回答。为了方便起见,我正在复制(略有修改(:
import pandas as pd
df = pd.DataFrame({"A": [10,20,30], "B": [20, 30, 10], "C": [10, 10, 10], "D": [1, 1, 1]})
def fab(row):
return row['A'] * row['B'], row['A'] + row['B']
df['newcolumn'], df['newcolumn2'] = zip(*df.apply(fab, axis=1))
此代码工作没有问题。但是,我的代码没有。 我的数据帧具有以下结构:
Date Station Insolation Daily Total Temperature(avg) Latitude
0 2011-01-01 Aksaray 1.7 72927.6 -0.025000 38.3705
1 2011-01-02 Aksaray 5.6 145874.7 2.541667 38.3705
2 2011-01-03 Aksaray 6.3 147197.8 6.666667 38.3705
3 2011-01-04 Aksaray 2.9 100350.9 5.312500 38.3705
4 2011-01-05 Aksaray 0.7 42138.7 4.639130 38.3705
我应用的函数将一行作为输入,并根据纬度和日期返回两个值。这是该函数:
def h0(row):
# Get a row from a dataframe, give back H0 and daylength
# Leap year must be taken into account
# row['Latitude'] and row['Date'] are relevant inputs
# phi is taken in degrees, all angles are assumed to be degrees as well in formulas
# numpy defaults to radians however...
gsc = 1367
phi = np.deg2rad(row['Latitude'])
date = row['Date']
year = pd.DatetimeIndex([date]).year[0]
month = pd.DatetimeIndex([date]).month[0]
day = pd.DatetimeIndex([date]).day[0]
if year % 4 == 0:
B = (day-1) * (360/366)
else:
B = (day-1) * (360/365)
B = np.deg2rad(B)
delta = (0.006918 - 0.399912*np.cos(B) + 0.070257*np.sin(B)
- 0.006758*np.cos(2*B) + 0.000907*np.sin(2*B)
- 0.002697*np.cos(3*B) + 0.00148*np.sin(3*B))
ws = np.arccos(-np.tan(phi) * np.tan(delta))
daylenght = (2/15) * np.rad2deg(ws)
if year % 4 == 0:
dayangle = np.deg2rad(360*day/366)
else:
dayangle = np.deg2rad(360*day/365)
h0 = (24*3600*gsc/np.pi) * (1 + 0.033*np.cos(dayangle)) * (np.cos(phi)*np.cos(delta)*np.sin(ws) +
ws*np.sin(phi)*np.sin(delta))
return h0, daylenght
当我使用
ak['h0'], ak['N'] = zip(*ak.apply(h0, axis=1))
我得到错误:传递值的形状是(1816,2(,索引意味着(1816,6(
我找不到我的代码有什么问题。你能帮忙吗?
因此,正如我之前的评论中所述,如果您想基于数据帧的多个现有列在数据帧中创建多个新列。您可以在h0
函数内的系列行中创建新字段。
这里有一个过于简单的例子来展示我的意思:
>>> def simple_func(row):
... row['new_column1'] = row.lat * 1000
... row['year'] = row.date.year
... row['month'] = row.date.month
... row['day'] = row.date.day
... return row
...
>>> df
date lat
0 2018-01-29 1000
1 2018-01-30 5000
>>> df.date
0 2018-01-29
1 2018-01-30
Name: date, dtype: datetime64[ns]
>>> df.apply(simple_func, axis=1)
date lat new_column1 year month day
0 2018-01-29 1000 1000000 2018 1 29
1 2018-01-30 5000 5000000 2018 1 30
在您的情况下,在 h0
函数中,设置 row['h0'] = h0
并row['N'] = daylength
然后return row
。然后,当涉及到调用函数DF时,您的行将更改为ak = ak.apply(h0, axis=1)