Pandas 数据帧的多输入和多输出函数应用程序引发形状异常



我有一个包含 6 列(不包括索引(的数据帧,其中 2 列是函数的相关输入,该函数有两个输出。我想将这些输出作为列插入到原始数据帧。

我在这里遵循toto_tico的回答。为了方便起见,我正在复制(略有修改(:

    import pandas as pd
    df = pd.DataFrame({"A": [10,20,30], "B": [20, 30, 10], "C": [10, 10, 10], "D": [1, 1, 1]})
    def fab(row):                                                  
        return row['A'] * row['B'], row['A'] + row['B']
    df['newcolumn'], df['newcolumn2'] = zip(*df.apply(fab, axis=1))

此代码工作没有问题。但是,我的代码没有。 我的数据帧具有以下结构:

        Date  Station  Insolation  Daily Total  Temperature(avg)  Latitude
0 2011-01-01  Aksaray         1.7      72927.6         -0.025000   38.3705
1 2011-01-02  Aksaray         5.6     145874.7          2.541667   38.3705
2 2011-01-03  Aksaray         6.3     147197.8          6.666667   38.3705
3 2011-01-04  Aksaray         2.9     100350.9          5.312500   38.3705
4 2011-01-05  Aksaray         0.7      42138.7          4.639130   38.3705

我应用的函数将一行作为输入,并根据纬度和日期返回两个值。这是该函数:

def h0(row):
    # Get a row from a dataframe, give back H0 and daylength
    # Leap year must be taken into account
    
    # row['Latitude'] and row['Date'] are relevant inputs
    
    # phi is taken in degrees, all angles are assumed to be degrees as well in formulas
    # numpy defaults to radians however...
    
    gsc = 1367
    phi = np.deg2rad(row['Latitude'])
    date = row['Date']
    
    year = pd.DatetimeIndex([date]).year[0]
    month = pd.DatetimeIndex([date]).month[0]
    day = pd.DatetimeIndex([date]).day[0]
    
    if year % 4 == 0:
        B = (day-1) * (360/366)
    else:
        B = (day-1) * (360/365)
    
    B = np.deg2rad(B)
    delta = (0.006918 - 0.399912*np.cos(B) + 0.070257*np.sin(B)
                           - 0.006758*np.cos(2*B) + 0.000907*np.sin(2*B)
                           - 0.002697*np.cos(3*B) + 0.00148*np.sin(3*B))
    
    ws = np.arccos(-np.tan(phi) * np.tan(delta))
    daylenght = (2/15) * np.rad2deg(ws)
    
    if year % 4 == 0:
        dayangle = np.deg2rad(360*day/366)
    else:
        dayangle = np.deg2rad(360*day/365)
    
    h0 = (24*3600*gsc/np.pi) * (1 + 0.033*np.cos(dayangle)) * (np.cos(phi)*np.cos(delta)*np.sin(ws) + 
                                                                     ws*np.sin(phi)*np.sin(delta))
    
    return h0, daylenght

当我使用

ak['h0'], ak['N'] = zip(*ak.apply(h0, axis=1))

我得到错误:传递值的形状是(1816,2(,索引意味着(1816,6(

我找不到我的代码有什么问题。你能帮忙吗?

因此,正如我之前的评论中所述,如果您想基于数据帧的多个现有列在数据帧中创建多个新列。您可以在h0函数内的系列行中创建新字段。

这里有一个过于简单的例子来展示我的意思:

>>> def simple_func(row):
...     row['new_column1'] = row.lat * 1000
...     row['year'] = row.date.year
...     row['month'] = row.date.month
...     row['day'] = row.date.day
...     return row
...
>>> df
        date   lat
0 2018-01-29  1000
1 2018-01-30  5000
>>> df.date
0   2018-01-29
1   2018-01-30
 Name: date, dtype: datetime64[ns]
>>> df.apply(simple_func, axis=1)
        date   lat  new_column1  year  month  day
0 2018-01-29  1000      1000000  2018      1   29
1 2018-01-30  5000      5000000  2018      1   30

在您的情况下,在 h0 函数中,设置 row['h0'] = h0row['N'] = daylength然后return row 。然后,当涉及到调用函数DF时,您的行将更改为ak = ak.apply(h0, axis=1)

最新更新