Pandas:基于另一个数据框架的行创建一个数据框架



我有一个问题,一个比较复杂的问题。

我有一个像这样的数据帧:

commodity_name first_delivery_date last_delivery_date  last_trading_date   tenor   delivery_window new_tenor    Vol
<chr>          <dttm>              <dttm>              <dttm>              <chr>   <chr>           <chr>      <int>
1 oil            2021-06-01 00:00:00 2021-06-30 00:00:00 2021-04-30 00:00:00 month   Jun 21          Jun 21     29000
2 gold           2022-03-01 00:00:00 2022-03-31 00:00:00 2022-02-28 00:00:00 month   Mar 22          Mar 22      -800
3 oil            2021-07-01 00:00:00 2021-07-31 00:00:00 2021-05-31 00:00:00 month   Jul 21          Jul 21    -21000
4 gold           2021-09-01 00:00:00 2021-09-30 00:00:00 2021-08-31 00:00:00 month   Sep 21          Sep 21      1100
5 gold           2021-02-01 00:00:00 2021-02-28 00:00:00 2021-01-29 00:00:00 month   Feb 21          Feb 21     -3000
6 depower        2021-01-01 00:00:00 2021-01-31 00:00:00 2020-12-30 00:00:00 quarter Jan 21          Q1 21         -3
7 oil            2022-04-01 00:00:00 2022-04-30 00:00:00 2022-02-28 00:00:00 month   Apr 22          Apr 22     23000
8 czpower        2023-02-01 00:00:00 2023-02-28 00:00:00 2023-01-30 00:00:00 quarter Feb 23          Q1 23         26
9 oil            2021-02-01 00:00:00 2021-02-28 00:00:00 2020-12-31 00:00:00 quarter Feb 21          Q1 21     -17000
10 gold           2021-05-01 00:00:00 2021-05-31 00:00:00 2021-04-30 00:00:00 month   May 21          May 21      2400

我想从它创建另一个数据框架,基于以下条件:

  • For YearYY,如果new_tenor在旧数据帧中为Q1 YY:在新数据帧中创建三行,其中new_tenor分别为Jan YY,Feb YYMar YY。所有其他变量保持不变;
  • 如果旧数据帧中的new_tenorQ2 YY:在新数据帧中创建三行,其中new_tenor分别为Apr YYMay YYJun YY。所有其他变量保持不变;
  • 如果旧数据帧中的new_tenorQ3 YY:在新数据帧中创建三行,其中new_tenor分别为Jul YYAug YYSep YY。所有其他变量保持不变;
  • 如果旧数据帧中的new_tenorQ4 YY:在新数据帧中创建三行,其中new_tenor分别为Oct YYNov YYDec YY。所有其他变量保持不变;
  • 如果旧数据帧中的new_tenorCal YY:在新数据帧中创建6行,其中new_tenor分别为Jan YY+1Feb YY+1Mar YY+1Q2 YY+1Q3 YY+1Q4 YY+1。所有其他变量保持不变;

这个问题很简单,主要取决于YY的值,其他的在新数据帧中和旧数据帧中保持不变。

我试着用下面的代码解决这个问题:

my_df = []

for index, row in ss.iterrows():

# d = row["NewTenor"].split()
# year = d[1]

print(year)
if "Cal" in row["NewTenor"]:

# Go to next year

# Add Jan, Feb, and Mar

temp_1 = row

temp_1['NewTenor'] = temp_1['NewTenor'].replace({'Cal':'Jan','21':'22','22':'23','23':'24'})

temp_2 = row

temp_2['NewTenor'] = temp_2['NewTenor'].replace({'Cal':'Feb','21':'22','22':'23','23':'24'})

temp_3 = row

temp_3['NewTenor'] = temp_3['NewTenor'].replace({'Cal':'Mar','21':'22','22':'23','23':'24'})

# Add Q2, Q3, and Q4

temp_4 = row

temp_4['NewTenor'] = temp_1['NewTenor'].replace({'Cal':'Q2','21':'22','22':'23','23':'24'})

temp_5 = row

temp_5['NewTenor'] = temp_1['NewTenor'].replace({'Cal':'Q3','21':'22','22':'23','23':'24'})

temp_6 = row

temp_6['NewTenor'] = temp_1['NewTenor'].replace({'Cal':'Q4','21':'22','22':'23','23':'24'})

# Append to data frame

my_df.append(temp_1)
my_df.append(temp_2)
my_df.append(temp_3)
my_df.append(temp_4)
my_df.append(temp_5)
my_df.append(temp_6)

elif "Q1" in row["NewTenor"]:

# Add Jan, Feb, and Mar

temp_1 = row

temp_1['NewTenor'] = temp_1['NewTenor'].replace({'Q1':'Jan'})

temp_2 = row

temp_2['NewTenor'] = temp_2['NewTenor'].replace({'Q1':'Feb'})

temp_3 = row

temp_3['NewTenor'] = temp_3['NewTenor'].replace({'Q1':'Mar'})

# Append to data frame

my_df.append(temp_1)
my_df.append(temp_2)
my_df.append(temp_3)


elif "Q2" in row["NewTenor"]:

# Add Apr, May, and Jun

temp_1 = row

temp_1['NewTenor'] = temp_1['NewTenor'].replace({'Q2':'Apr'})

temp_2 = row

temp_2['NewTenor'] = temp_2['NewTenor'].replace({'Q2':'May'})

temp_3 = row

temp_3['NewTenor'] = temp_3['NewTenor'].replace({'Q2':'Jun'})


# Append to data frame

my_df.append(temp_1)
my_df.append(temp_2)
my_df.append(temp_3)

elif "Q3" in row["NewTenor"]:

# Add Jul, Aug, and Sep

temp_1 = row

temp_1['NewTenor'] = temp_1['NewTenor'].replace({'Q3':'Jul'})

temp_2 = row

temp_2['NewTenor'] = temp_2['NewTenor'].replace({'Q3':'Aug'})

temp_3 = row

temp_3['NewTenor'] = temp_3['NewTenor'].replace({'Q3':'Sep'})

# Append to data frame

my_df.append(temp_1)
my_df.append(temp_2)
my_df.append(temp_3)

else :

# Add Oct, Nov, and Dec

temp_1 = row

temp_1['NewTenor'] = temp_1['NewTenor'].replace({'Q4':'Oct'})

temp_2 = row

temp_2['NewTenor'] = temp_2['NewTenor'].replace({'Q4':'Nov'})

temp_3 = row

temp_3['NewTenor'] = temp_3['NewTenor'].replace({'Q4':'Dec'})

# Append to data frame

my_df.append(temp_1)
my_df.append(temp_2)
my_df.append(temp_3)


my_df = pd.DataFrame(my_df)

这并不复杂,它总是给我错误。

谁能帮我创建新的数据框架?提前谢谢你。

如果我没理解错的话:

def split_tenor(tenor):
start, year = tenor.split(" ")
if start == "Cal":
months = ["Jan", "Feb", "Mar", "Q2", "Q3", "Q4"]
year = int(year) + 1
elif start == "Q1":
months = ["Jan", "Feb", "Mar"]
elif start == "Q2":
months = ["Apr", "May", "Jun"]
elif start == "Q3":
months = ["Jul", "Aug", "Sep"]
elif start == "Q4":
months = ["Oct", "Nov", "Dec"]
else:
return tenor
return [f"{m} {year}" for m in months]
df["new_tenor"] = df["new_tenor"].apply(split_tenor)
df.explode("new_tenor")

相关内容

  • 没有找到相关文章

最新更新