如何用另一个DataFrame的值替换pandas DataFrame?



我有2个dataframe (signal_dfprice_df),可以使用以下代码生成。

import pandas as pd
import numpy as np
signal_df = pd.DataFrame({
'long':[
True ,True, np.nan, True, np.nan
],
'short':[
np.nan, np.nan, True, np.nan, True  
],
'date':[
'2020-01-01 19:15', '2020-01-01 20:00', '2020-01-01 22:15', '2020-01-01 22:45', '2020-01-02 00:30',
],
})

# convert the column (it's a string) to datetime type
datetime_series = pd.to_datetime(signal_df['date'])
# create datetime index passing the datetime series
datetime_index = pd.DatetimeIndex(datetime_series.values)
signal_df = signal_df.set_index(datetime_index)
signal_df.drop('date',axis=1,inplace=True)
print(signal_df)
price_df = pd.DataFrame({
'close_price':[
30,  2,  3, 29,
15,  6, 19, 56,
9 , 38, 41, 12,
23, 14, 15, 16,
38, 18, 19, 20,
21, 22, 23, 33,
25, 26, 10, 28
],
'date':[
'2020-01-01 19:00', '2020-01-01 19:15', '2020-01-01 19:30', '2020-01-01 19:45', 
'2020-01-01 20:00', '2020-01-01 20:15', '2020-01-01 20:30', '2020-01-01 20:45', 
'2020-01-01 21:00', '2020-01-01 21:15', '2020-01-01 21:30', '2020-01-01 21:45', 
'2020-01-01 22:00', '2020-01-01 22:15', '2020-01-01 22:30', '2020-01-01 22:45', 
'2020-01-01 23:00', '2020-01-01 23:15', '2020-01-01 23:30', '2020-01-01 23:45', 
'2020-01-02 00:00', '2020-01-02 00:15', '2020-01-02 00:30', '2020-01-02 00:45', 
'2020-01-02 01:00', '2020-01-02 01:15', '2020-01-02 01:30', '2020-01-02 01:45', 
]
})
# convert the column (it's a string) to datetime type
datetime_series = pd.to_datetime(price_df['date'])
# create datetime index passing the datetime series
datetime_index = pd.DatetimeIndex(datetime_series.values)
price_df = price_df.set_index(datetime_index)
price_df.drop('date',axis=1,inplace=True)
print(price_df)

我如何创建第三个DataFrame (new_df),看起来像下面?

new_df
long 
2020-01-01 19:15:00  2
2020-01-01 20:00:00  15
2020-01-01 22:15:00   NaN
2020-01-01 22:45:00  16
2020-01-02 00:30:00   NaN

new_df首先在signal_dflong列中查找True的值,然后用price_df中的相应值替换它们。

您可以尝试masksignal_dflong列中的True值

out = (signal_df['long'].mask(signal_df['long'].eq(True),
price_df.loc[signal_df.index, 'close_price'])
.to_frame())
print(out)
long
2020-01-01 19:15:00    2
2020-01-01 20:00:00   15
2020-01-01 22:15:00  NaN
2020-01-01 22:45:00   16
2020-01-02 00:30:00  NaN

你可以这样做:

(price_df.loc[signal_df.where(signal_df.long).index].mask(
signal_df.long.ne(True)).rename(columns={'close_price':'long'}))

输出:

long
2020-01-01 19:15:00          2.0
2020-01-01 20:00:00         15.0
2020-01-01 22:15:00          NaN
2020-01-01 22:45:00         16.0
2020-01-02 00:30:00          NaN

最新更新