Python 将股票期权字段更新为三个字段



我一直在为这个问题而苦苦挣扎。 我是一个业余开发人员和自学成才,但没有接近中级。 我的 for 循环似乎工作正常,直到我尝试使用 if、elsif、else 语句添加到数据框。

for 循环不是在每一行上更新,而是将列中的所有记录更新为相同的值。

这是为什么呢?

contract_date、contract_type 和strike_price应有不同的值。

from numpy import dtype
import pandas as pd
import requests
import urllib.parse
from datetime import datetime
from dateutil import tz

s = requests.Session()


headers = {       #match headers on API request
'Accept':'*',
'User-Agent':'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/97.0.4692.71 Safari/537.36'
}
#print('Enter a Ticker to Pull Data From')
#ticker = input()
ticker = 'SPY'

tickerurl = f'https://cdn.cboe.com/api/global/delayed_quotes/options/{ticker}.json'
data = s.get(tickerurl).json()

lookupdata = data['data']['options']
df = pd.json_normalize(data['data']['options'])
df['ticker'] = ticker

for contract in lookupdata:
y = contract['option'].replace(ticker,'')
contract_date = f'20{y[0:2]}-{y[2:4]}-{y[4:6]}'
z = y.replace(y[0:6],'')
contract_type = "Call" if z[0] == 'C' else "Put"
strikeprice = z.replace(z[0],'')
strike_price = float(strikeprice)/1000
df['contract_date'] = contract_date
df['contract_type'] = contract_type
df['strike_price'] = strike_price

print(df)

更新数据帧的每一行而不是整列的正确方法是什么?

我已经尝试了以下一堆变体,并不断提出相同的结果或无法正确添加到数据帧的无限循环。

from numpy import dtype
import pandas as pd
import requests
import urllib.parse
from datetime import datetime
from dateutil import tz

s = requests.Session()


headers = {       #match headers on API request
'Accept':'*',
'User-Agent':'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/97.0.4692.71 Safari/537.36'
}
#print('Enter a Ticker to Pull Data From')
#ticker = input()
ticker = 'SPY'

tickerurl = f'https://cdn.cboe.com/api/global/delayed_quotes/options/{ticker}.json'
data = s.get(tickerurl).json()

lookupdata = data['data']['options']
df = pd.json_normalize(data['data']['options'])
df['ticker'] = ticker
df2 = pd.DataFrame()  
for contract in lookupdata:
y = contract['option'].replace(ticker,'')
contract_date = f'20{y[0:2]}-{y[2:4]}-{y[4:6]}'
z = y.replace(y[0:6],'')
contract_type = "Call" if z[0] == 'C' else "Put"
strikeprice = z.replace(z[0],'')
strike_price = float(strikeprice)/1000
df['contract_date'] = contract_date
df['contract_type'] = contract_type
df['strike_price'] = strike_price
df2 = df.append(df2) 

print(df2)
# tried the following as well:
#    df2.append(df)
#df2 = pd.concat(df2)
#print(df2)
# gives the error TypeError: first argument must be an iterable of pandas objects, you passed an object of type "DataFrame"

#trying the following gives a key error on contract_date

#    df2.append(df)
#    df2 = pd.concat(df2['contract_date']['contract_type'][strike_price])
#print(df2)
#   df2.append(df)
#    df2 = #pd.concat((df2['contract_date'],df2['contract_type'],df2['strike_price']), #ignore_index=True)
#print(df2)

我认为你让这件事变得比它需要的要困难得多。您有一个df通过pd.json_normalize(),其中包含如下列:

import pandas as pd
data = {'option': ['SPY220829C00310000','SPY220829P00310000']}
df = pd.DataFrame(data)
print(df)
option
0  SPY220829C00310000
1  SPY220829P00310000

在维基百科上,您可以找到构成这些代码的"块"的标准格式。您要做的是将这些块转换为正则表达式模式,然后使用pd.Series.str.extract检索各个块并将它们分配给各个列。

# read as (string 1-6 letters)(6 digits)(cap C or P)(rest of digits)
pattern = r'([A-Z]{1,6})(d{6})([CP])(d+)'
df[['ticker','contract_date','contract_type','strike_price']] = 
df.option.str.extract(pattern, expand=True)
print(df)
option ticker contract_date contract_type strike_price
0  SPY220829C00310000    SPY        220829             C     00310000
1  SPY220829P00310000    SPY        220829             P     00310000

接下来,您可以更改新创建的列的格式:

df.contract_date = pd.to_datetime(df.contract_date, format='%y%m%d')
df.contract_type = df.contract_type.map({'P':'Put','C':'Call'})
df.strike_price = df.strike_price.astype(float)/1000
print(df)
option ticker contract_date contract_type  strike_price
0  SPY220829C00310000    SPY    2022-08-29          Call         310.0
1  SPY220829P00310000    SPY    2022-08-29           Put         310.0

最新更新