查找并比较每个月特定日期的股票价格

所以，我试图编写的代码是解决一个月的哪几天是历史上最好的买卖股票的日子。我特别关注的股票是UVXY。我试着找出哪些日子是历史上的月最低，哪些日子是历史上的月最高，然后把它们取平均值。到目前为止，我的代码不起作用，因为在一个月的某些日子里，20日或10日不是交易日。实际的字符串会更长，有更多的日期，但我愿意使用yfinance来获得历史价格，我只是不确定它是如何工作的。谢谢!

from bs4 import BeautifulSoup

content = """
<pre style="word-wrap: break-word; white-space: pre-wrap;">
Fri 09-24-2021      22.22     22.27     20.38      20.49    47101392 
Thu 09-23-2021      22.52      22.63      21.32      21.48    48145436 
Wed 09-22-2021      24.88      25.37      22.88      23.68    59917888 
Tue 09-21-2021      26.03      28.18      25.20      25.86    73069928 
Mon 09-20-2021      26.26      30.81      25.36      27.31   104578920 
Fri 09-17-2021      21.56      23.58      21.33      23.48    61526336 
Thu 09-16-2021      21.91      22.66      21.04      21.38    42485960 
....
Wed 12-07-2016    9150.00    9390.00    8780.00    9270.00       37485 
Tue 12-06-2016    9530.00    9660.00    9130.00    9210.00       27220
</pre>""" 
soup = BeautifulSoup(content, "html.parser")
stuff = soup.find('pre').text
lines = stuff.split("n")
listOfStuff=[]
openPriceOfTrades=[]
closePriceOfTrades=[]
difference=[]

for line in lines:
if(line[7:9]=="20"):
closePriceOfTrades.append(line[20:-46])
if line[7:9]=="10":
openPriceOftrades.append(line[20:-46])
difference = []   # initialization of result list
for i in range(len(openPriceOfTrades)-1):
print(len(openPriceOfTrades))
difference.append(float(closePriceOfTrades[i])-float(openPriceOfTrades[i]))
print(difference)

你应该学习pandas.DataFrame。

首先，我将删除带有<pre>的行以只包含数据。

content = """
<pre style="word-wrap: break-word; white-space: pre-wrap;">
Fri 09-24-2021      22.22     22.27     20.38      20.49    47101392 
Thu 09-23-2021      22.52      22.63      21.32      21.48    48145436 
Wed 09-22-2021      24.88      25.37      22.88      23.68    59917888 
Tue 09-21-2021      26.03      28.18      25.20      25.86    73069928 
Mon 09-20-2021      26.26      30.81      25.36      27.31   104578920 
Fri 09-17-2021      21.56      23.58      21.33      23.48    61526336 
Thu 09-16-2021      21.91      22.66      21.04      21.38    42485960 
Wed 12-07-2016    9150.00    9390.00    8780.00    9270.00       37485 
Tue 12-06-2016    9530.00    9660.00    9130.00    9210.00       27220
</pre>"""
# remove lines with `<>`
content = 'n'.join(line for line in content.split('n') if not line.startswith('<')).strip()
print(content)

然后它看起来像CSV文件，以空格作为分隔符，你可以使用io在内存中模拟文件并读取它

import pandas as pd
import io
df = pd.read_csv(io.StringIO(content), sep='s+', names=['day', 'date', 'A', 'B', 'C', 'D', 'volumen'])

然后你可以用date-day

创建列

df['date-day'] = df['date'].str[3:5]

然后选择date-day列中所有20的行，计算average(mean)

day_20 = df[ df['date-day'] == '20' ]
print(day_20.mean())

或者您可以使用groupby在同一时间与所有天一起工作。

for value, group in df.groupby('date-day'):
print('--- date-day:', value, '---')
#print(group.mean())
print('mean "A":', group['A'].mean())
print('mean "B":', group['B'].mean())
print('mean "C":', group['C'].mean())
print('mean "D":', group['D'].mean())

完整工作代码:

content = """
<pre style="word-wrap: break-word; white-space: pre-wrap;">
Fri 09-24-2021      22.22     22.27     20.38      20.49    47101392 
Thu 09-23-2021      22.52      22.63      21.32      21.48    48145436 
Wed 09-22-2021      24.88      25.37      22.88      23.68    59917888 
Tue 09-21-2021      26.03      28.18      25.20      25.86    73069928 
Mon 09-20-2021      26.26      30.81      25.36      27.31   104578920 
Fri 09-17-2021      21.56      23.58      21.33      23.48    61526336 
Thu 09-16-2021      21.91      22.66      21.04      21.38    42485960 
Wed 12-07-2016    9150.00    9390.00    8780.00    9270.00       37485 
Tue 12-06-2016    9530.00    9660.00    9130.00    9210.00       27220
</pre>"""
# remove lines with `<>`
content = 'n'.join(line for line in content.split('n') if not line.startswith('<')).strip()
import pandas as pd
import io
df = pd.read_csv(io.StringIO(content), sep='s+', names=['day', 'date', 'A', 'B', 'C', 'D', 'volumen'])
df['date-day'] = df['date'].str[3:5]
print(df)
day_20 = df[ df['date-day'] == '20' ]
print(day_20)
for value, group in df.groupby('date-day'):
print('--- date-day:', value, '---')
#print(group.mean())
print('mean "A":', group['A'].mean())
print('mean "B":', group['B'].mean())
print('mean "C":', group['C'].mean())
print('mean "D":', group['D'].mean())

结果:

day        date        A        B        C        D    volumen date-day
0  Fri  09-24-2021    22.22    22.27    20.38    20.49   47101392       24
1  Thu  09-23-2021    22.52    22.63    21.32    21.48   48145436       23
2  Wed  09-22-2021    24.88    25.37    22.88    23.68   59917888       22
3  Tue  09-21-2021    26.03    28.18    25.20    25.86   73069928       21
4  Mon  09-20-2021    26.26    30.81    25.36    27.31  104578920       20
5  Fri  09-17-2021    21.56    23.58    21.33    23.48   61526336       17
6  Thu  09-16-2021    21.91    22.66    21.04    21.38   42485960       16
7  Wed  12-07-2016  9150.00  9390.00  8780.00  9270.00      37485       07
8  Tue  12-06-2016  9530.00  9660.00  9130.00  9210.00      27220       06
day        date      A      B      C      D    volumen date-day
4  Mon  09-20-2021  26.26  30.81  25.36  27.31  104578920       20
--- date-day: 06 ---
mean "A": 9530.0
mean "B": 9660.0
mean "C": 9130.0
mean "D": 9210.0
--- date-day: 07 ---
mean "A": 9150.0
mean "B": 9390.0
mean "C": 8780.0
mean "D": 9270.0
--- date-day: 16 ---
mean "A": 21.91
mean "B": 22.66
mean "C": 21.04
mean "D": 21.38
--- date-day: 17 ---
mean "A": 21.56
mean "B": 23.58
mean "C": 21.33
mean "D": 23.48
--- date-day: 20 ---
mean "A": 26.26
mean "B": 30.81
mean "C": 25.36
mean "D": 27.31
--- date-day: 21 ---
mean "A": 26.03
mean "B": 28.18
mean "C": 25.2
mean "D": 25.86
--- date-day: 22 ---
mean "A": 24.88
mean "B": 25.37
mean "C": 22.88
mean "D": 23.68
--- date-day: 23 ---
mean "A": 22.52
mean "B": 22.63
mean "C": 21.32
mean "D": 21.48
--- date-day: 24 ---
mean "A": 22.22
mean "B": 22.27
mean "C": 20.38
mean "D": 20.49

如果您使用yfinance，那么您将直接获得pandas.DataFrame的数据

相关内容

最新更新

热门标签：