如何对List中的每个Item执行API调用

我试图在元组[(company_name, symbol)]列表中获取每个公司的价格数据。在这个例子中，我使用的是TD Ameritrade API。

同样，我在Reddit上也有同样的问题。唯一的区别是我试图检索每个帖子'id'的所有评论。但是，与Reddit代码我从熊猫df而不是列表拉ID。

这是我现在在哪里:

这是TD Ameritrade API

async def run_app(symbols):
# empty list to append dataframes to
all_dfs = []
for sym,name in symbols:
# gets the price history data 
rdata = asyncio.get_price_data(symbol=sym, period='10', periodType='day', frequency='minute', frequencyType='1')
# prepares data for panda df
can = unpack_ph_data(rdata, 'candles', 'symbol') 
# function for creating a panda df
df = create_ph_df(can) # function for creating a panda df
# append to all_dfs
all_dfs.append(df)
return rdata

我的想法是使用for语句，它将为运行每个步骤符号列表中的每个项目。首先，我尝试没有asyncio，然后我看到一个类似的例子，但没有API，所以我想我会尝试它。

我正在尝试使用praw包类似的东西。但是对于这个，我从pandas df中的每一行提取数据，并遇到了同样的问题。

我有一个函数，它获得一个指定的subreddit并返回所有的数据在熊猫df:

def get_subreddit_data(subreddit="all", limit=25): 
"""
:param subreddit: Which subreddit to get top posts. 
:param limit: number of desired posts (This will be for setting the limit 
after.hot(limit=num_of_posts)and eventually determined from user input. 
Default 25
:returns: top posts in subreddit or default (top posts on reddit).
data is returned in pandas df with the following columns: 
>>> title, score, id, subreddit, url, comments, selftext, created <<<
"""
# empty list to insert data to: 
posts = []
# variable for data 
top_posts = reddit.subreddit(subreddit).hot(limit=limit) # limit and subreddit params
# FOR loop to append data to posts 
for post in top_posts:
posts.append([post.title, post.score, post.id, post.subreddit, post.url, post.num_comments, post.selftext, post.created])
# Create df 
df = pd.DataFrame(posts, columns=["title","score", "id", "subreddit", "url", "comments","selftext", "created"])
return df

df = get_reddit_subreddit(subreddit, limit)工作正常，返回熊猫df

这就是我遇到问题的地方:

IDs = []
for ID in [df["id"]]:
IDs.append(ID) # Add IDs to ID list
def return_comments_for(ID_list):
# empty list to append comments to
_comments = []
"""
:param ID_list: list of post IDs 
:returns: list of comments for each post ID 
in ID_list
"""
# for loop to extract each ID one by one
for ID in ID_list:
# Create submission instance
submission = reddit.submission(id=ID)
submission.comments.replace_more(limit=None)
for comment in submission.comments.list():
_comments.append(comment.body)
comments = return_comments_for(IDs)

没有工作，所以我尝试不创建一个函数，并使用队列:

# Empty list for all IDs
queue = [] 
IDs = [df["id"]] # get IDs from DF
for i in IDs:
queue.append(i) # Add IDs to queue 
# list to append comments to
_comments = []
while queue:
# pop item index 0 and assign to ID
ID = queue.pop(0)
# create submission instance for ID
submission = reddit.submission(id=ID)
submission.comments.replace_more(limit=None)
# for each comment in submission instance 
for comment in submission.comments.list():
_comments.append(comment.body) # append to main comment list

这不是我尝试使用队列堆栈的唯一方法。我试过很多不同的方法，就是记不住。但不管怎样，它们都不起作用，所以我错过了一些东西。

这是我每次得到的全部错误消息。不管我怎么努力。

ValueError                                Traceback (most recent call last)
<ipython-input-9-2cddf98f54ba> in <module>
20 
21 
---> 22 comments = return_comments_for(IDs)
23 print(comments)
<ipython-input-9-2cddf98f54ba> in return_comments_for(ID_list)
14     for ID in ID_list:
15         # Create submission instance
---> 16         submission = reddit.submission(id=ID)
17         submission.comments.replace_more(limit=None)
18         for comment in submission.comments.list():
C:ProgramDataAnaconda3libsite-packagesprawreddit.py in submission(self, id, url)
847 
848         """
--> 849         return models.Submission(self, id=id, url=url)
C:ProgramDataAnaconda3libsite-packagesprawmodelsredditsubmission.py in __init__(self, reddit, id, url, _data)
532 
533         """
--> 534         if (id, url, _data).count(None) != 2:
535             raise TypeError("Exactly one of `id`, `url`, or `_data` must be provided.")
536         self.comment_limit = 2048
C:ProgramDataAnaconda3libsite-packagespandascoregeneric.py in __nonzero__(self)
1476 
1477     def __nonzero__(self):
-> 1478         raise ValueError(
1479             f"The truth value of a {type(self).__name__} is ambiguous. "
1480             "Use a.empty, a.bool(), a.item(), a.any() or a.all()."
ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().

After

IDs = []
for ID in [df["id"]]:
IDs.append(ID) # Add IDs to ID list

IDS是不是int的列表，而是pandas的列表。只包含一个级数，即df["id"]。

# This does what you were trying to do:
IDs = []
for ID in df["id"]:
IDs.append(ID)

# Which can be shortened to
IDs = list(df["id"])
# But I think just passing the Series to your function, should work fine:
comments = return_comments_for(df["id"])

真正的bug是comments在此之后将变成None，因为return_comments_for不返回任何东西，所以它将隐式返回None。

您试图提取id列表的方式是错误的，只需执行以下操作:

IDs = df["id"].values.tolist()

相关内容

最新更新

热门标签：