我是numpy和pandas的新手,我正试图编写这段代码来创建pandas系列。对于系列中的每个索引,我想从上面的列表中随机选择一个兴趣的随机数,在这种情况下为1-3,没有重复。如果可能的话,我想找到改进代码的方法。
感谢
def random_interests(num):
interests = [1, 2, 3, 4, 5, 6]
stu_interests = []
for n in range(num):
stu_interests.append(np.random.choice(interests, np.random.randint(1, 4), replace=False))
rand_interests = pd.Series(stu_interests)
这是一种方法。使用更快的apply可以避免For循环。此外,您不需要创建单独的变量stu_interests
,当num
为高时,它会消耗更多内存
def random_interests(num):
interests = [1, 2, 3, 4, 5, 6]
rand_interests = pd.DataFrame(np.nan, index=[x for x in range(num)], columns=['random_list'])
rand_interests['random_list'] = rand_interests['random_list'].apply(lambda x: np.random.choice(interests, np.random.randint(1, 4), replace=False))
return rand_interests['random_list']
您必须在函数中添加一个return,这样您才能从中获得结果。通过在底部添加一个返回,您的代码将是:
def random_interests(num):
interests = [1, 2, 3, 4, 5, 6]
stu_interests = []
for n in range(num):
stu_interests.append(np.random.choice(interests, np.random.randint(1, 4), replace=False))
rand_interests = pd.Series(stu_interests)
return rand_interests
通过对n=5运行它,输出为:
random_interests(5)
0 [5, 6, 4]
1 [5]
2 [1, 6, 4]
3 [3, 4, 1]
4 [1, 2]
dtype: object
或单行
pd.Series([np.random.choice(
[1, 2, 3, 4, 5, 6], np.random.randint(1, 4), replace=False)
for i in range(num)])
输出:
0 [3, 6]
1 [4, 2, 1]
2 [6, 5]
3 [3]