如何避免对if语句进行两次编码

我主要是自己编程，所以没有人检查我的代码。我觉得我养成了一堆坏习惯。

我在这里粘贴的代码是有效的，但我想听听其他一些解决方案。

我创建了一个名为teams_shots的词典。我遍历pandas数据帧，它在一行中有客场和主队的名称。我想跟踪数据框中出现的每支球队的投篮情况。这就是为什么我检查home_team_name或away_team_name在字典中是否没有条目，如果是，我会创建一个条目。

for index,match in df.iterrows():
if match['home_team_name'] not in teams_shots:
#we have to setup an entry in the dictionary
teams_shots[match['home_team_name']]=[]
teams_shots[match['home_team_name']].append(match['home_team_shots'])
home_shots_avg.append(None)
else:
home_shots_avg.append(np.mean(teams_shots[match['home_team_name']]))
teams_shots[match['home_team_name']].append(match['home_team_shots'])
if match['away_team_name'] not in teams_shots:
teams_shots[match['away_team_name']]=[]
teams_shots[match['away_team_name']].append(match['away_team_shots'])
away_shots_avg.append(None)
else:
away_shots_avg.append(np.mean(teams_shots[match['away_team_name']])) 
teams_shots[match['away_team_name']].append(match['away_team_shots'])

正如您所看到的，几乎相同的代码被写了两次，这不是良好编程的标志。我曾想过在if语句中使用or运算符，但可能已经创建了一个条目，我会截断它。

在这种情况下，我认为额外的for循环应该起作用：

for index,match in df.iterrows():
for name, shots in {'home_team_name':'home_team_shots',
'away_team_name':'away_team_shots'}:
if match[name] not in teams_shots:
#we have to setup an entry in the dictionary
teams_shots[name]=[]
teams_shots[name].append(match[shots])
home_shots_avg.append(None)
else:
home_shots_avg.append(np.mean(teams_shots[name]))

但可能有一种方法可以用矢量化的方式来处理这个问题。

我会使用get作为快速查找。它不抛出KeyErrors，默认的None充当真实中的False

for index, match in df.iterrows():
home, away, home_shots, away_shots = match['home_team_name'],
match['away_team_name'],
match['home_team_shots'],
match['away_team_shots']

if not teams_shots.get(home):
# No need to separately allocate the array
teams_shots[home] = [home]
home_shots_avg.append(None)
else:
home_shots_avg.append(np.mean(teams_shots[home_shots]))
if not teams_shots.get(away):
teams_shots[away] = [away]
away_shots_avg.append(None)
else:
away_shots_avg.append(np.mean(teams_shots[away_shots]))

相关内容

最新更新

热门标签：