pandas矢量化方法允许在单个线路中做很多事情,从而导致一条比平常更长的线路。如何将PEP指南与Pandas长行调和?
pep建议最大线长不得大于72。
熊猫线可以是这样的:
df['VALUE_EXPRESSED'] = np.where((df['TEST_HOSPITAL_CONCEPT_NAME_CLEAN']=='EO AUTOMATED ABS') & (df['UNIT_AS_EXPECTED']=='cells/mcl'),df['VALUE_EXPRESSED']*1000,df['VALUE_EXPRESSED'] )
或
query = df.groupby(['TEST_HOSPITAL_CONCEPT_NAME_CLEAN', 'UNIT_AS_EXPECTED_TRANSFORMED', 'NUMERATOR','DENOMINATOR']).size().reset_index(name='COUNT')
我无法修改标题名称,我认为使用变量缩短名称将使代码不太明确且难以读取。
您所指的是方法链接。
有几种分解事物的方法:
- 将整个表达式放在括号中(如下(
- 使用
进行线路延续而没有括号
示例:
query = (df
.groupby(
[
'TEST_HOSPITAL_CONCEPT_NAME_CLEAN',
'UNIT_AS_EXPECTED_TRANSFORMED',
'NUMERATOR',
'DENOMINATOR'
]
)
.size()
.reset_index(name='COUNT')
)
还考虑将非常长的子表达放入中间变量中。例如,您可以重写您的行:
df['VALUE_EXPRESSED'] = np.where((df['TEST_HOSPITAL_CONCEPT_NAME_CLEAN']=='EO AUTOMATED ABS') & (df['UNIT_AS_EXPECTED']=='cells/mcl'),df['VALUE_EXPRESSED']*1000,df['VALUE_EXPRESSED'] )
as:
cond = (
(df['TEST_HOSPITAL_CONCEPT_NAME_CLEAN'] == 'EO AUTOMATED ABS') &
(df['UNIT_AS_EXPECTED'] == 'cells/mcl')
)
df['VALUE_EXPRESSED'] = np.where(
cond,
df['VALUE_EXPRESSED'] * 1000,
df['VALUE_EXPRESSED'],
)