Pandas vs PEP style guide



pandas矢量化方法允许在单个线路中做很多事情,从而导致一条比平常更长的线路。如何将PEP指南与Pandas长行调和?

pep建议最大线长不得大于72。

熊猫线可以是这样的:

df['VALUE_EXPRESSED'] = np.where((df['TEST_HOSPITAL_CONCEPT_NAME_CLEAN']=='EO AUTOMATED ABS') & (df['UNIT_AS_EXPECTED']=='cells/mcl'),df['VALUE_EXPRESSED']*1000,df['VALUE_EXPRESSED'] )

query = df.groupby(['TEST_HOSPITAL_CONCEPT_NAME_CLEAN', 'UNIT_AS_EXPECTED_TRANSFORMED', 'NUMERATOR','DENOMINATOR']).size().reset_index(name='COUNT')

我无法修改标题名称,我认为使用变量缩短名称将使代码不太明确且难以读取。

您所指的是方法链接

有几种分解事物的方法:

  • 将整个表达式放在括号中(如下(
  • 使用进行线路延续而没有括号

示例:

query = (df
    .groupby(
        [
            'TEST_HOSPITAL_CONCEPT_NAME_CLEAN',
            'UNIT_AS_EXPECTED_TRANSFORMED',
            'NUMERATOR',
            'DENOMINATOR'
        ]
    )
    .size()
    .reset_index(name='COUNT')
)

还考虑将非常长的子表达放入中间变量中。例如,您可以重写您的行:

df['VALUE_EXPRESSED'] = np.where((df['TEST_HOSPITAL_CONCEPT_NAME_CLEAN']=='EO AUTOMATED ABS') & (df['UNIT_AS_EXPECTED']=='cells/mcl'),df['VALUE_EXPRESSED']*1000,df['VALUE_EXPRESSED'] )

as:

cond = (
    (df['TEST_HOSPITAL_CONCEPT_NAME_CLEAN'] == 'EO AUTOMATED ABS') &
    (df['UNIT_AS_EXPECTED'] == 'cells/mcl')
)
df['VALUE_EXPRESSED'] = np.where(
    cond,
    df['VALUE_EXPRESSED'] * 1000,
    df['VALUE_EXPRESSED'],
)

最新更新