我想:
- 删除单词";ANOS";以及";ANO">
- 替换";A";至";TO";;以及
- 替换"<1ano";至";0到1">
示例:";10 A 19 ANOS";至";10至19〃;
data = pd.DataFrame({'FAIXA_ETARIA': ['10 A 19 ANOS',' 20 A 29 ANOS', '30 A 39 ANOS', '40 A 49 ANOS',
'50 A 59 ANOS', ' 60 A 69 ANOS', '70 A 79 ANOS', '80 A 89 ANOS',
'<1ANO'],
'Count': [3, 8, 28, 7, 15, 9, 3, 5, 3]})
PS:我的数据库显示了许多列,我希望这个过程只在列中执行;FAIXA_ETARIA">
谢谢你的帮助!
这里有一种可能的方法:
data["FAIXA_ETARIA"]
.str.replace(r"ANOw?", "") # Regex for ANO plus an optional single character
.str.replace(r"A", "TO") # Replace a single character
.str.replace(r"<w?", "0 to 1") # Regex for < and non-greedy multiple characters.
输出:
0 10 TO 19
1 20 TO 29
2 30 TO 39
3 40 TO 49
4 50 TO 59
5 60 TO 69
6 70 TO 79
7 80 TO 89
8 0 to 1
Name: FAIXA_ETARIA, dtype: object
您可以迭代抛出数组中的入口,然后使用python的replace()
方法。示例:
message = "Hello there"
custom = message.replace("there", "kvratto")
结果将是";你好kvratto"。
就你而言,你有一本字典。因此,您可以使用dictionaryname['columnname']
获取特定条目。您可以将结果放入一个新变量中,然后可以像数组一样处理它。
我希望这足够帮助!
或者,只提取数字,并与"to"连接:
data['FAIXA_ETARIA'] = data['FAIXA_ETARIA'].str.findall('d+').str.join(' to ')
cond = data['FAIXA_ETARIA'] == '1'
data.loc[cond, 'FAIXA_ETARIA'] = '0 to 1'
0 10 to 19
1 20 to 29
2 30 to 39
3 40 to 49
4 50 to 59
5 60 to 69
6 70 to 79
7 80 to 89
8 0 to 1
Name: FAIXA_ETARIA, dtype: object