我有一个json文件.json,它具有以下结构:
[
{ "name":"n Johnn ", "age": "30 n ","car":" Bmw n n" },
{ "name":"n Joen ", "age": "20 n ","car":" mercedes n n" },
{ "name":"n Alexn ", "age": "18 n ","car":" tesla n n" }
]
我想去掉每个值的所有空白和换行符。这是我的代码:
df = pd.read_json('a.json')
df= df.replace(r'n','',regex=True)
我删除了换行符,但没有删除空白,尽管我写了
df.columns=df.columns.str.replace(' ','')
df.columns=df.columns.str.strip()
df.columns=df.columns.str.lstrip()
我的输出:
name age car
0 John 30 Bmw
1 Joe 20 mercedes
2 Alex 18 tesla
我该怎么做?
@chitown88的答案可能更快,但如果你想使用regex,你可以这样做:
df.replace('(^s+|s+$)', '', regex=True, inplace=True)
输出:
name age car
0 John 30 Bmw
1 Joe 20 mercedes
2 Alex 18 tesla
您可以使用pandas-applymap函数来迭代所有值
import pandas as pd
df = pd.read_json('a.json')
df = df.applymap(lambda x: x.strip() if isinstance(x, str) else x)
print(df)
输出:
name age car
0 John 30 Bmw
1 Joe 20 mercedes
2 Alex 18 tesla
另一种非常相似但更紧凑的方式是:
import pandas as pd
df = pd.read_json("a.json")
df_obj = df.select_dtypes(['object'])
df[df_obj.columns] = df_obj.apply(lambda x: x.str.strip())
print(df)
输出:
name age car
0 John 30 Bmw
1 Joe 20 mercedes
2 Alex 18 tesla
一个选项是使用列表和字典理解来清理json本身:
import pandas as pd
data = [
{ "name":"n Johnn ", "age": "30 n ","car":" Bmw n n" },
{ "name":"n Joen ", "age": "20 n ","car":" mercedes n n" },
{ "name":"n Alexn ", "age": "18 n ","car":" tesla n n" }
]
data = [{k:v.strip() for k,v in each.items()} for each in data]
df = pd.DataFrame(data)
或者,您可以遍历每一列:
data = [
{ "name":"n Johnn ", "age": "30 n ","car":" Bmw n n" },
{ "name":"n Joen ", "age": "20 n ","car":" mercedes n n" },
{ "name":"n Alexn ", "age": "18 n ","car":" tesla n n" }
]
df = pd.DataFrame(data)
for col in df.columns:
df[col] = df[col].str.strip()
输出:
print(df)
name age car
0 John 30 Bmw
1 Joe 20 mercedes
2 Alex 18 tesla