假设我有一个数据帧，如下所示：

df = pd.DataFrame({0:['Hello World!']}) # here df could have more than one column of data as shown below
df = pd.DataFrame({0:['Hello World!'], 1:['Hello Mars!']}) # or df could have more than one row of data as shown below
df = pd.DataFrame({0:['Hello World!', 'Hello Mars!']})

我还有一个列名列表，如下所示：

new_col_names = ['a','b','c','d'] # here, len(new_col_names) might vary like below
new_col_names = ['a','b','c','d','e'] # but we can always be sure that the len(new_col_names) >= len(df.columns)

既然如此，我如何替换df中的列名，使其结果如下：

df = pd.DataFrame({0:['Hello World!']})
new_col_names = ['a','b','c','d']
# result would be like this
a               b               c               d
Hello World!    (empty string)  (empty string)  (empty string)

df = pd.DataFrame({0:['Hello World!'], 1:['Hello Mars!']}) 
new_col_names = ['a','b','c','d']
# result would be like this
a               b               c               d
Hello World!    Hello Mars!     (empty string)  (empty string)

df = pd.DataFrame({0:['Hello World!', 'Hello Mars!']})
new_col_names = ['a','b','c','d','e']
a               b               c               d               e
Hello World!    (empty string)  (empty string)  (empty string)  (empty string)
Hellow Mars!    (empty string)  (empty string)  (empty string)  (empty string)

通过阅读StackOverflow的回答，我有一个模糊的想法，它可能是下面的东西：

df[new_col_names] = '' # but this returns KeyError
# or this
df.columns=new_col_names # but this returns ValueError: Length mismatch (of course)

如果有人能向我展示一种覆盖现有数据帧列名的方法，同时在行中添加具有空字符串值的新数据列，我将非常感谢您的帮助。

想法是通过zip根据现有列名称创建字典，只重命名现有列，然后通过DataFrame.reindex:添加所有新列

df = pd.DataFrame({0:['Hello World!', 'Hello Mars!']})
new_col_names = ['a','b','c','d','e']
df1 = (df.rename(columns=dict(zip(df.columns, new_col_names)))
.reindex(new_col_names, axis=1, fill_value=''))
print (df1)
a b c d e
0  Hello World!        
1   Hello Mars!      

df1 = (df.rename(columns=dict(zip(df.columns, new_col_names)))
.reindex(new_col_names, axis=1))
print (df1)
a   b   c   d   e
0  Hello World! NaN NaN NaN NaN
1   Hello Mars! NaN NaN NaN NaN

这里有一个函数，可以执行您想要的操作

我找不到一行字，但耶斯拉找到了：他的回答

import pandas as pd
# function
def rename_add_col(df: pd.DataFrame, cols: list) -> pd.DataFrame:
c_len = len(df.columns)
if c_len == len(cols):
df.columns = cols
else:
df.columns = cols[:c_len]
df = pd.concat([df, pd.DataFrame(columns=cols[c_len:])]) 
return df
# create dataframe
t1 = pd.DataFrame({'a': ['1', '2', '3'], 'b': ['4', '5', '6'], 'c': ['7', '8', '9']})
a   b   c
0   1   4   7
1   2   5   8
2   3   6   9
# call function
cols = ['d', 'e', 'f']
t1 = rename_add_col(t1, cols)
d   e   f
0   1   4   7
1   2   5   8
2   3   6   9
# call function
cols = ['g', 'h', 'i', 'new1', 'new2']
t1 = rename_add_col(t1, cols)

g   h   i   new1    new2
0   1   4   7    NaN     NaN
1   2   5   8    NaN     NaN
2   3   6   9    NaN     NaN

这可能有助于您同时完成所有操作

使用旧的Dataframe使用pd.Dataframe((方法重新创建另一个数据帧，然后通过列表添加在列参数中添加新列。

注意：这将根据索引长度添加新列，但使用NaN值，解决方法是执行df.fillna(' ')

pd.DataFrame(df.to_dict() , columns = list(df.columns)+['b','c'])

希望这能有所帮助！干杯

替换现有列名，同时向pandas数据帧添加带有空字符串的新列

这里有一个函数，可以执行您想要的操作

这可能有助于您同时完成所有操作

相关内容

最新更新

热门标签：