我正试图在AppleStore应用程序数据帧中的Price
列之后插入一个名为Price Label
的列,方法是遍历数据帧并将字符串("Free"或"Not Free"(附加到具有price = $0.00.
的应用程序中。因此,我尝试的代码低于
for i, row in df.iterrows():
price = row.Price.replace('$','')
if price == '0.0':
row.append("Free")
else:
row.append("Non-Free")
df[column].append("price_label") # in an attempt to add a header to column.
但随后我看到了下面的错误消息。有人能告诉我熊猫是否有一种特殊的方法可以将字符串连接到数据帧序列/列吗?一如既往,我感谢社区的帮助。你们是最棒的。
TypeError Traceback (most recent call last)
<ipython-input-191-c6a90a84d57b> in <module>
6 row.append("Free")
7 else:
----> 8 row.append("Non-Free")
9
10 df.head()
~anaconda3libsite-packagespandascoreseries.py in append(self, to_append, ignore_index, verify_integrity)
2580 to_concat = [self, to_append]
2581 return concat(
-> 2582 to_concat, ignore_index=ignore_index, verify_integrity=verify_integrity
2583 )
2584
~anaconda3libsite-packagespandascorereshapeconcat.py in concat(objs, axis, join, ignore_index, keys, levels, names, verify_integrity, sort, copy)
279 verify_integrity=verify_integrity,
280 copy=copy,
--> 281 sort=sort,
282 )
283
~anaconda3libsite-packagespandascorereshapeconcat.py in __init__(self, objs, axis, join, keys, levels, names, ignore_index, verify_integrity, copy, sort)
355 "only Series and DataFrame objs are valid".format(typ=type(obj))
356 )
--> 357 raise TypeError(msg)
358
359 # consolidate
TypeError: cannot concatenate object of type '<class 'str'>'; only Series and DataFrame objs are valid
price_label = []
for i, row in df.iterrows():
price = row.Price.replace('$','')
if price == '0.0':
price_label.append("Free")
else:
price_label.append("Non-Free")
然后
df["price_label"] = price_label
尝试添加一个具有默认值的新列,然后更新行,其中价格为0:
df['price_label'] = 'Non-Free' # append a new column
df.loc[df['Price'] == '0.0$', 'price_label'] = 'Free' # set the price_label column, where the Price == 0.0$
代码的第二行通过"布尔索引"进行过滤:
https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#boolean-索引
本文以为例对此进行了详细解释
https://appdividend.com/2019/01/25/pandas-boolean-indexing-example-python-tutorial/
使用loc((按行和列索引进行选择:https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#indexing-标签
您可以使用numpy.where(condition, x, y)
来返回元素。然后可以使用df.columns.getloc
方法来获取Price
列的位置。然后,您可以指定列的顺序,以便根据需要重新排列它们。
使用此:
import numpy as np
# --> make new column named `Price-Label`
df["Price-Label"] = np.where(df["Price"].eq("$0.0"), "Free", "Non-Free")
#--> get the location of `Price` column
price_col_loc = df.columns.get_loc("Price")
#--> Obtain the resulting dataframe by specifying the order of columns
#--> such that Price-Label column appear after the Price column
result = df[list(df.columns[:price_col_loc + 1]) + [df.columns[-1]] + list(df.columns[price_col_loc + 1:-1])]