嗨,这真的让我感到困惑,因为我在一个大的datframe上使用了一个命令:
df.duplicated(subset=None, keep='first)
这看起来与文档所说的相同:
DataFrame.duplicated(subset=None, keep='first')
我只是使用 df 代替,但是,我得到的只是以下回溯:
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-53-529f7b7a97fb> in <module>()
----> 1 df.duplicated(subset=None, keep='first')
/anaconda3/lib/python3.7/site-packages/pandas/core/frame.py in duplicated(self, subset, keep)
4383 vals = (col.values for name, col in self.iteritems()
4384 if name in subset)
-> 4385 labels, shape = map(list, zip(*map(f, vals)))
4386
4387 ids = get_group_index(labels, shape, sort=False, xnull=False)
/anaconda3/lib/python3.7/site-packages/pandas/core/frame.py in f(vals)
4364 def f(vals):
4365 labels, shape = algorithms.factorize(
-> 4366 vals, size_hint=min(len(self), _SIZE_HINT_LIMIT))
4367 return labels.astype('i8', copy=False), len(shape)
4368
/anaconda3/lib/python3.7/site-packages/pandas/util/_decorators.py in wrapper(*args, **kwargs)
176 else:
177 kwargs[new_arg_name] = new_arg_value
--> 178 return func(*args, **kwargs)
179 return wrapper
180 return _deprecate_kwarg
/anaconda3/lib/python3.7/site-packages/pandas/core/algorithms.py in factorize(values, sort, order, na_sentinel, size_hint)
628 na_sentinel=na_sentinel,
629 size_hint=size_hint,
--> 630 na_value=na_value)
631
632 if sort and len(uniques) > 0:
/anaconda3/lib/python3.7/site-packages/pandas/core/algorithms.py in _factorize_array(values, na_sentinel, size_hint, na_value)
474 uniques = vec_klass()
475 labels = table.get_labels(values, uniques, 0, na_sentinel,
--> 476 na_value=na_value)
477
478 labels = _ensure_platform_int(labels)
pandas/_libs/hashtable_class_helper.pxi in pandas._libs.hashtable.PyObjectHashTable.get_labels()
TypeError: unhashable type: 'list'
我做错了什么?
据我所知,您的数据框中有列表,而 python 或 Pandas 不能散列列表。您可能已经观察到了这一点,以防您曾经尝试将列表用作字典中的键。一个简单的解决方法是将列表转换为可哈希的元组。