不可散列类型:在 Python 中从列表值中删除字符时出现"切片"错误?



我有一个使用BeautifulSoup的网站提取的值列表。它看起来像这样:

tables_values1 = soup.find_all('td',attrs={'class':'x1'})
print(tables_values1)

输出: [123值1,123值2,"123值3] (注意没有"或"(

我正在尝试使用以下方法(我也在stackexchange上找到(切掉前x个字符:

tables_values = [x[2:] for x in tables_values1]

但是,这将返回:

类型

错误:不可哈希类型:"切片">

任何人都可以帮助弄清楚为什么会发生这种情况以及如何解决它?非常感谢!

编辑:请让我知道这是否是一个有效的列表!

编辑 3:按照以下要求打印确切的 repr:

[<td class="views-field views-field-field-category-value-2018">136          </td>, <td class="views-field views-field-field-category-value-2018">SFD          </td>, <td class="views-field views-field-field-category-value-2018">136          </td>, <td class="views-field views-field-field-category-value-2018">$33,657,146           </td>, <td class="views-field views-field-field-category-value-2018">9.7          </td>, <td class="views-field views-field-field-category-value-2018">$33,657,146           </td>, <td class="views-field views-field-field-category-value-2018">61          </td>, <td class="views-field views-field-field-category-value-2018">34          </td>, <td class="views-field views-field-field-category-value-2018">5          </td>, <td class="views-field views-field-field-category-value-2018">61          </td>, <td class="views-field views-field-field-category-value-2018">34          </td>, <td class="views-field views-field-field-category-value-2018">5          </td>, <td class="views-field views-field-field-category-value-2018">5          </td>, <td class="views-field views-field-field-category-value-2018">95          </td>]
<td class="views-field views-field-field-category-value-2018">136          </td>

这些是列表中的BeautifulSoup标签对象,而不是字符串。您正在尝试将它们切片,就好像它们是字符串一样。您确实应该将它们用作标签,而不是尝试进行字符串操作;例如,如果您尝试获取标签之间的文本,那将是

contents = [x.string for x in tables_values1]

其中string属性是获取标记的单个字符串子项(如果有(的帮助程序。


如果您确实想通过字符串操作而不是通过 BeautifulSoup 界面来执行任务,您可以将标签对象转换为字符串,包括<td class="..."></td>部分:

strings = [str(x) for x in tables_values1]

然后,您可以根据需要对字符串进行切片。

最新更新