用pandas将.txt读取为数据帧

我正在尝试读入一个文本文件。该文件包含以下输入：

DE  01945   Ruhland Brandenburg BB      00  Landkreis Oberspreewald-Lausitz 12066   51.4576 13.8664 4
DE  01945   Tettau  Brandenburg BB      00  Landkreis Oberspreewald-Lausitz 12066   51.4333 13.7333 4
DE  01945   Grünewald   Brandenburg BB      00  Landkreis Oberspreewald-Lausitz 12066   51.4    14  4
DE  01945   Guteborn    Brandenburg BB      00  Landkreis Oberspreewald-Lausitz 12066   51.4167 13.9333 4
DE  01945   Kroppen Brandenburg BB      00  Landkreis Oberspreewald-Lausitz 12066   51.3833 13.8    4
DE  01945   Schwarzbach Brandenburg BB      00  Landkreis Oberspreewald-Lausitz 12066   51.45   13.9333 4
DE  01945   Hohenbocka  Brandenburg BB      00  Landkreis Oberspreewald-Lausitz 12066   51.431  14.0098 4
DE  01945   Lindenau    Brandenburg BB      00  Landkreis Oberspreewald-Lausitz 12066   51.4    13.7333 4
DE  01945   Hermsdorf   Brandenburg BB      00  Landkreis Oberspreewald-Lausitz 12066   51.4055 13.8937 4
DE  01968   Senftenberg Brandenburg BB      00  Landkreis Oberspreewald-Lausitz 12066   51.5252 14.0016 4
DE  01968   Schipkau Hörlitz    Brandenburg BB      00  Landkreis Oberspreewald-Lausitz 12066   51.5299 13.9508 
DE  01968   Schipkau    Brandenburg BB      00  Landkreis Oberspreewald-Lausitz 12066   51.5456 13.9121 4
DE  01979   Lauchhammer Brandenburg BB      00  Landkreis Oberspreewald-Lausitz 12066   51.4881 13.7662 4

我的代码看起来像这样。

import pandas as pd
data = pd.read_csv('DE.txt', sep=" ", header=None)

目前我得到了以下错误，我无法克服：

ParserError:标记数据时出错。C错误：第11行应为2个字段，看到3个

我想这是由于城市名称由两部分组成，我如何才能正确读取文本文件？

您必须正常读取文件，并将所有内容解析为字典，然后创建数据帧

import pandas as pd
file = open("DE.txt", "r")
lines = file.readlines()
dict = {}
for line in lines:
//Create your own dictionary as you want to be created using the value in each line and store it in dict
df = pd.DataFrame(data=dict)

或者，如果这对你来说更容易的话，你可以创建一个二维列表而不是字典，并以同样的方式创建数据帧。

相关内容

最新更新

热门标签：