我在将CSV文件导入python时遇到问题。.csv格式的整个文件在所有单元格中都有正常值,但在将数据复制到数据框的过程中出现错误,并弹出一些Null值,这使得无法进行此操作。
import pandas as pd
excel_path
df=pd.read_csv(excel_path, error_bad_lines=False, sep=';',dtype='c')
print (df)
我也尝试过其他的工作方式,但结果是一样的。
import csv
excel_path
with open(excel_path, 'r') as csv_file:
csv_reader = csv.reader(csv_file)
你知道如何改变将数据加载到python中的方式吗?我已经检查了现有的主题,并尝试了不同的编码。在我的文件CSV文件是数字,字符串和日期。
Null值没有错误。问题是出现了这些值。在CSV文件中有普通字符串和整数。我需要这些数据,这就是为什么我不能只传递null值。
这就是这个文件作为数据帧的样子:
R Unnamed: 1 Unnamed: 2 Unnamed: 3 Unnamed: 4 Unnamed: 5
0 NaN NaN NaN NaN NaN NaN
1 NaN NaN NaN NaN NaN NaN
2 NaN NaN NaN NaN NaN NaN
3 NaN NaN NaN NaN NaN NaN
4 NaN NaN NaN NaN NaN NaN
... .. ... ... ... ... ...
1326 NaN NaN NaN NaN NaN NaN
1327 NaN NaN NaN NaN NaN NaN
1328 NaN NaN NaN NaN NaN NaN
1329 NaN NaN NaN NaN NaN NaN
1330 NaN NaN NaN NaN NaN NaN
何时将"dtype='c'"添加到此代码行:df=pd.read_csv(excel_path,error_bad_lines=False(我收到这样的东西:
R Unnamed: 1 Unnamed: 2 Unnamed: 3 Unnamed: 4 Unnamed: 5 Unnamed: 6
0 b'' b'' b'' b'' b'' b'' b''
1 b'' b'' b'' b'' b'' b'' b''
2 b'' b'' b'' b'' b'' b'' b''
3 b'' b'' b'' b'' b'' b'' b''
4 b'' b'' b'' b'' b'' b'' b''
... ... ... ... ... ... ... ...
1326 b'' b'' b'' b'' b'' b'' b''
1327 b'' b'' b'' b'' b'' b'' b''
1328 b'' b'' b'' b'' b'' b'' b''
1329 b'' b'' b'' b'' b'' b'' b''
1330 b'' b'' b'' b'' b'' b'' b''
我的CSV文件如下所示:
RNSP_ID;AUTHOR ID;PRODUCT FAMILY;REQUEST SCOPE;HC Prog;DRAWING NUMBER;VALIDITY;PART TYPE;DOCUMENT OF DEFINITION;CLASSIFICATION;ENGLISH DESIGNATION;FRENCH DESIGNATION;CREATION DATE;
RNSP11701;G700895;Fasteners;Selection;H60;U533A;Serial;Normalised;AS3510;4.7.6;"CABLE SAFETY KIT;"CABLE DE SECURITE;"SICHERUNGSDRAHTKIT;"CABLE SAFETY KIT;17/03/2015 13:38:23;
RNSP11701;G700895;Fasteners;Selection;H60;U533A;Serial;Normalised;AS3510;4.7.6;"CABLE SAFETY KIT;"CABLE DE SECURITE;"SICHERUNGSDRAHTKIT;"CABLE SAFETY KIT;17/03/2015 13:38:23;
RNSP11707;xa434956;Fasteners;Creation;H60;U311A;Serial;Normalised;NSA 551.33;4.8.1;"STUD;"FERMETURE RAPIDE;"VERSCHLUSSZAPFEN;"PASADOR DE CIERRE;19/03/2015 09:28:18;
RNSP11746;xa444992;Fasteners;Use of a new;H60;U...;Serial;Non Aero;ISO7070;4.7.1.1;"NUT HEXA;"ECROU;"NUSS;"NUT;27/03/2015 12:47:53;
RNSP11746;xa444992;Fasteners;Use of a new;H60;U...;Serial;Non Aero;ISO7071;4.7.1.1;"NUT HEXA;"ECROU;"NUSS;"NUT;27/03/2015 12:47:53;
RNSP11747;xa444992;Fasteners;Addition;H60;U...;Serial;Non Aero;DIN950;4.7.1.1;"HANDWHEELS;"VOLANTS;"HANDRADER;"HANDWHEELS;27/03/2015 13:19:24;
RNSP11749;xa444992;Fasteners;Addition;H60;U...;Serial;Non Aero;DIN934;4.2.1.1;"HEXAGONAL NUT;"HEXAGONAL NUT;"SECHSKANTMUTTER;"HEXAGONAL NUT;27/03/2015 13:48:24;
RNSP11749;xa444992;Fasteners;Addition;H60;U...;Serial;Non Aero;DIN934;4.2.1.1;"HEXAGONAL NUT;"HEXAGONAL NUT;"SECHSKANTMUTTER;"HEXAGONAL NUT;27/03/2015 13:48:24;
RNSP11750;xa444992;Fasteners;Addition;H10;U...;Serial;Non Aero;ISO7089;4.3.1;"WASHER, FLAT;"RONDELLE;"SCHEIBE, FLACH;"WASHER, FLAT;27/03/2015 14:01:53;
RNSP11750;xa444992;Fasteners;Addition;H10;U...;Serial;Non Aero;ISO7089;4.3.1;"WASHER, FLAT;"RONDELLE;"SCHEIBE, FLACH;"WASHER, FLAT;27/03/2015 14:01:53;
谢谢!
我认为您的CSV文件有两个问题:
- 正如@TrentonMcKinney所指出的,列的数量与标题的数量相差两个,并且
- 它有一些随机的
"
(单"双引号"(,这让pandas解析器抓狂
我能够通过删除文件中的所有"
来正确解析它,然后应用以下命令导入它:
>>> df = pd.read_csv('20210108.csv',skiprows=[0],names=[1,2,3,4,5,6,7,8,9,10,11,12,13,14,15],delimiter=';', index_col=False)
详细信息:
- CSV文件已被
"
加载 - 分隔符设置为
;
- 我让panda跳过第一行(没有对齐的标题(,并给列编号从1到15的标题(缺少一个标题很明显,这是一种语言,类似于"<德语?>DESIGNATION",另一个不是,所以我用数字替换了它们(
- 我还强迫熊猫在没有索引列的情况下导入,因为第一个看起来不像索引列
现在数据帧似乎是合理的:
>>> df
1 2 3 4 5 6 7
0 RNSP11701 G700895 Fasteners Selection H60 U533A Serial
1 RNSP11701 G700895 Fasteners Selection H60 U533A Serial
2 RNSP11707 xa434956 Fasteners Creation H60 U311A Serial
3 RNSP11746 xa444992 Fasteners Use of a new H60 U... Serial
4 RNSP11746 xa444992 Fasteners Use of a new H60 U... Serial
5 RNSP11747 xa444992 Fasteners Addition H60 U... Serial
6 RNSP11749 xa444992 Fasteners Addition H60 U... Serial
7 RNSP11749 xa444992 Fasteners Addition H60 U... Serial
8 RNSP11750 xa444992 Fasteners Addition H10 U... Serial
9 RNSP11750 xa444992 Fasteners Addition H10 U... Serial
8 9 10 11 12
0 Normalised AS3510 4.7.6 CABLE SAFETY KIT CABLE DE SECURITE
1 Normalised AS3510 4.7.6 CABLE SAFETY KIT CABLE DE SECURITE
2 Normalised NSA 551.33 4.8.1 STUD FERMETURE RAPIDE
3 Non Aero ISO7070 4.7.1.1 NUT HEXA ECROU
4 Non Aero ISO7071 4.7.1.1 NUT HEXA ECROU
5 Non Aero DIN950 4.7.1.1 HANDWHEELS VOLANTS
6 Non Aero DIN934 4.2.1.1 HEXAGONAL NUT HEXAGONAL NUT
7 Non Aero DIN934 4.2.1.1 HEXAGONAL NUT HEXAGONAL NUT
8 Non Aero ISO7089 4.3.1 WASHER, FLAT RONDELLE
9 Non Aero ISO7089 4.3.1 WASHER, FLAT RONDELLE
13 14 15
0 SICHERUNGSDRAHTKIT CABLE SAFETY KIT 17/03/2015 13:38:23
1 SICHERUNGSDRAHTKIT CABLE SAFETY KIT 17/03/2015 13:38:23
2 VERSCHLUSSZAPFEN PASADOR DE CIERRE 19/03/2015 09:28:18
3 NUSS NUT 27/03/2015 12:47:53
4 NUSS NUT 27/03/2015 12:47:53
5 HANDRADER HANDWHEELS 27/03/2015 13:19:24
6 SECHSKANTMUTTER HEXAGONAL NUT 27/03/2015 13:48:24
7 SECHSKANTMUTTER HEXAGONAL NUT 27/03/2015 13:48:24
8 SCHEIBE, FLACH WASHER, FLAT 27/03/2015 14:01:53
9 SCHEIBE, FLACH WASHER, FLAT 27/03/2015 14:01:53