[J] RYBEWYS
1老爷:
Lambertus #
»05.01.1979 Eindhoven40.31.01.2017 4»31.01.20274 & lt;Gemeente Waaire
5740344641
AM-BE, T
D1NLD1574034464194NB44MK362D54] 1
我有超过数千个文本文件包含示例图像中显示的信息,我想为每个文本文件编写一个CSV文件。文本文件包含驾照信息。我想把每一行放到一个单独的列中。
我正在使用这段代码:
#list the files
filelist = os.listdir(targetdir)
#read them into pandas
df_list = [pd.read_csv(file, header = None, encoding='latin1') for file in filelist]
df_list.to_csv('D:/Athora/CSV/text.csv', index = None, encoding='latin1')
我得到以下错误的代码:
FileNotFoundError Traceback (most recent call last)
Input In [8], in <cell line: 5>()
3 filelist = os.listdir(targetdir)
4 #read them into pandas
----> 5 df_list = [pd.read_csv(file, header = None, encoding='latin1') for file in filelist]
6 df_list.to_csv('D:/Athora/CSV/text.csv', index = None, encoding='latin1')
Input In [8], in <listcomp>(.0)
3 filelist = os.listdir(targetdir)
4 #read them into pandas
----> 5 df_list = [pd.read_csv(file, header = None, encoding='latin1') for file in filelist]
6 df_list.to_csv('D:/Athora/CSV/text.csv', index = None, encoding='latin1')
File D:SoftwaresAnacondalibsite-packagespandasutil_decorators.py:311, in deprecate_nonkeyword_arguments.<locals>.decorate.<locals>.wrapper(*args, **kwargs)
305 if len(args) > num_allow_args:
306 warnings.warn(
307 msg.format(arguments=arguments),
308 FutureWarning,
309 stacklevel=stacklevel,
310 )
--> 311 return func(*args, **kwargs)
File D:SoftwaresAnacondalibsite-packagespandasioparsersreaders.py:678, in read_csv(filepath_or_buffer, sep, delimiter, header, names, index_col, usecols, squeeze, prefix, mangle_dupe_cols, dtype, engine, converters, true_values, false_values, skipinitialspace, skiprows, skipfooter, nrows, na_values, keep_default_na, na_filter, verbose, skip_blank_lines, parse_dates, infer_datetime_format, keep_date_col, date_parser, dayfirst, cache_dates, iterator, chunksize, compression, thousands, decimal, lineterminator, quotechar, quoting, doublequote, escapechar, comment, encoding, encoding_errors, dialect, error_bad_lines, warn_bad_lines, on_bad_lines, delim_whitespace, low_memory, memory_map, float_precision, storage_options)
663 kwds_defaults = _refine_defaults_read(
664 dialect,
665 delimiter,
(...)
674 defaults={"delimiter": ","},
675 )
676 kwds.update(kwds_defaults)
--> 678 return _read(filepath_or_buffer, kwds)
File D:SoftwaresAnacondalibsite-packagespandasioparsersreaders.py:575, in _read(filepath_or_buffer, kwds)
572 _validate_names(kwds.get("names", None))
574 # Create the parser.
--> 575 parser = TextFileReader(filepath_or_buffer, **kwds)
577 if chunksize or iterator:
578 return parser
File D:SoftwaresAnacondalibsite-packagespandasioparsersreaders.py:932, in TextFileReader.__init__(self, f, engine, **kwds)
929 self.options["has_index_names"] = kwds["has_index_names"]
931 self.handles: IOHandles | None = None
--> 932 self._engine = self._make_engine(f, self.engine)
File D:SoftwaresAnacondalibsite-packagespandasioparsersreaders.py:1216, in TextFileReader._make_engine(self, f, engine)
1212 mode = "rb"
1213 # error: No overload variant of "get_handle" matches argument types
1214 # "Union[str, PathLike[str], ReadCsvBuffer[bytes], ReadCsvBuffer[str]]"
1215 # , "str", "bool", "Any", "Any", "Any", "Any", "Any"
-> 1216 self.handles = get_handle( # type: ignore[call-overload]
1217 f,
1218 mode,
1219 encoding=self.options.get("encoding", None),
1220 compression=self.options.get("compression", None),
1221 memory_map=self.options.get("memory_map", False),
1222 is_text=is_text,
1223 errors=self.options.get("encoding_errors", "strict"),
1224 storage_options=self.options.get("storage_options", None),
1225 )
1226 assert self.handles is not None
1227 f = self.handles.handle
File D:SoftwaresAnacondalibsite-packagespandasiocommon.py:786, in get_handle(path_or_buf, mode, encoding, compression, memory_map, is_text, errors, storage_options)
781 elif isinstance(handle, str):
782 # Check whether the filename is to be opened in binary mode.
783 # Binary mode does not support 'encoding' and 'newline'.
784 if ioargs.encoding and "b" not in ioargs.mode:
785 # Encoding
--> 786 handle = open(
787 handle,
788 ioargs.mode,
789 encoding=ioargs.encoding,
790 errors=errors,
791 newline="",
792 )
793 else:
794 # Binary mode
795 handle = open(handle, ioargs.mode)
FileNotFoundError: [Errno 2] No such file or directory: 'DLcasper.jpg.txt'
谁能告诉我我在这里做错了什么?另外,如果你有其他的解决办法,请告诉我。
Python找不到文件(因此FileNotFoundError),因为
df_list = [pd.read_csv(file, header = None, encoding='latin1') for file in filelist]
给出了一个文件名列表(没有文件的实际路径)。如果您添加了路径,它应该可以毫无问题地读取文件。
df_list = [pd.read_csv(f'{targetdir}/{file}', header = None, encoding='latin1') for file in filelist]
虽然这解决了加载文件的问题,但代码的下一行也会产生一个错误。您正在尝试使用属性(.to_csv)的列表(并保存所有的文本文件到相同的.csv文件-所以你是覆盖相同的文件,每次)。相反,尝试遍历列表并分别保存每个文件。
for idx, item in enumerate(df_list):
item.to_csv(f'test{idx}.csv', index=None, encoding='latin1')
上面的代码遍历加载的txt文件列表,并使用'循环计数' (idx)将它们导出到一个单独的.csv文件,以给文件一个唯一的文件名。