我需要使用"iconv"来转换Windows上生成的某些文件的字符编码。有时这些文件非常大,执行失败,因为它耗尽了 RAM。谷歌搜索我找到了一个名为"iconv-chunks.pl"的脚本,它基本上是一个 perl 脚本,可以处理文件并且运行良好,但它在我的/tmp 文件夹中生成临时文件。问题是这个脚本每天都会自动为许多文件运行,并且即使它有清理标志打开,它也会在我的/tmp dir 上生成垃圾。
我说的脚本是:https://code.google.com/p/clschool-team4/source/browse/trunk/iconv-chunks.pl?r=53
#!/usr/bin/perl
our $CHUNK_SIZE = 1024 * 1024 * 100; # 100M
=head1 NAME
iconv-chunks - Process huge files with iconv
=head1 SYNOPSIS
iconv-chunks <filename> [iconv-options]
=head1 DESCRIPTION
The standard iconv program reads the entire input file into
memory, which doesn't work for large files (such as database exports).
This script is just a wrapper that processes the input file
in manageable chunks and writes it to standard output.
The first argument is the input filename (use - to specify standard input).
Anything else is passed through to iconv.
The real iconv needs to be somewhere in your PATH.
=head1 EXAMPLES
# Convert latin1 to utf-8:
./iconv-chunks database.txt -f latin1 -t utf-8 > out.txt
# Input filename of - means standard input:
./iconv-chunks - -f iso8859-1 -t utf8 < database.txt > out.txt
# More complex example, using compressed input/output to minimize disk use:
zcat database.txt.gz | ./iconv-chunks - -f iso8859-1 -t utf8 |
gzip - > database-utf.dump.gz
=head1 AUTHOR
Maurice Aubrey <maurice.aubrey+iconv@gmail.com>
=cut
# $Id: iconv-chunks 6 2007-08-20 21:14:55Z mla $
use strict;
use warnings;
use bytes;
use File::Temp qw/ tempfile /;
# iconv errors:
# iconv: unable to allocate buffer for input: Cannot allocate memory
# iconv: cannot open input file `database.txt': File too large
@ARGV >= 1 or die "Usage: $0 <inputfile> [iconv-options]n";
my @options = splice @ARGV, 1;
my($oh, $tmp) = tempfile(undef, CLEANUP => 1);
# warn "Tempfile: $tmpn";
my $iconv = "iconv @options $tmp";
sub iconv { system($iconv) == 0 or die "command '$iconv' failed: $!" }
my $size = 0;
# must read by line to ensure we don't split multi-byte character
while (<>) {
$size += length $_;
print $oh $_;
if ($size >= $CHUNK_SIZE) {
iconv;
truncate $oh, 0 or die "truncate '$tmp' failed: $!";
seek $oh, 0, 0 or die "seek on '$tmp' failed: $!";
$size = 0;
}
}
iconv if $size > 0;
任何帮助查找问题或完成后如何删除临时文件?
问候
更改
my($oh, $tmp) = tempfile(undef, CLEANUP => 1);
自
my($oh, $tmp) = tempfile(UNLINK => 1);
CLEANUP
用于在退出时触发删除临时目录,而不是文件。请注意,为了使用默认模板而将undef
作为第一个参数传递是不必要的。