计算非常大的文件中的 # 行会导致系统内存不足异常


static void Main(string[] args) 
{
string TheDataFile = "";
string ErrorMsg = "";
string lngTransDate = "";
ProcessDataFile  ProcessTheDataFile = new ProcessDataFile();
string TheFile = "S:\MIS\Provider NPI file\Processed\npidata_20050523-20161009.csv";
string[] lines = File.ReadAllLines(TheFile, Encoding.UTF8);//Read all lines to an array 
Console.WriteLine(lines.Length.ToString());
Console.ReadLine();
}

这将引发错误,因为文件非常大(有 600 万行(。有没有办法处理大文件并计算 # 行?

使用StreamReader

string TheFile = "S:\MIS\Provider NPI file\Processed\npidata_20050523-20161009.csv";
int count = 0;
using (System.IO.StreamReader sr = new System.IO.StreamReader(TheFile))
{
while (sr.ReadLine() != null)
count++;
}

您需要对文件进行延迟评估,以便它不会完全加载到内存中。

帮助程序方法

public static class ToolsEx
{
public static IEnumerable<string> ReadAsLines(this string filename)
{
using (var streamReader = new StreamReader(filename))
while (!streamReader.EndOfStream)
yield return streamReader.ReadLine();
}
}

用法

var lineCount = "yourfile.txt".ReadAsLines().Count();

根据这个已经接受的答案,这应该可以做到。

using System;
using System.IO;
namespace CountLinesInFiles_45194927
{
class Program
{
static void Main(string[] args)
{
int counter = 0;
foreach (var line in File.ReadLines("c:\Path\To\File.whatever"))
{
counter++;
}
Console.WriteLine(counter);
Console.ReadLine();
}
}
}

最新更新