PHP substr() 在不获取字符串的第一部分时真的很慢

我有一个文本文件，大约有 5，000 行，每行大约 200 个字符长。每行实际上包含 6 条不同的数据，我一直在用substr()来分解这些数据。例如，在每一行上，字符 0 - 10 包含 Client#，字符 10-20 包含 Matter#，依此类推。这一切都很好，运行得比我需要的要快。

当我的老板告诉我客户编号有 4 个前导零并且需要去掉它们时，我的问题就出现了。所以我想，没问题 - 我只是将我的第一个 substr() 函数从 substr(0, 10) 更改（从 0 开始并取 10 个字符）并将其更改为 substr(4, 6)（从第 4 个字符开始，仅取 6），这将跳过 4 个前导零，我会很高兴。

但是，当我将substr(0, 10)更改为substr(4,6)时，该过程会停止并且需要很长时间才能完成。这是为什么呢？

这是我的代码片段：

// open the file    
$file_matters = fopen($varStoredIn_matters,"r") or exit("Unable to open file!");
// run until the end of the file
while(!feof($file_matters))
{
    // place current line in temp variable
    $tempLine_matters = fgets($file_matters);
    // increment the matters line count
    $linecount_matters++;
    // break up each column
    $clientID = trim(substr($tempLine_matters, 0, 10)); // THIS ONE WORKS FINE
    //$clientID = trim(substr($tempLine_matters, 4, 6)); // THIS ONE MAKES THE PROCESS GRIND TO A HALT!!
    $matterID = trim(substr($tempLine_matters, 10, 10)); 
    //$matterID = trim(substr($tempLine_matters, 15, 5)); 
    $matterName = trim(substr($tempLine_matters, 20, 80)); 
    $subMatterName = trim(substr($tempLine_matters, 100, 80)); 
    $dateOpen = trim(substr($tempLine_matters, 180, 10)); 
    $orgAttorney = trim(substr($tempLine_matters, 190, 3)); 
    $bilAttorney = trim(substr($tempLine_matters, 193, 3)); 
    $resAttorney = trim(substr($tempLine_matters, 196, 3)); 
    //$tolCode = trim(substr($tempLine_matters, 200, 3)); 
    $tolCode = trim(substr($tempLine_matters, 200, 3)); 
    $dateClosed = trim(substr($tempLine_matters, 203, 10)); 
    // just does an insert into the DB using the variables above
}

我不明白为什么这会慢得多，但你可以看看 unpack，它可以一次性提取你的固定宽度记录：

 $fields = unpack('A10client/A10matter/A60name ...etc... ',$tempLine_matters);

我使用与您的示例类似的记录模式进行了快速基准测试，发现 unpack 的速度是每次迭代中使用 10 个 substr 调用的两倍多。

我建议使用 xdebug 分析您的代码，看看真正的不同之处。

这不是一个非常优化的过程。你也许应该多考虑一下。但如果它现在有效，那就是最重要的......也许如果你通过两个过程获得你的价值，它会更快。例如：

$clientID_bis = trim(substr($tempLine_matters, 0, 10));
$clientID = trim(substr($clientID_bis, 4, 6));

相关内容

最新更新

热门标签：