PHP uncompress & Unpack 给出包含一些(10309 中的 9 个)未定义值的数组



我有一个二进制数据字符串,我使用php解压缩并解压到一个数组中,代码如下(这个php页面的完整代码包含在这个问题的底部):

while ($row = mysql_fetch_array($result, MYSQL_NUM)) {
$glycopeptide[$hits] = $row[1];
echo $row[4];
// $row[4] contains the binaryString
$mz = base64_decode($row[4]);
$unc_mz = gzuncompress($mz);
$max = strlen($unc_mz);
$counter = 0;
for ($i = 0; $i < $max; $i = $i+4) {
$temp = substr($unc_mz,$i,4);
$temp = unpack("f",$temp);
$mz_array[$counter] = $temp[1];
$counter++;
}
$hits++;
}

我发现mz_array(上面的代码,X-coords)都有9个未定义的值(都在末尾),但int_arrays(类似的代码,Y-coords)也有9个不定义的值,分布在整个数组中(没有分组或在开头/结尾)。

以下是我添加到页面中的一个小测试块的示例:

测试代码:

for ($i = 0; $i < $counter; $i++) {
echo $i;
echo " -  ";
echo $mz_array[$i];
echo " - ";
echo $int_array[$i];
echo "<br/>";
} 

输出的"选择"(注意缺失的值):

671 - 274.20001220703 - 429
672 - 274.39999389648 -
673 - 274.60000610352 - 1098
-- skipping a few lines --
10299 - 2199.8000488281 - 0
10300 - 2200 - 0
10301 - - 0
10302 - - 0

最奇怪的是,如果我在完整/原始代码中手动输入字符串(请参见页面底部),我会得到未定义的值,而如果我在下面的代码中手动键入通过执行"echo$row[4]"(包含binaryString)返回的字符串,则不会产生未定义值。

<?php
$string = " /* Copy the string in the spoiler (on this page) here */ ";
$int = base64_decode($string);
$unc_int = gzuncompress($int);
$max = strlen($unc_int);
$counter = 0; 
$max_int = 0;
for ($i = 0; $i < $max; $i = $i + 4) {
$temp= substr($unc_int,$i,4);
$temp = unpack("f",$temp);
$int_array[$counter] = $temp[1];
echo $counter;
echo " -- ";
echo $int_array[$counter];
echo "<br/>";
$counter++;
}
?>

有人有什么建议或想法吗?为什么会发生这种情况?

PS:有人能添加gzuncompress标签吗?(我没有这个名声)。

编辑1

我已经包含了一个示例binaryString(警告:巨大!)

Y-Coords(由代码中注释的echo$row[4]检索):

eJztXQuQFMUZblQEQfEQUQSJYwRBSuLG+AAFHdmeryOPS2lRgEgNvgpLhUQkooYB5GXBLIaHwRQRwMpCqOuicJh4KjUoMnm98r2KUaKkXreBpSCXf3z3Lzu7Na3fn9nZhvqop2Nenu6//7fPpcdYjBgxYsSIESNGjBhVgFGkqmLOWPjk77tlNs5a35AY9kdnBrPo+3GMZ5tm87VmDULfXL/PnOwzkGfp0P7RL j2MurAaNB0l2OtJuDzVu+188XJnKmnYZ0a5DpZE7Fu3TVm/BOft5TZ4yqwvwf1u+ne89ea7wM/Pke5vvvI3jHfGGVA56xxEnTWKeCT/h588i5k3cEp1vyuxtQN+H/an5+ya9GuBf2t6zi+s2ZAHh/KC+V7BTDXoK/15Y03uxD3zo1urmoKND4efXb17tPqgTZz0Oy/+eaB2uMHYP1UKOjVc1gDuy30TUuz3fWP82Nuy1nZyZx4GXDgO9M/acXgPd9w+mvXFrx66P+TLGNBnXBbXBB4tWwldYhvFcGmI8GbRJ1xGPhFjvpQqvljDbZjv6jpan1pHfxcfjHyqm0Hra8vnraa+sGF68prmz4aeGtM3Y4wr3fd8I35nrxfZTU/g 88YK/NbORAJ6owvG31Y49mwT/JJFXPrjYQBZSrGaAj97UMTrfFYJ4wkD9GUYtcuLBHU7Z/rzoO3dtT1OgQ0Y42neNqyFuSjrOcuamNMo7KPzecHvR45MsYtwKZfAFh7GBS9lv0LbH+BzH+67v7KvcBEDdNrarV0hb77Ddy9VWT9AFmRfwJqtqR+9pF8MGq/iwl7PId1bY62DQcMGB+0hF5W1th8WIO/a7a0TsN+maDK+M9qdnynWao3H76fVAb87oN8K2/s3XNDKfE5jzc0aW7sRNuYhkKM7bFpNt23g/XhhfBp6RtDCoW/UXmjzkaPNefg8j2Jj9UWXmgN4z7gnT0N1IuTRZ5yle4Avdzi+HwK+/0md+eNmsLwxTsT8zHyMvGDOzlbujnagnagazi//cmcyOtq11c8aNDC5Ugb1/hvsmw5ps7GOl1dhycHs5fNS7EM26prjzJ/htz+rJ6tnLnZC/74ePczadhX0/LZp4VWY59OcB4IdmXqALaf+JPNUAj2dUQBvql3KbhqKx6ceMZVNnjGWT1qeY/jZ00Z/xTMP2bzkBPEudLA6ksaHfxVvf02HbaV3wfy2JJn+MeyvP+K5C8D3m5NMfRb3rfaYH+k+l/yseRXGuAT3XSHvI5tXjPk2Rz8Yi/kwl3GtiGyrxl6wff6FPr8oHK91v23D2dAX4/NDhW1Itxvbq8DPB/jwixM9bNodBT6Yg/EO5OHoRD7t6dgr/fenkSvNBM+RYvw6fHt1n0Iakk9GMXurL/e5uz3onYbQeMr2vbyrd5uTgtrbXn+xJgvJeUcbGPQuiAr5IF7VoHYF4Dg2M0jSTXBvHd/NaZoDiFscuWC6Ycz/LFGN8MXA/ityXyO30bZwmsQevJ2O8j/HM3Tlh3oI/BlcWtyJ43+uT7UFdCXn5ox418oDwle28OYQtQDo/xF4XebkWjCdj10kkOGu8VGON+4I3t+WfYS7P047sZv3X0j/M3pm/t1oQ9LflnwIby/g95kA0acuPIzEUdDrdzn9UIo8z0pcjWzmsHWCcgfcX2J8akNSxsx8oA5Bvydx4Tu7Pvd59PEkxvspZy3naywN3m7Dv7ejPGbEawR2ZLTi/oZxlnDZMgJkmMjbX1DvtNheb/WmMKlv/t3XJuSrvI2vQJy7gKN6eR7ebQhmGtBgwcwH/CtiAG3Ldh0KhpVd5C8KOV kEfW7p+fPGQaDbqTaPusgTiq1YS6LNkxnvS1uJ9q1ym3ctnToTzx4V0s7IYWR7X6OxK+Z3anvWdnIhV1Idp2i2nQfHPwsYY+XQg/y8buF039+0EdoMhaV4w1ar2uwt6bl+b4joHbDM36bfy7VNLa9jL09H3SdmV8f62aNKX/F5zvKH0ua6g5IbiX9+yBZKPiyWBaEhKihXBD+Xte2QIZn0OfOzEmj1qXzFSs3zUdEDMjWkWZ24AMURt59PmSIMDOJx9O6CGv/TKssO6oBTaigFSjILNDh7zCOJb2FBsZlHbIBscMsu4qjAm7Wvks3LniiKoUIO6W/lbUJXvIXf3uC79TDZk7/h0tWEZLvy9PskSkJ+2HvIbx4uq+RmPT ntbY2t ayxyF9XlX0kLUKOX2Nvxz6yJc2DeZE7EmvfIy2lwhbbrmp3DvX/D90iSzYAfp6xwxIuhoYfdh3T7ulmLKLLTVvW2THIxL0McrnE2cN5a1cFxDxrI+08ey759MsSOvSrF+AOCD4WtSfo0rGz9hks9c7h/e5oDa/G2Lwgq5KjgoWbZhnyg3XH7jsS3oDOrTFdZgxw1EPAnjH7h47sm/Avj0/Y6npjjOv89YJxkFx74g/rStgW4zThPwqb3PZvyRYQ+ykn20h3UezGpz6/ZJyA/slugX2jki15LN+9jk60wjfIwm+opA6QbBi23Z+X/KCei/23gbMJy6DPj9Rk7p3WPoFxXxfen6Z8p+BPoiPuV8gf/y8v6p4dplcm1EnsNhEyqQH8p4/Das0 EYlH5T0soW92jJQkzbWljEiZiTqPF3Gnk3j+39wUZ/GJsJ3/B7/P1zOWe2LZ9+ej40akEGkW9iupOv6GZANxjP+sW2KW6rw+dkS2Y7sXWNT0bgg4422wu+Mn0OWLi3y36B3lCMwzy2gVXeMays+H4r//8iOk9l7lmL+nrLiJsyR4s+P+Iy7aO9VaTwN5w8Rr4/aJ9QME7Ib2sSrpVcxDTYH3D9yvaN34QuwHibbsD6XSBjOSq+Ux5Fnzm+xzq0jJO5cZXiTI9hTX+J/iBLlc+4jNGvt+XJpfjtfFmjri9Ae/hz1ufyHr91MJs5S3+sse7Pgld+gX0HW8wgfgI9s/BxzPu5sAmdvkbD0WjDg2UX5S0S+2A9IK/351Iwal2IT78JxzMKdN6LeF s22fYX/tj/5ecPQFHaAO8K+Nng39p8ImNsGzZj/015WLOAjRxNhfxhUaxqDf2fgNulZZhd9hY1ivc+HbsQ/Qrq+7/BGFTYP+ncVnrPa234ScWdHrfMiPItkNbsBazTX5g/7zID6DL77BD45fNmh92oiLpLokWKH9kyxlgupzsKO6Tn4rAV+TBb6Wx0bTa5jb4A1GHvgXtBrVZ5elD8OistRTkbEzmZUl84UAyYbTVngbc9SnZdCtnkFdX1U6yLyRBMD9m0ftCPbJODcHcVGhV1eho2gfC5jweXaU43TNVlnEXDesGpYxvN6drytZwPyM+pwO14dkDc3P+TC34kyhk+1kyLvDTmpv4PxbgB/fUm2mXxGGvaugQKyGvrWSuZlkrEQ93UpLZ/iyx0YgOHfBuKPfA1 l/mQsHX+k3HfTFzQJ41vQk6+XVrswhVe+sABw+LigEu1c/ICdyVlLqKMOnfrP1jjRzSRpzA+8e5Dgc2oTnD4i21jAvep2sZE/pW4ds2L9TYbOg38x061wl9e70mfEvqr61NE3tf1CDWE0LwRcWAjSXORDvWJTMJe3AO6KqXsC+qBOtJ2Fs97HO7FcZwd/cJ+4h9hcvK8V5S+DND18FWvhYyfwz5jnL/ZWAbmWTv5WL7F0m7K8qYPNValnPmRvhOcyLm8anWeb3bdWEC8vntaY4m+WLvvvvkku31FPlhAbEaNyxaAD79U2VyUf0dF7kyMcfiOcDPMkLUzUOet5jZT53Cvj1tnc2u/B3Ovr80vakjAV67FeRD/TaP3RPB9URUHyFfFFnzWcxWg9Mg Qc0lrlDY9vWaOzygSk26qgUa4Wvag6A/n8VY3UATKITnz31SnAfnBXnZ5RiXxQzdaboVsfBrzuU/GA8z+mqwndmlLdST6Jv+zAxQjJ5+4VDlrLQddMuh7scMXgm2qQ06oFJd9ycFzsKWF/2nLA1qTxBl2/QjFn0E/GqNoE5ADscjPyN3XySAbky0N9nuqBeNF8Ab8TLU7F7QnnzO9n50LyMnifljv4aB9z8IYhb4IvgDFtVSbF6gOkHyDWYVzI70s4oW5OjOKe1VaSxwh9FOohiVft2eMA00eLD1HquzEfbAvzfmFPEl+LsVwnXLXJN9+RXs+SJC9+0PH3oIPpE5x8SVesnMAJ3Dv8/1uc30e+w1rSblD8W4Ar3WA/UcxXVFDGJE 9VQzSr1mi/bjgfFCtgOSQyBfcHa1NRXJh1KAUW/6Kxhou0dhJQ1Iy/nmxw6bsBNA7JJbPxXwvzNv8+i7w0HHwc5fh+zcxvtW23O7dPmaUOUjWtivYX+qxmoiHUg2bONNZj2cfw2CqowbOBcpd9L4pLmuqHTCSnZBDPs+n14EWHvgeihj+oPOLYWUWIsEWvjuMzp31kD78psq5Ko7EcKHKYF/jeGQa4/KXEnY97yJ+2wfU/gNz9OGU7rBY2NOj3F+p2RYme/Ax97oSbrK+eBR5Z1zN6mvJf+YuEZCFFLU4ENoc+C3/A1L7lu0twMPZPqpBqpWgbldGGLCru9I+X7pqSM77rUwFpj6arDXzkmPQb8ZNPjquTnFg7xRTeqVE7YzhUfsQBehcVnH MkfRLhs54Jxy+AXxFytlMWEpnKBx2TG8X/yGELKD+ldWytkPkCTvIB6l5wAYzbsy/S5Tywgr5BfTeAkdtZLabXRcUUY1UtpWzxCBZy+CM9YvzpS45H/0hLmu0XGKGon58e8fvP7ULaPC4d33EbmTBVwei7SEydyj86Ffb62x1Yek6JweRO6mWzRcAqjvSH8d+erG9z+ELt7j7BvT1iZ0zc8TqyZ9TH/X3e1zHNj9Zc7mkGBECezHovZKegMwXccMw9lud6nfDeVbEicH2WztXssuhwz/dpVRPZD+xYVdD7JNqSsq7rLp8xU13srwDWsRdG6q4qZHIh3cFHdZ4FsqAY

完整代码:

<?php
require 'phplot/phplot.php';
$type = $_GET['type'];
$gp = $_GET['gp'];
$site = $_GET['site'];
$prec = $_GET['prec'];
$link = mysql_connect("localhost","reader","") or die (mysql_error());
mysql_select_db('leidenGlycoPeptide') or die ();
$query = sprintf("select precursor.mzValue, glycoPeptide.protein, binaryDataArray.arrayLength, binaryDataArray.encodedLength, binaryDataArray.arrayData, precursor.chargeState, run.pepMass, run.PepSeq from glycoPeptide, spectrum, binaryDataArray, run, precursor where run.glycoPeptide = glycoPeptide.id AND spectrum.run = run.id AND precursor.run = run.id AND binaryDataArray.spectrum = spectrum.id AND precursor.id = spectrum.precursor AND spectrum.spectrum like 'm/z' AND precursor.mzValue like '%s' and glycoPeptide.protein like '%s' and run.glycoSite like '%s' and run.glycoType like '%s' ORDER by glycoPeptide.protein, spectrum.spectrum",(string)$prec, (string)$gp, (string)$site, (string)$type);
$result = mysql_query($query);
$hits = 0;
while ($row = mysql_fetch_array($result, MYSQL_NUM)) {
$charge = $row[5];
$pepmass = $row[6];
$pepseq = $row[7];
$glycopeptide[$hits] = $row[1];
/* Manually entering string here also gives undefined values */
/* $mz = " I was not able to include the mz string due to message size limit "; */
$mz = base64_decode($row[4]);
$unc_mz = gzuncompress($mz);
$max = strlen($unc_mz);
$counter = 0;
for ($i = 0; $i < $max; $i = $i+4) {
$temp = substr($unc_mz,$i,4);
$temp = unpack("f",$temp);
$mz_array[$counter] = $temp[1];
$counter++;
}
$hits++;
}
$query = sprintf("select precursor.mzValue, glycoPeptide.protein, binaryDataArray.arrayLength, binaryDataArray.encodedLength, binaryDataArray.arrayData from glycoPeptide, spectrum, binaryDataArray, run, precursor where run.glycoPeptide = glycoPeptide.id AND spectrum.run = run.id AND precursor.run = run.id AND binaryDataArray.spectrum = spectrum.id AND precursor.id = spectrum.precursor AND spectrum.spectrum like 'intensity' AND precursor.mzValue like '%s' and glycoPeptide.protein like '%s' and run.glycoSite like '%s' and run.glycoType like '%s' ORDER by glycoPeptide.protein, spectrum.spectrum",(string)$prec, (string)$gp, (string)$site, (string)$type);
$result = mysql_query($query);
while ($row = mysql_fetch_array($result, MYSQL_NUM)) {
/* Manually entering string here also gives undefined values */
/* $int = " Copy the string from above in here "; */
$int = base64_decode($row[4]);
/* The result from this is the above binaryString */
/* echo $row[4]; */ 
$unc_int = gzuncompress($int);
$max = strlen($unc_int);
$counter = 0;
$max_int = 0;
for ($i = 0; $i < $max; $i = $i + 4) {
$temp= substr($unc_int,$i,4);
$temp = unpack("f",$temp);
$int_array[$counter] = $temp[1];
$counter++;
if ($temp[1] > $max_int) {
$max_int = $temp[1];
$counter++;
}
}
}
/* The following chunk is just to test the arrays */
for ($i = 0; $i < $counter; $i++) {
echo $i;
echo " -  ";
echo $mz_array[$i];
echo " - ";
echo $int_array[$i];
echo "<br/>";
}
for ($i = 0; $i < $counter; $i++) {
$plot_data[$i] = array('',$mz_array[$i],$int_array[$i]);
}
// Plot the regular spectrum
$width = 1024;
$height = 768;
$plot = new PHPlot($width,$height);
$plot->SetMarginsPixels(NULL,NULL,NULL,35);
$plot->SetPrintImage(False);
$plot->SetPlotType('thinbarline');
//$plot->SetXTitle('m/z Values');
$plot->SetXTickAnchor('400');
$plot->SetDataColors('red');
$plot->SetXTickIncrement('200');
$plot->SetXDataLabelPos('none');
$plot->SetYTitle('Intensity');
$plot->SetYTickAnchor('0');
//Might need to define this dynamically with nested if/else loops
$plot->SetYTickIncrement('100000');
$plot->SetDataType('data-data');
$plot->SetDataValues($plot_data);
$plot->SetTitle('Fragmentation Spectrum');
//$plot->DrawGraph();
mysql_close($link);
?>
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"
"http://www.w3.org/TR/html4/loose.dtd">
<html>
<head>
<title>Query result page</title>
<script src="jquery-1.9.1.min.js"></script>
</head>
<script>
var gp = '<?php echo htmlspecialchars($_GET['gp']); ?>';
$(document).ready(function() {
$('.button').click(function() {
window.open('http://www.uniprot.org/uniprot/?query='+gp+'+AND+organism:human&sort=score');
});
$('.XY').click(function() {
var mz_array = <?php echo json_encode($mz_array) ?>;
var int_array = <?php echo json_encode($int_array) ?>;
var table = 
"<table border="1">"
+"<tr>"
+"<th>m/z</th>"
+"<th>intensity<t/h>"
+"</tr>";
var max = <?php echo $counter ?>;
for (var i = 0; i < max; i++) {
table += "<tr>"
+"<td>"+mz_array[i]+"</td>"
+"<td>"+int_array[i]+"</td>"
+"</tr>";
}
table += "</table>";
var disp = window.open();
$(disp.document.body).html(table); 
});
});
</script>
<body>
<p>The displayed spectrum belongs to <?php echo $gp ?> with a precursor [M+H] of <?php echo (($prec*$charge)-($charge+1)); ?>.<br>
The peptide belonging to this glycopeptide has a mass of <?php echo $pepmass ?> and sequence: <?php echo $pepseq ?>.<br>
<button class="button">Uniprot search</button> <button class="XY">Display XY data</button></p>
<img src="<?php echo $plot->EncodeImage();?>" alt="Plot Image">
</body>
</html>

它非常复杂:

$max = strlen($unc_mz);
$counter = 0;
for ($i = 0; $i < $max; $i = $i+4) {
$temp = substr($unc_mz,$i,4);
$temp = unpack("f",$temp);
$mz_array[$counter] = $temp[1];
$counter++;
}

改为使用这个:

$mz_array = array_values(unpack("f*", $unc_mz));

我在不该增加的地方增加了一个索引,if($temp[1]>$max_int){//stuff}内的$counter++在检测到新的最大值时对索引进行了缓冲。

int_array的新代码现在看起来是这样的(使用Sectus的trick和max(array)):

while ($row = mysql_fetch_array($result, MYSQL_NUM)) {
$int = base64_decode($row[4]);
$unc_int = gzuncompress($int);
$int_array = array_values(unpack("f*",$unc_int));
$max_int = max($int_array);
}

以下语法也是有效的(如果你不想使用Sectus的技巧):

while ($row = mysql_fetch_array($result, MYSQL_NUM)) {
$int = base64_decode($row[4]);
$unc_int = gzuncompress($int);
$max = strlen($unc_int);
$counter = 0; 
$max_int = 0;
for ($i = 0; $i < $max; $i = $i + 4) {
$temp= substr($unc_int,$i,4);
$temp = unpack("f",$temp);
$int_array[$counter] = $temp[1];
$counter++;
if ($temp[1] > $max_int) {
$max_int = $temp[1];
} 
} 
}

我也要感谢所有为此绞尽脑汁的人。

最新更新