以下代码有效,但与我使用(linux)管道向(修改后的)程序提供解压缩数据的管道相比,效率低下了大约两倍。我需要程序中的稳定流,我可以按n
继续拆分。有没有办法使用(字符串?)流或任何其他技巧来做到这一点?
int main(int argc, char *argv[]) {
static const int unzipBufferSize = 8192;
long long int counter = 0;
int i = 0, p = 0, n = 0;
int offset = 0;
char *end = NULL;
char *begin = NULL;
unsigned char unzipBuffer[unzipBufferSize];
unsigned int unzippedBytes;
char * inFileName = argv[1];
char buffer[200];
buffer[0] = ' ';
bool breaker = false;
char pch[4][200];
Read *aRead = new Read;
gzFile inFileZ;
inFileZ = gzopen(inFileName, "rb");
while (true) {
unzippedBytes = gzread(inFileZ, unzipBuffer, unzipBufferSize);
if (unzippedBytes > 0) {
unzipBuffer[unzippedBytes] = ' '; //put a 0-char after the total buffer
begin = (char*) &unzipBuffer[0]; // point to the address of the first char
do {
end = strchr(begin,(int)'n'); //find the end of line
if (end != NULL) *(end) = ' '; // put 0-char to use it as a c-string
pch[p][0] = ' '; \ put a 0-char to be able to strcat
if (strlen(buffer) > 0) { // if buffer from previous iteration contains something
strcat(pch[p], buffer); // cat it to the p-th pch
buffer[0] = ' '; \ set buffer to null-string or ""
}
strcat(pch[p], begin); // put begin (or rest of line in case there was a buffer into p-th pch
if (end != NULL) { // see if it already points to something
begin = end+1; // if so, advance begin to old end+1
p++;
}
if(p>3) { // a 'read' contains 4 lines, so if p>3
strcat(aRead->bases,pch[1]); // we use line 2 and 4 as
strcat(aRead->scores,pch[3]); // bases and scores
//do things with the reads
aRead->bases[0] = ' '; //put them back to 0-char
aRead->scores[0] = ' ';
p = 0; // start counting next 4 lines
}
}
while (end != NULL );
strcat(buffer,pch[p]); //move the left-over of unzipBuffer to buffer
}
else {
break; // when no unzippedBytes, exit the loop
}
}
您的主要问题可能是标准的 C 字符串库。
通过使用strxxx()
功能,每次调用都会多次遍历完整的缓冲区,首先是strchr()
,然后是strlen()
,然后是每个strcat()
调用。使用标准库是一件好事,但在这里,它只是效率低下。
尝试是否可以想出一些更简单的东西,每个字符只接触一次(代码只是为了显示原理,不要指望它有效):
do
{
do
{
*tp++ = *sp++;
} while (sp < buffer_end && *sp != 'n');
/* new line, do whatever it requires */
...
/* reset tp to beginning of buffer */
} while (sp < buffer_end);
我正在尝试让它工作,但它所做的只是在运行时给出分段错误:
do {
unzippedBytes = gzread(inFileZ, unzipBuffer, unzipBufferSize);
if (unzippedBytes > 0) {
while (*unzipBuffer < unzippedBytes) {
*pch = *unzipBuffer++;
cout << pch;
i++;
}
i=0;
}
else break;
} while (true);
我在这里做错了什么?