SO_RCVTIMEO起得太早了

SO_RCVTIMEO的 Linux 手册页说：

指定接收或发送超时，直到报告错误...如果输入或输出功能在这段时间内阻塞...[并且] 没有传输任何数据并且已达到超时，则返回 -1，errno 设置为 EAGAIN 或 EWILLBLOCK，或 EINPROGRESS（对于连接（2））

在我看来，这听起来像 I/O 应该至少等待SO_RCVTIMEO，然后再将执行返回给调用方。与此同时，在开放小组，他们记录了相反的情况：

设置超时值，该值指定输入函数在完成之前等待的最长时间。

那么，最小阻塞时间还是最大阻塞时间是哪个呢？答案似乎是：是的。以下是当我要求在 Linux 系统上进行 .500s 超时时发生的情况：

time: 0.497054 result: 0
time: 0.495352 result: 0
time: 0.504948 result: 0
time: 0.495119 result: 0
time: 0.507884 result: 0
time: 0.491892 result: 0
time: 0.500764 result: 0

我们看到时间是错误的，通常多达7ms左右，这是很长一段时间的错误。错误发生在两个方向上。与此同时，在达尔文：

time: 0.500426 result: -1
time: 0.501144 result: -1
time: 0.500507 result: -1
time: 0.501119 result: -1
time: 0.501016 result: -1
time: 0.500540 result: -1
time: 0.500127 result: -1
time: 0.500815 result: -1
time: 0.500341 result: -1
time: 0.500871 result: -1
time: 0.500835 result: -1
time: 0.501138 result: -1
time: 0.501087 result: -1
time: 0.501153 result: -1
time: 0.501149 result: -1

误差要低得多（~1ms），但仍然存在，他们清楚地将500ms解释为最小时间，而不是最大值。

现在一些问题：

SO_RCVTIMEO应该是阻止呼叫者的最短还是最长持续时间？
如果是最长持续时间，则最短持续时间是多少？当然，当要求 500 毫秒超时时，实现不能自由选择非阻塞读取？
如果是最短持续时间，达尔文错了吗？
如果我想保证我尝试读取至少 500 毫秒，我是否应该继续循环尝试直到 500 毫秒过去？实现"至少 X 毫秒"行为的"正确方法"是什么？
为什么在 Linux 上从调用到调用有如此大的差异？错误的来源是什么？
我应该使用更好的 API 来从套接字读取吗？

我用来测量这一点的代码：

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>
#include <sys/types.h>
#include <sys/socket.h>
#include <netinet/in.h>
#include <time.h>
#include <fcntl.h>
#ifdef __MACH__
#include <mach/clock.h>
#include <mach/mach.h>
#endif
void error(const char *msg)
{
    perror(msg);
    exit(1);
}
struct timespec os_time() {
    struct timespec ts;
    #ifdef __MACH__ // OS X does not have clock_gettime, use clock_get_time
    clock_serv_t cclock;
    mach_timespec_t mts;
    host_get_clock_service(mach_host_self(), CALENDAR_CLOCK, &cclock);
    clock_get_time(cclock, &mts);
    mach_port_deallocate(mach_task_self(), cclock);
    ts.tv_sec = mts.tv_sec;
    ts.tv_nsec = mts.tv_nsec;
    #else
    clock_gettime(CLOCK_REALTIME, &ts);
    #endif
    return ts;
}
int main(int argc, char *argv[])
{
     int sockfd, newsockfd, portno;
     socklen_t clilen;
     char buffer[256];
     struct sockaddr_in serv_addr, cli_addr;
     int n;
     if (argc < 2) {
         fprintf(stderr,"ERROR, no port providedn");
         exit(1);
     }
     sockfd = socket(AF_INET, SOCK_STREAM, 0);
     if (sockfd < 0)
        error("ERROR opening socket");
     bzero((char *) &serv_addr, sizeof(serv_addr));
     portno = atoi(argv[1]);
     serv_addr.sin_family = AF_INET;
     serv_addr.sin_addr.s_addr = INADDR_ANY;
     serv_addr.sin_port = htons(portno);
     if (bind(sockfd, (struct sockaddr *) &serv_addr,
              sizeof(serv_addr)) < 0)
              error("ERROR on binding");
     listen(sockfd,5);
     clilen = sizeof(cli_addr);
     newsockfd = accept(sockfd,
                 (struct sockaddr *) &cli_addr,
                 &clilen);
     if (newsockfd < 0)
          error("ERROR on accept");
     for (int i = 0; i < 100; i++) {
         struct timeval tv;
         tv.tv_sec = 0;
         tv.tv_usec = 500000;
         char buf[1];
         if (setsockopt(newsockfd, SOL_SOCKET, SO_RCVTIMEO, (char *)&tv,sizeof(struct timeval)) != 0){
             error("setsockopt error");
         }
         struct timespec start = os_time();
         int result = recv(newsockfd,buf,1,0);
         struct timespec end = os_time();
         double end_time = (double)end.tv_sec + ((double)end.tv_nsec)/1.0E9;
         double start_time = (double)start.tv_sec + ((double)start.tv_nsec)/1.0E9;
         printf("time: %f result: %dn",end_time-start_time, result);
     }
     return 0;
}

繁殖：

clang test.c && ./a.out 5551 &
telnet localhost 5551
time: 0.497839 result: 0
time: 0.501052 result: 0
time: 0.498565 result: 0
time: 0.500741 result: 0
time: 0.500108 result: 0
time: 0.500244 result: 0
time: 0.499040 result: 0
time: 0.500212 result: 0
time: 0.500137 result: 0
time: 0.499920 result: 0
time: 0.500758 result: 0
time: 0.498068 result: 0

在我看来，这听起来像 I/O 应该至少等待SO_RCVTIMEO，然后再将执行返回给调用方。

不。它最多应该等待超时。如果数据已存在，或者在超时之前到达，则该方法将在该点返回，而无需等待超时过期。

与此同时，在开放小组，他们记录了相反的情况：

设置超时值，该值指定输入函数在完成之前等待的最长时间。

那么，最小阻塞时间还是最大阻塞时间是哪个呢？

最大阻止时间。

他们清楚地将500ms解释为最小时间，而不是最大时间。

在这里，您提出并测试两个不同的问题：计时器的分辨率和操作系统在超时后重新调度线程的速度。两者都未指定。

SO_RCVTIMEO应该是阻止呼叫者的最短还是最长持续时间？

最大值，在其（即操作系统的）分辨率范围内，并可能受到进一步的计划延迟的影响。

如果是最长持续时间，则最短持续时间是多少？

零。

当然，当要求 500 毫秒超时时，实现不能自由选择非阻塞读取？

当然是。如果套接字接收缓冲区中已存在数据，recv()传输该数据并立即返回。为什么要等待？

如果是最短持续时间，达尔文错了吗？

不，它只是具有不同的分辨率和重新计划延迟。

如果我想保证我尝试读取至少 500 毫秒
，我是否应该继续循环尝试直到 500 毫秒过去？实现"至少 X 毫秒"行为的"正确方法"是什么？

你必须用自己的计时器来做到这一点，但我看不出这一点。如果数据已经存在，或者更早到达，你到底为什么要推迟？

为什么在 Linux 上从调用到调用有如此大的差异？错误的来源是什么？

定时

器抖动;重新定时抖动。它不是实时操作系统

我应该使用更好的 API 来从套接字读取吗？

定义"更好"。你的期望似乎很奇怪。30 多年来，这个 API 对其他人来说已经足够好了。

相关内容

最新更新

热门标签：