pthread_cond_wait不断崩溃,显示一条神秘的"futex facility returned an unexpected error code"消息



我有一个只有 2 个线程的程序。一个是主线程,第二个用作"音乐处理器"。音乐处理器最初通过调用 pthread_cond_wait 在条件变量上休眠。主线程将要由另一个线程处理的数据放在共享变量中,并使用pthread_cond_signal唤醒线程。

我在 Ubuntu 16 系统上构建了这个程序,它运行得很好。然后,我继续使用最新的 linux 内核和 GLibc 2.17 构建一个 GNU 系统,我需要使用 LFS 8.2 指令运行该程序。

在此系统上运行它,音乐处理线程总是失败,并在调用pthread_cond_wait时显示"futex 设施返回意外的错误代码"消息。这可能导致这种情况吗?我看遍了所有地方,找不到任何解释。

编辑

以下是简化的代码:

struct _audio_processor {
/*  Other variables
.
.
.
*/
pthread_mutex_t     frameAdvanceLock,
queuedEffectsLock;
pthread_cond_t      frameAdvanceBarrier;
} __attribute__((packed));
typedef struct _audio_processor * AudioProcessor;
static void * _playbackThreadBody ( register void * p )
{
register AudioProcessor processor = NULL;
processor = (AudioProcessor) p;
while ( processor->audioThreadRunning ) {
pthread_mutex_lock ( & processor->frameAdvanceLock );
/* This is where it INVARIABLY fails. */
pthread_cond_wait ( & processor->frameAdvanceBarrier, & processor->frameAdvanceLock );
pthread_mutex_lock ( & processor->frameAdvanceLock );
/*  Rest of the thread (stuff happens here that takes time).
.
.
.
*/
}
return NULL;
};
AudioProcessor CreateAudioProcessor ( void )
{
register AudioProcessor result = NULL;
register int        status = -1;
pthread_attr_t      attributes;
result = & _mainAudioProcessor;
/*
Other variables initialized here.
.
.
.
*/
pthread_cond_init ( & result->frameAdvanceBarrier, NULL);
pthread_mutex_init ( & result->frameAdvanceLock, NULL );
pthread_mutex_init ( & result->queuedEffectsLock, NULL );
pthread_attr_init ( & attributes );
pthread_attr_setstacksize ( & attributes, 8192 );
status = pthread_create ( & _audioProcessorThread, & attributes,     _playbackThreadBody, result );
sched_yield ();
return result;
};
void AudioProcessorPlaybackMusic ( register const AudioProcessor processor )
{
register int    status = -1;
pthread_cond_signal ( & processor->frameAdvanceBarrier );
};

如果"pthread_cond_t"内存未与DWORD边界对齐,例如在堆上分配时,可能会发生这种情况? 下面提供了演示这一点的代码... 注意:似乎"pthread_mutex_t"也必须在堆上才能观察到问题?

似乎"malloc(("保证了对齐:为什么Malloc((关心边界对齐? (使用我自己的调试堆时,我有时会收到错误 - 这不提供此保证。

不确定是否有任何局部/全局对齐的保证,尽管也许可以将其指定为结构定义的一部分? GCC 中的结构对齐(应该在 typedef 中指定对齐吗? 当然,示例(https://linux.die.net/man/3/pthread_cond_init(将这些作为较大结构的一部分,尽管可能再次保证对齐(模结构打包覆盖(。

--> 也许这些结构是动态分配的?如果是这样,文档应该指定这一点,或者至少是对齐要求?

--> 如果不满足所需的对齐方式,也许"pthread_cond_init(("(或随后的"等待"(函数应该尽早引发错误,而不仅仅是将神秘的消息转储到控制台并杀死进程?(话虽如此,仅当两个结构都在堆上时,才会发生错误。

使用不同的对齐方式演示错误/成功的代码:

// This has to be aligned on a DWORD boundary, tested on Kubuntu 20.04:
int const alignmentRequirement = sizeof(int);
typedef long TPointerSize;  // (Assuming 64bit addresses.)
// Use malloc with its logic, however can apply an n-byte offset to the struct used below:
#define ALLOC_WITH_OFFSET(OFFSET)        (pthread_cond_t*)(OFFSET + (char*)malloc(OFFSET + sizeof(pthread_cond_t)));
// Uncomment only one of these lines to observe the behaviour:
pthread_cond_t* condition_var = (pthread_cond_t*)malloc(sizeof(pthread_cond_t));               //--    (A) Works:  (I.e. waits forever.)  malloc() returns 'aligned' values.
//    pthread_cond_t* condition_var = ALLOC_WITH_OFFSET(0);                                        //--    (B) Works:  Same as malloc, above.
//    pthread_cond_t* condition_var = ALLOC_WITH_OFFSET(1);                                        //--    (C) FAILS:  Cryptic error message, process killed.
//    pthread_cond_t* condition_var = ALLOC_WITH_OFFSET(2);                                        //--    (D) FAILS:  As above.
//    pthread_cond_t* condition_var = ALLOC_WITH_OFFSET(3);                                        //--    (E) FAILS:  As above.
//    pthread_cond_t* condition_var = ALLOC_WITH_OFFSET(4);                                        //--    (F) Works!! (I.e. waits forever.)  A whole number of "DWORD"s past the normal malloc() alignment:
//    pthread_cond_t* condition_var = ALLOC_WITH_OFFSET(5);                                        //--    (G) FAILS:  As above.
//    pthread_cond_t* condition_var = ALLOC_WITH_OFFSET(6);                                        //--    (H) FAILS:  As above.
//    pthread_cond_t* condition_var = ALLOC_WITH_OFFSET(7);                                        //--    (I) FAILS:  As above.
//    pthread_cond_t* condition_var = ALLOC_WITH_OFFSET(8);                                        //--    (J) Works!! (I.e. waits forever.)  A whole number of "DWORD"s past the normal malloc() alignment:
printf("Created 'pthread_cond_t', address = 0x%lx,  mis-alignment = %ld bytesn", (TPointerSize)condition_var, ((TPointerSize)condition_var) % alignmentRequirement);  // (Assuming 64bit addresses.)
// Use that to wait on a mutex...
//    pthread_mutex_t mutex;                                                                //-- Note that with the mutex as a local, none of the above cause the issue...
pthread_mutex_t* mutex = (pthread_mutex_t*)malloc(sizeof(pthread_mutex_t));
pthread_mutexattr_t    attr;
memset(&attr, (int)NULL, sizeof(pthread_mutexattr_t));
/*int res =*/ pthread_mutexattr_settype(&attr, PTHREAD_MUTEX_RECURSIVE);                //-- (Check return value is zero.)
printf("Waiting forever, this will kill the process when the error happens (or will wait forever if is ok - then you will have to kill the process):n");
/*int res =*/ pthread_mutex_lock(/*&*/mutex);                                           //-- Grab the mutex first, always works.    (Check return value is zero.)
// Error message is   "The futex facility returned an unexpected error code.":
/*int res =*/ pthread_cond_wait(condition_var, /*&*/mutex);                             //-- (Check return value is zero.)
// Not showing cleanup/free of memory...

相关内容

最新更新