C-截距Linux PTHREAD_CREATE功能,导致JVM/SSH崩溃



我试图在ubuntu14.04上拦截pthread_create,代码是这样的:

struct thread_param{
    void * args;
    void *(*start_routine) (void *);
};
typedef int(*P_CREATE)(pthread_t *thread, const pthread_attr_t *attr,void *
    (*start_routine) (void *), void *arg);
void *intermedia(void * arg){
struct thread_param *temp;
temp=(struct thread_param *)arg;
//do some other things
return temp->start_routine(temp->args);
}
int  pthread_create(pthread_t  *thread,  const pthread_attr_t  *attr,  void  *
(*start_routine)(void  *),  void  *arg){
    static void *handle = NULL;
    static P_CREATE old_create=NULL;
    if( !handle )
    {
        handle = dlopen("libpthread.so.0", RTLD_LAZY);
        old_create = (P_CREATE)dlsym(handle, "pthread_create");
    }
    struct thread_param temp;
    temp.args=arg;
    temp.start_routine=start_routine;
    int result=old_create(thread,attr,intermedia,(void *)&temp);
//        int result=old_create(thread,attr,start_routine,arg);
    return result;
}

它与我自己的pthread_create测试用例(编写C)正常工作。但是当我在JVM上使用Hadoop使用它时,它给了我这样的错误报告:

Starting namenodes on [ubuntu]
ubuntu: starting namenode, logging to /home/yangyong/work/hadooptrace/hadoop-2.6.5/logs/hadoop-yangyong-namenode-ubuntu.out
ubuntu: starting datanode, logging to /home/yangyong/work/hadooptrace/hadoop-2.6.5/logs/hadoop-yangyong-datanode-ubuntu.out
ubuntu: /home/yangyong/work/hadooptrace/hadoop-2.6.5/sbin/hadoop-daemon.sh: line 131:  7545 Aborted                 (core dumped) nohup nice -n 
$HADOOP_NICENESS $hdfsScript --config $HADOOP_CONF_DIR $command "$@" > "$log" 2>&1 < /dev/null
Starting secondary namenodes [0.0.0.0
#
# A fatal error has been detected by the Java Runtime Environment:
#
#  SIGSEGV (0xb) at pc=0x0000000000000000, pid=7585, tid=140445258151680
#
# JRE version: OpenJDK Runtime Environment (7.0_121) (build 1.7.0_121-b00)
# Java VM: OpenJDK 64-Bit Server VM (24.121-b00 mixed mode linux-amd64 compressed oops)
# Derivative: IcedTea 2.6.8
# Distribution: Ubuntu 14.04 LTS, package 7u121-2.6.8-1ubuntu0.14.04.1
# Problematic frame:
# C  0x0000000000000000
#
# Failed to write core dump. Core dumps have been disabled. To enable core dumping, try "ulimit -c unlimited" before starting Java again
#
# An error report file with more information is saved as:
# /home/yangyong/work/hadooptrace/hadoop-2.6.5/hs_err_pid7585.log
#
# If you would like to submit a bug report, please include
# instructions on how to reproduce the bug and visit:
#   http://icedtea.classpath.org/bugzilla
#]
A: ssh: Could not resolve hostname a: Name or service not known
#: ssh: Could not resolve hostname #: Name or service not known
fatal: ssh: Could not resolve hostname fatal: Name or service not known
been: ssh: Could not resolve hostname been: Name or service not known
#: ssh: Could not resolve hostname #: Name or service not known
#: ssh: Could not resolve hostname #: Name or service not known
#: ssh: Could not resolve hostname #: Name or service not known
^COpenJDK: ssh: Could not resolve hostname openjdk: Name or service not known
detected: ssh: Could not resolve hostname detected: Name or service not known
version:: ssh: Could not resolve hostname version:: Name or service not known
JRE: ssh: Could not resolve hostname jre: Name or service not known

我的代码有什么问题吗?还是因为JVM或SSH的保护机制之类的东西?谢谢。

此代码使子线程具有无效的arg值:

    struct thread_param temp;
    temp.args=arg;
    temp.start_routine=start_routine;
    int result=old_create(thread,attr,intermedia,(void *)&temp);
//        int result=old_create(thread,attr,start_routine,arg);
    return result;  // <-- temp and its contents are now invalid

temp不再保证在新线程中存在,因为父母对您的pthread_create()的调用可能已经返回,使其所包含的值无效。

这些是代码中的一堆问题。我不知道哪个(如果有)会导致您看到的问题,但是您绝对应该解决它们。

首先,您可以打开核心转储(通常使用ulimit -c unlimited)并加载核心在GDB中。查看回溯指向。

不要 dlopen pthreads。相反,您应该可以使用dlsym(RTLD_NEXT, "pthread_create")

最有可能的麻烦来源是您将原始参数存储在全局变量中。这意味着,如果某人(例如Java运行时)同时打开许多线程,那么您将混合使用。

最新更新