Sqoop作业导入数据从SQL服务器卡住在地图0%



我有一个运行CDH5.0.2的伪分布式hadoop集群。我正在运行一个sqoop导入命令:

sudo -u sqoop sqoop import --connect "jdbc:sqlserver://x.x.x.x:1433;databaseName=yyyyy" --username x --password y --table table_name

我只是导入一个非常小的表,有12行和2列的测试。工作已经进行了半个小时了。在我的资源管理器上,映射器任务的状态被列为NEW,它们的状态被列为SCHEDULED。我不认为它会跑起来!

当我在纱线上列出作业时使用:

yarn application -list

我:

14/07/01 15:55:06 INFO client.RMProxy: Connecting to ResourceManager at host/x.x.x.x:8032
Total number of applications (application-types: [] and states: [SUBMITTED, ACCEPTED, RUNNING]):1
Application-Id      Application-Name        Application-Type          User           Queue                   State             Final-State             Progress                        Tracking-URL
application_1404252440376_0001      ActivityType.jar               MAPREDUCE         sqoop                   root.sqoop        RUNNING               UNDEFINED                   5%                  http://host:42583

这是我看到的Application Master日志。我怎么解决这个问题?

2014-07-01 15:14:12,880 WARN [main] org.apache.hadoop.conf.Configuration: job.xml:an attempt to override final parameter: mapreduce.job.end-notification.max.retry.interval;  Ignoring.
2014-07-01 15:14:12,885 WARN [main] org.apache.hadoop.conf.Configuration: job.xml:an attempt to override final parameter: mapreduce.job.end-notification.max.attempts;  Ignoring.
2014-07-01 15:14:12,888 INFO [main] org.apache.hadoop.yarn.client.RMProxy: Connecting to ResourceManager at host/x.x.x.x:8030
2014-07-01 15:14:12,973 INFO [main] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: maxContainerCapability: 1024
2014-07-01 15:14:12,973 INFO [main] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: queue: root.sqoop
2014-07-01 15:14:12,977 INFO [main] org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl: Upper limit on the thread pool size is 500
2014-07-01 15:14:12,979 INFO [main] org.apache.hadoop.yarn.client.api.impl.ContainerManagementProtocolProxy: yarn.client.max-nodemanagers-proxies : 500
2014-07-01 15:14:12,985 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl: job_1404252440376_0001Job Transitioned from INITED to SETUP
2014-07-01 15:14:12,987 INFO [CommitterEvent Processor #0] org.apache.hadoop.mapreduce.v2.app.commit.CommitterEventHandler: Processing the event EventType: JOB_SETUP
2014-07-01 15:14:12,997 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl: job_1404252440376_0001Job Transitioned from SETUP to RUNNING
2014-07-01 15:14:13,018 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskImpl: task_1404252440376_0001_m_000000 Task Transitioned from NEW to SCHEDULED
2014-07-01 15:14:13,019 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskImpl: task_1404252440376_0001_m_000001 Task Transitioned from NEW to SCHEDULED
2014-07-01 15:14:13,019 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskImpl: task_1404252440376_0001_m_000002 Task Transitioned from NEW to SCHEDULED
2014-07-01 15:14:13,019 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskImpl: task_1404252440376_0001_m_000003 Task Transitioned from NEW to SCHEDULED
2014-07-01 15:14:13,021 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: attempt_1404252440376_0001_m_000000_0 TaskAttempt Transitioned from NEW to UNASSIGNED
2014-07-01 15:14:13,021 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: attempt_1404252440376_0001_m_000001_0 TaskAttempt Transitioned from NEW to UNASSIGNED
2014-07-01 15:14:13,021 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: attempt_1404252440376_0001_m_000002_0 TaskAttempt Transitioned from NEW to UNASSIGNED
2014-07-01 15:14:13,021 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: attempt_1404252440376_0001_m_000003_0 TaskAttempt Transitioned from NEW to UNASSIGNED
2014-07-01 15:14:13,022 INFO [Thread-51] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: mapResourceReqt:1024
2014-07-01 15:14:13,066 INFO [eventHandlingThread] org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler: Event Writer setup for JobId: job_1404252440376_0001, File: hdfs://host:8020/user/sqoop/.staging/job_1404252440376_0001/job_1404252440376_0001_1.jhist
2014-07-01 15:14:13,976 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Before Scheduling: PendingReds:0 ScheduledMaps:4 ScheduledReds:0 AssignedMaps:0 AssignedReds:0 CompletedMaps:0 CompletedReds:0 ContAlloc:0 ContRel:0 HostLocal:0 RackLocal:0
2014-07-01 15:14:14,054 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerRequestor: getResources() for application_1404252440376_0001: ask=1 release= 0 newContainers=0 finishedContainers=0 resourcelimit=<memory:0, vCores:0> knownNMs=1

发生在我身上的主要问题是没有足够的资源来执行sqoop

当我执行Sqoop以及其他YARN应用程序时,它通常没有足够的资源,因此映射任务总是停留在0%。我找到了驱动程序日志,日志的最后几行类似于:

2017-02-01 14:54:48,638 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: attempt_1484947659248_0500_m_000000_0 TaskAttempt Transitioned from NEW to UNASSIGNED
2017-02-01 14:54:48,638 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: attempt_1484947659248_0500_m_000001_0 TaskAttempt Transitioned from NEW to UNASSIGNED
2017-02-01 14:54:48,638 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: attempt_1484947659248_0500_m_000002_0 TaskAttempt Transitioned from NEW to UNASSIGNED
2017-02-01 14:54:48,638 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: attempt_1484947659248_0500_m_000003_0 TaskAttempt Transitioned from NEW to UNASSIGNED
2017-02-01 14:54:48,639 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: attempt_1484947659248_0500_m_000004_0 TaskAttempt Transitioned from NEW to UNASSIGNED
2017-02-01 14:54:48,639 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: attempt_1484947659248_0500_m_000005_0 TaskAttempt Transitioned from NEW to UNASSIGNED

在此之后,没有记录任何内容,sqoop仍然为0%。

YARN没有其他运行时,Sqoop执行没有任何问题。

这不是最有帮助的答案,但我最终重新安装了整个东西。

相关内容

  • 没有找到相关文章

最新更新