使用带Boto的AWS EMR修复安装引导程序错误



随着AMI 3.3.0的发布,AWS支持Hue作为EMR中可安装的"应用程序",如Hive/Pig。使用EMR web UI,使用Hue创建集群对我来说很好,但当通过Boto添加Hue安装引导操作时,我会遇到一个不确定的错误(它会定期崩溃)。我已经用相同的配置测试了4次,崩溃率是50%。

在Boto中,我添加了一个额外的引导操作,这是在启用Hue时从web UI创建集群时自动完成的:

BootstrapAction('Install Hue', 's3://elasticmapreduce/libs/hue/install-hue', [])

集群然后以一个:终止

Terminated with errors: On the master instance (i-c6b7582a), 
bootstrap action 2 returned a non-zero return code

在引导程序操作日志中:

Existing lock /var/run/yum.pid: another copy is running as pid 2007. Another app is currently holding the yum lock; waiting for it to exit... The other application is: yum Memory : 22 M RSS (305 MB VSZ) Started: Tue Nov 11 21:00:12 2014 - 00:19 ago State : Sleeping, pid: 2007 Another app is currently holding the yum lock; waiting for it to exit...

成吨的,最后是一场大型的堆叠比赛:

Trying other mirror.
http://packages.ap-southeast-2.amazonaws.com/2014.09/main/20140901f63e/x86_64/repodata/repomd.xml?instance_id=i-c6b7582a&region=us-east-1: [Errno 12] Timeout on http://packages.ap-southeast-2.amazonaws.com/2014.09/main/20140901f63e/x86_64/repodata/repomd.xml?instance_id=i-c6b7582a&region=us-east-1: (28, 'Connection timed out after 10000 milliseconds')
Trying other mirror.
Traceback (most recent call last):
  File "/usr/bin/yum", line 29, in <module>
    yummain.user_main(sys.argv[1:], exit_code=True)
  File "/usr/share/yum-cli/yummain.py", line 355, in user_main
    errcode = main(args)
  File "/usr/share/yum-cli/yummain.py", line 174, in main
    result, resultmsgs = base.doCommands()
  File "/usr/share/yum-cli/cli.py", line 572, in doCommands
    return self.yum_cli_commands[self.basecmd].doCommand(self, self.basecmd, self.extcmds)
  File "/usr/share/yum-cli/yumcommands.py", line 432, in doCommand
    return base.installPkgs(extcmds, basecmd=basecmd)
  File "/usr/share/yum-cli/cli.py", line 968, in installPkgs
    txmbrs = self.install(pattern=arg)
  File "/usr/lib/python2.6/site-packages/yum/__init__.py", line 4721, in install
    mypkgs = self.pkgSack.returnPackages(patterns=pats,
  File "/usr/lib/python2.6/site-packages/yum/__init__.py", line 1069, in <lambda>
    pkgSack = property(fget=lambda self: self._getSacks(),
  File "/usr/lib/python2.6/site-packages/yum/__init__.py", line 774, in _getSacks
    self.repos.populateSack(which=repos)
  File "/usr/lib/python2.6/site-packages/yum/repos.py", line 383, in populateSack
    sack.populate(repo, mdtype, callback, cacheonly)
  File "/usr/lib/python2.6/site-packages/yum/yumRepo.py", line 250, in populate
    if self._check_db_version(repo, mydbtype):
  File "/usr/lib/python2.6/site-packages/yum/yumRepo.py", line 342, in _check_db_version
    return repo._check_db_version(mdtype)
  File "/usr/lib/python2.6/site-packages/yum/yumRepo.py", line 1520, in _check_db_version
    repoXML = self.repoXML
  File "/usr/lib/python2.6/site-packages/yum/yumRepo.py", line 1706, in <lambda>
    repoXML = property(fget=lambda self: self._getRepoXML(),
  File "/usr/lib/python2.6/site-packages/yum/yumRepo.py", line 1702, in _getRepoXML
    self._loadRepoXML(text=self.ui_id)
  File "/usr/lib/python2.6/site-packages/yum/yumRepo.py", line 1693, in _loadRepoXML
    return self._groupLoadRepoXML(text, self._mdpolicy2mdtypes())
  File "/usr/lib/python2.6/site-packages/yum/yumRepo.py", line 1667, in _groupLoadRepoXML
    if self._commonLoadRepoXML(text):
  File "/usr/lib/python2.6/site-packages/yum/yumRepo.py", line 1495, in _commonLoadRepoXML
    self._revertOldRepoXML()
  File "/usr/lib/python2.6/site-packages/yum/yumRepo.py", line 1345, in _revertOldRepoXML
    os.rename(old_data['old_local'], old_data['local'])
OSError: [Errno 2] No such file or directory

相比之下,引导日志显示了成功的单行:

Warning: RPMDB altered outside of yum.

在EMR AMI 3.3 中安装和运行Hue的示例

import boto.emr
from boto.emr.emrobject import InstanceGroup
from boto.emr.bootstrap_action import BootstrapAction
from boto.emr.step import ScriptRunnerStep
conn = boto.emr.EmrConnection()
jobid = conn.run_jobflow(name="Hue Example", ami_version = "3.3.0",
                                log_uri="s3n://your-log-path-here",
                                instance_groups= get_instance_groups(),
                                bootstrap_actions=get_bootstrap_actions(),
                                ec2_keyname="your-ec2-key-name",
                                steps = get_startup_steps()
                                )
def get_bootstrap_actions():
    install_hue_action = BootstrapAction("Install Hue ",
                                "s3n://us-east-1.elasticmapreduce/libs/hue/install-hue",
                                bootstrap_action_args=None)
    return [install_hue_action]

def get_startup_steps():
    runHueStep = ScriptRunnerStep(name="Run Hue",
                                        step_args = ["s3n://us-east-1.elasticmapreduce/libs/hue/run-hue"])
    return [runHueStep]

def get_instance_groups():
    #This is just an example. Actual implementation will have core, and task instance groups as well. Please choose your instance type, number, and bid price wisely as might it get too expensive too quickly.
    spotInstanceGroup =  InstanceGroup()
    spotInstanceGroup.name="Spot Instance Group Master"
    spotInstanceGroup.bidprice="0.20"
    spotInstanceGroup.num_instances = 1
    spotInstanceGroup.market="SPOT"
    spotInstanceGroup.type="c3.2xlarge"
    spotInstanceGroup.role="MASTER"

最新更新