使用带有HDInsight的Apache Mahout运行电影推荐时出错



我运行了以下代码,但收到了一个错误。。。

# The HDInsight cluster name.
$clusterName = "my-cluster-name"
Use-AzureHDInsightCluster $clusterName
# NOTE: The version number portion of the file path
# may change in future versions of HDInsight.
# So dynamically grab it using Hive.
$mahoutPath = Invoke-Hive -Query '!${env:COMSPEC} /c dir /b /s ${env:MAHOUT_HOME}examplestarget*-job.jar' | where {$_.startswith("C:appsdist")}
$mahoutPath = $mahoutPath -replace "\", "/"
$jarFile = "file:///$mahoutPath"
#
# If you are using an earlier version of HDInsight,
# set $jarFile to the jar file you
# uploaded.
# For example,
# $jarFile = "wasb:///example/jars/mahout-core-0.9-job.jar"
# The arguments for this job
# * input - the path to the data uploaded to HDInsight
# * output - the path to store output data
# * tempDir - the directory for temp files
$jobArguments = "-s", "SIMILARITY_COOCCURRENCE",
                "--input", "wasb:///u.data",
                "--output", "wasb:///example/out",
                "--tempDir", "wasb:///temp/mahout"
# Create the job definition
$jobDefinition = New-AzureHDInsightMapReduceJobDefinition `
  -JarFile $jarFile `
  -ClassName "org.apache.mahout.cf.taste.hadoop.item.RecommenderJob" `
  -Arguments $jobArguments
# Start the job
$job = Start-AzureHDInsightJob -Cluster $clusterName -JobDefinition $jobDefinition
# Wait on the job to complete
Write-Host "Wait for the job to complete ..." -ForegroundColor Green
Wait-AzureHDInsightJob -Job $job
# Write out any error information
Write-Host "STDERR"
Get-AzureHDInsightJobOutput -Cluster $clusterName -JobId $job.JobId -StandardError

我已经使用azure存储资源管理器将u.data文件上传到包含hdinsight文件的容器的根目录。。

我在第..行收到错误。。

PS C:>$job=启动AzureHDInsightJob-Cluster$clusterName-JobDefinition$JobDefinition

错误:

Start-AzureHDInsightJob:请求失败,代码为InternalServerError内容:{"error":null}第1行字符:8+$job=启动AzureHDInsightJob-群集$clusterName-作业定义$jobDefinitioni。。。+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~+CategoryInfo:未指定:(:)[Start AzureHDInsightJob],HttpLayerException+FullyQualifiedErrorId:Microsoft。WindowsAzure。经营HDInsight。框架果心图书馆WebRequest。HttpLayerEexception,微软。WindowsAzure。经营HDInsight。Cmdlet。PSCmdlet。StartAzureHDInsightJobCmdlet

真诚地请求任何帮助。。

感谢

这看起来像是HDInsight集群上Hive/Templeton的最新更改,它现在在文件路径的末尾返回一个CRLF。我将脚本更改为以下内容以修复它:

$mahoutPath = Invoke-Hive -Query '!${env:COMSPEC} /c dir /b /s ${env:MAHOUT_HOME}examplestarget*-job.jar' | where {$_.startswith("C:appsdist")}
$noCRLF = $mahoutPath -replace "`r`n", ""
$cleanedPath = $noCRLF -replace "\", "/"
$jarFile = "file:///$cleanedPath"

最新更新