我正在尝试实现以下代码:
import os
os.environ.update({'MALLET_HOME':r'c:/mallet-2.0.8/'})
mallet_path = 'C:\mallet-2.0.8\bin\mallet'
ldamallet = gensim.models.wrappers.LdaMallet(mallet_path, corpus=bow, num_topics=20, id2word=dictionary)
但是,我不断收到此错误:
CalledProcessError: 命令 'C:\mallet-2.0.8\bin\mallet import-file --preserve-case --keep-sequence --remove-stopwords --token-regex "\S+" --input C:\Users\Joshua\AppData\Local\Temp\98094d_corpus.txt --output C:\Users\Joshua\AppData\Local\Temp\98094d_corpus.mallet' 返回非零退出状态 1。
我以前能够在具有相同目录的笔记本电脑上执行此代码,但它无法在我的 PC(我当前运行 python(上执行。
有人可以让我知道我做错了什么吗?
我遇到了类似的错误。 仔细检查您是否安装了 java 以及文件路径是否正在调用 java。 我不得不像这样编辑在 mallet 文件夹 C:\mallet\mallet-2.0.8\bin 中找到的 mallet.bat 文件:
@echo off
rem This batch file serves as a wrapper for several
rem MALLET command line tools.
if not "%MALLET_HOME%" == "" goto gotMalletHome
echo MALLET requires an environment variable MALLET_HOME.
goto :eof
:gotMalletHome
set MALLET_CLASSPATH=C:malletmallet-2.0.8class;C:malletmallet-2.0.8libmallet-deps.jar
set MALLET_MEMORY=1G
set MALLET_ENCODING=UTF-8
set CMD=%1
shift
set CLASS=
if "%CMD%"=="import-dir" set CLASS=cc.mallet.classify.tui.Text2Vectors
if "%CMD%"=="import-file" set CLASS=cc.mallet.classify.tui.Csv2Vectors
if "%CMD%"=="import-svmlight" set CLASS=cc.mallet.classify.tui.SvmLight2Vectors
if "%CMD%"=="info" set CLASS=cc.mallet.classify.tui.Vectors2Info
if "%CMD%"=="train-classifier" set CLASS=cc.mallet.classify.tui.Vectors2Classify
if "%CMD%"=="classify-dir" set CLASS=cc.mallet.classify.tui.Text2Classify
if "%CMD%"=="classify-file" set CLASS=cc.mallet.classify.tui.Csv2Classify
if "%CMD%"=="classify-svmlight" set CLASS=cc.mallet.classify.tui.SvmLight2Classify
if "%CMD%"=="train-topics" set CLASS=cc.mallet.topics.tui.TopicTrainer
if "%CMD%"=="infer-topics" set CLASS=cc.mallet.topics.tui.InferTopics
if "%CMD%"=="evaluate-topics" set CLASS=cc.mallet.topics.tui.EvaluateTopics
if "%CMD%"=="prune" set CLASS=cc.mallet.classify.tui.Vectors2Vectors
if "%CMD%"=="split" set CLASS=cc.mallet.classify.tui.Vectors2Vectors
if "%CMD%"=="bulk-load" set CLASS=cc.mallet.util.BulkLoader
if "%CMD%"=="run" set CLASS=%1 & shift
if not "%CLASS%" == "" goto gotClass
echo Mallet 2.0 commands:
echo import-dir load the contents of a directory into mallet instances (one per file)
echo import-file load a single file into mallet instances (one per line)
echo import-svmlight load a single SVMLight format data file into mallet instances (one per line)
echo info get information about Mallet instances
echo train-classifier train a classifier from Mallet data files
echo classify-dir classify data from a single file with a saved classifier
echo classify-file classify the contents of a directory with a saved classifier
echo classify-svmlight classify data from a single file in SVMLight format
echo train-topics train a topic model from Mallet data files
echo infer-topics use a trained topic model to infer topics for new documents
echo evaluate-topics estimate the probability of new documents given a trained model
echo prune remove features based on frequency or information gain
echo split divide data into testing, training, and validation portions
echo bulk-load for big input files, efficiently prune vocabulary and import docs
echo Include --help with any option for more information
goto :eof
:gotClass
set MALLET_ARGS=
:getArg
if "%1"=="" goto run
set MALLET_ARGS=%MALLET_ARGS% %1
shift
goto getArg
:run
"C:Program FilesJavajdk-12binjava" -ea -Dfile.encoding=%MALLET_ENCODING% -classpath %MALLET_CLASSPATH% %CLASS% %MALLET_ARGS%
:eof
然后更改文件路径以反映此更改:
mallet_path = 'C:/mallet/mallet-2.0.8/bin/mallet.bat'
希望这对:)有所帮助