我正在尝试使用Apache Mahout生成建议,同时使用MongoDB根据MongoDBDataModel创建数据模型。我的代码如下:
import java.net.UnknownHostException;
import java.util.List;
import org.apache.mahout.cf.taste.common.TasteException;
import org.apache.mahout.cf.taste.impl.model.mongodb.MongoDBDataModel;
import org.apache.mahout.cf.taste.impl.neighborhood.ThresholdUserNeighborhood;
import org.apache.mahout.cf.taste.impl.recommender.GenericItemBasedRecommender;
import org.apache.mahout.cf.taste.impl.recommender.GenericUserBasedRecommender;
import org.apache.mahout.cf.taste.impl.similarity.PearsonCorrelationSimilarity;
import org.apache.mahout.cf.taste.neighborhood.UserNeighborhood;
import org.apache.mahout.cf.taste.recommender.RecommendedItem;
import org.apache.mahout.cf.taste.recommender.UserBasedRecommender;
import org.apache.mahout.cf.taste.similarity.ItemSimilarity;
import org.apache.mahout.cf.taste.similarity.UserSimilarity;
import com.mongodb.MongoException;
public class usingMongo {
public static void main(String[] args) throws UnknownHostException, Mong oException
,TasteException {
final long startTime = System.nanoTime();
MongoDBDataModel model = new MongoDBDataModel("AdamsLaptop", 27017,
"test", "ratings100k", false, false, null);
System.out.println("connected to mongo ");
UserSimilarity UserSim = new PearsonCorrelationSimilarity(model);
UserNeighborhood neighborhood = new ThresholdUserNeighborhood(0.5, UserSim, model);
UserBasedRecommender UserRecommender = new GenericUserBasedRecommender(model, neighborhood, UserSim);
List<RecommendedItem>UserRecommendations = UserRecommender.recommend(1, 3);
for (RecommendedItem recommendation : UserRecommendations) {
System.out.println("You may like movie " + recommendation.getItemID() + " as a user similar to you also rated it " + recommendation.getValue() + " USER");
}
ItemSimilarity ItemSim = new PearsonCorrelationSimilarity(model);//LogLikelihoodSimilarity(model);
GenericItemBasedRecommender ItemRecommender = new GenericItemBasedRecommender(model, ItemSim);
List<RecommendedItem>ItemRecommendations = ItemRecommender.recommend(1, 3);
for (RecommendedItem recommendation : ItemRecommendations) {
System.out.println("You may like movie " + recommendation.getItemID() + " as a user similar to you also rated it " + recommendation.getValue() + " ITEM");
}
final long duration = System.nanoTime() - startTime;
System.out.println(duration);
}
}
我看不出我哪里出错了,但是经过大量更改和大量试验,错误消息保持不变:
Exception in thread "main" java.lang.NullPointerException
at org.apache.mahout.cf.taste.impl.model.mongodb.MongoDBDataModel.getID(MongoDBDataModel.java:743)
at org.apache.mahout.cf.taste.impl.model.mongodb.MongoDBDataModel.buildModel(MongoDBDataModel.java:570)
at org.apache.mahout.cf.taste.impl.model.mongodb.MongoDBDataModel.<init>(MongoDBDataModel.java:245)
at recommender.usingMongo.main(usingMongo.java:24)
有什么建议吗?这是我在MongoDB中的数据示例:
{ "_id" : ObjectId("56ddf61f5960960c333f3dcb"),"userId" : 1, "movieId" : 292, "rating" : 4, "timestamp" : 847116936 }
我成功地将MongoDB数据集成到了mahout中。
mongoDB中数据的结构取决于您使用的相似性算法的类型。
用户相似性
MongoDBDataModel datamodel = new MongoDBDataModel("127.0.0.1", 27017, "testing", "ratings", true, true, null);其中user_id、item_id是整数值,首选项是浮点值,created_at为时间戳
SVDRecommender
user_id,item_id是MongoDB对象,首选项是浮点值,created_at为时间戳
您可以做的明显的故障排除是MongoDB服务器是否正在运行。根据异常,它正在运行。我认为问题在于您的数据结构。
使用user_id而不是用户ID,使用item_id而不是itemId,使用首选项而不是评级。我不知道这是否会有任何不同。我在网上使用了其中一个教程,但目前找不到它。
它正在工作,但当我有超过 10000 个用户和 1000 个项目时,它太慢了。
问题在于,当涉及到需要驻留在mongoDB中的项目ID,用户ID和首选项的项目ID,用户ID和首选项时,mahout会假设一些默认值,这些字段user_id,item_id和首选项,因此解决方案可能在于使用另一个MongoDBDataModel构造函数,该构造函数将使您能够将mongoDB实例中这些字段的名称作为参数传递或重新设计集合模式。
我希望这是有道理的。