>我有一个以下格式的csv文件/表格数据,
UserId Item1 Item2
1 url1 url3
1 url4 url6
2 url2 url3
2 url2 url4
2 url4 url6
3 url4 url6
3 url2 url3
所以,在这里,我想预测一个特定用户的 item2,如果 item1 的值是已知的。我们可以使用相同的协同过滤吗?如果是,请指导:)
我相信
它会起作用,你只需要弄清楚你如何决定一个用户是否相似。下面为您提供了一个建议字段,该字段基于与 item1 配对的 item2 值(反之亦然) - 它不包括用户已有的项目。当然,您可以做更复杂的事情,但这里有一些开始
select *, ISNULL((SELECT STUFF
((SELECT ', ' + CAST(ITEM2 AS VARCHAR(10)) [text()] from
((select top 5 ISNULL(item2,'') item2, count(item2) as cnt from items as CountTable1 where item1=Res.item1 and item2 is not null and len(item2) > 0
and item2 not in (select item2 from items where id=Res.id UNION select item1 from items where id=Res.id)
group by item2 order by cnt desc)
UNION
/* Below includes suggestions from item1 */
(select top 5 ISNULL(item1,'') item1, count(item1) as cnt from items as CountTable2 where item2=Res.item1 and item1 is not null and len(item1) > 0
and item1 not in (select item1 from items where id=Res.id UNION select item2 from items where id=Res.id)
group by item1 order by cnt desc))
as Suggs where item1=Res.item1 FOR XML PATH('')
, TYPE)
.value('.','NVARCHAR(MAX)'),1,2,' ')
List_Output)
,'') as Suggestions from items as Res
Sql Fiddle