TY - GEN
T1 - Improving collaborative filtering recommendations using external data
AU - Umyarov, Akhmed
AU - Tuzhilin, Alexander
PY - 2008/12/1
Y1 - 2008/12/1
N2 - This paper describes an approach for incorporating externally specified aggregate ratings information into certain types of collaborative filtering (CF) methods. For a statistical model-based CF approach, we formally showed that this additional aggregated information provides more accurate recommendations of individual items to individual users. Furthermore, theoretical insights gained from the analysis of this model-based method suggested a way to incorporate aggregate information into the heuristic itembased CF method. Both the model-based and the heuristic item-based CF methods were empirically tested on several datasets, and the experiments uniformly confirmed that the aggregate rating information indeed improves CF recommendations. These results also show the power of theory by demonstrating how the insights gained from theoretical developments can shed light on proper selection of good heuristic methods. We also showed the way to introduce scalability and parallelization into the estimation procedure and reported the running time for steps of the estimation procedure for large datasets.
AB - This paper describes an approach for incorporating externally specified aggregate ratings information into certain types of collaborative filtering (CF) methods. For a statistical model-based CF approach, we formally showed that this additional aggregated information provides more accurate recommendations of individual items to individual users. Furthermore, theoretical insights gained from the analysis of this model-based method suggested a way to incorporate aggregate information into the heuristic itembased CF method. Both the model-based and the heuristic item-based CF methods were empirically tested on several datasets, and the experiments uniformly confirmed that the aggregate rating information indeed improves CF recommendations. These results also show the power of theory by demonstrating how the insights gained from theoretical developments can shed light on proper selection of good heuristic methods. We also showed the way to introduce scalability and parallelization into the estimation procedure and reported the running time for steps of the estimation procedure for large datasets.
UR - http://www.scopus.com/inward/record.url?scp=67049167771&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=67049167771&partnerID=8YFLogxK
U2 - 10.1109/ICDM.2008.44
DO - 10.1109/ICDM.2008.44
M3 - Conference contribution
AN - SCOPUS:67049167771
SN - 9780769535029
T3 - Proceedings - IEEE International Conference on Data Mining, ICDM
SP - 618
EP - 627
BT - Proceedings - 8th IEEE International Conference on Data Mining, ICDM 2008
T2 - 8th IEEE International Conference on Data Mining, ICDM 2008
Y2 - 15 December 2008 through 19 December 2008
ER -