/ Machine Learning

A High Performance Recommender System Package based on Collaborative Filtering

A High Performance Recommender System Package based on Collaborative Filtering

gorse

A High Performance Recommender System Package based on Collaborative Filtering for Go.

gorse is a recommender system engine implemented by the Go programming language. It is a package provides components to build a recommender system:

  • Data: Load data from built-in datasets or custom files.
  • Splitter: Split dataset by k-fold, ratio or leave-one-out.
  • Model: Recommendation models based on collaborate filtering including matrix factorization, neighborhood-based method, Slope One and Co-Clustering.
  • Evaluator: Implemented RMSE and MAE for rating task. For ranking task, there are Precision, Recall, NDCG, MAP, MRR and AUC.
  • Parameter Search: Find best hyper-parameters using grid search or random search.
  • Persistence: Save a model or load a model.
  • SIMD (Optional): Vectors are computed by AVX2 instructions which are 4 times faster than single instructions in theory.

Installation

go get github.com/zhenghaoz/gorse

Build

If the CPU supports AVX2 and FMA3 instructions, use the avx2 build tag to enable AVX2 support. For example:

go build -tags='avx2' example/benchmark_rating_ml_1m/main.go

Usage

Examples and tutorials could be found in wiki. Let's get started with a simple example:

package main

import (
	"fmt"
	"github.com/zhenghaoz/gorse/base"
	"github.com/zhenghaoz/gorse/core"
	"github.com/zhenghaoz/gorse/model"
)

func main() {
	// Load dataset
	data := core.LoadDataFromBuiltIn("ml-100k")
	// Split dataset
	train, test := core.Split(data, 0.2)
	// Create model
	svd := model.NewSVD(base.Params{
		base.Lr:       0.007,
		base.NEpochs:  100,
		base.NFactors: 80,
		base.Reg:      0.1,
	})
	// Fit model
	svd.Fit(train)
	// Evaluate model
	fmt.Printf("RMSE = %.5f\n", core.RMSE(svd, test, nil))
	// Predict a rating
	fmt.Printf("Predict(4,8) = %.5f\n", svd.Predict(4, 8))
}

The output would be:

RMSE = 0.91305
Predict(4,8) = 4.72873

More examples could be found in the example folder.

Benchmarks

All models are tested by 5-fold cross validation on a PC with Intel(R) Core(TM) i5-4590 CPU (3.30GHz) and 16.0GB RAM. All scores are the best scores achieved by gorse yet.

  • Rating Prediction on MovieLens 1M (source)
Model RMSE MAE Time (AVX2)
SlopeOne 0.90683 0.71541 0:00:26
CoClustering 0.90701 0.71212 0:00:08
KNN 0.86462 0.67663 0:02:07
SVD 0.84252 0.66189 0:02:21 0:01:48
SVD++ 0.84194 0.66156 0:03:39 0:02:47
  • Item Ranking on MovieLens 100K (source)
Model [email protected] [email protected] [email protected] [email protected] [email protected] Time
ItemPop 0.19081 0.11584 0.05364 0.21785 0.40991 0:00:03
SVD-BPR 0.32083 0.20906 0.11848 0.37643 0.59818 0:00:13
WRMF 0.34727 0.23665 0.14550 0.41614 0.65439 0:00:14

GitHub