gorse
A High Performance Recommender System Package based on Collaborative Filtering for Go.
gorse is a recommender system engine implemented by the Go programming language. It is a package provides components to build a recommender system:
- Data: Load data from built-in datasets or custom files.
- Splitter: Split dataset by k-fold, ratio or leave-one-out.
- Model: Recommendation models based on collaborate filtering including matrix factorization, neighborhood-based method, Slope One and Co-Clustering.
- Evaluator: Implemented RMSE and MAE for rating task. For ranking task, there are Precision, Recall, NDCG, MAP, MRR and AUC.
- Parameter Search: Find best hyper-parameters using grid search or random search.
- Persistence: Save a model or load a model.
- SIMD (Optional): Vectors are computed by AVX2 instructions which are 4 times faster than single instructions in theory.
Installation
go get github.com/zhenghaoz/gorse
Build
If the CPU supports AVX2 and FMA3 instructions, use the avx2
build tag to enable AVX2 support. For example:
go build -tags='avx2' example/benchmark_rating_ml_1m/main.go
Usage
Examples and tutorials could be found in wiki. Let's get started with a simple example:
package main
import (
"fmt"
"github.com/zhenghaoz/gorse/base"
"github.com/zhenghaoz/gorse/core"
"github.com/zhenghaoz/gorse/model"
)
func main() {
// Load dataset
data := core.LoadDataFromBuiltIn("ml-100k")
// Split dataset
train, test := core.Split(data, 0.2)
// Create model
svd := model.NewSVD(base.Params{
base.Lr: 0.007,
base.NEpochs: 100,
base.NFactors: 80,
base.Reg: 0.1,
})
// Fit model
svd.Fit(train)
// Evaluate model
fmt.Printf("RMSE = %.5f\n", core.RMSE(svd, test, nil))
// Predict a rating
fmt.Printf("Predict(4,8) = %.5f\n", svd.Predict(4, 8))
}
The output would be:
RMSE = 0.91305
Predict(4,8) = 4.72873
More examples could be found in the example folder.
Benchmarks
All models are tested by 5-fold cross validation on a PC with Intel(R) Core(TM) i5-4590 CPU (3.30GHz) and 16.0GB RAM. All scores are the best scores achieved by gorse
yet.
- Rating Prediction on MovieLens 1M (source)
Model | RMSE | MAE | Time | (AVX2) |
---|---|---|---|---|
SlopeOne | 0.90683 | 0.71541 | 0:00:26 | |
CoClustering | 0.90701 | 0.71212 | 0:00:08 | |
KNN | 0.86462 | 0.67663 | 0:02:07 | |
SVD | 0.84252 | 0.66189 | 0:02:21 | 0:01:48 |
SVD++ | 0.84194 | 0.66156 | 0:03:39 | 0:02:47 |
- Item Ranking on MovieLens 100K (source)
Model | [email protected] | [email protected] | [email protected] | [email protected] | [email protected] | Time |
---|---|---|---|---|---|---|
ItemPop | 0.19081 | 0.11584 | 0.05364 | 0.21785 | 0.40991 | 0:00:03 |
SVD-BPR | 0.32083 | 0.20906 | 0.11848 | 0.37643 | 0.59818 | 0:00:13 |
WRMF | 0.34727 | 0.23665 | 0.14550 | 0.41614 | 0.65439 | 0:00:14 |