Collaborative Filtering Simulator Back
Machine Learning

Collaborative Filtering Simulator

Explore the recommendation algorithm behind "customers who bought this also bought…" — collaborative filtering. Adjust the ratings matrix, neighbour count and similarity metric to see how user-based collaborative filtering predicts an unknown rating and who becomes your "taste neighbour", all in real time.

Parameters
Number of users
users
Rows of the ratings matrix (synthetic users)
Number of items
items
Columns of the ratings matrix (synthetic items)
Neighbour count k
users
Number of similar users allowed to vote on the prediction
Similarity metric
How "taste closeness" between users is measured
Matrix sparsity
%
Fraction of cells masked as "unknown" (empty)
Results
Predicted rating
Actual rating (hidden truth)
Prediction error (absolute)
Top neighbour similarity
Matrix sparsity (%)
Prediction quality
Ratings matrix — users × items

Rows are users, columns are items. Cell colour is the rating (1–5); hatched cells are unknown. The yellow outline marks the target cell to predict, blue outlines mark the chosen neighbour rows.

Prediction error vs neighbour count k
Each user's similarity to the target user
Theory & Key Formulas

$$\hat r_{u,i}=\bar r_u+\frac{\sum_{v\in N}\text{sim}(u,v)\,(r_{v,i}-\bar r_v)}{\sum_{v\in N}|\text{sim}(u,v)|}$$

Predicted rating $\hat r_{u,i}$ of user $u$ for item $i$. $N$ is the set of the $k$ most similar users who have rated item $i$, and $\bar r_u$ is the mean rating of user $u$.

$$\text{cos}(u,v)=\frac{\sum_{j} r_{u,j}\,r_{v,j}}{\sqrt{\sum_j r_{u,j}^2}\,\sqrt{\sum_j r_{v,j}^2}}$$

Cosine similarity. The index $j$ runs over the items both $u$ and $v$ have rated. The smaller the angle between their rating vectors, the closer it is to 1.

$$\text{pear}(u,v)=\frac{\sum_{j}(r_{u,j}-\bar r_u)(r_{v,j}-\bar r_v)}{\sqrt{\sum_j(r_{u,j}-\bar r_u)^2}\,\sqrt{\sum_j(r_{v,j}-\bar r_v)^2}}$$

Pearson correlation. Because each user's mean rating is subtracted before correlating, it corrects for generous-versus-harsh rating habits.

What is the Collaborative Filtering Simulator?

🙋
On shopping sites you always see "customers who bought this also bought…". How is that produced? Is an AI reading the contents of the products?
🎓
No — and that is the interesting part. The algorithm behind it, "collaborative filtering", does not look at the product content at all. For a movie it uses nothing like "this is sci-fi" or "who stars in it". All it uses is the pattern of ratings everyone gave in the past. In short, it recommends through the collective wisdom of people's tastes.
🙋
Wait — without looking at the content, how can it know what to recommend for me?
🎓
Roughly speaking, it "finds people whose taste is close to yours and borrows their opinion". First it finds users whose past ratings closely resemble yours — your "taste neighbours". Then, if those neighbours rated some item highly, it predicts that you will probably like it too. Look at the ratings matrix on the left: rows are users, columns are items, and the colour is each person's rating. The yellow-outlined cell is the unknown rating we are about to predict.
🙋
I see! And how exactly do you measure "taste being close"?
🎓
You line up the items two users have both rated and turn "how well their ratings line up" into a number. The classic one is cosine similarity, which looks at the angle between the two rating vectors. The other, Pearson correlation, subtracts each person's average score first. That corrects for habits like "always giving a generous 5" versus "a harsh rater who gives mostly 2s". Switch the metric in the select box on the left and you will see the blue-outlined neighbours change.
🙋
Moving the neighbour-count k slider changes the prediction. Is a bigger k always better?
🎓
Not necessarily. k is "the number of neighbours that vote on the prediction". With k=1 you decide from the single most similar person, so if they happen to have an odd rating the prediction is far off. Raise k too high and even not-very-similar people join the vote, blurring the prediction. So you look at the "prediction error vs neighbour count k" chart and find the k where the error is smallest. In practice you pick the best k by a procedure called cross-validation.
🙋
When I push the sparsity slider up to 70%, the prediction goes quite wrong. What is happening there?
🎓
That is exactly the biggest weakness of collaborative filtering — sparsity. A real ratings matrix is more than 99% empty, because everyone rates only a tiny fraction of what they see. With many blanks, any two users share very few "items both rated", so similarity cannot be computed reliably. A brand-new user has zero ratings and can be compared with nobody — that is the "cold-start problem". Pushing the sparsity slider up is precisely a simulation of that difficulty.

Frequently Asked Questions

Content-based recommendation uses the content of the item itself as features (for a movie: genre and cast; for a product: material and price) and recommends items whose content resembles what you liked before. Collaborative filtering never looks at the item content at all. Instead it uses only the pattern of past ratings: it finds other users whose rating tendencies resemble yours — your taste neighbours — and predicts from their ratings. Being able to recommend without knowing anything about the items is the strength of collaborative filtering.
Cosine similarity looks only at the angle between two users' rating vectors, which makes it simple and easy to implement. Pearson correlation subtracts each user's mean rating first (mean-centring) before correlating, so it corrects for rating habits such as a generous rater who always gives high scores versus a harsh rater. On real data with a mix of generous and harsh raters, Pearson correlation tends to be more accurate, but it becomes unstable when two users share too few common ratings. This tool lets you switch between both and compare the results.
The neighbour count k is the number of similar users allowed to vote on the prediction. A small k uses only a few very similar users, so if one of them is an outlier the prediction swings wildly. A large k is stable thanks to the majority vote, but lets less-similar users join in and blurs the prediction. In practice you search for the k that minimises prediction error via cross-validation. The prediction-error versus k chart in this tool lets you see which k gives the smallest error.
The cold-start problem is that a brand-new user or item has no past ratings at all, so there is nothing to compute similarity from and no recommendation can be made. Sparsity is the fact that real ratings matrices are more than 99% empty: users rate only a tiny fraction of the items they see, so any two users overlap on very few items and similarity cannot be computed reliably. These two are the biggest practical challenges that decide the accuracy of collaborative filtering.

Real-World Applications

Product recommendations on e-commerce sites: "Customers who bought this also bought…", popularised by Amazon, is the most famous application of collaborative filtering. From a huge ratings matrix of purchase history, it finds people with buying patterns similar to yours and shows the products they often buy. Because it can recommend without reading the product description, it works as-is across categories as different as books, electronics and groceries.

Video and music streaming: Netflix's recommendations and Spotify's "Discover Weekly" are built around collaborative filtering with viewing and playback history treated as a ratings matrix. The "Netflix Prize" competition held in 2006 offered one million dollars to the team that improved prediction error by just 10%, and it triggered the rapid spread of a modern technique called matrix factorisation.

"People you may know" on social networks: Friend-request and follow recommendations are also a form of collaborative filtering that uses connection patterns between users in place of a ratings matrix. They find "people who share many friends with you" or "people who follow similar accounts". The collaborative-filtering idea — running purely on patterns of relationships rather than item content — applies directly here.

Matching news, jobs and ads: News-app article delivery, "jobs recommended for you" on job sites and ad targeting all use it too. By treating implicit feedback such as clicks, applications and dwell time as ratings, collaborative filtering can run even when users never explicitly give stars. Most real services adopt a hybrid recommender that combines collaborative filtering with content-based methods.

Common Misconceptions and Pitfalls

A big misconception is that "collaborative filtering is a fair algorithm that mostly recommends popular things to everyone equally". The opposite is true: a popular item with many ratings shares common ground with more users and so gets recommended even more. This is called "popularity bias" or the "filter bubble". New releases and niche gems have few ratings, struggle to enter recommendations, and if left alone the bias toward "best-sellers selling even more" grows. In practice you must work to ensure diversity — down-weighting by popularity, deliberately mixing in surprising items, and so on.

Next, assuming that "a high similarity means an accurate prediction". As you can feel by raising the sparsity in this tool, the similarity of two users who share only one or two common ratings may just look high by chance. With a single common item, cosine similarity is always 1. In practice you add safeguards such as "only admit neighbours with at least a certain number of common ratings" or "discount the similarity of pairs with few common items (significance weighting)". Always check how many common ratings support a similarity, not just the similarity value itself.

Finally, the overconfidence that "collaborative filtering alone completes a recommender system". Collaborative filtering is weak against the cold-start problem and cannot handle new users, new items or niche items nobody has rated. When the ratings matrix grows huge, computing the similarity of every user pair becomes heavy, making scalability an issue. A real recommender becomes practical only with an overall design that combines content-based recommendation, matrix factorisation and deep-learning models, and even includes defences against rating manipulation (shilling attacks). Treat this tool as a way to build the intuition of "plain collaborative filtering" that everything else starts from.

How to Use

  1. Set the number of users (e.g., 500–5000) and items (e.g., 100–1000) to define your user-item interaction matrix size
  2. Adjust k (nearest neighbors, typically 5–50) to control how many similar users influence each prediction
  3. Configure matrix sparsity (20–95%) to simulate real-world rating coverage gaps where most user-item pairs are unknown
  4. Run the simulation to observe predicted ratings versus actual held-out ratings, with absolute error and top neighbor similarity scores displayed

Worked Example

Consider an e-commerce dataset with 2000 users, 300 products, k=15 neighbors, and 85% sparsity (only 15% of ratings observed). For User 42 predicting a laptop rating: the algorithm finds 15 users with cosine similarity >0.75 to User 42's preference profile, averages their laptop ratings (7.2, 6.8, 8.1, etc.), and outputs predicted rating 7.4. The held-out actual rating is 7.6, yielding absolute prediction error of 0.2 stars. Matrix sparsity metric confirms 85% missing values, typical for streaming services or marketplace catalogs.

Practical Notes

  1. Higher sparsity (>90%) degrades prediction accuracy; increase k to 30–50 neighbors for stability when data is sparse
  2. For music streaming (Spotify-scale, ~100M users), use distributed k-NN approximation; dense matrices (<30% sparsity) risk overfitting similar users
  3. Monitor top neighbor similarity scores; values <0.5 indicate weak collaborative signal, suggesting content-based hybrid approach
  4. Cold-start users with <5 ratings should bypass collaborative filtering until minimum interaction threshold is met