Site Tools


Hotfix release available: 2025-05-14b "Librarian". upgrade now! [56.2] (what's this?)
Hotfix release available: 2025-05-14a "Librarian". upgrade now! [56.1] (what's this?)
New release available: 2025-05-14 "Librarian". upgrade now! [56] (what's this?)
Hotfix release available: 2024-02-06b "Kaos". upgrade now! [55.2] (what's this?)
Hotfix release available: 2024-02-06a "Kaos". upgrade now! [55.1] (what's this?)
New release available: 2024-02-06 "Kaos". upgrade now! [55] (what's this?)
Hotfix release available: 2023-04-04b "Jack Jackrum". upgrade now! [54.2] (what's this?)
Hotfix release available: 2023-04-04a "Jack Jackrum". upgrade now! [54.1] (what's this?)
New release available: 2023-04-04 "Jack Jackrum". upgrade now! [54] (what's this?)
Hotfix release available: 2022-07-31b "Igor". upgrade now! [53.1] (what's this?)
Hotfix release available: 2022-07-31a "Igor". upgrade now! [53] (what's this?)
New release available: 2022-07-31 "Igor". upgrade now! [52.2] (what's this?)
New release candidate 2 available: rc2022-06-26 "Igor". upgrade now! [52.1] (what's this?)
New release candidate available: 2022-06-26 "Igor". upgrade now! [52] (what's this?)
Hotfix release available: 2020-07-29a "Hogfather". upgrade now! [51.4] (what's this?)
New release available: 2020-07-29 "Hogfather". upgrade now! [51.3] (what's this?)
New release candidate 3 available: 2020-06-09 "Hogfather". upgrade now! [51.2] (what's this?)
New release candidate 2 available: 2020-06-01 "Hogfather". upgrade now! [51.1] (what's this?)
New release candidate available: 2020-06-01 "Hogfather". upgrade now! [51] (what's this?)
Hotfix release available: 2018-04-22c "Greebo". upgrade now! [50.3] (what's this?)
Hotfix release available: 2018-04-22b "Greebo". upgrade now! [50.2] (what's this?)
intelligence_engine:product_recommendation_by_lightfm

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
intelligence_engine:product_recommendation_by_lightfm [2019/06/25 11:06]
182.72.26.6
intelligence_engine:product_recommendation_by_lightfm [2019/07/02 10:04] (current)
182.72.26.6
Line 1: Line 1:
 ===== Introduction ===== ===== Introduction =====
  
-Product recommendations are usually implemented by using collaborative filtering technique which recommends items by calculating similar items to the given item as well as finding similar users to whom recommendations are being made. This is very useful algorithm and is core of many big companies like amazon,​netflix etc. But the major concern with collaborative filtering is that it doesn’t handle cold start problem(eg: If user has no enrollments) well so to handle this case we need content based filtering which suggests items based on user profile.So ideally a hybrid recommendation system which implements both collaborative and content based filtering systems is required. Lightfm is the library which implements the hybrid recommendation systems.+Product recommendations are usually implemented by using collaborative filtering technique which recommends items by calculating similar items to the given item as well as finding similar users to whom recommendations are being made. This is very useful algorithm and is core of many big companies like amazon,​netflix etc. But the major concern with collaborative filtering is that it doesn’t handle cold start problem(eg: If user has no enrollments) well so to handle this we need help of content based filtering which suggests items based on user profile.So ideally a hybrid recommendation system which implements both collaborative and content based filtering systems is required. ​**Lightfm** is the library which implements the hybrid recommendation systems.
  
-===== LIGHTFM ​Advantages =====+===== Advantages =====
  
-Lightfm is a python library which implements number of popular recommendation algorithms for both implicit and explicit feedback, including efficient implementation of BPR and WARP ranking losses.It also makes it possible to incorporate both item and user metadata into the traditional matrix factorization algorithms. It has following advantages over other libraries:+Lightfm is a python library which implements number of popular recommendation algorithms for both implicit and explicit feedback, including efficient implementation of BPR and WARP ranking losses.It has following advantages over other libraries:
   * It provides recommendations for both implicit and explicit data where as many other libraries support only explicit data.   * It provides recommendations for both implicit and explicit data where as many other libraries support only explicit data.
   * It provides a facility to track metrics like model accuracy,​precision which is very helpful to understand the recommendation accuracy.   * It provides a facility to track metrics like model accuracy,​precision which is very helpful to understand the recommendation accuracy.
-  * It can make recommendation by considering the user and item features data.+  * It can make recommendation by considering the user features(job_role,​city,​org_id) ​data.
  
 ===== Implementation ===== ===== Implementation =====
Line 15: Line 15:
 Lightfm is a popular recommendation algorithms for both implicit and explicit feedback data.It incorporates both item and user metadata into the traditional matrix factorization algorithms. It represents each user and item as the sum of the latent representations of their features, thus allowing recommendations to generalise to new items (via item features) and to new users (via user features).Lightfm implementation is very easy and has enough documentation in the below link Lightfm is a popular recommendation algorithms for both implicit and explicit feedback data.It incorporates both item and user metadata into the traditional matrix factorization algorithms. It represents each user and item as the sum of the latent representations of their features, thus allowing recommendations to generalise to new items (via item features) and to new users (via user features).Lightfm implementation is very easy and has enough documentation in the below link
  
-[[http://​lyst.github.io/​lightfm/​docs/​home.html.]] +[[http://​lyst.github.io/​lightfm/​docs/​home.html]] ​
  
 Lightfm implementation can be divided into following steps Lightfm implementation can be divided into following steps
-  * prepare ​interactions,​user and item features matrices.+  * Prepare ​interactions,​user and item features matrices.
   * Train model with the available data.   * Train model with the available data.
-  * predict ​results using the trained model.+  * Predict ​results using the trained model.
  
 1)Preparing data matrices: 1)Preparing data matrices:
Line 59: Line 59:
       - EP OEMs Channel final test       - EP OEMs Channel final test
       - EP Panel Builders Channel final test       - EP Panel Builders Channel final test
-**Sample ​code:** + 
-from cassandra.cluster import Cluster+ 
 +===== Sample ​Code===== 
 +     ​from cassandra.cluster import Cluster
      from cassandra.auth import PlainTextAuthProvider      from cassandra.auth import PlainTextAuthProvider
      from lightfm import LightFM      from lightfm import LightFM
Line 67: Line 69:
      from lightfm.data import Dataset      from lightfm.data import Dataset
      ​import numpy as np      ​import numpy as np
-   +     
      ​auth_provider = PlainTextAuthProvider(username='​cassandra',​ password='​cassandra'​)      ​auth_provider = PlainTextAuthProvider(username='​cassandra',​ password='​cassandra'​)
      ​cluster = Cluster(['​107.170.83.67'​],​ auth_provider=auth_provider)      ​cluster = Cluster(['​107.170.83.67'​],​ auth_provider=auth_provider)
Line 76: Line 78:
      ​item_rows = session.execute("​select course_id,​title from e1_course"​)      ​item_rows = session.execute("​select course_id,​title from e1_course"​)
      ​enrollment_rows = session.execute("​select user_id,​course_id from enrollment_e1"​)      ​enrollment_rows = session.execute("​select user_id,​course_id from enrollment_e1"​)
-     ​dataset.fit(users=(x.user_id for x in user_rows), items=(x.course_id for x in  item_rows), ​    ​user_features=([x.city,x.job_role,​x.org_id] for x in user_rows))+     user_features = ([x.city,​x.job_role,​x.org_id] for x in user_rows) 
 +     dataset.fit(users=(x.user_id for x in user_rows), items=(x.course_id for x in  item_rows),​user_features=user_features)  
 +     ​interaction_data ​= ([x.user_id,x.course_id] for x in enrollment_rows) 
 +     ​(interactions,​weights) = dataset.build_interactions(data=interaction_data)
      ​num_users,​ num_items = dataset.interactions_shape()      ​num_users,​ num_items = dataset.interactions_shape()
-     (interactions,​weights) ​dataset.build_interactions(data=([x.user_id,​x.course_id] for x in enrollment_rows)) +     user_feat ​= ((user.user_id,​ [user.job_role,​user.org_id,​user.city]) for user in user_rows) 
-      user_features = dataset.build_user_features(((user.user_id,​ [user.job_role,​user.org_id,​user.city]) for user in user_rows),​False) +     ​user_features = dataset.build_user_features(user_feat,False) 
-      model = LightFM(no_components=30,​ k=1, n=20, loss='​warp',​ learning_schedule='​adagrad'​) +     ​model = LightFM(no_components=30,​ k=1, n=20, loss='​warp',​ learning_schedule='​adagrad'​) 
-      model.fit_partial(interactions,​ user_features=user_features,​ epochs=30) +     ​model.fit_partial(interactions,​ user_features=user_features,​ epochs=30) 
-      results = new_model.predict(dataset.mapping()[0][user_id],​ np.arange(num_items),​ user_features=user_features,​ num_threads=1) +     ​results = new_model.predict(dataset.mapping()[0][user_id],​ np.arange(num_items),​ user_features=user_features,​ num_threads=1) 
-      results = np.argsort(-results) +     ​results = np.argsort(-results) 
-      def similar_items(internal_idx,​ item_feats_mtx,​ model, N=10): +     ​def similar_items(internal_idx,​ item_feats_mtx,​ model, N=10): 
-            item_representations = model.get_item_representations()[1] +          item_representations = model.get_item_representations()[1] 
-            scores = item_representations.dot(item_representations[internal_idx,​ :]) +          scores = item_representations.dot(item_representations[internal_idx,​ :]) 
-            item_norms = np.linalg.norm(item_representations,​ axis=1) +          item_norms = np.linalg.norm(item_representations,​ axis=1) 
-            scores /= item_norms +          scores /= item_norms 
-            best = np.argpartition(scores,​ -N)[-N:] +          best = np.argpartition(scores,​ -N)[-N:] 
-            return sorted(zip(best,​ scores[best] / item_norms[internal_idx]),​key=lambda x: -x[1]) +          return sorted(zip(best,​ scores[best] / item_norms[internal_idx]),​key=lambda x: -x[1]) 
-      sim_items = similar_items(dataset.mapping()[2][item_id],​item_features,​model)+     ​sim_items = similar_items(dataset.mapping()[2][item_id],​item_features,​model)
intelligence_engine/product_recommendation_by_lightfm.1561460817.txt.gz · Last modified: 2019/06/25 11:06 by 182.72.26.6