Creating a recommendation engine based on NLP and contextual word embeddings

Manas Ranjan Kar (~manasRK)


107

Votes

Description:

How can we create a recommendation engine that is based both on user browsing history and product reviews? Can I create recommendations purely based on the 'intent' and 'context' of the search? How do I use natural language processing techniques to create valid recommendations?

This talk will showcase how a recommendation engine can be built with user browser history and user-generated reviews using a state-of-the-art technique - word2vec. We will create something that not only matches the existing recommender systems deployed by websites, but goes one step ahead - incorporating context to generate valid and innovative recommendations. The beauty of such a framework is that not only does it support online learning, but is also sensitive to minor changes in user tone and behavior.

The trick/secret sauce is - How do we account for the 'context' and build it in our systems? The talk will answer these questions and showcase effectiveness of such a recommender system.

MOTIVATIONS & PRACTICAL APPLICATIONS:

The current recommender systems tend to misfire when user history is not known or new products are introduced into the mix. Missing user ratings add more complexity and create hindrances to relevant recommendations. Also, the current websites, especially in the domain of food or travel, don't allow me to do a contextual search, like "best chicken tikkas in South Delhi". The results depend on the keywords appearing in the title or description, but rarely in the reviews + user browsing history.

We have built two models using word2vec, which are;

META MODEL: Created with the user browsing history. This contains more than 9.4 million product histories. The attempt is to mimic and improve upon existing collaborative filtering systems.

USER REVIEW MODEL: Currently websites don’t allow us to search on “context”. This model intakes reviews and attempts to create a framework for a “contextual search engine”.

This is our attempt to propose and demonstrate a framework that’s more rounded and preserves context while generating recommendations. The current architecture looks something like this;

Word2vec recommender architecture

The top results have high affinity among each other, and occur in Amazon’s website itself in the “also bough/also viewed” section. The precison at 3 results (P@3) is 58% currently. Howevever, P@15 is at 100%. We are currently improving the system and working on pre-processing techniques.

Prerequisites:

The participants must be well versed with Python and have a basic understanding of natural language processing, Codes and required documentation will be provided post the session.

Content URLs:

Code: https://github.com/manasRK/word2vec-recommender

The code is messy, will be cleaning the code.

Slides link: https://docs.google.com/presentation/d/1D4kdRbpHIZJ6YJc0huCRjiipNImC3rZxKSUdP_uub2U/edit?usp=sharing

Speaker Info:

Manas is currently leading the text analytics practice at Juxt Smart Mandate, a data science company. He likes helping clients making sense of their data and build a powerful case for business change using analytics in their respective companies.

He has architected multiple commercial NLP solutions in the area of healthcare, foods & beverages, finance and retail. He is deeply involved in functionally architecting large scale business process automation & deep insights from structured & unstructured data using Natural Language Processing & Machine Learning.

To sum up his experience, he has worked on;

  • Application of machine learning to build text analytics solutions
  • Automate business processes for efficiency & productivity
  • Build algorithms for extracting multiple facets from text - gender of author, keywords, sentiment, taxonomies, concepts, entities
  • Combine and augment unstructured insights with structured data
  • Build recommendation engine for automated medical coding services
  • Build models to predict taxonomies for textual content
  • Create machine learning algorithms for topic detection & sentiments
  • Competitive intelligence algorithms to monitor events & trends for startups & SMEs

His detailed LinkedIn profile is https://in.linkedin.com/in/manasranjankar .

Manas has contributed to multiple NLP libraries like Gensim and Conceptnet5. He blogs regularly on NLP on multiple forums like Data Science Central, LinkedIn and his blog Unlock Text. He is currently ranked 1035th on Kaggle amon more than half a million Kaggler in the world. He loves teaching and mentoring students. He speaks regularly on NLP and analytics at national conferences, guest talks at IIM Lucknow and MDI Gurgaon. He has also mentored students from schools like ISB Hyderabad, BITS Pilani, Madras School of Economics.

Akhil Gupta is currently in 4th year B-tech at SRM University, and currently working as a software developer at SRM Search Engine, a government funded project. He has a 2+ years of experience in data science with major expertise in data mining, text analysis, social media analysis, back-end architectures and data mining.

He also worked in areas of;

  • Natural language processing
  • Classification algorithm
  • Topic modelling
  • Clustering using probabilistic models
  • Twitter Mining.

He likes to build software which in some way eases human effort, some of them are;

• Content based semantic image retrieval • Languge model; having features such as Autocomplete, Entity tagger, Spell check, Word segmenter etc. • Entity Tagger; made on wikipedia dataset for tagging entity as well as domain identification. • Topic modelling • Restaurant recommendation engine; on the basis of food items. • Adjective and pronoun coreference resolution.

His detailed linkedin profile is https://in.linkedin.com/in/akhilgupta0910 .

Speaker Links:

Manas Ranjan Kar

  1. LinkedIn : https://in.linkedin.com/in/manasranjankar
  2. Contribution to Gensim (PR #625): https://github.com/RaRe-Technologies/gensim/blob/develop/gensim/scripts/glove2word2vec.py
  3. Blog: http://unlocktext.com/
  4. Related Blog Article: http://unlocktext.com/index.php/2015/12/14/using-glove-vectors-in-gensim/
  5. Context oriented NLP: https://www.linkedin.com/pulse/context-extraction-better-sentiment-analysis-manas-ranjan-kar?trk=prof-post
  6. Analysing product reviews for context cues: http://www.datasciencecentral.com/profiles/blogs/impactful-text-analytics-for-smarter-businesses

Akhil Gupta

  1. LinkedIn : https://in.linkedin.com/in/akhilgupta0910
  2. Github : https://github.com/codeorbit
  3. Contribution to SRMSE : https://github.com/SRMSE
  4. Twitter : https://twitter.com/decoding_life

Section: Data Visualization and Analytics
Type: Talks
Target Audience: Intermediate
Last Updated:

This looks interesting!

writetonikhil

Really a good job here done by the team of Pycon.org. I will recommend you to the dissertation writing service.

OwenK9

Commonly, a prescribed framework looks at a client profile to some reference qualities, and tries to foresee the "rating" that a client would provide for a thing they had not yet considered. college assignment help. These attributes might be from the data thing (the substance based approach) or the client's social condition (the community oriented separating approach).

laurelkase

Bidirectional recurring neural nets explain promising outcome, as they aim to write me an essay imprison both contextual dependence between words through repetition and position-invariant semantics through complication.

hyunacrysten

Great Information,it has lot for stuff Do MY Programming Assignment For Me which is informative.I will share the post with my friends.

kristoferkihn

This is really great work. Thank you for R Assignment Help sharing such a useful information here in the blog.

kristoferkihn

I loved the way you discuss the topic Solidworks Assignment Help great work thanks for the share Your informative post.

kristoferkihn

Science Channel’s are giving a complete knowledge to its viewers Make My SPSS Homework about every thing students write done dissertation on this subjects and show its importance.

kristoferkihn

Those who come to read your article Digital Marketing Course In Pakistan will find lots of helpful and informative tips

kristoferkihn

如果你想要像我这样的最好的内容,导师写作 只需要快速访问这个网站,因为它提供功能内容,谢谢

kristoferkihn

John arnold is an academic writer of the Dissertation-Guidance.Can Someone Do My Exam Who writes quality academic papers for students to help them in accomplishing their goals.

kristoferkihn

The leading assignment help UK firm offers state Write My Medical Thesis of the art services to its clients with a promise of delivering all the required work well within the deadline.

kristoferkihn

Great Information, Help With Audit Assignment it has lot for stuff which is informative.I will share the post with my friends.

kristoferkihn

John arnold is an academic writer of the Statistics Assignment Help Dissertation-Guidance. Who writes quality academic papers for students to help them in accomplishing their goals.

kristoferkihn

Login to add a new comment.