How GO-FOOD built a Query Semantics Engine to help you find food faster
Ishita Mathur (~ishita) |
Context: The Search Problem
GOJEK is a SuperApp: 19+ apps within an umbrella app. One of these is GO-FOOD, the first food delivery service in Indonesia and the largest food delivery service in Southeast Asia. There are over 300 thousand restaurants on the platform with a total of over 16 million dishes between them.
Over two-thirds of those who order food online using GO-FOOD do so by utilising text search. Search engines are so essential to our everyday digital experience that we don’t think twice when using them anymore. Search engines involve two primary tasks: retrieval of documents and ranking them in order of relevance. While improving that ranking is an extremely important part of improving the search experience, actually understanding that query helps give the searcher exactly what they’re looking for.
Query Understanding: Why?
This is where Query Understanding comes into the picture: it’s about the search query interpretation process even before the results are even retrieved and ranked. The semantic neighbours of the query itself become the focus of the search process: after all, if I don’t understand what you’re trying to ask for, how will I give you what you want?
In this talk, you will learn about our journey going from a using a purely ElasticSearch-based stack, which resulted in only exact text matches and/or fuzzy matches, to enhancing the search experience by adding a Query Semantics Engine built in Python using open-source tools & libraries to correctly identify the intent behind the search query and return more relevant results. You will learn about how we are taking advantage of word embeddings for a holistic Query Understanding approach designed to make the customer’s experience as smooth as possible.
The primary objective of the talk is for you to learn why deriving query semantics is essential to building a great search engine, and how you can go about building a Query Semantics Engine. I will talk about how to:
- Take advantage of word embeddings for building an intelligent search engine
- Choose between the different algorithms (such as Doc2Vec, Word2Vec) and implementations (such as gensim, StarSpace)
- Deal with data challenges
- Choose the right metric when evaluating performance of a Search Engine
This talk is intended to be of a technical case study format wherein I will go over the techniques we used to build each component of the engine, the data and algorithmic challenges we faced and how we solved each problem we came across.
A basic familiarity with the following would be useful for this talk:
- Word embeddings algorithms such word2vec
- Draft slides: http://bit.ly/query-semantics-engine-slides-v4
- Short video about talk: https://youtu.be/D9DsAvMwgXE
Ishita has been working as a Data Scientist since 2016 with product-based startups in understanding business concerns in various domains and formulating them as technical problems that can be solved using data and ML. Her current work at GO-JEK involves end-to-end development of ML projects, by working as part of a product team in defining, prototyping and implementing data science models within the product. She has also published a book on “Applied Supervised Learning with Python” with publisher Packt.
Ishita has completed her Masters’ degree in High Performance Computing with Data Science from the University of Edinburgh, UK and her Bachelors’ degree with Honours in Physics from St. Stephen’s College, Delhi.
- Medium profile (blogs): https://linkedin.com/in/imathur/
- LinkedIn profile: https://firstname.lastname@example.org
- Author profile on Goodreads: https://goodreads.com/imathur
- Github profile: https://github.com/imathur