Named Entity resolution for beer SKUs using Python
ANINDO CHAKRABORTY (~anindo78) |
Today in the world of CPG companies there is data being extracted from customer sales POS systems or even external providers. Most CPG companies do not have visibility to what consumers are buying. Consumers of beer buy from retail stores. These retail stores are the end customers of CPG companies. From a Commercial strategy perspective, it is very important to understand the sales of that retailer and what consumers are buying across time, seasons, events, brands and lastly...styles. The main problem in using different sources of consumer POS sales data, is that a SKU is never identified easily. A named entity problem appears. A named entity like Budweiser 330 ML 6 pack can be set up as BudwISER 6 pk in a certain store. This brings a problem of how to solve in resolving nouns (often NLP deals with full language and sentences)....as brand and SKU names from customer POS systems. This talk is about using NER methods to resolve this issue and how to put in production so that at high accuracy we have data scientists working on correct data rather than manually spending time on resolving SKU names.
- Basic understanding of NLP libraries
- Basic understanding of NER problem is
- Some understanding of CPG / Retail helps
Anindo is a Director Data Science at ABInBev, worlds' largest beer company. His current role is to ensure developing solutions with applying ML to our sales initiatives. His background is quantitative economics, and his career has pursued path of solving problems using statistics and econometric models. He has been in leadership role for long, but has stayed hands on and is fairly new to Python using it for last 3 years now. He also has hosted one session of PyData in Bangalore at ABInBev offices this year. So, he is a newbie but aspires to contribute more to OS and attend more pycon conferences.