Gensim is a machine learning package for natural language understanding. For example, it can tell you the main topics of a web-page. It has word2vec and doc2vec machine learning algorithms

During the coding sprint we plan to re-work our tutorials. See them listed on our github page at https://github.com/RaRe-Technologies/gensim/blob/develop/tutorials.md Come to the sprint and improve them or create new ones!


Open to beginners.

No machine learning experience necessary.

Some Python knowledge required. You need to know what a for loop is but no need to know what zip* does.

__Environment setup:_

Python 3

pip3 install cython gensim sklearn pandas matplotlib nltk pyemd jupyter

The tutorials that need improvement are in https://github.com/RaRe-Technologies/gensim/blob/develop/tutorials.md

Lev Konstantinovskiy is a maintainer of Gensim.

He is an expert in natural language processing, is a Python and Java developer. Lev has extensive experience working with financial institutions and is RaRe Technologies' manager of open source communities including gensim, an open source machine learning toolkit for understanding human language.

Section: Machine Learning
Type: Dev Sprint
Target Audience: Beginner
Go gensim :D

Bhargav Srinivasa (~bhargavvader)

Would be a wonderful experience for anyone interested in getting a hands-on experience in machine learning and natural language processing!

Devashish Deshpande (~dsquareindia)

Hello Lev, Please add the detailed steps to setup the developer environment for this project. Many Thanks!

UltimateCoder (~ultimatecoder)

@UltiateCoder. Thanks for the tip. Added env setup to the pre-reqs section.

Lev Konstantinovskiy (~tmylk)

Is there a Docker container with everything at the exact versions you recommend, for easier installation?

Jeff Rush (~jeff)
Lev Konstantinovskiy (~tmylk)
Kalvin Micheal (~kalvin)
Elizabeth King (~elizabeth87)
jony shaun (~jony)

