Using genetic algorithm to improve text classification
Achieving higher performance in text classification with a varied combination of
- Feature engineering
- Feature selection
- Optimizing feature selection and ensemble learning process through a combination of search and genetic algorithms.
Most importantly, during this talk we will discuss feature selection, which is often overlooked in text classification. Different methods of doing feature selection and what works better than others.
Feature selection not only improves model performance but also reduces vector and model size. Additionally, it will be covered how to combine feature selection with different types of features and ensembling results.
0-5 mins: Introduction.
5-15 mins: Current methods.
15-25 mins: Main Agenda (My methodology for doing text classification through feature engineering, selection, ensembling & optimizing the process through genetic algorithm and other search algorithms)
25-30 mins: closing remarks and questions.
- An interest in NLP and text classification, especially how to improve text classification model performance.
- Basic understanding of different text vector representations
- Basic understanding of ensemble learning.
Data Scientist with 10+ years of experience. Interests include NLP, Signal processing, and mathematical optimization. Author of 5 python libraries used in machine learning, geo-spatial data analysis and signal processing.
Links for python library and previous talks
TextFeatureSelection Pypi python library for feature selection for text classification
SNgramExtractor Pypi python library for syntactic ngram feature extraction
Example code to create Flask API for Keras deep learning NLP model
Example code to extract noun and adjective pairs using context-free grammar
BaselineRemoval Pypi python library for baseline removal from spectral data
Geospatial data analysis
Pandas2Shp Pypi python library for creating shp file for geo-spatial analysis from longitude and latitude
EvolutionaryFS Pypi python library for feature selection using evolutionary algorithms.
Previous meetup talk slides
Dependency Parsing in NLP