Biomedical text mining with Python:Some experiments

Talks | Submit a talk
Authors Jaganadh G, Dr. Carlos Rodriguez Penagos
Level Intermediate
Topic None of the above
Tags BIONLP,NLP,Python,Text Mining
Summary

Biomedical text mining refers to text mining applied to biomedical literature. It is one of the recent research area in Natural Language Processing, bio-informatics and computational linguistics. The proposed talk will be focused on how Python and Natural Language Processing techniques can be used for biomedical text processing.

Outline

The talk will be focused on the following aspects 1) the bioreader module in nltk_contrib (nltk.org) packages. Bioreader is a module that allows creation of biomedical corpus based on keyword queries or PMID lists (references to pubmed articles). It also parses PUBMED and MEDLINE xml formats. It was coded originally for Information Extraction purposes, and differs a bit from other corpora modules in NLTK in that it creates the repositories on-the-fly, and that it encodes semantic and bibliographical metadata with each record. It is intended to teach BioNLP, and do some basic text mining. The module is written by Carlos R and submitted to the nltk_contrib package. At present the Mr. Jaganadh G is rewriting the package to include more text mining facilities.

2) How to retrieve Gene information from biomedical literature available in the NCBI NLP service 3) Some experiments with the MEDLINE data and Python 4) An overview of biomedical modules for Python

Notes

Source code of programs will be available in the following repo http://bitbucket.org/jagan/bioreader

Profile of the authors

Dr. Carlos Rodriguez Penagos, is a senior researcher in the Language and Voice area of Barcelona Media (GLICOM). . His major areas of expertise are Information Extraction, Text Mining, Natural Language Processing, Knowledge Engineering and Computer-Aided Translation. He has taught and coordinated various international research projects at the National Autonomous University of Mexico (UNAM), the Universitat Pompeu Fabra (UPF) and the National Cancer Research Center (CNIO), where he mined biomedical literature. He has been awarded various research grants by the National Science and Technology Council (Mexico) and the Generalitat de Catalunya Government (Spain).

Jaganadh G, is a consultant in Natural Language Processing at 365MEDIA Coimbatore. His major area of interest are Information Extraction, Natural Language Processing and Machine Aided Translation Research.He is working in the field of Computational Linguistics and Natural Language Processing for the past 5 years.

Files
file size uploaded comment
bmtmpn.odp 17.9 KB september 24, 2010 Modified and latest file
bmtmp.odp 16.4 KB september 19, 2010 Initial presentation - has to aleter befor conference

You can upload or delete a file if you are author of this talk.