Python for Multi Omics Data Analytics
MANIMARAN NC (~manimaran) |
DNA is the genetic material (in most of the biological systems) found inside cell, in concert with other cell components, directs what and how much proteins are to be produced. Proteins are building blocks of cell. In Different stages of cell processes, different types of components are produced. For example, Transcription process produces nascent mRNAs and these mRNAs along with other RNAs produce protein. Quantification of cell components at different stages of cell processes are very important in characterizing the state. Collectively, heterogeneous dataset produced are called as Multi-Omics data. (Proteomics - Protein data; Genomics - Gene Expression data; Metabolomics - Metabolites data etc)
Generally, Statistical analyses done or Machine learning Models built on these datasets belonging to different omics layers separately to get feature biomarkers. Feature biomarkers from each layer can be integrated to explain biological phenomenon. Results from this approach can be biased towards certain datasets. In addition, these individual analyses do not take into account the interactions between the Omics layers.
Multi-variate analyses which can find out feature biomarkers than can classify the groups effectively and also maximize the correlation between datasets from different omics layers are much needed. Functions written in Python to do these kind of analyses will be much useful for finding out Biomarkers.
Basic knowledge of Python & its packages Basics of Molecular Biology (DNA, RNA, microRNA, Protein & Transcription, Translation etc)
- Primarily codes in Python.
- Believes in Data Driven Research.
- Uses Machine Learning Algorithms to solve Data science problems in Biology.