Deep learning powered Genomic Research -- Keras package
usha rengaraju (~usha75) |
The python package used is Keras..
This proposal got accepted at ODSC India 2019 and we are also featured speakers in the ODSC warm up webinar
The event disease happens when there is a slip in the finely orchestrated dance between physiology, environment and genes. Treatment with chemicals (natural, synthetic or combination) solved some diseases but others persisted and got propagated along the generations. Molecular basis of disease became prime center of studies to understand and to analyze the root cause. Cancer also showed a way that the origin of disease, detection, prognosis and treatment along with cure was not so uncomplicated process. Treatment of diseases had to be done on a case by case basis (no one size fits). With the advent of next generation sequencing, high throughput analysis, enhanced computing power and new aspirations with neural network to address this conundrum of complicated genetic elements (structure and function of various genes in our systems). This requires the genomic material extraction, their sequencing (automated system) and analysis to map the strings of As, Ts, Gs, and Cs which yields genomic dataset. These datasets are too large for traditional and applied statistical techniques. Consequently, the important signals are often incredibly small along with blaring technical noise. This further requires far more sophisticated analysis techniques. Artificial intelligence and deep learning gives us the power to draw clinically useful information from the genetic datasets obtained by sequencing. Precision of these analyses have become vital and way forward for disease detection, its predisposition, empowers medical authorities to make fair and situationally decision about patient treatment strategies. This kind of genomic profiling, prediction and mode of disease management is useful to tailoring FDA approved treatment strategies based on these molecular disease drivers and patient’s molecular makeup. Now, the present scenario encourages designing, developing, testing of medicine based on existing genetic insights and models. Deep learning models are helping to analyze and interpreting tiny genetic variations ( like SNPs – Single Nucleotide Polymorphisms) which result in unraveling of crucial cellular processes like metabolism, DNA wear and tear. These models are also responsible in identifying disease like cancer risk signatures from various body fluids. They have the immense potential to revolutionize healthcare ecosystem. Clinical data collection is not streamlined and done in a haphazard manner and the requirement of data to be amenable to a uniform fetchable and possibility to be combined with genetic information would power the value, interpretation and decisive patient treatment modalities and their outcomes. There is hugh inflow of medical data from emerging human wearable technologies, along with other health data integrated with ability to quickly carry out complex analyses on rich genomic databases over the cloud technologies … would revitalize disease fighting capability of humans. Last but still upcoming area of application in direct to consumer genomics (success of 23andMe). This road map promises an end-to-end system to face disease in its all forms and nature. Medical research, and its applications like gene therapies, gene editing technologies like CRISPR, molecular diagnostics and precision medicine could be revolutionized by tailoring a high-throughput computing method and its application to enhanced genomic datasets.
Outline/Structure of the Poster Basics of Genetics A. Types of Nucleic Acid (DNA & RNA) B. Structure of Nucleic Acid C. What is a gene? D. Basic gene regulation Basics of genetic mutation(10 minutes) A. What is genetic mutation? B. Types of mutations C. Why mutation occurs D. Origin of cancer and it’s progression Basics of genetic testing A. Sequencing technique B. Next generation sequencing Problems to be discussed : A. Problem 1: Detecting actively coding regions (the discovery of transcription-factor binding sites in DNA.) B. Problem 2: Deep convolutional neural networks for accurate somatic mutation detection
Learning Outcome Participants will Gain insight into human genomics and healthcare Develop an intuitive understanding of sequence models. Exciting knowledge about emerging area of research
Target Audience Anyone who is interested in healthcare and emerging trends in human genetics, Data Scientists, Data Analysts, Machine Learning engineers , Life Sciences / Genomics Researchers, Deep Learning Engineers
Basic Understanding of Deep Learning
Usha Rengaraju :
I am a polymath and unicorn data scientist with strong foundations in Economics, Finance, Business Foundations, Business Analytics and Psychology. I specialize in Probabilistic Graphical Models, Machine Learning and Deep Learning. I have completed Financial Engineering and Risk Management program from Columbia University with top honors, micromasters in Marketing Analytics from UC Berkeley and statistical analysis in Life Sciences specialization from Harvard. I am chapter lead/Co-Organizer of Women in Machine Learning and Data Science Bengaluru Chapter and Core oganizing team member at WIDS Bengaluru .I have around 6 years of technical experience working in various companies like Infosys, Temenos, NeoEYED and Mysuru Consulting Group. I am part of dedicated group of experts and enthusiasts who explore Coursera courses before they open to the public, an ambassador at AIMed (an initiative which brings together physicians and AI experts), part time Data science instructor, mentor at GLAD (gladmentorship.com), mentor at JobsForHer and volunteer at Statistics without Borders. I developed the course curriculum for Probabilistic Graphical Models @ Upgrad which is taught by Professor Srinivasa Raghavan from IIIT Bangalore.
Dr.Jyothirmayee Rao ::
Seven years of work experience with biotechnology giant Novozymes A/S as Senior Technology Innovation Specialist. Eight years of basic biotech research and 9 years of Patent R&D. I have worked in central government research institutes like CDFD, CCMB (Hyderabad), NCCS(Pune) and CSIR-URDIP (Pune).
Biotechnologist with research experience in microbial genetic and human genomics and trained patent professional working in the area of Innovation, patinformatics (patent R&D) for adaptable Intellectual property research proposition. From past two years, trying to incorporate data science like machine learning and deep learning in my analysis.
This proposal got accepted at ODSC India 2019
Speaker at World Machine Learning Summit 2018: https://1point21gws.com/machinelearning/bangalore/schedule.html#day1
Speaker at Google Cloud Next 18:
Session on Neuroeconomics -Neuroscience of Decision Making
Workshop at ODSC -Introduction to Bayesian Networks :
Speaker at BRUG (Time series Analysis):
Speaker at BangPypers: (Introduction to Probabilistic Graphical Models")
Speaker at PyLadies : https://twitter.com/aartee_ty/status/1066228144003264512
Workshop in Stock Price Prediction using Probabilistic Graphical Models https://www.meetup.com/Byte-Academy-Bangalore/events/256804439/
Webinar in Stock Price Prediction using Probabilistic Graphical Models https://www.meetup.com/Byte-Academy-Bangalore/events/fznzmqyzcbfb/
https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1005807 https://www.nature.com/articles/s41551-017-0178-6 https://www.broadinstitute.org/videos/machine-learning-based-crispr-guide-design https://www.microsoft.com/en-us/research/project/crispr/