Required Qualifications and Skills
- Ph.D. or Masters in Computer science, Data science, Statistics, Bioinformatics or related fields.
- 10+ years’ experience and technical expertise in applied bioinformatics, computational biology, data science or biostatistics.
- Robust working knowledge and application of data analysis and modeling, data wrangling and data visualization.
- Firm grasp of modern statistical methods and machine learning techniques, and their applications to large-scale, high throughput dataset analysis.
- Proficiency with R/ Bioconductor, Python or equivalents, and relational databases (SQL, NoSQL).
- First-hand experience in multi-parametric data mining, analysis and visualization in any biomedical application.
- Exposure to multi-parametric data mining experience for disease stratification/endotyping, target identification and biomarker analysis.
- Experience and understanding of how bioinformatics and data science can best be applied to speed up drug discovery.
- Basic understanding of biological concepts and a familiarity with drug development process
- Knowledge of bioinformatic tools and databases to analyze genomics and proteomics data
- Ability to manage projects with minimal supervision, using creative and analytical thinking.
- Ability to drive highly collaborative work across the organization and outside the company
- Excellent oral and written communication.
Desirable Additional Experience
Experience in one or more of the following areas is highly desirable, but not essential.
- Deeper knowledge/training/experience in biomedical field.
- A minimum of 1-year research (academia or industry) experience.
- Demonstrated experience in deep learning and generative AI model based approaches such as bioinformatics foundation models (BFMs).
- Experience in genomics, transcriptomics, Next Generation Sequencing (NGS) analysis, single cell RNAseq, flow cytometry or IHC based data processing.
- Experience working with one or more of the following disciplines: synthetic biology, comparative genomics, population genetics, probabilistic modeling, population genetics, and quantitative modeling of biological systems.
- Experience with one or more of the following: P Snakemake, Nextflow, airflow, CWL, relational databases (SQL), GraphQL, distributed computing (AWS/Google Cloud), Docker, software version control (git).
- Managing a data analytics and computational operating model that encompasses processes and technologies for executing scalable data management solutions for various data types.