We are seeking a motivated and enthusiastic data engineer to join our Informatics team. As a lead member of the team you will have the opportunity to architect and develop integrated data solutions to support Decibel’s growing research and development pipeline. We are a highly collaborative group of expert informaticians, data scientists, biologists, chemists and other researchers in a dynamic startup culture. Our platform utilizes a modern, cloud-based technology stack.
- Architect and develop end to end solutions to ingest, transform, analyze, and distribute core data assets.
- Develop and productionize genomic/genetic data pipelines in close collaboration with our genomics and computational biology scientists.
- Utilize AWS as the primary technology stack leveraging higher order services such as EMR/Spark, Glue, Athena, Batch/ECS/Fargate, etc.
- Architect data repositories that scale, integrate across domains, and incorporate appropriate metadata.
- Develop integrated data query interfaces and visualization techniques to answer key scientific questions.
- Develop machine learning and analytic pipelines.
- Bachelor's degree or higher in a quantitative/technical field (e.g. Computer Science, Statistics, Engineering).
- 7+ years’ industry experience developing data solutions in a biopharma environment.
- In-depth knowledge and experience in the development and support of scientific data systems.
- Proven track record of delivering integrated solutions to support the analysis of large scale genomic and genetic data sets.
- Database design and modelling experience including relational and one or more NoSQL databases (document, key-value, graph).
- Experience developing genomics pipelines preferably with a standard framework (e.g. Nextflow, Snakemake, Luigi).
- Experience architecting solutions utilizing AWS (S3, EC2, RDS, Lambda, Batch/Docker, EMR/Spark, Glue, Athena/Redshift).
- Proven ability to rapidly learn and apply new technologies to drive innovation.
- Knowledge of practices across the devops lifecycle, including agile methodologies, source code management, build processes, automated testing, and operations.
Experience providing technical leadership and mentoring other engineers for best practices on data engineering.