Data Curator/ Data Engineer – Computational Pathology, Oncology

Do you have a passion for data? Would you like to apply your expertise to accelerate, improve and help automate our data flow and provide our scientists with FAIR and ready to use data at their fingertips? in a company that follows the science and turns ideas into life changing medicines? Then AstraZeneca might be the one for you!


AstraZeneca is a global, science-led, patient-focused biopharmaceutical company that focuses on the discovery, development and commercialisation of prescription medicines for some of the world’s most serious diseases. But we’re more than one of the world’s leading pharmaceutical companies. At AstraZeneca we’re dedicated to being a Great Place to Work. Where you are empowered to push the boundaries of science and unleash your entrepreneurial spirit. There’s no better place to make a difference to medicine, patients and society.


Welcome to Computational Pathology Munich, one of over 400 sites here at AstraZeneca, providing a collaborative environment where everyone feels comfortable and able to be themselves is at the core of AstraZeneca’s priorities, it’s important to us that you bring your full self to work every day. To help you maintain your best self, here’s a sneak peek into some of the things this site provides for you: After-work events, Lunch & Learns, Bright and spacious environment, Sustainable office working environment, Networking events, family and childcare support and of course the Alps around the corner for hiking, biking and skiing fun.


Be part of fulfilling our ambition to be world leaders in Oncology. We are already the fastest growing team within AstraZeneca and across the industry, and there are countless new indications and targets in our game-changing pipeline. We deliver this value through launch excellence, commercial effectiveness and maximising the lifecycle. By leveraging our commercialised portfolio we are confident we can change the practice of medicine and redefine cancer treatment.

We’re brave disruptors – entrepreneurial, courageous and pioneering in our approach. Here you have the opportunity to step up, take personal accountability and lead changes in our ever-evolving environment.

With pace and drive, comes trust that we will get it done. Embrace the freedom to create and expand your horizons. Always backed and supported by pioneering leaders, this is the place to build a world-class career that’s meaningful and rewarding. Here we’re on a journey to becoming digitally-enabled, to discover new ways of offering better solutions to our patients. Join the team with a vision to use data as a tool, to build a better, deeper, more personal understanding of the people we’re helping. To ultimately deliver better outcomes for them – through dynamic omnichannel content, personal relationships and experience.

What you’ll do

As a Data Curator/ Data Engineer you will help driving our mission to streamline and automate the TM data flow breaking up data silos and establishing a FAIR data environment while supporting on-time data provisioning and efficient, high quality data consumption. You will be responsible for (semantically) cleaning up and integrating large datasets and will use your understanding on current gaps within the data flow to improve and automate our data pipelines. Understanding how our scientists address key scientific questions, data collection and analysis and how these processes can be formalized to improve data (re)use, data quality and decision-making will be critical to success in this role.

Main Duties and Responsibilities:

  • Data capture, (semantic) cleaning and integration as a preparation step for bioinformatician/data scientist activities, while protecting privacy and compliance directives.

  • Develop data standards, vocabularies and dictionary as a baseline for a coherent data flow together with our data governance team and scientific stakeholders

  • Support/drive development of ETL pipelines in order to automate of our data flow through the system landscape

  • Work with bioinformaticians, data scientist and data management teams to develop best practices in data wrangling/curation and storage

Essential for the role

  • Bioinformatics, data science or related field (e.g. Physics, Math) with a focus on Life Science/ Pharma R&D (Master level)

  • Expertise in data curation/management and integration with data entity focus genomics, transcriptomics, proteomics

  • Good programming skills (experience in Python preferred, R)

  • Understanding of the FAIR data principles and how to translate and implement requirements into a typical data flow

  • Experienced working within cross-functional teams, including business data owners/stewards and technical product development staff

Desirable for the role

  • Experience with common biomedical vocabularies (Gene Ontology, NCI Thesaurus, ISA-Tab, MeSH, Human Disease Onotolgy, BAO, EFO, Human Phenotype Ontology, others), ontology repositories (NCBO Bioportal, EBI OLS) and common reference databases (e.g. Uniprot, Ensembl, CHEMBL, EntrezGene,

  • Experience with various data sources from different scientific domains, both structured and unstructured (e.g. HDFS, SQL, noSQL)

  • Experience working across multiple scientific compute environments to create data workflows and pipelines (e.g. HPC, cloud, Unix/Linux systems)

  • Various types of databases/datastores and semantic skills such as Oracle, Mysql, ElasticSearch, MongoDB, RDF/TripleStores etc.

Why AstraZeneca?

At AstraZeneca we’re dedicated to being a Great Place to Work. Where you are empowered to push the boundaries of science and unleash your entrepreneurial spirit. There’s no better place to make a difference to medicine, patients and society. An inclusive culture that champions diversity and collaboration, and always committed to lifelong learning, growth and development. We’re on an exciting journey to pioneer the future of healthcare.

So, what’s next?

  • Are you already imagining yourself joining our team? Good, because we can’t wait to hear from you.

  • Are you ready to bring new ideas and fresh thinking to the table? Brilliant! We have one seat available and we hope it’s yours.

Close date: 15/7/2021

Competitive salary and benefits

Where can I find out more?

Our Social Media,

Follow AstraZeneca on LinkedIn

Follow AstraZeneca on Facebook

Follow AstraZeneca on Instagram

Apply for this job

Privacy Policy / BBSTEM Limited | Registered in England and Wales | Company No: 11127036