Careers at Children's Cancer Institute
Research Data Engineer / Senior Research Data Engineer

Randwick - Sydney, Australia  
Employment Type:
Full-Time Fixed Term  
Computational Biology (CB)
$120,000 + Super + NFP Salary Packaging Benefits
Supporting Documentation:

  • Innovative, collaborative and team-orientated environment
  • Modern, world-class research
  • Convenient Randwick location, across the road from the UNSW Light Rail Stop
  • VERY Flexible working options

The Children’s Cancer Institute’s sole mission is to put an end to the devastating impact of childhood cancer. To achieve this, CCI in conjunction with the Kids Cancer Centre lead the ZERO Childhood Cancer National Personalised Medicine Program (ZERO), which is Australia’s first and most comprehensive personalised medicine program for children and adolescents with cancer. ZERO is a unique, multidisciplinary program bringing together cutting-edge science, the latest technology, and the brightest minds in research and clinical care, it is on a path to change the model of care for children with cancer today.

The ZERO program is generating vast amounts of genomic and clinical data from children, which is difficult to manage and share with the research community, which hampers research into this deadly disease. To overcome this challenge, the Computational Biology group at CCI have partnered with the Australian BioCommons to develop the Human Genomes Platform project, a national collaborative approach to genomic data sharing.

We seek an experienced Data Engineer to enable genomic data sharing at scale. The right applicant will join a passionate team focused on delivering personalised medicine to kids with cancer in a rewarding environment. This project is managed in a highly collaborative way, with research partners from the University of Melbourne, Garvan Institute and QIMR Berghofer and infrastructure partners from the Australian Access Federation and the National Computational Infrastructure. Furthermore, this project will be delivered in partnership with Australian Genomics, ensuring a pathway to national uptake at other sites.


Responsibilities will include, but are not limited to:

  • Developing a genomic data sharing portal, extending our prototype built using the Gen3 platform
  • Implementing ETL processes to enable near real-time data loading into the portal
  • Developing systems to standardise, harmonise and share clinical data
  • Collaborating nationally to design and develop a system that allows virtual cohorts from several federated systems to be cross-queried, identified and analysed using genomic analysis pipelines
  • Developing and adopting standards for authentication and authorization (AuthN/AuthZ), so that approved researchers can access data, potentially using CILogon, GA4GH’s Passport, or NIH’s RAS standards
  • Developing semi-automated procedures for data access request approval, in partnership with AAF, potentially using DUOS or REMS standards
  • Developing streamlined methods for exporting data to genomic data sharing platforms, including the European Genome-Phenome Archive (EGA)
  • Establishing efficient long term data storage strategies that leverage CCI’s Hybrid Cloud infrastructure, and emerging genomic data compression technologies
  • Mentoring junior bioinformatics engineers to achieve the above goals
  • The creation of a data sharing portal, populated with matched genomic and clinical data, in near-real-time, to support researchers to find novel insights into childhood cancer
  • Streamlined web-based processed for tracking and authorising data access to specific datasets
  • Standards-compliant systems for user authentication and authorisation
  • Streamlined process for submitting data to EGA
  • System developed to create virtual cohorts from multiple federated systems
  • Contribution to national documentation, training materials and presentations about this research


Qualifications, experience and skills required:

  • Tertiary qualifications in Computer Science, Engineering, IT, Bioinformatics, Statistics, or equivalent with 3+ years work experience in data engineering or related fields, or 8+ years for a senior
  • Fluency in at least 2 modern data-centric programming languages such as Python, R as well as SQL
  • Experienced in at least one major Linux platform
  • Experienced in data pipelines and modelling
  • Familiar with cloud technologies such as AWS including S3, IAM, EC2
  • Excellent problem-solving and communication skills
  • Proven ability to work in a team


This is an excellent opportunity to work in an inspiring workplace. You'll be rewarded with a friendly and professional flexible work environment, comprehensive on-campus facilities, competitive salary, salary packaging options, access to a leading EAP program and regular social activities. Join a group of dedicated people in a performance-driven environment to achieve success and discover what it's like to look forward to coming to work every day and make a real difference.

A detailed job description and additional information about Children's Cancer Institute can be found on our website at: . We embrace diversity and encourage applications from people from diverse backgrounds and cultures.

To apply, please click the 'APPLY' link and forward both your resume AND cover letter (mandatory) clearly addressing the qualifications, experience and skills required.

Note: Applications will be reviewed prior to the closing date; which is dependent on the status of the recruitment process. Only successfully shortlisted candidates will be contacted.