Engineer At The Childhood Cancer Data Lab
Childhood Cancer Data Lab Overview
Alex’s Lemonade Stand Foundation is one of the leading funders of pediatric cancer research in the US and Canada. Since its inception in 2005, ALSF has funded more than 1,000 projects at nearly 150 institutions across the United States and Canada.
The Childhood Cancer Data Lab, an initiative of Alex’s Lemonade Stand Foundation, was founded in August 2017 with the mission of empowering pediatric cancer experts poised for the next big discovery with the knowledge, data, and tools to reach it. The Data Lab is a team of software developers, data scientists, designers, and community managers who are driven to build software systems, analytical workflows, and training programs in service of this mission. Members of the Data Lab simultaneously contribute to childhood cancer research and to the open science and open source software communities.
Position Overview, Duties and Responsibilities
Engineers at the Data Lab write code to develop Data Lab services and interfaces that meet the pressing needs of researchers studying childhood cancer. These developments will serve a community of dedicated scientists and clinicians, including those who receive grants from ALSF. Engineers work with computational biologists, designers, and the Director to guide conception, development, and maintenance of a variety of products and services ranging from primarily user-facing interfaces to computationally intensive processing of large collections of biomedical data. As part of our small team, you will have the opportunity to make an outsized impact on our products and services, as well as the community they serve. Working at the Data Lab also provides a unique opportunity to interact with the pediatric cancer research community and its supporters at ALSF.
As part of the Data Lab engineering team, you will help tackle a variety of technical challenges and turn them into reliable software implementations. You will have an opportunity to learn and develop your skills and expertise while building tools and applications that directly benefit childhood cancer researchers. You will participate in annual and sprint goal setting and planning to achieve them.
● Collaborate with team members to understand issues and determine and implement solutions
● Develop and deploy software solutions for diverse problems to compute infrastructure
● Write clean, maintainable source code
● Participate in code review to maintain and raise code quality
● Participate in sprint goal setting and planning
All employees of the Foundation undertake other duties as needed and special projects as assigned. All positions at ALSF require occasional non-traditional work hours including evenings and weekends.
● Thorough understanding of REST APIs
● 2 or more years experience with AWS, and Configuration Management Tools
● Experience with Docker or Deploying Docker Containers
● Eagerness to learn new technologies and frameworks
● Experience building and maintaining web applications with a large codebase
● Experience with React, Django Rest Framework, Elasticsearch, and Terraform
● Cloud Based DevOps experience
● Experience with scaling PostGres databases
● Experience with Agile development
● Existing contributions to open source projects
If you think any of the above fits your existing experience or you believe you can transition your skills from one technology to one listed, please apply!
The Data Lab is headquartered in the greater Philadelphia area. The team has switched to fully remote work during the pandemic, with many members being permanently or partially remote. When it is deemed safe to return to in-person work, this position can remain remote with quarterly trips to the Philadelphia area or return to the Alex’s Lemonade Stand Foundation office as desired.
If interested, please submit your resume, materials related to the technical assignment below, and a cover letter describing your interest and why you are the right fit for the position to [email protected]
Given the following description, please submit a UML diagram and answers to the questions listed below the description.
The engineering team creates RESTful API and web client interfaces for accessing datasets created by pediatric cancer researchers. A dataset is composed of individual samples from the same experiment or scientific study. Datasets are often provided as a ZIP file that contains a CSV file for each processed sample along with other files that contain metadata describing the samples and the experiment as a whole.
Each dataset should have its own UUID and will have a few standard attributes describing the area of research the dataset was generated from, as well as a timestamp for when the dataset was published. Each dataset will consist of an arbitrarily long collection of samples. Any sample may belong to more than a single dataset. Every sample will have one uniquely identifying field. Different samples from the same dataset are not guaranteed to have identical metadata keys. Each project and sample will be downloadable as a ZIP directly from AWS S3 from the URL that the API has provided after the user has agreed to the Terms of Service.
Note: The names of attributes are unimportant, but please provide some example attributes where specified in the description.
Please include answers to the following questions along with your diagram submission:
● How would an API based on this design be able to list experiments queried against sample metadata?
● How would you be able to supply a client with a list of metadata keys?
● Asking questions is part of being an Engineer at the Childhood Cancer Data Lab. What questions do you have about the assignment?