Job ID: 102055
Job date: 2017-02-09
End Date:
Company : University of Chicago Country : Role : Technician
Job date: 2017-02-09
End Date:
Company : University of Chicago Country : Role : Technician
Job Description:
The Lead Software Architect is responsible for design and development of new systems, features, and tools for our large-scale data commons and clouds. You will be responsible for designing, building, scaling, optimizing, and maintaining a multi-petabyte system architecture. Works with cloud computing infrastructure primarily based on OpenStack to design software applications to meet business and technical requirements. Works in Linux-based systems in Python with some C/C++ and Go and various web technologies. - Leads design of new systems, features, and tools. - Solves complex problems and identifies opportunities for technical improvement and performance optimization. - Hands-on prototyping and development. - Reviews and tests code as appropriate to ensure appropriate standards are met. - Provides production and user support escalations. - Serves as a liaison with internal and external collaborators on multiple research projects. - Documents project development and programming code. - Drive innovation across the software engineering team and mentor early career team members. - Stay abreast of broad technical knowledge of existing and emerging technologies, including public cloud offerings from Amazon Web Services, Microsoft Azure, and Google Cloud. Coding includes the full stack including systems orchestration, API development, algorithms and data structures, and user interfaces. Projects span management, sharing, and provenance of large data sets; automation, metrics, and scheduling for cloud computing, large scale pipelining of next-generation sequence analysis, transfer programs/protocols for high-speed networks and resource visualization. Perform other duties as assigned. This at-will position is wholly or partially funded by contractual grant funding which is renewed under provisions set by the grantor of the contract. Employment will be contingent upon the continued receipt of these grant funds and satisfactory job performance. Education: Bachelor's degree in computer science, mathematics, statistics, engineering, or a related field; OR four (4) years programming experience required. Advanced degree in a related field preferred.
Additional Info:
he Center for Data Intensive Science is a research center that focuses on data science and its applications to problems in biology, medicine, and health care. We develop technology to manage, analyze and share large biomedical datasets and apply this technology to make discoveries in biology, medicine and health care. Broadly speaking, we are focused on the emerging field of translational data science. We work with the research community through partnerships and consortia that share our vision. CDIS is hosting several large projects at the University of Chicago: - The NCI Genomic Data Commons (GDC) is a unified knowledge base that promotes sharing of genomic and clinical data between researchers and facilitates precision medicine in oncology. The NCI GDC breaks down barriers by bringing cancer genomics datasets and associated clinical data into one location that any researcher may access, and "harmonizing" the data so that datasets that were generated with different protocols can be studied side by side. Then, by making these data available using modern computing and network technology, the GDC makes it possible for any researcher to ask new and fundamental questions about cancer. - The Bionimbus Protected Data Cloud (PDC) is the first open-source cloud-based computational platform that allows researchers authorized by NIH to compute over human genomic data in a secure and compliant fashion. Bionimbus and related cloud-based infrastructure are used by researchers working on cancer, diabetes and neuropsychiatric disorders. - The Biomedical Commons Cloud (BCC) is cloud-based infrastructure that we are developing for a consortium of medical research centers and commercial partners that provides secure, compliant cloud services for managing and analyzing genomic data, electronic medical records (EMR), medical images, and other PHI data. It provides resources to researchers so they can more easily make discoveries from large complex controlled access datasets. - The Open Science Data Cloud (OSDC) provides the scientific community with resources for storing, sharing, and analyzing terabyte and petabyte-scale scientific datasets. The OSDC is a data science ecosystem in which researchers can house and share their own scientific data, access complementary public datasets, build and share customized virtual machines with whatever tools necessary to analyze their data, and perform the analysis to answer their research questions. Both the BCC and OSDC are collaborations with the not-for-profit Open Cloud Consortium. Required Job Seeker Documents: - Resume -Cover Letter -Other Reference Contact Information [Click Here to Access the Original Job Post]
The Lead Software Architect is responsible for design and development of new systems, features, and tools for our large-scale data commons and clouds. You will be responsible for designing, building, scaling, optimizing, and maintaining a multi-petabyte system architecture. Works with cloud computing infrastructure primarily based on OpenStack to design software applications to meet business and technical requirements. Works in Linux-based systems in Python with some C/C++ and Go and various web technologies. - Leads design of new systems, features, and tools. - Solves complex problems and identifies opportunities for technical improvement and performance optimization. - Hands-on prototyping and development. - Reviews and tests code as appropriate to ensure appropriate standards are met. - Provides production and user support escalations. - Serves as a liaison with internal and external collaborators on multiple research projects. - Documents project development and programming code. - Drive innovation across the software engineering team and mentor early career team members. - Stay abreast of broad technical knowledge of existing and emerging technologies, including public cloud offerings from Amazon Web Services, Microsoft Azure, and Google Cloud. Coding includes the full stack including systems orchestration, API development, algorithms and data structures, and user interfaces. Projects span management, sharing, and provenance of large data sets; automation, metrics, and scheduling for cloud computing, large scale pipelining of next-generation sequence analysis, transfer programs/protocols for high-speed networks and resource visualization. Perform other duties as assigned. This at-will position is wholly or partially funded by contractual grant funding which is renewed under provisions set by the grantor of the contract. Employment will be contingent upon the continued receipt of these grant funds and satisfactory job performance. Education: Bachelor's degree in computer science, mathematics, statistics, engineering, or a related field; OR four (4) years programming experience required. Advanced degree in a related field preferred.
**For the Other required document, we would like you to attach a Code Sample**
Experience: -Minimum two (2) years of relevant software engineering, architecture, and design experience required. -Experience programming in Python, Go, and C/C++ required. -High performance/cloud computing experience required. -Unix/Linux programming or system administration experience required. -Git version control experience required. -Experience with large-scale distributed systems preferred. -Experience with graph and relational databases preferred. -Experience with data engineering preferred. -Experience leading a team in an agile environment preferred. -Experience with genomics preferred. Competencies: -Ability to prioritize and manage workload to meet critical project milestones and deadlines required. -Attention to detail required. -Ability and willingness to acquire new programming languages, statistical and computational methods, and background in research area required. -Ability to work in a collaborative team environment required.Requeriments :
Skills :
Additional Info:
he Center for Data Intensive Science is a research center that focuses on data science and its applications to problems in biology, medicine, and health care. We develop technology to manage, analyze and share large biomedical datasets and apply this technology to make discoveries in biology, medicine and health care. Broadly speaking, we are focused on the emerging field of translational data science. We work with the research community through partnerships and consortia that share our vision. CDIS is hosting several large projects at the University of Chicago: - The NCI Genomic Data Commons (GDC) is a unified knowledge base that promotes sharing of genomic and clinical data between researchers and facilitates precision medicine in oncology. The NCI GDC breaks down barriers by bringing cancer genomics datasets and associated clinical data into one location that any researcher may access, and "harmonizing" the data so that datasets that were generated with different protocols can be studied side by side. Then, by making these data available using modern computing and network technology, the GDC makes it possible for any researcher to ask new and fundamental questions about cancer. - The Bionimbus Protected Data Cloud (PDC) is the first open-source cloud-based computational platform that allows researchers authorized by NIH to compute over human genomic data in a secure and compliant fashion. Bionimbus and related cloud-based infrastructure are used by researchers working on cancer, diabetes and neuropsychiatric disorders. - The Biomedical Commons Cloud (BCC) is cloud-based infrastructure that we are developing for a consortium of medical research centers and commercial partners that provides secure, compliant cloud services for managing and analyzing genomic data, electronic medical records (EMR), medical images, and other PHI data. It provides resources to researchers so they can more easily make discoveries from large complex controlled access datasets. - The Open Science Data Cloud (OSDC) provides the scientific community with resources for storing, sharing, and analyzing terabyte and petabyte-scale scientific datasets. The OSDC is a data science ecosystem in which researchers can house and share their own scientific data, access complementary public datasets, build and share customized virtual machines with whatever tools necessary to analyze their data, and perform the analysis to answer their research questions. Both the BCC and OSDC are collaborations with the not-for-profit Open Cloud Consortium. Required Job Seeker Documents: - Resume -Cover Letter -Other Reference Contact Information [Click Here to Access the Original Job Post]