Data Engineer

Job ID: 5226
Job date: 2017-01-25
End Date:

Company : University of California 

Country :

Role : Technician 


[Click Here to Access the Original Job Post]

Job Description:
The Data Engineer is responsible for the development and deployment of UCSF’s Research Computing shared capability, which includes computing nodes, Very Large biomedical and health data, shared and best-of-breed tools, metadata and catalog. The role includes data and tool management, curation, engineering and quality maintenance.

The scope described below will be shared with other engineer(s), existing or to be brought on. The exact task list will be customized to individual talents.

The Data Engineer functions as a specialized bioinformatics systems development professional with an understanding of computational algorithms to solve a wide range of problems. Project scope is broad and may extend anywhere in the university. In more specific terms:

• Create the pipelines and configurations on a Linux-based distributed file system for Very Large health data, premise-hosted as well as public cloud based. Such data will be clinical, phenotypic and population level data of several categories, structured, semi-structured and unstructured data. Non-structured data includes text, images, and “messy” alphanumeric data.

• Manage the quality, interfaces and catalog of this data and related tools, for use across the research enterprise of UCSF. Adapt the interfaces, where possible, to meet evolving needs and compatibility.

• Qualify or build scalable general-purpose computational and inferential software tools to work with the data. Examples are tools for de-identification (including in text), “tall and wide” tabular statistical/mining tools, search, automated data cleaning and natural language processing of text.

• Stay up to date with the latest offerings from both, hardware and cloud service vendors, including new architectures such as GPU's.

• Contribute to improving our adopted medical taxonomies and semantic networks.

• Contribute to key architectural meetings within the university and also with partners.

• Comply with the university’s policies with respect to privacy and security.

Required Qualifications:

• Bachelor’s degree in computational / programming sciences, or related area and / or equivalent experience / training .

• Knowledge of computational and bioinformatics methods, applications programming, web development and data structures. Python/C++/Java .

• Demonstrated application in the field of Health/Medicine data, at Very Large scale (Linux).

• Strong understanding of relational databases, distributed computing and storage (e.g. Hadoop), web interfaces and operating systems.

• Time management skills; ability to solve problems and meet deadlines.

• Ability to communicate technical information in a clear and concise manner

Note: Fingerprinting and background check required.

Preferred Qualifications

• Master’s degree in computational / programming sciences, or related area and / or equivalent experience / training .

• Expert knowledge of bioinformatics programming design, modification and implementation.

• Some knowledge of modern biology and applicable field of research.

• Interpersonal skills in order to work with both technical and non-technical personnel internally and externally to the organization.

• Good knowledge of web, application and data security concepts and methods.

• Knowledge of and demonstrated experience in Data Sciences, including data mining, machine learning, advanced statistics, data curation, data management, data quality.


Requeriments :

Skills :

Areas :


Additional Info:
INSTITUTE FOR COMPUTATIONAL HEALTH SCIENCES :

The Institute for Computational Health Sciences (ICHS) is a critical component of a global UCSF initiative in Precision Medicine, which seeks to aggregate and integrate vast, disparate datasets to advance understanding of biological processes, determine mechanisms of disease, and inform diagnosis and treatment of patients. Beginning with a base of excellent computational faculty dispersed among our four top-ranked professional schools (Dentistry, Medicine, Nursing and Pharmacy) and Graduate Division, superb research programs and outstanding Medical Center, ICHS will establish a central convening center, hire additional faculty, and build programs for research and education. ICHS will develop and enhance UCSF’s computational approaches and strategies in basic, translational, clinical and population-based biomedical research, working with partners in industry and academia where appropriate. It will be a campus hub for computer scientists and for researchers who employ computation as a primary tool in their biomedical research.

We are currently in the process of building up our computing capability, including talent. This role is the first full-time role of several planned.

ABOUT UCSF :

The University of California, San Francisco (UCSF) is a leading university dedicated to promoting health worldwide through advanced biomedical research, graduate-level education in the life sciences and health professions, and excellence in patient care. It is the only campus in the 10-campus UC system dedicated exclusively to the health sciences.

The University of California San Francisco is an Equal Opportunity/Affirmative Action Employer. All qualified applicants will receive consideration for employment without regard to race, color, religion, sex, sexual orientation, gender identity, national origin, age, protected veteran or disabled status, or genetic information.

[Click Here to Access the Original Job Post]