Advanced bioinformatics tools and pipelines for the next-generation of microbiome analysis (CASE)

Job ID:
Job date: 2017-04-24
End Date:

Company : EASTBIO 

Country :

Role : Student 


[Click Here to Access the Original Job Post]

Job Description:
In this studentship, we will:

1-Train the student in current best practices for the analysis of WGS metagenomics data.

2-Benchmark current tools for use in a range of environments, and for answering a range of different questions, using both real, pseudo-real and simulated datasets.

3-Test the hypothesis that no single tool is “best” for all scenarios, and that in most cases a compendium approach is needed.If our hypothesis is shown to be true, we will develop summary and visualisation tools that enable researchers to use the outputs of multiple software tools whilst analysing metagenomics datasets.

4-Develop a plug-in based software pipeline, based on a software infrastructure such as NextFlowIO, which enables researchers to run a range of sophisticated tools on a set of metagenomics data, including QC; community structure and function; assembly and annotation; and visualisation.Key to this will be the benchmarking in objective (2) and the results of objective (3).

5-Benchmark, test and develop pipelines for the integration of long read data with metagenomics data.

We are perfectly placed to train the student in this new, high impact, high profile research area. The outputs of the research will be important to a range of questions in the biomedical sciences, and will result in high impact publications. Fios are perfectly placed to ensure that the training and resulting research outputs are relevant to industry and can make an impact outside of academia.

Eligibility:

All candidates should have or expect to have a minimum of an appropriate upper 2nd class degree. To qualify for full funding students must be UK or EU citizens who have been resident in the UK for 3 years prior to commencement.


Areas :


Additional Info:
The last decade has seen increased interest in the study of microbial communities, and a greater recognition of their importance across a wide range of ecosystems. The microbiome has been linked with health issues as diverse as cancer, immune disorders, obesity, diabetes, and cardiovascular disease. Similarly, the gut microbiome of livestock is critically important for food security, influencing feed-conversion-ratio, animal health, and the impact of farm animals on the environment. Many microbiomes contain novel proteins useful in medicine and biotechnology, as well as providing a reservoir for both novel antibiotics and novel antibiotic resistance genes.

Whole-genome shotgun (WGS) metagenomics enables scientists to assay the genomes of a broad range of organisms within a particular ecosystem. Such experiments routinely produce hundreds of gigabases of sequencing data, which can be used for a number of purposes: (i) to assay the species of bacteria, archaea, viruses, fungi and protists present in the microbiome and to predict community structure; (ii) to assay the enzymes and pathways present in the microbiome and therefore to predict function; and (iii) to assemble (fragments) of the genomes of organisms in the microbiome and look for novel proteins of interest.

For predicting community structure there are a bewildering array of bioinformatics tools, such as Kraken, MetaPhlAn2 and Centrifuge. These tend to be incredibly fast; however, their accuracy is influenced hugely by the reference database used, and we have shown that many of these tools fail to reflect biological reality in certain situations. In many cases the tools have not been properly benchmarked outside of the limited environments used during their development.

However, there is a paucity of bioinformatics tools used to predict function, and one of the most common is HUMAnN2. This tool looks at the abundance of UniRef clusters within a metagenome and uses those to predict pathway abundance and completeness. However, the first step in the pipeline is to predict community structure, and this is again heavily dependent on the database used. Again, we have shown that HUMAnN2 is useful only for certain environments and has not been properly benchmarked.

Metagenomic assembly is a problem that has garnered a lot of attention, but with no obvious single or best solution. Raw metagenomic assemblers exist such as IDBA-UD, MetaVelvet and MegaHit; and binning algorithms such as CONCOCT. However, the former produce fragments only slightly longer than read length; and the latter only appear to work in certain environments. Finally, very few metagenomics software tools have been adapted to cope with high error, long reads from technologies such as the PacBio Sequel and Oxford Nanopore’s MinION. Both produce 5-8Gbase of sequence data, at approximately 85-90% accuracy, and with reads in excess of 8Kbases. The integration of these data with short read, high volume Illumina data is mostly unexplored.

Completed application forms along with your curriculum vitae should be sent to our PGR student team at RDSVS.PGR.Admin@ed.ac.uk

Reference Request Form – please fill in your name and send the form to two academic referees. Your referees should send the completed forms to our PGR student team at RDSVS.PGR.Admin@ed.ac.uk

Closing Date: 1st May 2017

[Click Here to Access the Original Job Post]