Assembling a genome from short-reads is actually quite a difficult thing. But it is tractable and you can get an awful lot of information from the draft assemblies that result. This little walkthrough of the basic technique, originally written for the TSL EMBO Practical in 2012, should give a good idea of the work that goes into it. You should be able to follow along in Galaxy on the TSL site.

Guide to Assembly

Effector mining

So, you’ve got a genome sequence. But this is Plant-Microbe interactions? Where the devil are those blasted effectors? Which is the one that makes my organism pathogenic?

Hopefully this little technique introduction will help you get an idea of what sort of genomic analyses you can use to find the effectors in a fresh, unsullied genome assembly.

Mining Sequence Data for Effectors

SNP Finding

Single-nucleotide polymorphisms (SNPs) are variations in nucleotide composition at homologous sites of different species. They may lead to change of function or expression of a gene through the introduction of premature stop codons, different protein folds or a change in gene expression. SNPs may be linked to a gene for a given trait, for example the resistance to a pathogen.

RNA-seq is a method used to sequence the transcribed genes in a given organism. The RNA of an organism is sequenced and the reads are then mapped to a reference genome. The number of genes that map to a given gene are a value of it’s expression – the more reads that map, the higher the gene is expressed.

