Maker – gene calling with aplomb

Whatever a plomb is.

Gene calling is a funny thing. Different programs take very different approaches. Time and time again in bioinformatics we come up against a situation where we have to find a badly defined object hiding in some messed up dataset. Molecular biology is wonderful in that its central object, the gene, is particularly badly defined. What joy comes from trying to find genes in draft assemblies then, and every program has a different, specialised approach (or heuristic). So which program can we believe if they all give different answers? Maker, a program from the GMOD project attempts to answer this question. Maker is a pipeline that can merge output from several programs to give a consensus answer and allows you to move annotations over from one version of an assembly to another. Its a one step annotation pipeline, and it’s available on the TSL cluster. Give it a go.

bio-svgenes – publisher proof visualisation of genomic regions

Genome browsers are wonderful things, but (perhaps surprisingly) there are few that generate truly publication quality plots of the data. Most need you to screen-grab or export PNG files that suffer from compression or other display problems. A problem noticed by the publication-happy staff of TSL and one that the bioinformatics team have been working to address. During her seminar today Diane showed the first public display of the in-house created bio-svgenes package. bio-svgenes is a Ruby package that takes in pre-prepped data, such as gene models and coverage counts and spits out full Illustrator/Inkscape compatible, true Scalable Vector Graphics images that you can send off for publication at any resolution.

bio-svgenes is relatively easy to use, it hasn’t got a GUI, you need to specify options in a simple config file and run a command-line script as a minimum, though for you code-monkeys there is an API, so you can generate all the images you need scriptaculously. So if you want nice renderings of BLAST data, or some other weird and wonderful use then this is a great package for you.

Here’s an example of the output, ironically in PNG format because not all browsers can render SVG properly, but its indicative of what you can achieve very quickly.

example output

Don’t worry if you don’t like the colours, its completely flexible and you can set the visual aspects very easily.

For now bio-svgenes is available on request from me, once the documentation is up to scratch then I’ll make it available as a regular gem. If you have no idea what on earth this means and want to give it a go feel free to drop by and ask.

Circos: Nifty Diagrams – Configuration Hell

Diane’s very interesting seminar today contained a wonderful, Nature/Science-style headline grabbing data-vis of many different data-types relating to her PST tribes. This great data-ful vis was created by Diane in the Circos software.

Circos is a pretty neat piece of software designed to create specifically circular diagrams that have a very high ‘data-to-ink’ ratios. It has gained a lot of ground due to its immense flexibility, you can see it in lots of publications (including some authored by TSL members like this recent one.

Like most truly powerful software, Circos is complex and hard to use. It doesn’t help that it is provided as a Perl script. Creating images with Circos requires a fair amount of data pre-prep and configuration tweaking, but the results are well worth it. If you’d like to give Circos a go for rendering the results of your projects you can get it here. If you’re having issues with its install and use, come and see us down in the bioinformatics office and we’ll see what we can do.

Who would win in a fight between…

CLC Bio Assembly Cell and Velvet?

It’s a good question.

Sequence assembly, taking short reads and re-assembling them into long useful contigs is the cornerstone of many of our genomics projects. The recent successes of the CLC Bio Assembly Cell software has meant that its getting talked up over some of our older methods. At this link (TSL Site only), you’ll see a head-to-head of Velvet and CLC Bio. Who wins? Remember, there are no big red buttons and no perfect pieces of software. The result, as ever is mixed.