Category Archives: Statistics and Programming

statistics courses at SCENE – message from Ludovic

Dear all,

please find below an announcement for the 10 next statistics courses at SCENE. I myself attended the course on multivariate analyses of population genetics data and it was really worthy (with a great environment: a biological station on the shore of Loch Lomond).


1) Advancing in statistical modelling using R (ADVR)
Date: 2 May 2016 – 6 May 2016
Instructors: Dr. Luc Bussière, Dr. Tom Houslay

2) Time Series Models for Ecologists and Climatologists (TSME)
Date: 10 May 2016 – 13 May 2016
Instructors: Dr. Andrew Parnell, Dr. Doug McNeall,

3) Introduction to Python for Biologists (IPYB)
Date: 23 May 2016 – 27 May 2016
Instructors: Dr. Martin Jones

4) Advances in Spatial Analysis of Multivariate Ecological Data: Theory and Practice (MVSP)
Date: 11 July 2016 – 15 July 2016
Instructors: Prof. Pierre Legendre, Dr. Olivier Gauthier,

5) Advances in DNA taxonomy (DNTX)
Date: 8 August 2016 – 11 August 2016
Instructors: Dr. Diego Fontaneto, Prof. Ziheng Yang

6) Introduction to LINUX workflow for Biologists (ILWB)
Date: 15 August 2016 – 19 August 2016
Instructor: Dr. Martin Jones

7) Exploratory methods for genetic data analysis (GDAR)
Date: 15 August 2016 – 20 August 2016
Instructor: Dr. Thibaut Jombart

8) Introduction to Bayesian hierarchical modelling using R (IBHM)
Date: 23 August 2016 – 26 August 2016
Instructors: Dr. Andrew Parnell

9) Model-base multivariate analysis of abundance data (MBMV)
Date: 3 October 2016 – 7 October 2016
Instructors: Prof. David Warton

10) Applied Bayesian modelling for Ecologists and Epidemiologists (ABME)
Date: 24 October 2016 – 29 October 2016
Instructors: Dr. Matt Denwood, Prof. Jason Matthiopoulos

for more details visit or email

Oliver Hooker
PR statistics

128 Brunswick Street
G1 1TF

+44 (0) 7966500340

2 year postdoc on Drosophila epigenetics

Klaus Reinhardt (Dresden) has a two year postdoc position on genetics and epigenetics of Drosophila. Details here: bioinformatics_Pop_gen_tudaz.

Workshop in Marine Ecology, Evolution, and Genomics

Hello All,

Just a note that there is a workshop in Vigo this fall, organized in part by APS alumni (and friend) Juan Galindo. Although it is focused on marine organisms, the topics tackled should apply well across organisms (particularly the data handling and analyses bits). Victor will be running a hands-on tutorial as part of the workshop. More info can be found here:

cheers, Patrik

generate log-uniform distribution with R

Hi All,

does anyone know how to generate a log-uniform distribution with R?



Pruning relatives from a large dataset

Hi all,

Sanad and I have a problem and we wonder whether anyone has encountered something similar (and knows a solution). In his spiny mouse dataset we have used microsats to measure relatedness between all typed individuals. For most downstream population genetic analyses (e.g. testing for departures from HWE, performing analyses in STRUCTURE etc)  assumptions of individuals being unrelated are made.  Violating these assumptions can cause real problems – see for example the recent paper in MER from Jianlang Wang’s group on what this does to STRUCTURE analyses. In the spiny mice we have quite a lot of pairs (>500) with an r > 0.25). Therefore, we wish to prune individuals from the dataset such that nobody has an r >= 0.25 to anything else. This sounds straightforward, but in practice is quite tricky because there are so many pairwise combinations and if you remove one individual from a dyad at random you may end up throwing away too much data.

Therefore, our question is this.
Does anyone know of an efficient way (i.e. a program) for removing the fewest possible individuals while ensuring no dyads have a r above a  given threshold (we chose 0.25 fairly arbitralily).

Many thanks
Jon & Sanad

GenotypeChecker software

Those of us working on pedigree and genotype datasets may find this useful.

Apparently datasets with 1000s of markers can be analysed, although it may be a bit cluncky.


Msatcommander software

Hi all,

This program looks useful, especially for those of you interested in obtaining microsatellites from 454 data or large datasets (e.g. ESTs from Genbank).
It searches fasta files for microsats, and then uses Primer3 for designing primers. We can already do this using Sputnik and Primer3,  but Msatcommander does it all in one program, without the need for editing the format of input/output files.

It looks like the source code and executables for most platforms are available.
Wiki for it here:


Assignment tests

Hi everyone,

I have recently had one of my submitted manuscripts back with reviewers comments.  One of their suggestions is that I do some assignment tests on my genotype data to coarsely reconstruct the consequences of translocations in my study population.

I have to admit, I don’t really know where to start with this!  If anyone has any thoughts or suggestions I would be very keen to hear them.

Many thanks,

R package for 454 sequence data

Hi all
I just noticed this paper in the latest issue of Bioinformatics:
It describes an R package for analysing 454 sequence data. Could be handy.


Endnote X4

The lab has bought Endnote X4; the CD-ROM is in the lab’s software folder.

It has some nice features compared to our previous X1 version:

  • finds PDFs for references already in your library
  • finds metadata for PDFs that you import into your library
  • searches the contents of attached PDFs
  • doesn’t go crazy when you have track-changes on in Word
  • imports years after 2009 (a very annoying bug in X1)

posted by Hannah