In late 2012, I got the idea that there were a few points to make about homology that were either not in the literature or they were there, but in relatively obscure places. I thought it was time to pull these ideas together and see if there was something to be said about how homology was being treated in the literature and perhaps how we might think about it in its totality.
I talked to some people in my research group - notably Leanne Haggerty - about the possibility of writing a paper on the subject. We took a look at the gene similarity data coming out of her PhD project and decided that it made sense.
We have been looking at gene similarity networks since 2010, when I spent 2 months in Eric Bapteste and Philippe Lopez's lab in Paris thinking about gene similarity networks and what they might show us.
The main problem was that homologs were really being treated as a "thing", a "real thing". Like a species. Indivisible.
Perhaps derived from the philosophical idea of essentialism.
That last sentence was excised from the actual paper by my co-authors as it seemed inflammatory.
Most people would say that they don't think of homologs in that way, but if you read the literature, you can see that this is the way in which they are being treated.
So, in late 2012, I began drafting this manuscript and I invited some lovely people to co-author the manuscript and most accepted the task.
Some others I had coffee with told me that this was a pretty silly idea and that we knew all we needed to know about homology. Some warned me that the area was a complete mess and I would be best leaving it alone.
I discovered later that indeed the area was a complete mess and I would have to steer clear of some of the work of developmental biologists and anatomists if I was to keep my sanity.
Therefore, in the end, we only wrote about homology as it really relates to molecular data. Pierre-Alain Jachiet, Leanne Haggerty and David Fitzpatrick contributed cold, hard data and analyses, but even with such great data, it was a heck of a job to write the actual manuscript.
The first draft was a bolshie effort on my behalf, written in a slightly haranguing style and fortunately, my co-authors generally hated it. I have rarely gotten comments back from people where the total length of the comments exceeded the total length of the manuscript by approximately one order of magnitude.
Approximately half the original text is on the cutting-room floor.
Over a period of 14 months we had several attempts at including what we felt was important and deleting parts that were either inflammatory or plain wrong.
In the end, the manuscript is one that I am very pleased with.
Given the number of emails I have gotten about it, it seems that several of you are also pleased with it.
I hope it prompts people with a good knowledge of programming and clustering and fuzzy clustering and networks to have another go at understanding homology.
Perhaps we might consider adding to our store of knowledge of molecular evolution by including transitive relationships between proteins (relationships that are not direct homology relationships, but relationships of non-homologs to one another through composite genes/proteins). Anyway, the paper is published this month in M.B.E. where it is completely Open Access and they were lovely enough to put one of our networks on the front cover.
Haggerty, L.S., Jachiet, P.A., Hanage, W.P., Fitzpatrick, D., Lopez, P., O’Connell, M.J., Pisani, D., Wilkinson, M., Bapteste, E., and McInerney, J.O., (2014) A pluralistic account of homology: adapting the models to the data. Molecular Biology and Evolution 31 (3): 501-516.