Computational inference of selection underlying the evolution of the novel coronavirus, SARS-CoV-2
Date
2020-04-01Author
Cagliani, Rachele
Forni, Diego
Clerici, Mario
Sironi, Manuela
Metadata
Show full item recordAbstract
The novel coronavirus (SARS-CoV-2) recently emerged in China is thought to have a bat origin, as its closest known relative (BatCoV RaTG13) was described in horseshoe bats. We analyzed the selective events that accompanied the divergence of SARS-CoV-2 from BatCoV RaTG13. To this aim, we applied a population genetics-phylogenetics approach, which leverages within-population variation and divergence from an outgroup. Results indicated that most sites in the viral ORFs evolved under strong to moderate purifying selection. The most constrained sequences corresponded to some non-structural proteins (nsps) and to the M protein. Conversely, nsp1 and accessory ORFs, particularly ORF8, had a non-negligible proportion of codons evolving under very weak purifying selection or close to selective neutrality. Overall, limited evidence of positive selection was detected. The 6 bona fide positively selected sites were located in the N protein, in ORF8, and in nsp1. A signal of positive selection was also detected in the receptor-binding motif (RBM) of the spike protein but most likely resulted from a recombination event that involved the BatCoV RaTG13 sequence. In line with previous data, we suggest that the common ancestor of SARS-CoV-2 and BatCoV RaTG13 encoded/encodes an RBM similar to that observed in SARS-CoV-2 itself and in some pangolin viruses. It is presently unknown whether the common ancestor still exists and which animals it infects. Our data however indicate that divergence of SARS-CoV-2 from BatCoV RaTG13 was accompanied by limited episodes of positive selection, suggesting that the common ancestor of the two viruses was poised for human infection.