Comparative Biology of I-Type Lectins

 Distribution and evolution of I-type lectins
I-type lectins are a group of glycan-recognition proteins that contain an immunoglobulin-like fold as the carbohydrate recognition domain. This class of lectins includes some proteins expressed in the nervous system, such as the neural cell adhesion molecule (N-CAM) and myelin protein zero (P0), and those expressed on hematopoietic cells, such as CD83 and Siglecs.

Proteins possessing one or more immunoglobulin-like fold(s), i.e., immunoglobulin superfamily (IgSF) proteins, are widespread in animals (particularly vertebrates), and many of them are involved in cell-to-cell interactions and biological defense against pathogens. Being a part of IgSF, I-type lectins are expected to have similar species distribution and biological roles.

Most of the I-type lectins known to date are of vertebrate origin, especially of mammals. This is probably because the IgSF has expanded greatly during vertebrate evolution, and mammals have been used extensively in the studies of cell-to-cell interactions and biological defense against pathogens. An exception is hemolin, which is an I-type lectin found in insect hemolymph and plays a proposed role in anti-bacterial defense.

Although I-type lectins are classified as a family, they do not necessarily belong to a single cohesive branch of IgSF (see Panel A below). In other words, it is more likely that several members of IgSF have independently acquired the ability to recognize carbohydrates during the course of evolution. Therefore, the I-type lectin family is more inclusive than other lectin families, e.g., L-type lectins, R-type lectins and galectins, for each of these families presumably has a common ancestor with glycan recognition acitvity, and most of its descendants (current members) are also lectins. There may be many more undiscovered I-type lectins.
Distribution and evolution of Siglecs
More than half the I-type lectins known so far belong to the Siglec family (i.e., the Siglec family is a major sub-family of I-type lectins). The definition of the Siglec family is similar to that of other lectin families in that all Siglecs have presumably descended from a common ancestor and most of the current family members recognize glycans.

Myelin-associated glycoprotein (MAG = Siglec-4), a member of the Siglec family, is conserved throughout vertebrate evolution (i.e., from fish to mammals). With regard to other Siglecs, the degree of conservation differs, although none is conserved throughout vertebrate evolution (see Panel B below). On the other hand, a MAG-like sequence could not be identified in the near-complete genome sequence of sea squirts, which belong to the tunicates, close relatives of vertebrates. Therefore, it is likely that the distribution of Siglecs is limited to vertebrates.

There are some possible explanations for the high degree of conservation of MAG: (1) its expression is limited to the nervous system, where access by pathogens, known to drive molecular evolution of cell-surface molecules, is limited; (2) it interacts not only with glycans but also with other protein ligands, and is thus under strict functional constraints restricting its pace of evolution. In contrast, most other Siglecs are expressed on cells involved in immunity, where contact with pathogens is inevitable, possibly explaining why they are poorly conserved. This trend of poor conservation, or rather, rapid evolution, is most prominent in the CD33-related Siglecs. CD33-related Siglecs are encoded in a gene cluster, and the number of functional genes in this cluster differs even among different species of primates. This is due to the dynamic and macroscopic rearrangements of the genomic region, involving gene duplication(s), gene conversion(s) and gene deletion(s). In addition, CD33-related Siglecs are undergoing accelerated evolution at the microscopic level as well, an observation which is most evident in the first immunoglobulin-like domain involved in glycan recognition. Varki and Angata have recently proposed that this accelerated evolution is due to an evolutionary chain of “Red Queen Effects”, in which “evolutionary competition between host sialic acids and pathogens that utilize sialic acids” and “evolutionary competition between host Siglecs and pathogens that utilize Siglecs” are coupled by the “co-evolution of host sialic acids and Siglecs” (as pictured in Panel C). Available data are consistent with this theory, and other testable hypotheses emerge from it.
Takashi Angata (Research Center for Glycosciense, National Institute of Advanced Industrial Science and Technology(AIST))
References (1) Angata T, Brinkman-Van der Linden E: I-type lectins. Biochim. Biophys. Acta, 1572, 294-316, 2002
(2) Lehmann F, Gathje H, Kelm S, Dietz F: Evolution of sialic scid-binding proteins: molecular cloning and expression of fish siglec-4. Glycobiology, 14, 959-968, 2004
(3) Angata T, Margulies EH, Green ED, Varki A: Large-scale sequencing of the CD33-related Siglec gene cluster in five mammalian species reveals rapid evolution by multiple mechanism. Proc. Natl. Acad. Sci. U S A., 101, 13251-13256, 2004
Varki A, Angata T: Glycobiology, 15, 2005, in press.
LE- B04
Siglecs (Takashi Angata)
Jun. 30, 2005

GlycoscienceNow INDEX