Shin-ichi Nakakita
Department of Basic Life Science at the Faculty of Medicine, Kagawa University, Japan. Ph.D., Science.
After completing his doctoral program at the Graduate School of Science, Osaka University in 1998, he was appointed to a research assistant in the Department of Organic Biochemistry (Laboratory of Prof. Sumihiro Hase) at the School of Science, Osaka University. There, he engaged in research on glycan structural analysis and fluorescent labeling methods for glycans. In 2004, he became a Visiting Associate Professor and Kagawa Prefecture Endowed Chair in the Department of Functional Glycomics, the Life Science Research Center at Kagawa University. In 2008, he became an Associate Professor at the Department of Glycan Function Analysis, Center for Integrated Life Science Research, Kagawa University. In 2020, he became an Associate Professor in the Department of Integrated Life Sciences, Kagawa University School of Medicine, a position he holds to this day.
Jun Hirabayashi
Tokai National Higher Education and Research System, Nagoya University, Japan. Ph.D., Science.
After graduating from Tohoku University (Master of Science), he started his professional career at Teikyo University, where he investigated animal lectins under the supervision of Prof. Kenichi Kasai. He moved to the National Institute of Advanced Industrial Science and Technology (AIST, Tsukuba) in 2002, and while a deputy director in the Research Center for Medical Glycoscience (2006) and a prime senior researcher in the Research Center for Stem Cell Engineering (2012), he was involved in a series of national glycan engineering projects. Now, he is a professor and project manager (Human Glycome Atlas Project) in the Institute for Glyco-core Research, Tokai National Higher Education and Research System, Nagoya University, as well as vice president of the Japanese Consortium for Glycoscience and Glycotechnology. He is also a visiting professor of Kagawa University (2003) and Yokohama City University (2019), as well as a committee member of Kagawa Prefecture Rare Sugar Co-creation Promotion Council (2024, Chairperson of the Complex Carbohydrates and Glycans Working Group).
Disaccharides, the simplest oligosaccharides, can theoretically form up to 3,056 unique structures from various D- and L-aldohexose combinations of two monosaccharide units. While most of these have not been studied, advances in rare sugar synthesis now allow their experimental investigation (Nakakita SI, Hirabayashi J, BBA Advances, 7:100143, 2025). This opens new possibilities for discovering disaccharides with unique properties and potential uses as sweeteners, pharmaceuticals, and biodegradable materials. Compared to more complex oligosaccharides, disaccharides are easier to study systematically. We propose “disaccharide glycomics” as a systematic framework to explore this newly accessible chemical space, opening the door to discoveries that are both scientifically and practically significant.
Representing the simplest oligosaccharides and composed of two monosaccharides, disaccharides serve as fundamental recognition units in numerous biological processes. Research on disaccharides has primarily focused on a limited range of monosaccharides, i.e., D-glucose (D-Glc), D-mannose (D-Man), and D-galactose (D-Gal), largely due to their natural abundance. Their derivatives—including N-acetyl-D-glucosamine (D-GlcNAc), N-acetyl-D-galactosamine (D-GalNAc), L-fucose (L-Fuc), D-xylose (D-Xyl), and D-glucuronic acid (D-GlcA)—are also commonly found in natural glycans. Importantly, all these monosaccharides and their derivatives are principally biosynthesized from either D-Glc or D-Man1,2. Furthermore, sialic acids—nine-carbon acidic monosaccharides (2-keto-deoxynonulosonic acids) characteristic of higher animal cells—are synthesized from N-acetyl-D-mannosamine (C6) and pyruvate (C3) in vivo, via a mechanism analogous to the aldol condensation reaction.
In theory, however, numerous additional sugar isomers exist at the monosaccharide level (i.e., diastereomers): they include D-allose (D-All), D-altrose (D-Alt), D-talose (D-Tal), D-gulose (D-Gul), and D-idose (D-Ido) (see Fig. 1). These non-natural sugars are typically referred to as “rare sugars,” defined as “monosaccharides and their derivatives being rarely found in nature”3,4. All monosaccharides belonging to the L-series (both aldoses and ketoses) are basically classified as rare sugars. In this regard, monosaccharides are fundamentally different from components of other biomolecules such as nucleic acids and proteins: glycan components largely exist as a systematic series of isomers, while the others (nucleotides and amino acids) do not. Consequently, much of the diastereomeric potential within sugar chains remains uninvestigated. In recent years, however, progress in enzymology and fermentation science has enabled the conversion of common monosaccharides including ketoses, such as D-fructose (D-Fru), into a variety of rare sugars, including those in the L-series5-7.

If any type of aldohexose were able to form glycosidic bonds to produce disaccharides, the range of possible combinations would be immense. Nakakita and Hirabayashi recently reported that the "glycome size," meaning the total number of unique disaccharides possible when two unmodified aldohexoses are linked, is 3,056 Note 1) 8. In this context, unmodified aldohexoses refer to simple six-carbon sugars that have not been chemically altered, and "chemical space" describes the range of possible molecular structures that can be formed Note 2). Sugar chains exhibit tremendous diversity in how monosaccharides are linked; thus, the ways of linking just three monosaccharides can exceed 100,000 combinations9. Given this vast number of possible oligosaccharides, it is not surprising that much of their chemical space remains unexplored—even at the disaccharide level8. As a result, synthesizing artificial disaccharides could lead to discoveries of properties not found in naturally occurring sugars. These unique properties could have significant applications in various fields such as pharmaceuticals, biotechnology, or materials science, opening new avenues for research and innovation (for entire image, see Fig. 2).
In contrast, trisaccharides present an overwhelming level of diversity—so vast that handling or analyzing all possible structures is practically impossible. This is due to what is known as a “combinatorial explosion,” meaning that as the number of sugar units increases, the possible combinations grow exponentially, making comprehensive analysis extremely challenging. In this context, disaccharide glycomics provides a practical solution to this issue of combinatorial explosion—sometimes also referred to as a “singularity,” or a point at which the complexity becomes unmanageable—by focusing on the more accessible structures of disaccharides. To address this, we will utilize both chemical and enzymatic synthetic strategies to generate a broad spectrum of disaccharides, many of which are entirely new compounds. By employing these methods, we can systematically generate and analyze a wide range of disaccharides (Fig. 2).

Comprehensive analysis will then be conducted on these molecules to assess both their physical properties and their “bio-compatible functions”—that is, their compatibility and potential roles in biological systems, such as their ability to interact with proteins or participate in cellular processes, e.g., by mimicking a known biomaterial. The resulting data will deepen our understanding of the fundamental properties of oligosaccharides and even larger glycans, as well as their conjugates. This foundational knowledge opens the door to diverse applications. For example, novel disaccharides could be used to create targeted drug delivery systems or innovative biomaterials for tissue engineering, clearly illustrating the practical impact and potential of this research.
Note 1) If the same two monosaccharides are linked, their α1-1β and β1-1α forms should be identical; thus, the total number of disaccharides of all aldohexose combinations is calculated to be 16 (non-reducing terminal aldohexoses) × 12 (linkages; α1-2/3/4/6 and β1/2/3/4/6 for reducing disaccharides, and α1-1α/α1-1β/β1-1α/β1-1β for non-reducing disaccharides) × 16 (reducing terminal aldohexoses) - 16 = 3,056.
Note 2) Chemical space contains over 106010 potential molecules, making known compounds a tiny fraction of molecules filling this space. To navigate this vast universe, researchers map structures using descriptors as coordinates, including molecular weight, lipophilicity (log P), polar surface area, and hydrogen bonding capacity11. This multidimensional approach enables the quantitative identification of novel drug candidates.
Glycans are often referred to as the “third code of life”, but the information they encode is incredibly rich—far surpassing that encoded by linear molecules like nucleic acids (DNA, RNA) and proteins. Unlike nucleic acids and proteins, glycans can form highly branched structures, resulting in an exponentially greater diversity of possible configurations and information-carrying capacity. Disaccharides, from the perspective of glycomics, can be considered the smallest informational unit, or ‘bit’. Actually, these disaccharides serve as the “smallest meaningful blocks” that living organisms utilize for molecular recognition and signaling.
Lectins are ubiquitous proteins that bind specifically to carbohydrates and are vital for cell recognition. While they are often classified by monosaccharide specificity12.13, most lectins recognize penultimate as well as outermost saccharides. For instance, Ulex europaeus lectin identifies the blood group H antigen (Fucα1-2Gal), which is important for blood type compatibility, and jacalin binds the T-antigen (Galβ1-3GalNAc). In this context, Albert M. Wu advocated for the classification of glycan epitopes at the disaccharide level, proposing a systematic nomenclature and leveraging plant lectin specificity to investigate these disaccharide epitopes14. Similarly, animal lectins such as galectins identify a fundamental disaccharide unit, N-acetyllactosamine (LacNAc), which is a common motif in glycans composed of galactose and N-acetylglucosamine. Further research resulted in a consensus rule: any disaccharide conforming to the “Galβ(syn)-gauche” formula—a specific structural arrangement of galactose residues recognized by galectins—has the potential to serve as a galectin ligand2.15. Such examples of disaccharide epitopes for various lectins are listed in Table 1.
Thus, biological meaning emerges at the disaccharide level, where monosaccharides function merely as “letters,” while disaccharides act as “words.” Just as words formed from letters convey meaning in language, disaccharides assembled from monosaccharides create functional units that carry biological information. By systematically characterizing newly discovered disaccharides, this project will establish foundational knowledge that enables meaningful comparisons and functional predictions in glycoscience. In this way, the proposed research will offer opportunities for discovering new physical properties and biological functions, making further exploration in this field especially worthwhile.
As described above, this disaccharide glycomics project consists of three major phases: 1) synthesis and separation, 2) discovery, and 3) utility application (Fig. 2). Each phase presents substantial technical challenges. In particular, key methodological issues related to enzymatic synthesis and high resolution structural analysis are actively being addressed and are summarized in the Column, “The Current State of the art in Enzymatic Synthesis and Structural Analysis of Non-natural Disaccharides”.
In this section, we outline the specific content and technical key points of each phase, and propose a practical research framework for implementing disaccharide glycomics.
● Synthesis
Disaccharides can be synthesized either chemically or enzymatically (Scheme 1). Though chemical methods are robust, generating over 3,000 novel disaccharides is impractical. The standard substances obtained by chemical synthesis are likely to be particularly useful as standard materials for establishing analytical methods, as well as structurally known standard substances and reference materials for physical property analysis or in later discovery phases. On the other hand, a few preceding studies using enzymes have been attempted16.17. However, the enzymatic approach depends on the luck of finding 'special' enzymes (e.g., glycosyltransferases, glycosidases, phosphorylases) that are rare even in nature and also act on rare sugars, making it difficult to establish a systematic and efficient method for the synthesis of disaccharides.
● Separation
Obtained disaccharides should be subjected to separation to homogeneity to identify their structures and investigate their physicochemical properties. For this, various kinds of liquid chromatography will be attempted: e.g., size-exclusion, ion-exchange, reversed-phase, and hydrophilic chromatography of either their intact forms or forms after labeling with an appropriate reagent. Effective separation technologies are crucial, especially because rare sugars are less stable than natural ones when they are placed at the reducing terminal. A preceding study showed that reactivity of glucose disaccharides was largely dependent on the linkage type (α or β, and 1-2, 1-3, 1-4, or 1-6), although the reason remains to be elucidated18.
● Structural analysis
Accurate characterization of disaccharides requires precise determination of monosaccharide composition, anomeric configuration (α or β), and glycosidic linkage position. Among available analytical techniques, nuclear magnetic resonance (NMR) spectroscopy plays a central role by providing definitive structural information. In parallel, fluorescence labeling combined with high performance liquid chromatography (HPLC) enables efficient separation of isomeric disaccharides, while mass spectrometry (MS) techniques, including MS/MS or MSn analyses, ion-mobility separation19, possibly yield novel knowledge about disaccharide properties. In addition, crystallographic analysis offers detailed insights into the intrinsic structural features of disaccharides20. The complementary use of these methodologies enables systematic and reliable disaccharide analysis, overcoming the limitations of conventional approaches and advancing glycomics research.
● Physical property analysis
Many synthesized disaccharides are new, with limited data on their physical properties, creating challenges for industrial use. We will systematically compare basic properties such as melting and boiling points, solubility, volatility, hygroscopicity, optical rotation, crystallinity, and organoleptic traits like sweetness. This profiling will uncover broader governing principles by helping to identify correlations between chemical structure and physicochemical behavior.
● Exploring bio-compatible functions
By comparing new disaccharides with lactose (Galβ1-4Glc), maltose (Glcα1-4Glc), and cellobiose (Glcβ1-4Glc), we will assess their recognition and metabolism by microorganisms, as well as their impact on intestinal bacteria and immune responses21. Each compound’s core properties—such as melting point, viscosity, optical activity, solubility, hygroscopicity, reductivity, amine reactivity, thermodynamic stability, crystallization, and sweetness—will be evaluated for biological relevance. Our analysis will focus on microbial processing and physiological effects, linking the above sub-section, physical property analysis, to potential functional or commercial uses. Notably, GalNAc-modified siRNA has recently shown strong therapeutic performance22.
● Large scale production of valued disaccharides (bridge to industry)
In conjunction with the development of new functions for novel disaccharides, we will develop microorganisms (enzymes) capable of producing valued disaccharides from common, cheap, naturally abundant materials, such as D-Glc and sucrose. Specifically, we envision glycosidases that recognize, hydrolyze, and transfer the novel disaccharides, as well as related sugar-nucleotide transporters, glycosyltransferases, glycosidases, and lectins.
Recent breakthroughs in enzyme engineering and fermentation technology allow us to produce rare sugars3-8 This discussion is closely related to the concept of 'expanded glycomics,' which is a proposed field of study that aims to examine not only current biological systems but also the entire range of sugar molecules beyond them23. For the first time, researchers can now systematically assess the properties and potential uses of all 3,056 possible disaccharides—a dramatic expansion of focus from that of previous work. While most of these disaccharides are artificially synthesized, our approach, which includes every possible combination of aldohexoses defines a new field called “synthetic glycomics.” Synthetic glycomics is significant because it enables the systematic exploration and creation of artificial sugars, opening doors to innovations in drug development, advanced biomaterials, and diagnostic tools. By designing and studying sugars not found in nature, scientists can tailor molecules for specific medical or industrial purposes, potentially leading to breakthroughs in areas such as targeted therapies, improved biocompatible materials, or precise biosensors. This approach distinguishes our research from traditional glycomics, which focuses mainly on naturally occurring sugars.
One of the most important ideas here is the idea of the “chemical space” of glycans—the vast universe of possible sugar molecules—which is far larger than that of proteins or nucleic acids9. This immense diversity results from the large number of hydroxyl (–OH) groups and the many ways they can connect (including position, configuration, and anomeric form) within each monosaccharide. This variety stems from the general formula for carbohydrates, (C+H2O; i.e., a structure potentially rich in many hydroxyl groups as well as chiral carbons) 1,2. The chemical space for these disaccharides is already explosive: for n kinds of monosaccharides, the theoretical number of unique disaccharides is n × n × (possible bond positions) × (α/β) × (stereochemistry), resulting in thousands to tens of thousands of potential structures. To put this into perspective, there are 1,600 possible combinations of two amino acids (40 × 40) and just 16 combinations of two nucleic acids (4 × 4)—revealing that glycan diversity is much greater. Nakakita and Hirabayashi8 calculated a total of 3,056 disaccharides when considering only aldohexoses. (Again, aldohexoses are monosaccharides with six carbon atoms and an aldehyde group.) This count excludes pentoses, ketoses, and further derivatives—such as N-acetyl and O-sulfated forms—which would increase the total even further.
Disaccharides thus represent the first "critical point" in the chemical space of glycans. As described, even with trisaccharides (chains of three sugars), the diversity explodes to between 100,000 and over 1,000,000 possible types. In fact, it was calculated that the structural complexity for just six 'common' monosaccharides linked together could reach as high as 1.05 × 1012 possible combinations24. Because of this, disaccharides are seen as the best starting point for understanding artificial sugars and advancing the concept of synthetic glycomics. By beginning with these smaller, more manageable structures, researchers can systematically unlock the potential of synthetic sugars, paving the way for new discoveries across science and technology.
The disaccharide glycomics theme aims to advance saccharide transformation from monosaccharides to disaccharides. Challenges include developing synthesis and separation methods and identifying new disaccharide functions, which are essential for securing research funding. However, the usefulness of new products remains unclear until tested. Despite these challenges, sugars are complex and less understood than proteins or nucleic acids, offering many unexplored possibilities. Studying their origins and biological roles—especially through the “Expanded Central Dogma” in Japan’s Human Glycome Atlas project25—is crucial. Disaccharides are a key entry point for further glycomics research.
Glycans, often likened to trees due to their branching, are best understood through disaccharides—the smallest “meaningful block” with sufficient informational content and structural diversity. Monosaccharides offer limited data, while larger structures complicate analysis, making disaccharides optimal as the "compressible unit" for information theory. Experimentally, disaccharides are ideal for synthesis, separation, mass spectrometry, library creation, and biological assays; they mark a threshold in chemical space where diversity spikes, and they are the smallest units recognized by organisms. Their unique properties position disaccharides as a focal point in glycoscience, suggesting opportunities for mathematical modeling and graph-based analysis of their networks.
Beyond structural diversity, another key advantage of disaccharides lies in their properties as compact and highly hydrophilic molecular modules. Although their molecular weight is limited to approximately 400 Da or less, disaccharides incorporate multiple chiral centers together with densely arranged hydroxyl groups. This unique combination enables them to function as efficient linkers or hub molecules, capable of connecting multiple functional components and thereby conferring properties that are not accessible with conventional molecular architectures. Notably, disaccharides are well suited to emerging mid-sized molecular modalities, such as glycol-modified oligonucleotides and peptide therapeutics, in which stereochemical diversity, aqueous solubility, and precise conformational control are essential design requirements. In this context, disaccharides can be regarded as a privileged scaffold with substantial potential for future sugar integrated mid molecule drug design (Fig. 3).

Enabled by recent advances in rare sugar synthesis, disaccharide glycomics is an emerging field in glycoscience that systematically investigates the structures, physicochemical properties, and biological functions of disaccharides. The discipline integrates structural analysis, functional evaluation, and microbial studies, with broad applications to food science, medicine, and pharmaceutical research. Closely linked to glycomics and glycoinformatics, precise disaccharide analysis enhances our understanding of glycan biology and supports the development of novel functional materials. More than twenty-five years after its advent26, the field of glycomics continues to evolve and expand.
We express our sincere gratitude to Dr. Kenichi Kasai for his valuable insights and guidance in the writing of this review. We also thank Dr. Morten Thaysen-Andersen (Macquarie University, Nagoya University), Dr. Yann Guerardel (CNRS, Gifu University), Dr. Mamoru Mizuno (Noguchi Institute), Dr. Jun Iwaki (Tokyo Chemical Industry, Co., Ltd.), Dr. Yoshihiro Takatsu (Seikagaku Corporation), Dr. Takashi Nishikaze (Shimadzu Corporation), and Dr. Masaaki Tokuda (Kagawa University) for providing helpful comments and warm support to the concept of disaccharide glycomics.
Rare sugar-containing disaccharides can be synthesized using glycosyltransferases (GTases), glycosidases operating in transglycosylation mode, or phosphorylases. These enzymatic approaches enable regio and stereoselective construction of non-natural disaccharides that are difficult to synthesize chemically, thereby supporting advances in synthetic glycomics.
i. Glycosyltransferases (GTases)
GTases provide excellent control over anomeric configuration and linkage position, and many bacterial GTases can be expressed recombinantly in Escherichia coli1-5. However, their application is limited by the high cost of sugar nucleotide donors and strict substrate specificity.
ii. Transglycosylation by Glycosidases
Transglycosylation using p-nitrophenyl monosaccharides (pNP sugars) as donors is a versatile strategy that can theoretically generate multiple disaccharide isomers in a single reaction6. Unexpected formation of non reducing trisaccharides has also been reported7. Although phylogenetic search for novel glycosidases having trans-glycosylation activity has been attempted8, major challenges to execution of this strategy include the scarcity of enzymes active toward rare sugars, limited availability of rare sugar-pNP donors, and the lack of standardized purification methods. Efficient and mild separation techniques are particularly important due to the instability of rare sugars.
iii. Phosphorylase based Synthesis
Phosphorylases catalyze reversible phosphorolysis and condensation reactions and are well suited for large scale production, as demonstrated for lacto N-biose9-11. However, most known phosphorylases are glucose specific, restricting their use to glucosylation at the non reducing terminus of rare sugar acceptors.
Accurate disaccharide identification requires determination of monosaccharide composition, anomeric configuration (α or β), and linkage position. Fluorescent labeling combined with HPLC enables efficient separation of isomeric disaccharides, while chemical methods such as compositional analysis via acid hydrolysis12, methylation analysis13,14, Smith degradation15,16, and partial acetolysis17,18 provide robust linkage information. Enzymatic digestion can support anomeric assignment when necessary.
In mass spectrometry-based methods (MS/MS or MSn), it is necessary to systematically analyze the fragmentation patterns of unlabeled and appropriately labeled disaccharides in detail. Additionally, using new mass separation modes, such as ion mobility methods, may allow for the separation and identification of disaccharide isomers. Furthermore, NMR spectroscopy provides comprehensive and non-destructive structural information in solution, and X-ray crystallography reveals intermolecular interactions at the atomic level.