Machine Learning Techniques and Analytics for Cloud Security. Группа авторов
very frequently with co-infections and reassortment of swine and human. Cell surface oligosaccharide receptors of the swine windpipe present a NeuAcalpha2, 6 Gal linkages, and preferred by human viruses. Some glycans with sialic acid α2,6-linked (SA2,6Gal) are detected more immense in the upper airways than the lower airways in the human body [7–10]. According to the “central dogma” biological concept, double-stranded DNA is simulated into RNA and this RNA carries some instructions for making proteins that mean DNA to RNA to protein. Glycosylation and phosphorylation are the post-translational modifications techniques that are required to estimate for the accurate regulation that is required by the living system. Every cell is wrapper up with a coat (solid) of glycans to create interface of molecular between the cells and environment of cells. Glycosylation is basically enzymatic techniques where this technique is the key factors to establish the connection of carbohydrates to proteins through nitrogen and oxygen links. Whole process happens with the Endoplasmic Reticulum (cell) that arranged with different types of glycol-sialyltransferases and glycosidases [9–16]. The immune system is an important component to keep up a hygenic and well balanced system that is very organized by a sequence of stimulatory and restrictive ways. If whole system is broken down, then autoimmunity can occur for loss of immune tolerance and it can also effect in unusual costimulatory signals. The proper growth and function of the immune system confide pair on the glycan-structures (expression) and glycan-binding proteins, and the association between them. Innate immune is responsible to identify the molecular “patterns” that basically find on microbes. Microbes are responsible to bind by pattern recognition receptors that is one kind of protein to identify molecules in pathogen, C-type (calcium-dependent), lectins (types are Dectin-1 and DC-SIGN), and mannose-binding lectin. The glycans that are found on sensors of innate immune system can be classified by two effects they are “direct” and “indirect” effects [11–15] and these are playing important role in influencing microbe-host interactions and T and B cell identification. Glycan-binding proteins known as GBP are very important within the immune systems that are basically the lectins and the sialic acid-binding immunoglobulin (Siglecs). Within the lectin family, lots of pattern recognition receptors are present like as DC-SIGN and Dectin-1 and the selectins that are L-selectin, E-selectin, and P-selectin. For leukocyte function, lectins are very complex to communicate with the glycans of cell surface that are basically sialyl-LewisX and 6′-sulfo-sialyl-LewisX. To bind the glycoprotein as counter receptors, L-lectins are expressed by leukocytes on endothelial cells for directing naive T cells. In contrast, on endothelial cells, E-lectins and P-lectins are both expressed as a impact of inflammation. The selectins and their glycan ligands interactions facilitate adherence of leukocytes along the endothelium and allowing the cells to migrate into tissues in response to chemokines that are bound to glycol-saminoglycans [12–18]. This way, communication of glycans and selectins are responsible for leukocyte function by arranging restricted to the ideal anatomic field. Another group of glycan-binding molecules are siglecs. But siglecs’ function is perfectly separate from lectins (c-type) and galectins. Siglecs are also receptors of cell surface for recognizing sialic acids and high-ranking vertebrates. It has also cytoplasmic tails that holds more than immune-receptor inhibitory motif sequences (tyrosine-based). Glycans holds different types of effects (indirect) on lymphocyte function [12–19]. To reduce the N-glycans’ complexity on T-cell receptors, these effects are resulted to raised T-cell receptors clustering and signaling at antigen density (lower). In the T-cell receptors signaling process, galectin is not directly engaged and Mgat5 enzyme plays an important role to contribute of N-glycan complexity that increase autoimmune disorders of H1N1 disease and had raised sensibility to empirical autoimmune encephalitis. Similarly, decease of N-glycan complexity on glycoproteins of the cell surface is changed the signaling via lectins and cytokine receptors [17–22].
In 2015, a framework has been proposed of the genetics of the new strain and recognized its nearest relatives in swine using a cluster analysis approach like as the PCA and k-means clustering algorithm and suitable with a reassortment of Eurasian and North American swine viruses [5, 20]. Glycoproteins are the key elements of human pathogenic viruses and perform important roles in infection and immunity. The influenza A virus contains two surface glycoproteins which consist of hemagglutinin (HA) and neuraminidase (NA) that dominate the virion exterior and form antibodies. One major of the components of the outermost layer of viruses is glycans. The communication between the viral pathogens with pathogens’ hosts is affected by the glycans’ pattern and glycan-binding receptors. Due to the mass branching of carbohydrates, they are the complex bio-molecules, and in this process, various glycoproteins are used to recognize with human pathogens (virus). Infectious glycans can be either virus-encoded or can be host-derived that usually obtained by humoral immune responses (high) within the human body. HA and NA both are responsible for creating a connection with envelope glycoproteins of the influenza virus. When HA communicates with terminal sialyl residues of oligosaccharides that ensure the binding of the virion to the cell surface. To eliminate sialyl residues from oligosaccharides contained in cell and virus components, NA is also needed. It is a receptor-destroying enzyme that prevents aggregation of virus particles [7, 25].
In this paper, our goal is to identify differentially expressed glycan. The clustering algorithms have been applied to H1N1 infected human datasets and non-infected human data-set. After that, we compare infected with the non-infected dataset and identify differentially expressed glycan.
2.2 Proposed Methodology
Input: Let, the dataset D consists of “n” number of glycan with “m” number of parameter values like RFU (relative fluorescence units), STDEV (standard deviation), and SEM (squared error mean). Each glycan is a vector and is represented by g1, g2, g3, …, gi, …, gn. The dataset D has two states normal (represented by DN) and diseased or H1N1 infected state (represented by DI).
Output: Differentially expressed glycan identification G’
Step-1: Apply clustering algorithm “C” on normal (represented by DN) and diseased or H1N1 infected state (represented by DI).
Step-2: Result for normal state =
Step-3: Find out the identical clusters or matched clusters between normal states to infected states.
Step-4: Perform cluster comparison and identify the differentially expressed glycan set G that has been changed quite significantly.
Step-5: For multiple glycan datasets D1, D2,…, Dt, the resultant glycan set will be represented as G’= G1∩G2…∩Gt; here, G1 is the differentially expressed glycan set obtained in Step 4 for dataset D1.
The entire methodology has been depicted in Figure 2.1. In this paper, three clustering algorithms are used:
The first algorithm has been applied that is the k-means clustering and was proposed by scientist J.B. Macqueen. The actual idea behind this algorithm is to identify k centroids one for each cluster or group.
(1) At first, choose some points to represent initial cluster focal points.
(2) Secondly, assign each object to a cluster that has closed centroids.
(3) Thirdly, when all objects are assigned, then recalculate the position of the k centroids, and lastly, this process will be continued until the centroids no longer move and this basically produces separation of the objects into clusters from which the metric is to be minimized can be calculated [23].