Data Analytics in Bioinformatics. Группа авторов

Data Analytics in Bioinformatics - Группа авторов


Скачать книгу
Among the soft clustering approaches FCM is most popular.

       2.3.8.1 FCM (Fuzzy Class Membership)

      This algorithm is mostly applied in microarray data analysis as microarrays are collection of tens of thousands of genes and analysing them concurrently. This uses a membership function upon which a membership matrix is built from the dataset. This is updated at every instance of similarity check with the data points. The degree of membership is given by the weights of the matrix [25] which specifies the data point how similar it is to the mean of a cluster. The membership values ranges from 0 to 1.

       References

      1. Simeone, O., A Very Brief Introduction to Machine Learning With Applications to Communication Systems. IEEE Trans. Cognit. Commun. Networking, 4, 4, 648–664, 2018.

      2. Dixit, P. and Prajapati, G.I., Machine Learning in Bioinformatics: A Novel Approach for DNA Sequencing. 2015 Fifth International Conference on Advanced Computing & Communication Technologies, Haryana, pp. 41–47, 2015.

      3. https://en.wikipedia.org/wiki/Unsupervised_learning.

      4. Jain, A.K., Data clustering: 50 years beyond k-means. Pattern Recognit. Lett., 31, 8, 1, 651−666, 2010, https://doi.org/10.1016/j.patrec.2009.09.011.

      5. Oyelade, J. et al., Data Clustering: Algorithms and Its Applications. 2019 19th International Conference on Computational Science and Its Applications (ICCSA), Saint Petersburg, Russia, pp. 71–81, 2019.

      6. Larrañaga, P., Calvo, B., Santana, R., Bielza, C., Galdiano, J., Inza, I., Lozano, J.A., Armañanzas, R., Santafé, G., Pérez, A., Robles, V., Machine learning in bioinformatics. Briefings Bioinf., 7, 1, 86–112, March 2006, https://doi.org/10.1093/bib/bbk007.

      7. National Research Council (US) Committee on Intellectual Property Rights in Genomic and Protein Research and Innovation, Merrill, S.A. and Mazza, A.M. (Eds.), Reaping the Benefits of Genomic and Proteomic Research: Intellectual Property Rights, Innovation, and Public Health, National Academies Press (US), Washington (DC), 2006, 2, Genomics, Proteomics, and the Changing Research Environment, Available from: https://www.ncbi.nlm.nih.gov/books/NBK19861/.

      8. Boundless.com. License: CC BY-SA: Attribution-ShareAlike.

      9. Oyelade, J., Isewon, I., Oladipupo, F. et al., Clustering Algorithms: Their Application to Gene Expression Data. Bioinform. Biol. Insights, 10, 237–253, 2016.

      10. Kerr, G., Ruskin, H.J., Crane, M., Doolan, P., Techniques for clustering gene expression data. Comput. Biol. Med., 38, 3, 283–293, 2008.

      11. Jain, A.K., Murty, M.N., Flynn, P.J., Data clustering: A review. ACM Comput. Surv., 31, 3, 264–323, 1999.

      12. ©Nature Education, CC-BY-NC-SA.

      14. Chandrasekhar, T., Thangavel, K., Elayaraja, E., Effective clustering algorithms for gene expression data. Int. J. Comput. Appl., 32, 4, 25–9, 2011.

      15. Khan, S.S. and Ahmad, A., Cluster Center Initialization Algorithm for K-Means Clustering. 25, 11, 1293–1302, 2004.

      16. Handhayani, T. and Hiryanto, L., Intelligent Kernel K-Means for Clustering Gene Expression. Procedia Comput. Sci., 59, 171–7, 2015.

      17. Kaufman, L. and Rousseeuw, P.J., Finding Groups in Data: An Introduction to Cluster Analysis, vol. 344, John Wiley & Sons, New York, 1990.

      18. Sokal, R.R. and Michener, C.D., A statistical method for evaluating systematic relationships. Univ. Kansas Sci. Bull., 28, 1409–38, 1958.

      19. Domany, E., Superparamagnetic clustering of data—The definitive solution of an ill-posed problem. Physica A Stat. Mech. Appl., 263, 1, 158–69, 1999.

      20. Guha, S., Rastogi, R., Shim, K., CURE: an efficient clustering algorithm for large databases, in: ACM SIGMOD Record, vol. 27, New York, NY, pp. 73–84, ACM, USA, 1998.

      21. Karypis, G., Han, E.H., Kumar, V., Chameleon: Hierarchical clustering using dynamic modeling. Computer (Long Beach Calif.), 32, 8, 68–75, 1999.

      22. Zhang, T., Ramakrishnan, R., Livny, M., BIRCH: an efficient data clustering method for very large databases, vol. 25, New York, NY, pp. 103–14, ACM, ACM Sigmod Record, USA, 1996.

      23. Grun, B., Model Based Clustering, arXiv:1807.01987v1 [stat.ME], 5 Jul 2018.

      24. Kohonen, T., The self-organizing map. Proc. IEEE, 78, 9, 1464–80, 1990.

      25. Grid-Based Clustering Algorithms, Data Clustering: Theory, Algorithms, and Applications, 209–217, https://doi.org/10.1137/1.9780898718348.ch12.

      26. Sander, J., Density-Based Clustering, in: Encyclopedia of Machine Learning, C. Sammut and G.I. Webb (Eds.), Springer, Boston, MA, 2011, https://doi.org/10.1007/978-0-387-30164-8.

      27. Ester, M., Kriegel, H.-P., Sander, J., Xu, X., A density-based algorithm for discovering clusters in large spatial databases with noise, in: Proceedings of the Second International Conference on Knowledge Discovery and Data Mining (KDD’96), AAAI Press, pp. 226–231, 1996.

      28. Sharan, R., Elkon, R., Shamir, R., Cluster Analysis and Its Applications to Gene Expression Data, in: Bioinformatics and Genome Analysis, Ernst Schering Research Foundation Workshop, vol. 38, Springer, Berlin, Heidelberg, 2002, https://doi.org/10.1007/978-3-662-04747-7_5.

      29. Colaprico, A. et al., TCGAbiolinks: an R/Bioconductor package for integrative analysis of TCGA data. Nucleic Acids Res., 44.8, e71–e71, 2015.

      30. Silva, T.C. et al., TCGA Workflow: Analyze cancer genomics and epigenomics data using Bioconductor packages. F1000Research, 5, 2016, (https://f1000research.com/articles/5-1542/v2).

      32. Brazma, A. and Vilo, J., Gene expression data analysis. FEBS Lett., 480, 1, 17–24,


Скачать книгу