Machine learning in practice – from PyTorch model to Kubeflow in the cloud for BigData. Eugeny Shtoltc

Machine learning in practice – from PyTorch model to Kubeflow in the cloud for BigData

direction in which researchers are working in an attempt to improve in such networks is to determine the principle by which the network will decide which, for how long and how much the network will take into account the previous information in the future. Networks adopting specialized tools for storing information are called LSTM (Long-short term memory).

Not all combinations are successful, some only allow solving narrow problems. As the complexity increases, a smaller percentage of possible architectures are successful and bear their own names.

In general, there are networks that are fundamentally different in structure and principles:

* direct distribution networks

* convolutional neural networks

* recurrent neural networks

* autoencoder (classic, thin, variational, noise canceling)

* networks of trust ("deep belief")

* generative adversarial networks – opposition of two networks: generator and evaluator

* neural Turing machines – a neural network with a block of memory

* Kohonen neural networks – for unsupervised learning

* various architectures of circular neural networks: Hopfield neural network, Markov chain, Boltzmann machine

Let us consider in more detail the most commonly used, namely, feedforward, convolutional and recurrent networks:

Direct distribution networks:

* two entrances and one exit – Percetron (P)

* two inputs, two fully connected neurons with an output and one output – Feed Forward (FF) or Redial Basics Network (RBN)

* three inputs, two layers of four fully connected neurons and two Deep Feed Forward (DFF) outputs

* deep neural networks

* extreme propagation network – a network with random connections (neural echo network)

Convolutional neural networks:

* traditional convolutional neural networks (CNN) – image classification * unfolding neural networks – image generation by type * deep convolutional inverse graphic networks (DCEGC) – connecting convolutional and unrolling neural networks to transform or combine images

Recurrent neural networks:

* recurrent neural networks – networks with memory in neurons for sequence analysis, in which the sequence

matters such as text, sound and video

* Long Short Term Memory (LSTM) networks – the development of recurrent neural networks in which neurons can

classify data that are worth remembering into long-lived memory from those that are worth forgetting and delete information

from their memory

* deep residual networks – networks with connections between layers (similar in work to LSTM)

* recruited recute neurons (GRU)

Basics for writing networks.

Until 2015, scikit-learn was leading by a wide margin, which Caffe was catching up with, but with the release of TensorFlow, it immediately became the leader. Over time, only gaining a gap from two to three times by 2020, when there were more than 140 thousand projects on GitHub, and the closest competitor had just over 45 thousand. In 2020, Keras, scikit-learn, PyTorch (FaceBook), Caffe, MXNet, XGBoost, Fastai, Microsoft CNTK (CogNiive ToolKit), DarkNet and some other lesser known libraries are located in descending order. The most popular are the Pytorch and TenserFlow libraries. Pytorch is good for prototyping, learning and trying out new models. TenserFlow is popular in production environments and the low-level issue is addressed by Keras.

* FaceBook Pytorch is a good option for learning and prototyping due to the high level and support of various

environments, a dynamic graph, can give advantages in learning. Used by Twitter, Salesforce.

* Google TenserFlow – originally had a static solution graph, now dynamic is also supported. Used in

Gmail, Google Translate, Uber, Airbnb, Dropbox. To attract use in the Google cloud for it

Google TPU (Google Tensor Processing Unit) hardware processor is being implemented.

* Keras is a high-level tweak providing more abstraction for TensorFlow, Theano

or CNTK. A good option for learning. For example, he

allows you not to specify the dimension of layers, calculating it yourself, allowing the developer to focus on the layers

architecture. Usually used on top of TenserFlow. The code on it is maintained by Microsoft CNTK.

There are also more specialized frameworks:

* Apache MXNet (Amazon) and a high-level add-on for it Gluon. MXNet is a framework with an emphasis on

scaling, supports integration with Hadoop and Cassandra. Supported

C ++, Python, R, Julia, JavaScript, Scala, Go and Perl.

* Microsoft CNTK has integrations with Python, R, C # due to the fact that most of the code is written in C ++. That all sonova

written in C ++, this does not mean that CNTK will train the model in C ++, and TenserFlow in Python (which is slow),

since TenserFlow builds graphs and its execution is already carried out in C ++. Features CNTK

from Google TenserFlow and the fact that it was originally designed to run on Azure clusters with multiple graphical

processors, but now the situation is leveled and TenserFlow supports the cluster.

* Caffe2 is a framework for mobile environments.

* Sonnet – DeepMind add-on on top of TensorFlow for training super-deep neural networks.

* DL4J (Deep Learning for Java) is a framework with an emphasis on Java Enterprise Edition. High support for BigData in Java: Hadoop and Spark.

With the speed of availability of new pre-trained models, the situation is different and, so far, Pytorch is leading. In terms of support for environments, in particular public clouds, it is better for the farms promoted by the vendors of these clouds, so TensorFlow support is better in Google Cloud, MXNet in AWS, CNTK in Microsoft Azure, D4LJ in Android, Core ML in iOS. By languages, almost everyone has common support in Python, in particular, TensorFlow supports JavaScript, C ++, Java, Go, C # and Julia.

Many frameworks support TeserBodrd rendering. It is a complex Web interface for multi-level visualization of the state and the learning process and its debugging. To connect, you need to specify the path to the "tenserboard –logdir = $ PATH_MODEL" model and open localhost: 6006. Interface control is based on navigating through the graph of logical blocks and opening blocks of interest for subsequent repetition of the process.

For experiments, we need a programming language and a library. Often the language used is a simple language with a low entry threshold, such as Python. There may be other general-purpose languages like JavaScript or specialized languages like R. I'll take Python. In order not to install the language and libraries, we will use the free service colab.research.google.com/notebooks/intro.ipynb containing Jupiter Notebook. Notebook contains the ability not only to write code with comments in the console form, but to format it as a document. You can try Notebook features in the educational playbook https://colab.research.google.com/notebooks/welcome.ipynb, such as formatting text in the MD markup language with formulas in the TEX markup language, running scripts in Python, displaying the results of their work in text form and in the form of graphs using the standard Python library: NumPy (NamPay), matplotlib.pyplot. Colab itself provides a Tesla K80 graphics card for 12 hours at a time (per session) for free. It supports a variety of deep machine learning frameworks, including Keras, TenserFlow, and Pytorch. The price of a GPU instance in Google Cloud:

* Tesla T4: 1GPU 16GB GDDR6 0.35 $ / hour

* Tesla P4: 1GPU 8GB GDDR5 0.60 $ / hour

* Tesla V100: 1GPU 16GB HBM2 2.48 $ / hour

Скачать книгу