Monday, September 21, 2020

Neural networks

After I went through neural networks at the university, a lot of water has flowed under the bridge. We finally got our hands on the old and long-postponed lecture of the small ShAD.

Interesting: computer science vs computer engineering

In 2015, the moment passed when the computer began to determine the image better than a person (it is quite difficult for a person to distinguish an ordinary Husky from a Siberian Husky).

Classical neural networks are when each neuron is connected with all neurons of the previous layer, and everything is relatively simple there (we taught the recognition of pixel symbols in our labs). But, if you use them for image processing, there will not be enough resources, so they use Convolutional Neural Networks. This is a type of neural network that uses only limited weight matrices (small size), which are "moved" over the entire processed layer.

For example, you can train a neural network to detect eyes in a photograph and then quickly identify them in a photograph (the second slide describing the scientific work of Gatys LA on stylizing images seemed quite amusing to me. Lecture in 2015, let me remind you).

To solve problems related to text (translations, generation of texts), Recurrent Neural Networks are used - these are networks in which there is a feedback. When consistency matters. Here you can already write bots for, for example, technical support. Or my old idea of ​​a fix is ​​to translate any text into the style of Dostoevsky. When computers start writing programs for themselves, these are also recurrent neural networks.

But the magic begins when we start combining the two together. Convolutional + recurrent networks give us, for example, the ability to translate video to text (I've always wondered how this is done). There is a famous video where one developer walks around Amsterdam, and he describes everything he saw. [funny that this video was made with post-processing, I mean not in real time. Since the iPhone 7 Plus is already kind of more powerful than my eyir, we can already assume that applications for people with visual impairments will appear soon. I pointed my phone, and you were told in real time what was happening on the street].

First recursive, then convolutional. And you can do the generation of images by text (for example, help the build editor to select images for the text).

Using convolutional networks + reinforcement learning, we can train the network to solve any game (for example, the same teaching mechanics can be used for almost all Atari games). [note: checkers, for example, were decided almost by brute force - a computer can bring any game to a draw, chess is already a little more difficult).

PS At WWDC 2016 Apple presented two pieces: Basic Neural Network Subroutines (BNNS) and Convolutional Neural Networks (CNN). 

It is described here in more or less human language. In other words, you can already be completely dumb, you just take it and use it. For example, in order to detect a face in a photo, you can use a specially dedicated thing for this.

No comments:

Post a Comment