Quickly learn something new, basing on past experience

Commonly, RNNs are used to solve a specific task (e.g. how to play Go in AlphaGo), requiring a lot of data (billions of self-play matches). Ultimately, we’d like a NN, that can quickly learn new tasks (e.g. Chess) with just a few examples by using experience with other tasks, like people do.

Read More

New 2nd order gradient method beats SGD and ADAM on large models.

The paper brings 2-order optimization method to a comparable training speed with 1-order counterparts (SGD, ADAM). 2-order optimization method appears to be superior in terms of the final model performance, as tested on CIFAR and ImageNet using ResNet and VGG-f, and did not require hyperparameter tuning.

Read More