March 18, 2020

[D] Is mutation of input images necessary to prevent over-fitting in modern image classification/recognition methods (using DeepFace)

Hello, I'm new to this field of study; please forgive any misnomers. Goal: use DeepFace to match input images to a reference image, i.e. implement reverse […]
March 18, 2020

[R] Efficient Content-Based Sparse Attention with Routing Transformers. Great performance on Wikitext-103, enwik8, and image generation task on ImageNet-64, while reducing overall complexity of attention from O(n² d) to O(n√n d)

submitted by /u/hardmaru [link] [comments] Source
March 19, 2020

[R] 128 layer Transformer on your laptop: ReZero examples

This repo contains some verbose examples and analysis in which Residual connections with Zero init (x = x + alpha * F(x), init: alpha = 0) […]
March 19, 2020

[D] Learn Hugging Face Transformers & BERT with PyTorch in 5 Minutes

Since the release of Googles' Bidirectional Encoder Representations from Transformers (BERT), it has been one of the most essential default setting for the majority of state-of-the-art […]