Continuing with the pre-training approaches like BERT commonly used for languages, a new article explores large-scale pre-training for building computer vision models and for transfer learning. […]
https://youtu.be/IIebBjbBevs BERT is a giant model. Turns out you can prune away many of its components and it still works. This paper analyzes BERT pruning in […]
With all the discussion around “is TF better than pytorch” and many people saying that pytorch is being used in research, I'm wondering how companies are […]