March 3, 2020

[R] Pay Less Attention with Lightweight and Dynamic Convolutions

submitted by /u/tsauri [link] [comments] Source
March 3, 2020

How to do parameter estimation/MLE with MCMC methods? [D]

With variational methods one could just use the variational EM algorithm or just optimize for prior parameters and variational parameters jointly, however for MCMC methods like […]
March 4, 2020

[D] Does anyone know how exactly Google incorporated Bert into their search engines?

From https://blog.google/products/search/search-language-understanding-bert it seems that they may be doing a mean pool of a query? And if so, what else is going on? Are they also […]
March 4, 2020

[R] When Two-Layer ReLU Networks Provably Fail With High Probability

Title: Training Two-Layer ReLU Networks with Gradient Descent is Inconsistent (First author here.) We prove that for certain types of datasets, wide two-layer (Leaky)ReLU networks trained […]