Nvidia CEO Introduces Nvidia Ampere Architecture, Nvidia A100 GPU
May 22, 2020
HBO to Drop Support for Apple TV 2 & 3
May 22, 2020

[D] How to read deep learning papers?

Sorry if this has already been discussed, but I've been reading some deep learning papers and it seems like a lot of the choice of architecture is wishy-washy stuff that we just have to “accept” for some reason. I know that explaining a deep network is almost impossible, but that makes it really difficult to decide whether a given network outperforms another, especially when considering artificial data sets, because:

  • performance can vary due to the way the data is synthesized
  • preprocessing also affects performance
  • the impact of adding a particular architecture isn't quantifiable (e.g. why are you using Resnet50 instead of resnet34 / vice versa? Why are you using a CNN there, or why does it have this layer, etc.)
  • some of the justifications seems valid only on paper and isn't a “proper” explanation (for example, choice of cost functions beyond the general “use X for Y type of problem” seems rather difficult to justify)

In short, everything seems like it's a question of throwing more layers at the problem until you reach some sort of solution that you can claim is “better” than others', except since nothing is standardized, even for a particular type of problem, it can result in widely varying results and reproducibility.

How do you manage this? How do you not take it with a tablespoon of salt? I'm rather new to this, which is probably why I'm unable to see past this.

submitted by /u/GimmickNG
[link] [comments]


Leave a Reply

Your email address will not be published. Required fields are marked *