Yet Another Batch Normalisation Analysis. What makes BN tick? Do we need a BN layer after every convolution? Can BN layers learn by themselves?
GitHub: repo
Paper: YABA
This is an exploration performed as part of the "Machine Learning Practical" course at the University of Edinburgh.