Among many unsolved puzzles in theories of Deep Neural Networks (DNNs), generalisability is arguably one of the most puzzling mysteries of
DNNs. In this work, we investigates the concept of sharpness/flatness of local minima of the error function, and its relationship to generalisability of DNNs. By defining the sharpness of local minima as the largest Eigenvalue of the Hessian, we identify four influencing factors contributing to
the sharpness, while three factors are also found for controlling the generalisability of DNNs. Our findings are investigated and verified on a simple regression problem.
«
Among many unsolved puzzles in theories of Deep Neural Networks (DNNs), generalisability is arguably one of the most puzzling mysteries of
DNNs. In this work, we investigates the concept of sharpness/flatness of local minima of the error function, and its relationship to generalisability of DNNs. By defining the sharpness of local minima as the largest Eigenvalue of the Hessian, we identify four influencing factors contributing to
the sharpness, while three factors are also found for controlli...
»