A Comparative Performance of Different Convolutional Neural Network Activation Functions on Image Classification
DOI:
https://doi.org/10.31436/ijpcc.v10i2.490Keywords:
Activation Functions, Convolutional Neural Network , Image ClassificationAbstract
Activation functions are crucial in optimising Convolutional Neural Networks (CNNs) for image classification. While CNNs excel at capturing spatial hierarchies in images, the activation functions substantially impact their effectiveness. Traditional functions, such as ReLU and Sigmoid, have drawbacks, including the "dying ReLU" problem and vanishing gradients, which can inhibit learning and efficacy. The study seeks to comprehensively analyse various activation functions across different CNN architectures to determine their impact on performance. The findings suggest that Swish and Leaky ReLU outperform other functions, with Swish particularly promising in complicated networks such as ResNet. This emphasises the relevance of activation function selection in improving CNN performance and implies that investigating alternative functions can lead to more accurate and efficient models for image classification tasks.
References
M. Mathew, M. Noel, and Y. Oswal, "A significantly better class of activation functions than ReLU like activation functions," arXiv, 2024. [Online]. Available: https://doi.org/10.48550/arxiv.2405.04459.
A. D. Jagtap and G. E. Karniadakis, "How important are activation functions in regression and classification? A survey, performance comparison, and future directions," Journal of Machine Learning for Modelling and Computing, vol. 4, no. 1, pp. 21-75, 2023. [Online]. Available: https://doi.org/10.1615/jmachlearnmodelcomput.2023047367.
F. Gao and B. Zhang, "Data-aware customisation of activation functions reduces neural network error," arXiv, 2023. [Online]. Available: https://doi.org/10.48550/arxiv.2301.06635.
D. Sukau, "Activation functions in deep learning: A comprehensive survey and benchmark," Neurocomputing, vol. 503, pp. 92-108, 2022. [Online]. Available: https://doi.org/10.1016/j.neucom.2022.06.111.
V. Bansal, "Activation Functions: Dive into an optimal activation function," arXiv, 2022. [Online].
Available: https://doi.org/10.48550/arxiv.2202.12065.
J. Chen and Z. Pan, "Saturated Non-Monotonic Activation Functions," arXiv, 2023. [Online].
Available: https://doi.org/10.48550/arxiv.2305.07537.
J. Lederer, "Activation Functions in Artificial Neural Networks: A Systematic Overview," arXiv, 2021. [Online].
Available: https://arxiv.org/abs/2101.09957.
P. Liu, "A survey on recently proposed activation functions for Deep Learning," arXiv, 2022. [Online].
Available: https://doi.org/10.48550/arxiv.2204.02921.
S. Korn, G. Hamerly, and P. Rivas, "Is ReLU Adversarially Robust?," arXiv, 2024. [Online].
Available: https://doi.org/10.48550/arxiv.2405.03777.
M. F. R. Mariam, M. F. Farheen, M. M. Manjushree, and M. K. Pandit, "Skin Cancer Detection using CNN with Swish Activation Function," International Journal of Engineering Research and Technology, vol. 8, no. 14, 2020. [Online].
Available: https://doi.org/10.17577/IJERTCONV8IS14022.