Title: | Exploring the impact of post-training rounding in regression models (English) |
Author: | Kalina, Jan |
Language: | English |
Journal: | Applications of Mathematics |
ISSN: | 0862-7940 (print) |
ISSN: | 1572-9109 (online) |
Volume: | 69 |
Issue: | 2 |
Year: | 2024 |
Pages: | 257-271 |
Summary lang: | English |
. | |
Category: | math |
. | |
Summary: | Post-training rounding, also known as quantization, of estimated parameters stands as a widely adopted technique for mitigating energy consumption and latency in machine learning models. This theoretical endeavor delves into the examination of the impact of rounding estimated parameters in key regression methods within the realms of statistics and machine learning. The proposed approach allows for the perturbation of parameters through an additive error with values within a specified interval. This method is elucidated through its application to linear regression and is subsequently extended to encompass radial basis function networks, multilayer perceptrons, regularization networks, and logistic regression, maintaining a consistent approach throughout. (English) |
Keyword: | supervised learning |
Keyword: | trained model |
Keyword: | perturbations |
Keyword: | effect of rounding |
Keyword: | low-precision arithmetic |
MSC: | 62H12 |
MSC: | 62M45 |
MSC: | 68Q87 |
DOI: | 10.21136/AM.2024.0090-23 |
. | |
Date available: | 2024-04-04T12:11:27Z |
Last updated: | 2024-04-04 |
Stable URL: | http://hdl.handle.net/10338.dmlcz/152315 |
. | |
Reference: | [1] Agresti, A.: Foundations of Linear and Generalized Linear Models.Wiley Series in Probability and Statistics. John Wiley & Sons, Hoboken (2015). Zbl 1309.62001, MR 3308143 |
Reference: | [2] Blokdyk, G.: Artificial Neural Network: A Complete Guide.5STARCooks, Toronto (2021). |
Reference: | [3] Carroll, R. J., Ruppert, D., Stefanski, L. A., Crainiceanu, C. M.: Measurement Error in Nonlinear Models: A Modern Perspective.Monographs on Statistics and Applied Probability 105. Chapman & Hall/CRC, Boca Raton (2006). Zbl 1119.62063, MR 2243417, 10.1201/9781420010138 |
Reference: | [4] Croci, M., Fasi, M., Higham, N. J., Mary, T., Mikaitis, M.: Stochastic rounding: Implementation, error analysis and applications.R. Soc. Open Sci. 9 (2022), Article ID 211631, 25 pages. 10.1098/rsos.211631 |
Reference: | [5] Egrioglu, E., Bas, E., Karahasan, O.: Winsorized dendritic neuron model artificial neural network and a robust training algorithm with Tukey's biweight loss function based on particle swarm optimization.Granul. Comput. 8 (2023), 491-501. 10.1007/s41066-022-00345-y |
Reference: | [6] Fasi, M., Higham, N. J., Mikaitis, M., Pranesh, S.: Numerical behavior of NVIDIA tensor cores.PeerJ Computer Sci. 7 (2021), Article ID e330, 19 pages. 10.7717/peerj-cs.330 |
Reference: | [7] Gao, F., Li, B., Chen, L., Shang, Z., Wei, X., He, C.: A softmax classifier for high-precision classification of ultrasonic similar signals.Ultrasonics 112 (2021), Article ID 106344, 8 pages. 10.1016/j.ultras.2020.106344 |
Reference: | [8] Greene, W. H.: Econometric Analysis.Pearson Education, Harlow (2018). |
Reference: | [9] Hastie, T., Tibshirani, R., Wainwright, R.: Statistical Learning with Sparsity: The Lasso and Generalizations.Monographs on Statistics and Applied Probability 143. CRC Press, Boca Raton (2015). Zbl 1319.68003, MR 3616141, 10.1201/b18401 |
Reference: | [10] Hoefler, T., Alistarh, D., Ben-Nun, T., Dryden, N., Peste, A.: Sparsity in deep learning: Pruning and growth for efficient inference and training in neural networks.J. Mach. Learn. Res. 22 (2021), Article ID 241, 124 pages. Zbl 07626756, MR 4329820 |
Reference: | [11] Kalina, J., Tichavský, J.: On robust estimation of error variance in (highly) robust regression.Measurement Sci. Rev. 20 (2020), 6-14. 10.2478/msr-2020-0002 |
Reference: | [12] Kalina, J., Vidnerová, P., Soukup, L.: Modern approaches to statistical estimation of measurements in the location model and regression.Handbook of Metrology and Applications Springer, Singapore (2023), 2355-2376. 10.1007/978-981-99-2074-7_125 |
Reference: | [13] Louizos, C., Reisser, M., Blankevoort, T., Gavves, E., Welling, M.: Relaxed quantization for discretized neural networks.Available at https://arxiv.org/abs/1810.01875 (2018), 14 pages. 10.48550/arXiv.1810.01875 |
Reference: | [14] Maddox, W. J., Potapczynski, A., Wilson, A. G.: Low-precision arithmetic for fast Gaussian processes.Proc. Mach. Learn. Res. 180 (2022), 1306-1316. |
Reference: | [15] Nagel, M., Fournarakis, M., Amjad, R. A., Bondarenko, Y., Baalen, M. van, Blankevoort, T.: A white paper on neural network quantization.Available at https://arxiv.org/abs/2106.08295 (2021), 27 pages. 10.48550/arXiv.2106.08295 |
Reference: | [16] Park, J.-H., Kim, K.-M., Lee, S.: Quantized sparse training: A unified trainable framework for joint pruning and quantization in DNNs.ACM Trans. Embedded Comput. Syst. 21 (2022), Article ID 60, 22 pages. 10.1145/3524066 |
Reference: | [17] Pillonetto, G.: System identification using kernel-based regularization: New insights on stability and consistency issues.Automatica 93 (2018), 321-332. Zbl 1400.93316, MR 3810919, 10.1016/j.automatica.2018.03.065 |
Reference: | [18] Riazoshams, H., Midi, H., Ghilagaber, G.: Robust Nonlinear Regression with Applications Using R.John Wiley & Sons, Hoboken (2019). Zbl 1407.62022, MR 3839600, 10.1002/9781119010463 |
Reference: | [19] Saleh, A. K. M. E., Picek, J., Kalina, J.: R-estimation of the parameters of a multiple regression model with measurement errors.Metrika 75 (2012), 311-328. Zbl 1239.62081, MR 2909549, 10.1007/s00184-010-0328-2 |
Reference: | [20] Seghouane, A.-K., Shokouhi, N.: Adaptive learning for robust radial basis function networks.IEEE Trans. Cybernetics 51 (2021), 2847-2856. 10.1109/TCYB.2019.2951811 |
Reference: | [21] Shultz, K. S., Whitney, D., Zickar, M. J.: Measurement Theory in Action: Case Studies and Exercises.Routledge, New York (2020). 10.4324/9781315869834 |
Reference: | [22] Šíma, J., Vidnerová, P., Mrázek, V.: Energy complexity model for convolutional neural networks.Artificial Neural Networks and Machine Learning -- ICANN 2023 Lecture Notes in Computer Science 14263. Springer, Cham (2023), 186-198. 10.1007/978-3-031-44204-9_16 |
Reference: | [23] Smucler, E., Yohai, V. J.: Robust and sparse estimators for linear regression models.Comput. Stat. Data Anal. 111 (2017), 116-130. Zbl 1464.62164, MR 3630222, 10.1016/j.csda.2017.02.002 |
Reference: | [24] Sze, V., Chen, Y.-H., Yang, T.-J., Emer, J. S.: Efficient processing of deep neural networks: A tutorial and survey.Proc. IEEE 105 2017 (2295-2329). MR 3784727, 10.1109/JPROC.2017.2761740 |
Reference: | [25] Víšek, J.Á.: Consistency of the least weighted squares under heteroscedasticity.Kybernetika 47 (2011), 179-206. Zbl 1220.62064, MR 2828572 |
Reference: | [26] Wang, N., Choi, J., Brand, D., Chen, C.-Y., Gopalakrishnan, K.: Training deep neural networks with 8-bit floating point numbers.NIPS'18: Proceedings of the 32nd International Conference on Neural Information Processing Systems Curran Associates, New York (2018), 7686-7695. 10.5555/3327757.3327866 |
Reference: | [27] Yan, W. Q.: Computational Methods for Deep Learning: Theory, Algorithms, and Implementations.Texts in Computer Science. Springer, Singapore (2023). Zbl 7783714, MR 4660076, 10.1007/978-981-99-4823-9 |
Reference: | [28] Yu, J., Anitescu, M.: Multidimensional sum-up rounding for integer programming in optimal experimental design.Math. Program. 185 (2021), 37-76 \99999DOI99999 10.1007/s10107-019-01421-z . Zbl 1458.62158, MR 4201708, 10.1007/s10107-019-01421-z |
Reference: | [29] Zhang, R., Wilson, A. G., Sa, C. De: Low-precision stochastic gradient Langevin dynamics.Proc. Mach. Learn. Res. 162 (2022), 26624-26644. |
. |
Fulltext not available (moving wall 24 months)