Analysis of Industrial Sensor Data Using Statistical and Regression Methods


  • Katalin Ferencz Óbuda University
  • József Domokos Sapientia Hungarian University of Transylvania
  • Levente Kovács Óbuda University



IoT, IIoT, regression models, algorithms, predictions, outlier detections, Apache Spark


Today's industrial landscape is primarily driven by rapid and effective data processing and evaluation. Consequently, industries should devote considerable attention and resources towards real-time examination of the large data sets acquired, enabling timely extraction of vital information for outlier detection, fake data identification, and predictive analysis to mitigate unforeseen expenses. This rigorous process of data analysis necessitates the employment of a diverse set of algorithms that align with the specific objectives, spanning a wide spectrum of potential solutions. In this manuscript, we demonstrate how Apache Spark's unified engine can be harnessed for conducting statistical analysis of time series data, thereby expediting industrial data analysis processes. Furthermore, we examine and implement both linear and random forest regression models within the context of the demonstrated use case.


Várkonyi-Kóczy, A.R., J.Z. Szabó. "Soft Computing Based Methods in Diagnostics" (Lágyszámítási módszerekkel támogatott diagnosztikai módszerek), XXI. Nemzetközi Gépészeti Találkozó, OGÉT’2013, (VJ. Csibi ed., 460 p.,, EMT, Cluj, Romania), Arad, Romania, Apr. 25-28, 2013, pp. 427-430.

Várkonyi-Kóczy, Annamária R., Péter Baranyi, and Ron J. Patton. "Anytime fuzzy modeling approach for fault detection systems." Proceedings of the 20th IEEE Instrumentation Technology Conference (Cat. No. 03CH37412). Vol. 2. IEEE, 2003.

Ferencz, Katalin, József Domokos, and Levente KovÁcs. "A statistical approach to time series sensor data evaluation using Apache Spark modules." 2022 IEEE 16th International Symposium on Applied Computational Intelligence and Informatics (SACI). IEEE, 2022.

Samu, Gabor, and A. R. Várkonyi-Kóczy. "Intelligent monitor for anytime systems." IEEE International Symposium on Intelligent Signal Processing, 2003. IEEE, 2003.

Samu, Gabor. Intelligent monitor for anytime systems. Diss. 2005.

Khan, Md Saikat Islam, et al. "IoT and Wireless Sensor Networkingbased Effluent Treatment Plant Monitoring System." Acta Polytechnica Hungarica 18.10 (2021).

Forkuor, Gerald, et al. "High resolution mapping of soil properties using remote sensing variables in south-western Burkina Faso: a comparison of machine learning and multiple linear regression models." PloS one 12.1 (2017): e0170478.

Ali, Iftikhar, et al. "Review of machine learning approaches for biomass and soil moisture retrievals from remote sensing data." Remote Sensing 7.12 (2015): 16398-16421.

Montgomery, Douglas C., Elizabeth A. Peck, and G. Geoffrey Vining. Introduction to linear regression analysis. John Wiley & Sons, 2021.

Coulston, John W., et al. "Approximating prediction uncertainty for random forest regression models." Photogrammetric Engineering & Remote Sensing 82.3 (2016): 189-197.

Peres, Ricardo Silva, et al. "IDARTS–Towards intelligent data analysis and real-time supervision for industry 4.0." Computers in industry 101 (2018): 138-146.

Wang, K.. “Intelligent Predictive Maintenance ( IPdM ) System – Industry 4.0 Scenario.” WIT transactions on engineering sciences 113 (2016): 259-268.

Ge, Zhiqiang. "Distributed predictive modeling framework for prediction and diagnosis of key performance index in plant-wide processes." Journal of Process Control 65 (2018): 107-117.

Wu, Dazhong, et al. "A fog computing-based framework for process monitoring and prognosis in cyber-manufacturing." Journal of Manufacturing Systems 43 (2017): 25-34.

Magadán, L., et al. "Low-cost real-time monitoring of electric motors for the Industry 4.0." Procedia Manufacturing 42 (2020): 393-398.

Katalin, Ferencz and József, Domokos. "Ipari IoT szolgáltatások és nyílt forráskódú rendszerek áttekintése: Overview of Industrial IoT services and open source systems." Energetika-Elektrotechnika– Számítástechnika és Oktatás Multi-konferencia (2020): 69-74.

Ferencz, Katalin, and József Domokos. "Rapid Prototyping of IoT Applications for the Industry." 2020 IEEE International Conference on Automation, Quality and Testing, Robotics (AQTR). IEEE, 2020.

Pınar Tüfekci, Prediction of full load electrical power output of a base load operated combined cycle power plant using machine learning methods, International Journal of Electrical Power & Energy Systems, Volume 60, September 2014, pp. 126-140




How to Cite

“Analysis of Industrial Sensor Data Using Statistical and Regression Methods”, Syst. Theor. Control Comput. J., vol. 3, no. 1, pp. 36–44, Jun. 2023, doi: 10.52846/stccj.2023.3.1.48.