a, an, the Design, Implementation, and Validation of a University Data Analysis Laboratory Based on Mathematical Modeling and Applied Statistics Techniques

Development and evaluation of advanced statistical methodologies and tools for solving complex problems in the academic and professional field.

Authors

  • Jefferson Agustín Macías Bravo UNIVERSIDAD TECNICA DE MANABI
  • Wilson Fabián Chávez Rodríguez UNIVERSIDAD TECNICA DE MANABI
  • Yandri Francinet Guerrero Alcívar UNIVERSIDAD TECNICA DE MANABI

DOI:

https://doi.org/10.37117/s.v28i1.1272

Keywords:

Data analysis, Mathematical modeling, Higher education, Open-source software, University laboratory, Applied mathematics, Educational infrastructure, Project-based learning.

Abstract

This paper proposes the design of a University Laboratory for Data Analysis aimed at strengthening training in mathematical modeling and applied statistics within Applied Mathematics programs. The model relies on a local infrastructure of eight student workstations and a master node, interconnected via a managed switch, and exclusively employs open-source software such as Python, R, Hadoop, Spark, Hive, and Power BI to simulate a distributed computing environment without reliance on cloud services or commercial licenses. The design integrates the complete data analysis cycle from data ingestion and cleaning to visualization and interpretation aligning with the educational needs of data science training in resource-constrained higher education settings. Although the laboratory has not yet been implemented or empirically evaluated, its architecture adheres to principles of accessibility, reproducibility, and progressive scalability, offering a viable technical-pedagogical framework for future deployment in universities. The approach seeks to bridge the gap between mathematical theory and analytical practice, fostering essential technical and cognitive competencies for contemporary quantitative analysis.

Downloads

Download data is not yet available.

References

Apache Software Foundation. (2020). Apache Ambari Documentation. Obtenido de https://ambari.apache.org/

Braun, V., & Clarke, V. (2006). Using thematic analysis in psychology. Qualitative Research in Psychology, 3, 77–101. Obtenido de https://doi.org/10.1191/1478088706qp063oa

Camacho Marín, R., Rivas Vallejo, C., Gaspar Castro, M., & Quiñonez Mendoza, C. (2020). Innovación y tecnología educativa en el contexto actual latinoamericano. Revista de Ciencias Sociales, 26, 460-472. Obtenido de https://www.redalyc.org/journal/280/28064146030/html/

CISCO. (2023). Cisco SG200-18 Smart Switch Data Sheet. Cisco. Obtenido de https://www.cisco.com/c/es_mx/obsolete/switches/cisco-small-business-200-series-smart-switches.html

Dean, J., & Ghemawat, S. (2008). MapReduce: Simplified data processing on large clusters. Communications of the ACM, 51, 107–113. Obtenido de https://doi.org/10.1145/1327452.1327492

Hassin Alasadi, A. H., & Nemer, Z. N. (2017). Finger Vein Verification System based on Three Methodologies of Feature Extraction. International Journal of Computer Applications, 172(5), 0975 – 8887. doi:https://doi.org/10.5120/ijca2017915144

Kolokolov, A., & Zelensky, M. (2024). Data Visualization with Microsoft Power BI: How to Design Savvy Dashboards. Sebastopol, California: O'Reilly Media.

Márquez Silva, F., & López Martínez2, R. (2025). Competencias investigativas y su análisis en el campo de la tecnología educativa mediante e-learning. Revista Ensayos Pedagógicos, 20(1), 1-37. doi:http://doi.org/10.15359/rep.20-1.7

McKinney, W. (2010). Data structures for statistical computing in Python. Proceedings of the 9th Python in Science Conference, (págs. 51–56). Obtenido de https://doi.org/10.25080/Majora-92bf1922-00a

O’Neil, C., & Schutt, R. (2013). Doing data science: Straight talk from the frontline. O’Reilly Media.

Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., & Thirion, B. (2011). Scikit-learn: Machine learning in Python. Journal of Machine Learning Research, 2825–2830. Obtenido de https://doi.org/10.48550/arXiv.1201.0490

Pinto Ayala, B. E., Castañeda Fuentes, J. G., & Sojos Tubay, A. M. (2024). Competencias digitales en docentes latinoamericanos de educación primaria en los años del 2018-2022. Revista de Ciencias Humanísticas y Sociales, 49-59. doi:https://doi.org/10.33936/rehuso.v9i1.5773

Sumbaly, R., Kreps, J., & Wu, L. (2012). The “big data” ecosystem at LinkedIn. ACM SIGMOD International Conference on Management of Data, (págs. 1125–1128). Obtenido de https://doi.org/10.1145/2213836.2213957

Tanenbaum, A., & Wetherall, D. (2011). Computer networks (5th ed.). Pearson Education.

White, T. (2015). Hadoop: The definitive guide (4th ed.). O’Reilly Media.

Wickham, H. (2016). ggplot2: Elegant graphics for data analysis (2nd ed.). Springer. Obtenido de https://doi.org/10.1007/978-3-319-24277-4

Yin, R. (2014). Case study research: Design and methods (5th ed.). SAGE Publications.

Zhang, Q., Cheng, L., & Boutaba, R. (2010). Cloud computing: State-of-the-art and research challenges. Journal of Internet Services and Applications, 1, 7–18. Obtenido de https://doi.org/10.1007/s13174-010-0007-6

Published

2026-06-30

How to Cite

Macías Bravo, J. A., Chávez Rodríguez, . W. F., & Guerrero Alcívar, Y. F. . (2026). a, an, the Design, Implementation, and Validation of a University Data Analysis Laboratory Based on Mathematical Modeling and Applied Statistics Techniques: Development and evaluation of advanced statistical methodologies and tools for solving complex problems in the academic and professional field. Sinapsis, 28(1), 9. https://doi.org/10.37117/s.v28i1.1272

Issue

Section

Information and Communication Technologies