Examining the potential and pitfalls of AI in problem solving
Yasin Memiş 1 *
More Detail
1 Niğde Provincial Directorate of National Education, Türkiye
* Corresponding Author

Abstract

The integration of artificial intelligence (AI) into mathematical problem-solving has shown significant potential to enhance student learning and performance. However, while AI tools offer numerous benefits, they are prone to occasional conceptual and arithmetic errors that can mislead users and obscure understanding. This research examines such errors to improve the role of AI in solving mathematical problems. The study particularly assesses the abilities of AI tools—ChatGPT-4, Gemini, and CoPilot—in addressing proportional reasoning errors commonly made by students. ChatGPT-4 achieved the highest accuracy rate among the tested tools, correctly answering 10 out of 14 questions. Additionally, ChatGPT-4 provided more detailed explanations in its responses, with a higher word count compared to the other tools. However, all tools replicated certain errors commonly made by students on specific reasoning questions. In conclusion, while AI tools hold promise for enhancing mathematics education, they still have limitations. Improving AI’s contextual understanding and problem-solving adaptability could lead to the development of more robust educational tools.

Keywords

References

  • Baidoo-Anu, D., Owusu Ansah, L. (2023). Education in the era of generative artificial intelligence (AI): Understanding the potential benefits of Chatgpt in promoting teaching and learning. Journal of Artificial Intelligence, 7(1), 52-62.
  • Bubeck, S., Chandrasekaran, V., Eldan, R., Gehrke, J., Horvitz, E., Kamar, E., Lee, P., Lee, Y. T., Li, Y., Lundberg, S., Nori, H., Palangi, H., Ribeiro, M. T. & Zhang, Y. (2023). Sparks of artificial general intelligence: Early experiments with Gpt-4 (arXiv2303.12712). arXiv. https://doi.org/10.48550/arXiv.2303.12712
  • Casal-Otero, L., Catala, A., Fernández-Morante, C., Taboada, M., Cebreiro, B., & Barro, S. (2023). AI literacy in K-12: A systematic literature review. International Journal of STEM Education, 10(1), 29.
  • Cramer, K., & Post, T. (1993). Making connections: A case for proportionality. The Arithmetic Teacher, 40(6), 342-346.
  • Dao, X. Q., & Le, N. B. (2023). Investigating the effectiveness of ChatGPT in mathematical reasoning and problem solving: Evidence from the Vietnamese national high school graduation examination (arXiv:2306.06331). arXiv. https://arxiv.org/abs/2306.06331
  • Dwivedi, Y. K., Kshetri, N., Hughes, L., Slade, E. L., Jeyaraj, A., Kar, A. K., Baabdullah, A. M., Koohang, A., Raghavan, V., Ahuja, M., Albanna, H., Albashrawi, M. A., Al-Busaidi, A. S., Balakrishnan, J., Barlette, Y., Basu, S., Bose, I., Brooks, L., Buhalis, D., Carter, L., & Wright, R. (2023). So what if ChatGPT wrote it?” Multidisciplinary perspectives on opportunities, challenges and implications of generative conversational AI for research, practice, and policy. International Journal of Information Management, 71, 102642. https://doi.org/10.1016/j.ijinfomgt.2023.102642
  • Engelbrecht, J., & Borba, M. C. (2024). Recent developments in using digital technology in mathematics education. ZDM, 56(2), 281-292.
  • Ergene, O., & Ergene, B.C. (2024). AI ChatBots’ solutions to mathematical problems in interactive e-textbooks: Affordances and constraints from the eyes of students and teachers. Education and Information Technologies, 30, 509-545. https://doi.org/10.1007/s10639-024-13121-z
  • Frieder, S., Pinchetti, L., Griffiths, R.R., Salvatori, T., Lukasiewicz, T., Petersen, P.C., Chevalier, A., & Berner, J. (2023). Mathematical capabilities of ChatGPT (arXiv:2301.13867). arXiv. https://doi.org/10.48550/arXiv.2301.13867
  • Guler, N., Dertli, Z. G., Boran, E., & Yildiz, B. (2024). An artificial intelligence application in mathematics education: Evaluating ChatGPT’s academic achievement in a mathematics exam. Pedagogical Research, 9(2), 1-12. https://doi.org/10.29333/pr/14145
  • Hilton, A., Hilton, G., Dole, S., & Goos, M. (2013). Development and application of a two-tier diagnostic instrument to assess middle-years students’ proportional reasoning. Mathematics Education Research Journal, 25(4), 523-545. https://doi.org/10.1007/s13394-013-0083-6
  • Janéafik, A., & Dusek, O. (2024). The problem of AI hallucination and how to solve it. In F. Moriera (Ed.), Proceedings of the 23rd European Conference on e-Learning (Vol. 23, No. 1, pp. 122-128). Academic Conferences International. https://doi.org/10.34190/ecel.23.1.2584
  • Kavitha, K., & Joshith, V. P. (2024). Pedagogical incorporation of artificial intelligence in K-12 science education: A decadal bibliometric mapping and systematic literature review (2013-2023). Journal of Pedagogical Research, 8(4), 437-465. https://doi.org/10.33902/JPR.202429218
  • Lamon, S. J. (2007). Rational numbers and proportional reasoning: Toward a theoretical framework. In F. Lester (Ed.), Second handbook of research on mathematics teaching and learning (pp. 629-668). Information Age Publishing.
  • Lamon, S. J. (2020). Teaching fractions and ratios for understanding: Essential content knowledge and instructional strategies. Routledge. https://doi.org/10.4324/9781003008057
  • Lee, M. A. (2023). Mathematical investigation of hallucination and creativity in GPT models. Mathematics, 11(10), 2320. https://doi.org/10.3390/math11102320
  • Lesh, R., Post, T., & Behr, M. (1988). Proportional reasoning. In J. Hiebert, & M. Behr (Eds.), Number concepts and operations in the middle grades (pp. 93–118). National Council of Teachers of Mathematics.
  • Lewkowycz, A., Andreassen, A., Dohan, D., Dyer, E., Michalewski, H., Ramasesh, V., (2022). Solving quantitative reasoning problems with language models. Advances in Neural Information Processing Systems, 35, 3843–3857. https://doi.org/10.48550/arXiv.2206.14858
  • Li, P. H., Lee, H. Y., Cheng, Y. P., Starčič, A. I., & Huang, Y. M. (2023). Solving the self-regulated learning problem: Exploring the performance of ChatGPT in mathematics. In Y. M. Huang, & T. Rocha (Eds.), Innovative technologies and learning: Lecture notes in computer science (pp. 77-86). Springer. https://doi.org/10.1007/978-3-031-40113-8_8
  • Lobato, J., & Ellis, A. (2010). Developing essential understanding of ratios, proportions, and proportional reasoning for teaching mathematics in grades 6-8. National Council of Teachers of Mathematics.
  • Mardikar, S., Aldhuhayyan, A., & Prabhu, V. V. (2024). Can the “art” of mathematical modeling in industrial engineering be automated by AI? In M. Rahman (Ed.), Proceedings of the 9th North American Conference on Industrial Engineering and Operations Management (pp. 1030-1041). IEOM Society International. https://doi.org/10.46254/NA09.20240261
  • Memiş, Y. (2022). Examination of secondary school textbooks from Turkey, Singapore, and Canada in the context of proportional thinking (Publication no. 463446) [Master’s thesis, Anadolu University]. Council of Higher Education Thesis Center.
  • Miao, F., & W. Holmes. (2023). Guidance for generative AI in education and research. UNESCO.
  • Mohamed, M. Z. B., Hidayat, R., Suhaizi, N. N. B., Sabri, N. B. M., Mahmud, M. K. H. B., & Baharuddin, S. N. B. (2022). Artificial intelligence in mathematics education: A systematic literature review. International Electronic Journal of Mathematics Education, 17(3), 12132. https://doi.org/10.29333/iejme/12132
  • National Council of Teachers of Mathematics. (2024). Artificial intelligence and mathematics teaching. Author.
  • Oh, S., Yoon, J., Chung, Y., Cho, Y., Shim, H., & Kwon, O. N. (2024). Analysis of generative AI's mathematical problem-solving performance: Focusing on ChatGPT 4, Claude 3 Opus, and Gemini Advanced. The Mathematical Education, 63(3), 549-571.
  • Opesemowo, O. A. G., & Ndlovu, M. (2024). Artificial intelligence in mathematics education: The good, the bad, and the ugly. Journal of Pedagogical Research, 8(3), 333-346. https://doi.org/10.33902/JPR.202426428
  • Pardos, Z. A., & Bhandari, S. (2024). ChatGPT-generated help produces learning gains equivalent to human tutor-authored help on mathematics skills. PLoS ONE, 19(5), e0304013. https://doi.org/10.1371/journal.pone.0304013
  • Plevris, V., Papazafeiropoulos, G., Jiménez Rios, A. (2023). Chatbots put to the test in math and logic problems: a comparison and assessment of ChatGPT-3.5, ChatGPT-4, and Google Bard. Artificial Intelligence, 4(4), 949–969. https://doi.org/10.3390/ai4040048
  • Rane, N. (2023). Enhancing mathematical capabilities through ChatGPT and similar generative artificial intelligence: Roles and challenges in solving mathematical problems. SSRN. https://doi.org/10.2139/ssrn.4603237
  • Sallam, M. (2023). ChatGPT utility in healthcare education, research, and practice: Systematic review on the promising perspectives and valid concerns. Healthcare, 11(6), 887. https://doi.org/10.3390/healthcare11060887
  • Sánchez-Ruiz, L. M., Moll-López, S., Nuñez-Pérez, A., Moraño-Fernández, J. A., & Vega-Fleitas, E. (2023). ChatGPT challenges blended learning methodologies in engineering education: A case study in mathematics. Applied Sciences, 13(10), Article 6039. https://doi.org/10.3390/app13106039
  • Schoenfeld, A. H. (2011). How we think: A theory of goal-oriented decision making and its educational applications. Routledge.
  • Sinha, S., Prabhu, A., Kumaraguru, P., Bhat, S., & Bethge, M. (2024). Wu's method can boost symbolic AI to rival silver medalists and alpha geometry to outperform gold medalists at IMO geometry (arXiv:2404.06405). arXiv. https://arxiv.org/abs/2404.06405
  • Slagle, J. R. (1963). A heuristic program that solves symbolic integration problems in freshman calculus. Journal of the ACM, 10(4), 507-520.https://doi.org/10.1145/321186.321193
  • Stacey, K. (1989). Finding and using patterns in linear generalising problems. Educational Studies in Mathematics, 20(2), 147-164. https://doi.org/10.1007/bf00579460
  • Tahiru, F. (2021). AI in education: A systematic literature review. Journal of Cases on Information Technology, 23(1), 1-20.
  • Tekin, N. (2023). Artificial intelligence in education: a content analysis on trends in research from Türkiye. Journal of Necmettin Erbakan University Ereğli Faculty of Education, 5, 387-411.
  • Van Dooren, W., De Bock, D., Hessels, A., Janssens, D., & Verschaffel, L. (2005). Not everything is proportional: Effects of age and problem type on propensities for overgeneralization. Cognition and Instruction, 23(1), 57-86. https://doi.org/10.1207/s1532690xci2301_3
  • Vanluydt, E., Verschaffel, L., & Van Dooren, W. (2022). The role of relational preference in early proportional reasoning. Learning and Individual Differences, 93, 102108. https://doi.org/10.1016/j.lindif.2021.102108
  • Wardat, Y., Tashtoush, M. A., Ali, R., & Jarrah, A. M. (2023). ChatGPT: A revolutionary tool for teaching and learning mathematics. Eurasia Journal of Mathematics, Science and Technology Education, 19(7), 13272. https://doi.org/10.29333/ejmste/13272
  • Wei, X. (2024). Evaluating chatGPT-4 and chatGPT-4o: performance insights from NAEP mathematics problem solving. Frontiers in Education, 9, 1452570. https://doi.org/10.3389/feduc.2024.1452570
  • Zhai, X., Chu, X., Chai, C. S., Jong, M. S. Y., Istenic, A., Spector, M., Liu, J. B., Yuan, J., & Li, Y. (2021). A review of artificial intelligence (AI) in education from 2010 to 2020. Complexity, 2021, 8812542. https://doi.org/10.1155/2021/8812542

License

This is an open access article distributed under the Creative Commons Attribution License which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.