Skip to main content

Why AI (GPT, DeepSeek) isn’t flawless at CSAT math!

The reason we learn math is to develop mathematical thinking. Mathematical thinking refers to the thought process of analyzing and solving problems in a logical and systematic way! This is not just about solving math problems but also a method that can be applied to solving complex problems in everyday life and thinking logically.

GPT and DeepSeek scored 60 and 72 points, respectively, on last year’s CSAT math exam. These scores correspond to grades 3 and 4, respectively, and with such scores, Korean high school students cannot enter universities in Seoul. 

Grade

Ranking by percentage

The lowest acceptable score

Remarks

1

100 ~ 96

92

 

2

95 ~ 89

83

 

3

88 ~ 77

73

 

4

76 ~ 60

59

DeepSeek

5

59 ~ 40

41

GPT

6

39 ~ 23

27

 

7

22 ~ 11

22

 

8

10 ~ 4

18

 

9

3 ~ 0

0

 

Why does an excellent AI get a failing score in math? There are several factors.

AI (GPT, DeepSeek) demonstrates high accuracy in solving basic and intermediate-level math problems (2-3 point questions in the Korean CSAT). However, it often struggles with more complex 4-point problems. A key reason for this discrepancy is its tendency to overlook problem constraints. This report explores the causes of these errors and suggests solutions to improve accuracy.


Why AI (GPT, DeepSeek) Overlooks Constraints in 4-Point Math Problems

1) Lack of Deep Constraint Interpretation

AI (GPT, DeepSeek) processes constraints as auxiliary information rather than critical conditions for problem-solving.

Unlike humans, it does not naturally ask, "Why is this condition important?"

Example: A function's domain is limited, but AI (GPT, DeepSeek) may proceed with general methods without considering this limitation.


2) Logical Oversights in Problem-Solving

AI (GPT, DeepSeek) often misses checking conditions at every step.

It assumes previously read constraints are inherently applied rather than explicitly verifying them.

Example: In a probability problem, AI (GPT, DeepSeek) might apply a general formula without verifying the independence of events.


3) Unnecessary Complexity in Solutions

Instead of using constraints to simplify problems, AI (GPT, DeepSeek) sometimes takes a generic and unnecessarily complicated approach.


4) Incorrect Selection of Applicable Methods

AI (GPT, DeepSeek) may apply inappropriate methods because it does not intuitively recognize the best approach based on constraints.


By systematically verifying conditions, iterating solutions with constraint reinforcement, users can significantly enhance the reliability of AI (GPT, DeepSeek)’s math problem-solving capabilities.



Comments

Popular posts from this blog

Mathematics courses in the Korean curriculum

  In Korea, the above curriculum will be applied starting with the 2028 CSAT. For reference, this curriculum will be implemented for high school freshmen starting in 2025. Common Subjects Elective Subjects General Career-focused Interdisciplinary 1. Common Math 1  2. Common Math 2  1. Algebra  2. Calculus I  3. Probability and Statistics  1. Calculus II  2. Geometry  3. Economics Mathematics  4. Artificial Intelligence Mathematics  5. Occupational Mathematics  1. Mathematics and Culture  2. Practical Statistics  3. Mathematics Project Exploration  All high schools will teach Common Math 1 and 2, along with general electives like Algebra, Calculus 1, and Probability & Statistics. However, career-focused elective courses may differ between schools. Each course is taken per semester, with 1 credit equaling 16 sessions of 50-minute classes. Over the three years of high school, students are required to complete a total...

AI tutors: How do they score on the 2024 Korean SAT math exam? Part 1 of 3 sections

Using GPT-4o, I tackled the math section from last year’s Korean SAT. The math section is made up of 22 questions (76 points) from common subjects and 8 questions (24 points) from electives. The electives include three options: Probability and Statistics, Calculus, and Geometry. In this case, I opted for Probability and Statistics. In summary, GPT got 20 out of 30 questions correct and scored 60 points out of 100.  The prompt requires solving problems manually, based on high school-level knowledge up to the 12th grade . GPT sometimes misinterpreted the problem images, but I corrected those errors. If the solution was wrong, I modified the prompt to lead the process toward the correct answer. For instance, in a logarithmic problem, I would ask it to use the logarithmic transformation formula. Out of the 30 questions, the results for questions 1 to 10 are as follows. GPT answered all the first 10 questions correctly. However, students are expected to solve these problems ...

AI tutors: How do they score on the 2024 Korean SAT math exam? Part 2 of 3 sections

Using GPT-4o, I tackled the math section from last year ’ s Korean SAT. The math section is made up of 22 questions (76 points) from common subjects and 8 questions (24 points) from electives. The electives include three options: Probability and Statistics, Calculus, and Geometry. In this case, I opted for Probability and Statistics. In summary, GPT got 20 out of 30 questions correct and scored 60 points out of 100.  The prompt requires solving problems manually, based on high school-level knowledge up to the 12th grade. GPT sometimes misinterpreted the problem images, but I corrected those errors. If the solution was wrong, I modified the prompt to lead the process toward the correct answer. For instance, in a logarithmic problem, I would ask it to use the logarithmic transformation formula. Out of the 30 questions, the results for questions 11 to 20 are as follows. I developed a tool called "AI SAT math tutor" on the GPT Store and utilized it. GPT answere...

수학 계통도(중1~고3), 2028년 개정 반영

  한국의 수능 수학은 2028년부터 아래 구성으로 변경된다.  공통과목과 일반 선택 과목은 필수이고, 나머지 과목은 선택이다. 학생들은 대학마다 우선시 하는 선택과목을 전략적으로 수강해야 한다. 다만 고등학교의 강의 여건과 학생들의 선호도를 감안하면, 많은 학생들이 기존 수능의 선택 과목이였던 미적분 Ⅱ와 기하를 수강할 것으로 예상된다. 공통 과목 선택 과목 일반 선택 과목 진로 선택 과목 융합 선택 과목 공통수학 1 공통수학 2 대수 미적분Ⅰ 확률과 통계 미적분Ⅱ 기하 경제 수학 인공지능 수학 직무 수학 수학과 문화 실용 통계 수학과제 탐구   아래는 중학교 1학년부터 고등학교 3학년까지 6년 동안 수강하는 수학의 세부 계통도를 나타낸다.