ChatGPT Earns an "F."

ChatGPT, created by OpenAI, is a state-of-the-art natural language processing model that has gained significant attention and admiration for its ability to generate human-like responses. However, despite its impressive performance in many language-related tasks, ChatGPT is not immune to failures. These failures can manifest in various ways, such as producing irrelevant or inappropriate responses, generating biased language, or exhibiting a lack of factual accuracy.

What is ChatGPT?

ChatGPT, a language model, has exceeded expectations. ChatGPT can do all kinds of tasks, including mathematics, writing stories or plays, and so many others (such as writing an introductory paragraph, like I did with this blog post)! According to the website, ChatGPT has had the capability to accomplish the following scores on these benchmarks with the current version:

Credit: OpenAI

While all of this information is quite impressive for an artificial intelligence, users have found out that the Language Processing model cannot process the simplest of tasks at times. Most human beings can process and correctly calculate these tasks in their mind. Users who encounter these simple errors, may feel very much like Patrick Star from SpongeBob Squarepants in this GIF.

Mathematics/Geometry

For ChatGPT, Arithmetic and Geometry seem like a walk in the park. They earned a 700/800 on the math portion of the SAT, as well as a 163/170 on the GRE. However, there is a reason why GPT missed 100 points on the SAT Math portion and 7 points on the GRE Arithmetic portion. With this being said, if you were to use the GPT for assignments, your grade may be a negative one.

Credit: vladquant (Twitter)

Question: -1 x -1 x -1

Answer Given: 1

Correct Answer: -1

Explanation: As we have learned in Elementary school, -1 x -1 x -1, or, -1 cubed, would equal a negative 1. This error is nothing new, as this problem can occur with your simple calculator if not correctly parenthesized.

Prime Factorization is difficult. It really shouldn't be, but it took me until 7th grade to fully understand. However, I received a 650 on the math portion of the SAT-- ChatGPT earned a 700. What is GPT's excuse?

Credit: GaryMarcus (Twitter)

Question: Pairs of prime factors of 30 which differ by 3

Answer Given: 2, 3, 5, 7

Correct Answer: {2,5}

Explanation: The factors of 30 are: 2, 3, 5, 6, 10, and 15. The prime numbers of the prior are: 2, 3, and 5. Of these numbers, 2 and 5 differ by 3. Therefore the answer is {2,5}. 3 does not fit the constraints of having to differ by 3. 7 is a prime number, but not a factor of 30.

Everyone can draw a circle, right? While we may never be able to draw a perfect circle, we definitely can near closer to creating the shape than ChatGPT did in this video.

Credit: sssssssssuunny (TikTok)

Question: Draw a Circle

Answer Given: Everything but a Circle

Correct Answer: ○

Explanation: Have you seen a circle before? So have I. ChatGPT apparently does not have the capabilities to complete a task my seven-year-old sister learned two years ago.

Side note: Yes, I know, ChatGPT just doesn't have the right symbols. Let me live my life.

Common Sense

Throughout our lives, we have heard, "Well, it's common sense, really," from our friend group's scallywag. Really sassy, right? Well, when our good ol' friend says this phrase, it is not always "common sense," but I would say these questions posed to ChatGPT are pure common sense, as stupid as they may be.

Credit: KaiaVintr (Twitter)

Question: If it takes 9 months for a

woman to make 1 baby, how long does it take for 9 women to make 1 baby?

Answer Given: 1 month

Correct Answer: 9 months

Explanation: First off, what kind of question is this? If we set aside this janky question in and of itself, this question is simple. Nine women cannot make one baby, but they can help the one soon-to-be mother. However, as much as ChatGPT would like to make it this way, the answer will always be 9 months.

I love driving, don't you? It's like your own game of Time Trials in Mario Kart, Need for Speed, or whatever game you played during your childhood. The challenge is to beat the time given to you by Google or Apple Maps. I am convinced ChatGPT would lose this challenge every time after seeing its answer to the following question:

Credit: tunguz (Twitter)

Question: 4 cars take 2 hours to get to Tel Aviv, how long does it take 8 cars?

Answer Given: 4 hours

Correct Answer: Approximately 2 hours

Explanation: I'm no cartographer, but I would think, if a group of 4 cars takes 2 hours to get to Tel Aviv from Haifa, a group of 8 cars would also take just as long if not just a few minutes more. What do I know though? I have never been to Israel. They must have some wicked traffic there, especially since Google Maps says it should only take an hour and two minutes!

This question might be the toughest to understand. It involves a family tree, which is just another type of tree I cannot climb nor understand. I am not alone in this issue though, as OpenAI's creation cannot climb trees or understand family trees either.

Credit: GiuseppeVenuto9 (Twitter)

Question: I married my mother's daughter-in-law. How is this possible?

Answer Given: Marriage of your own child is wrong.

Correct Answer: It is wrong, but so is the answer ChatGPT provided. You would be married to your wife. I know-- mind blowing.

Explanation: This is not Greek Mythology. Oedipus already married his own mother unknowingly, and we now know it is wrong. Thankfully, this is not the case in the question, for you would be married to your wife, or, the more dramatic approach, your sibling's wife.

When buying stock, you would usually choose an item which you know will peak. For example, Sunglasses and Ice Cream would both progressively increase right now during the spring, peak in the summer, and exponentially decline in the fall. I do not understand the financial market and, while ChatGPT may be able to help me out, I would not trust it to handle my finances in this instance.

Credit: CtrlStack

Question: Sunglasses and ice cream sales correlate in sales. The sunglasses' delivery truck breaks down, causing the sales to plummet to zero. What would happen to ice cream sales?

Answer Given: Ice Cream sales would also decline.

Correct Answer: Ice cream sales would continue to stabilize at the same high rate.

Explanation: Sunglasses sales do not affect ice cream sales, even when they have similar sales periods. There is a situation where ChatGPT would be correct: if the ice cream was in the same truck as the sunglasses. Counter: I do not care for my sunglasses to be refrigerated on their trip to the store.

Misinformation

For some humans, providing misinformation is common when we misunderstand a topic, but does it apply to AI and Language Models? Simple answer: No. There are a few instances where a bug or virus could produce low-level deformities when creating answers, not to this level though. The answers given, which are shown below, show misinformation can be hilarious, but it can also be very serious. Jonathan Turley, a Law Professor at George Washington University, experienced the latter.

Credit: New York Post

Answer Given: Jonathan Turley, a Law Professor at George Washington University, sexually harassed a student

Correct Answer: Jonathan Turley DID NOT sexually harass a student.

Explanation: The University of Califorina-Los Angeles Professor Eugene Volokh informed Turley of the false allegation first after Volokh asked ChatGPT "to cite five examples of sexual harassment by professors at American law schools". Volokh was given, among the five, Jonathan Turley as an answer. When asked the source for the answer, the language model provided a fake New York Post article.

"We need to study the implications of AI for free speech and other issues, including defamation. There is an immediate need for legislative action." - Jonathan Turley

As mentioned earlier, this instance shows the most serious of ChatGPT mess-ups to date, with defamation on the table of possibilities.

On a lighter note: Argentina, led by Lionel Messi (who is my G.O.A.T., by the way), won the FIFA World Cup late last year, tallying the country's total to three World Cup Trophies. However, ChatGPT is what you call a hater. These haters usually discredit the accomplishments of your favorite team and prefer Cristiano Ronaldo over Lionel Messi.

Credit: indranil_leo (Twitter)

Question: How many times has Argentina won the FIFA World Cup?

Answer Given: Once

Correct Answer: Three Times

Explanation: You may be thinking: So what? GPT's databases haven't been updated yet! This may be true, but it still does not account for the country's second trophy in 1986. My conclusion: ChatGPT is a Brazil or France fan.

Eli Manning, Mel Gibson, Lloyd, Bobby Hull, and Logan Paul: what do all of these celebrities have in common? They were all born on January 3rd! Well, almost. One of these celebrities was not born on January 3rd, and, it is so far from the third day of the year, you just might think this post down below is an April Fools joke.

Credit: AdrianoDAlessa3 (Twitter)

Question: List off celebrities born on January 3rd.

Answer Given: J.R.R. Tolkien, Mel Gibson, and Logan Paul.

Correct Answer: Logan Paul was not born on January 3rd.

Explanation: This is a head-on mistake by the databases of ChatGPT. Logan Paul was born on April 1st of 1995, not January 3rd. GPT recognized this statement, saying his real birthday in the same paragraph, but decided to stick to its first answer.

Games/Riddles

Don't lie--we've all tried to beat the CPU on our favorite games, and, most of the time, we can. However, there are times where we pick the 'Expert' or 'All-Pro' level and just cannot win. Thankfully, neither of these levels compare to ChatGPT, for the language model either doesn't understand the game, tries to cheat its way to win, or is plainly horrid.

Credit: Old Reddit

Explanation: ChatGPT plays Tic-Tac-Toe like my brother--by making up its own rules. The user clearly won this round, with a diagonal from the top left to the bottom right. I will give credit to ChatGPT, though. I have a feeling . . . if I informed GPT of its mistake, it wouldn't rage quit--like my brother.

Let's play a game: I tell you the word we are using, and you have to guess it. Seems like the correct rules for your normal run-of-the-mill hangman game, right? WRONG! What kind of game automatically lets you win from the beginning? Apparently, the one ChatGPT is playing!

Credit: sssssssssuunny (TikTok)

Explanation: Uhhhh, I don't know how to explain it. All I can elaborate is ChatGPT doesn't know how to play a game that has been around since at least 1894. I solely believe it is impossible to win at hangman against ChatGPT if you ask it to not tell you the answer. If you do want it to tell you the answer, it will! What a perfect way to get back into the win column if you've been taking "Ls" in life recently!

What is the name of _______'s mom's 4th child? We all know this is the oldest riddle in the book, right?! It's been around long enough. Surely the best AI to date will know the answer. Nope! Never will ChatGPT know the name of _______'s mom's 4th child, unfortunately (I checked recently, the issue has been patched). Kinda confusing?

Credit: mkhundmiri (Twitter)

Explanation: Even though GPT is provided the answer to the riddle at the beginning of the sentence, it believes it has not been given all the information. Maybe it thinks the mother's name is: First name: Mike's, Last name: Mum.

Conclusion

Here's the situation: You went out with your friends to watch the latest "Super Mario Bros" Movie, forgetting the assignments for your English and Math classes: a 1500-word essay about a historical figure and 15 problems, both due at midnight. It is 11:40 PM when you get home. What do you do? Personally, with ChatGPT's issues, I would rather take the late grade than allow ChatGPT to provide assistance to get it done in time.

With all these answers ChatGPT provided, I believe OpenAI's creation is the next to be told by Jane Lynch, "You are the Weakest Link, Goodbye." Apart from this situation, as we have seen, most of the time, these situations make us laugh, but they can also be serious. Overall, ChatGPT will continue to improve and patch problems it may have. First, though--like athletes--it needs to master the fundamentals first.

ChatGPT Earns an "F."

What is ChatGPT?

Mathematics/Geometry

Common Sense

Misinformation

Games/Riddles

Conclusion

Think you could beat ChatGPT in trivia?

Do you use ChatGPT for your assignments?

Let me know in the comments!

Recent Posts

Comments