Our Cofounder’s research: ChatGPT passess Theory of Mind Test!
Reprinted from Gyfted’s media blog.
NOTE: check out Gyfted’s Theory of Mind test yourself (doesn’t require registration).
Stanford’s Michal Kosinski’s exciting ChatGPT Theory of Mind research
On Feb 20, 2023 we announced that Gyfted’s co-founder dr Michal Kosinski’s research on ChatGPT and Theory of Mind was republished in New Scientist, IFLScience, Popular Mechanics and ZDNet.
OpenAI’s ChatGPT performs well at Theory of Mind test
TLDR: OpenAI’s ChatGPT performs well at tasks designed to test cognition in 9-year old children. The most recent update to GPT-3 (ChatGPT/davinci-003) seems to be able to impute unobservable mental states (such as beliefs and desires) to others (in humans we would call it the Theory of Mind) — you can find the published research here “Theory of Mind May Have Spontaneously Emerged in Large Language Models“.
3/4 We conducted a range of sanity checks, including this sentence-by-sentence check of GPT-3’s understanding of the unexpected-contents task:
https://twitter.com/michalkosinski/status/1623882469807362051
— Michal Kosinski (@michalkosinski) February 10, 2023
ChatGPT passes human level Theory of Mind and other tests!
Quick update to research on OpenAI’s ChatGPT-4 ability to pass standardized tests and empathy (theory of mind) assessment.
In the December research ChatGPT-3.5 could pass the below at a 9-year old’s level, you can read more about it here.
Now, with GPT-4, ChatGPT can pass at an adult human level the Faux-pas Recognition Test (Adult version), a validated test of Theory of Mind. See below:
#GPT4 performs at the human level on the Faux-pas Recognition Test (Adult version), a well-validated test of Theory-of-Mind. Check out the example task below (We wrote custom tasks to make sure that it didn’t see them in its training set.) pic.twitter.com/NWwRvxy4vx
— Michal Kosinski (@michalkosinski) March 15, 2023
Wonder what’ll happen in image recognition form in reading emotions from people’s eyes — like in this emotion recognition assessment we have at Gyfted. Nevertheless, this is all really intriguing.
ChatGPT passes standardized tests well
Moreover, ChatGPT-4 can pass many standardized tests, and this is an amazing capability of this system: https://openai.com/research/gpt-4
It looks like an arms race is coming ahead between assessment companies and candidates, given such results on educational entry exams.
One might wonder how will companies in technical programming like HackerRank, Codility, CoderPad, CodeSignal fare, or general testing companies like TestGorilla or SHL. Incentives will matter significantly in this process.
Perhaps companies (customers of test provider solutions) will stop giving skills and task assessments to candidates? Candidates feel like they are hamsters running on recruiting process treadmills just to cater to the anxieties and biases of hiring managers and recruiters (mostly worried about mis-hires and avoiding accountability in recruitment decision — especially in enterprises), and they have every incentive to avoid tests as obstacles. That’s why it is not about testing per se, but about uncovering valid signals and discovering hidden talent that fits you in hiring, which is what we’re about at Gyfted.
Robert, CEO & Co-founder @ Gyfted