Exposing the limits of AI through Gamification
CommonsenseQA is a yes/no question answering challange set which was collected using a game called “Teach-Your-AI”.
At a high-level, a player is asked to author a yes/no question, is then shown the answer from the AI, and then marks whether the AI was correct or not. The goal of the player is to earn points, which are used as a flexible vehicle for steering the behaviour of the player. First, points are given for beating the AI, that is, authoring questions where the AI is incorrect. This incentivizes the player to ask difficult questions, conditioned on its understanding of the AI capabilities. Second, the player gets points for using particular phrases in the question. This provides the game designer control to skew the distribution of questions towards topics or other phenomena they are interested in. Last, questions are validated by humans, and points are deducted for questions that do not pass validation. This pushes players to author questions with broad agreement among people.
For more details check out our NeurIPS-21 benchmark submission “CommonsenseQA 2.0: Exposing the Limits of AI through Gamification”
This dataset was created by a team of NLP researchers at Allen Institute for AI.