# TED Community ยป Patrick Daly

• TEDCred score: 0.00 TEDCred reflects your contribution to the TED community.

• #### A comment onConversation: AI: is SENTIENCE derived from pain & pleasure?

Jan 5 2013: Having just completed UCBerkeleyX's 188.1x course, "Artificial Intelligence," I can tell you that Pain and Pleasure is how you create a QLearning algorithm. After the 6 week class, I created a set of code which could teach itself to play Pacman, and play it extraordinarily well!!

The way it did this was through a pain/pleasure learning method. The AI has a feature function; This function takes a Pacman game state, and analyzes it, returning certain measurements, such as distance to the next pellet, how many pellets are left, whether or not you eat a pellet, if there is a ghost about to eat you, game score, etc.

At the first time Pacman goes out onto the field of battle, as it were, Pacman has zero idea what any of these things mean. He keeps track of a set of coefficients which are attributable to each feature. When exploring the choices Pacman has, up/down and left/right, Pacman scores each of these by taking the features of the game state after the move is made, and using the coefficients he has learned to come up with a composite score for that movement; he then takes the best scoring move.

At first, as I said, he has no idea what is going on. Coefficients of 1, and he is off to the races. He comes across a pellet; after eating this pellet, he is still alive and his score goes up. Hrm, must mean eating pellets is a good thing: so he increase the coefficient for "eats pellet" a bit. If the square pacman is about to jump on gives a reward through eating a pellet, pacman will know that it is a good thing.

Next, he runs into the first ghost he sees, and dies. He learns from his death though; the coefficient for "Number of ghosts 1 square away" goes negative. This enforces it as a bad thing, and in the future, any square with a ghost about to step onto it will have a total score of much less. He shouldn't have to run into a second ghost; the negative coefficient is sufficient to lower the total score out of "viable move" range; he will dodge any future ghosts.
• #### A reply onTalk: Peter Norvig: The 100,000-student classroom

Jan 5 2013: Having participated in a few MOOC's myself, instructor feedback on the forums is VERY low, but I have seen few problems arise from this. The only real problem is the very noticeable lack of instructor presence on the forums.

Having been a Community Teaching Assistant in UCBerkeleyX's 188.1x, "Artificial Intelligence," the CTA's, who were just students who volunteered to answer questions on the forums, prodived the front line of interaction between students and teachers.

The teachers have not the time to answer hundreds of questions every single day. Giving the CTA's a direct connection to the teachers, should they need it, and allowing the CTA's to teach the students gave the CTA's the opportunity to learn how to teach the material, while they learned it themselves, turned out to be a fantastic change in the way things are run which provides great opportunities to all involved. Teachers get more time to work on other things, students get the opportunity to stand out from the crowd and learn to lead in this field.

Norvig's quote, "Peers can be the best teachers, because they're the ones that remember what it's like to not understand," is incredibly true. The community organized forums are an amazing innovation that work wonders.

#### Favorite talks

This member doesn't have any favorite talks yet.