E83: Joscha Bach, Research Scientist at the Harvard Program for Evolutionary Dynamics – Interview

December 20, 2016


This interview on computational metapsychology is with Joscha Bach, PhD. Joscha is a research scientist at the MIT Media Lab.

Joscha has a deep AI background. He has computer science degrees from the University of Waikato and Humboldt University of Berlin. He also received his PhD in Cognitive Science from Universitat Osnabruck. Starting in 2000, he has been an AI researcher and lecturer, and co-founder at different institutions and companies.

Joscha researches and explores the mind and how it works, and how to relate that to computing, including computational metapsychology It’s fascinating. Joscha is pushing the boundaries of what AI can do in the future. I was lucky enough to be able to interview him.

Here are some other things we talk about:

-What eventually do you want your Minecraft agent to be able to do?
-What two areas of AI are you interested in?
-How do you train an AI agent to have emotions?
-When did you first get interested in computers, AI?


David Kruse: Hey everyone. Welcome to another episode of Flyover Labs. Today we are lucky enough to have Joscha Bach with us. And Joscha is a research scientist at the MIT Media Lab and the Harvard Program for Evolutionary Dynamics. And Joscha researches and explores the mind and how it works and how to relate that to computing, including computational meta-psychology. So it’s fascinating. I don’t know if I understand quite all of it, so that’s why we have Joscha here to explain more, but I’m very curious to learn what he us up to. So Joscha, thanks for coming on the show today.

Joscha Bach: You’re very welcome.

David Kruse: And before we start with what you are working on now and your current research, can you give us a little bit of your background?

Joscha Bach: Well, I’m most interested in how minds work. So I went into academia and started to study computer science and philosophy and then switched into a Cognitive Science Department and I think that the best bet that we have with order to understand how they do what they do is to treat them as information processing systems. So I think that AI is probably our best bet at understanding minds and this is mostly the reason why the field has been started 60 years ago by people like Marvin Minsky and John McCarthy. Part of AI is about building systems that are better at processing data that learn better, that control robots or technical systems. But the AI has always also been a science that we are part of it. They did try to understand the minds by building test to the models, basic computer models that reflect the degree to which AI understands thinking and in not being able to do certain things, we did see the forthcoming of our series.

David Kruse: And how did you get interested in starting the research and to work with the mind. You know what prompted you, was it a class you took or a project you worked on?

Joscha Bach: I think it started when I was sitting in front of my Commodore 64, so roughly in 1983 and I typed stuff in and at some point in realized that it’s nothing I cannot put into this system. Everything that I can imagine I can bring to life inside of the computer and of course the Commodore 64 had very limited memory and it was very slow, but this is not the biggest problem. Of course you can have computers with more memory that are faster and I knew that at some point computers will be much, much faster and have much, much more memory. The real question is what do you understand? If you understand the colossal structure of the systems you can express it as a computer program and its going to do whatever you want. So you can build robots in a computer, you can built arbitrary machines in a computer. It’s like Lego, but it’s unlimited. And so I thought, if I put, what do I want to put inside of the system and the obvious answer was I would want to put minds into it; I want to put something in it that understands us, that we can talk to, that helps us understand who we are. Yeah, so those – and sometimes this is how my testing with Artificial Intelligent started and I guess it’s fairly typical for many people of our generation.

David Kruse: Yes, yes.

Joscha Bach: Who grew up with computers.

David Kruse: Yes, I’ve heard that from many different people, from very early on, they began with their fascination. So when was the first time that you worked on a project you know around AI and trying to model the mind?

Joscha Bach: Hard to say. So I think that I stated my seminars on mind building and cognitive architecture in the late 1990s and so this was the first time when I basically started to really group these other students and we got together about this and in 2003 we had the first addition of MicroPsi ready, which was inspired by the theory by the German Physiologist Dietrich Dörner and basically he had this theory on how motivation works and how it interfaces with memory and how to built agents based on this and get the run route free in the simulation environment. And we thought let’s take these ideas and turn them into a solid project that we test and work with and run simulations in. And yeah, in 2003 we had the first version on that one ready and now we are somehow working on the third addition of it.

David Kruse: Got you, okay. And can you tell us a little bit about your current research. Where your interests lie right now?

Joscha Bach: There are mostly two things that I’m interested in right now. One is the structure of the motivational system, because we are not just systems that are directed on goals, we are goal finding systems. And there is a structure in that that makes us interested in other things and it is interested and to do other things and this is the structure of motivation. Motivation is not affected, its resistor. It’s something that makes us not give into inertia. It’s something that makes us get up in the morning and take a shower and put food in our body and so on. We don’t do this out of committed resistant. We do it because the motivational systems forces us with pleasure and pain, right and the differences in these systems and between people manifest the differences in personality and if you are for instance a person that gets more reward for being competent than for being liked by another person, so if you have a stronger need for competent than for affiliation, then you are probably not a very agreeable personal and vice versa, if you have the stronger need for affiliation that people like you than for competence, you might be a much more agreeable person in a conversation. And in this way you can start modeling personality and make sense of difference in personality. So this is one of the topics that I’m currently interested, how to understand the personal differences and how to understand the works that make human in general. The other part is learning itself. Most of the interest in work and learning right now has been under the label of deep learning, so if you build this hierarchical neural network that more or less autonomously or with reinforcement and sometimes with supervision, learn almost arbitrary remain. But there is still a very big gap between our deep learning systems and human progression and I try to understand what is mission. So, one of the things that seem to be missing is compositionality. So for instant our thoughts can be put together into new thoughts, our mental presentations can be put into different kinds of simulations that play out in our mind. And current neural networks are not that compositional yet. So how can we build them into structures that can be taken apart and recomposed into a different context and this is one of the things I’m currently interested in.

David Kruse: Interesting. And so with your first interest around where you can model a personality, are you interested in kind of exploring you know how to model personality? Are you actually modeling it computationally? And if you are, how do you do that or can you describe kind of the structure and you’re thought process behind doing that?

Joscha Bach: So currently we have simple simulation, in which we have agents that run around in the direction of each order and by changing the parameters of the motivational system, they will interact with each other in different ways and we can look at these ways and then compare them for instance to humans being in a very similar environment. So one paradigm to do this is to build a game world and in this game world your computer agents play that game and have human agents play that game that modify the behavior or the parameters of your computer agents and see if they can mimic the behavior of different personality types.

David Kruse: Oh! Interesting.

Joscha Bach: That’s one of the paradigms that we are currently exploring. I don’t know how far we will get with this, but we’ll see what we can do with it.

David Kruse: That’s a smart way to model it. I was wondering how you were exploring that and can you give an example that maybe you have seen in a video game of an AI mimicking a human kind of emotion or behavior or have you seen something like that.

Joscha Bach: So what we did in the past was that we have been working at a game where we were collecting resources in an island world and on that island you put – for instance have to look for food and water and clean wraps and so on you have to actually avoid damage and people play this very same game and there are some people tend to play it very safe, that for instance come in at a place that they can pick on, it’s like they know the environment and then be glued to this space all the time that people that are very explorative and are very risk taking and so on. And we can model these things by changing parameters in the agents. The things that, this was work that was largely done inside the orders group and what we did was we ran a simulation on the parameters of the emotional system and of the motivational system and so if you changed the environment in a certain way, what kind of parameter settings do we get.

David Kruse: Interesting.

Joscha Bach: So for instance which kinds of environment will lead to more cooperative agents or more aggressive agents and so on?

David Kruse: Right. If it’s harder to find food and water, in theory would become more aggressive.

Joscha Bach: Yeah, it could be, but it could also be that you benefit more by forming alliances to defend your resources against other agents. But for doing this you need to have a benefit from alliances, so we needed to built something for the agent that could attach each other and so on, and protect each other and also tell each other apart. So as soon as you start doing this, you need the agents that have more and more common capabilities.

David Kruse: Interesting. And how do you know where to start from an algorithm perspective? That sounds like a pretty complex problem to model.

Joscha Bach: And sometime these agents are very simplistic, because for these simulations we used agents that have dramatically simplified perception, their decision making and reasoning and so on. It’s mostly scaffolding because we are interested in the motivational system. And if we built agents that are interested of getting perception that we aim to get perception right from scratch will mostly focused on this. So one thing that we have been working on was to use mind craft as a paradigm. It’s a very popular game that give you a Lego like blocks world and then have agents that learn to perceive objects in mind craft and reason about these objects.

David Kruse: Interesting.

Joscha Bach: And for this we have been using algorithms that are from deep learning’s doing all the order in quarters that can build perception hierarchies. And this is a very hard topic of research right now; A number of teams that were doing similar things.

David Kruse: Yeah, do you have any examples from mind craft, something that you’ve learnt or have you been working with mind craft for long?

Joscha Bach: So what we have to learn is to take, to try different situations in mind craft so far. So the agents would recognize in which context it was and what actions would be available in a given context it’s learnt by reinforcement learning. What I would like to do is get an agent that is able to form complex brands and then make complex action sequences based on what a person would like, or what he doesn’t like.

David Kruse: Wow! That would be amazing. How far…

Joscha Bach: Yeah.

David Kruse: So you are saying that…

Joscha Bach: I don’t know yet, so it also depends because we are currently putting new people together to work on these things and so it depends that we will put the focus next.

David Kruse: Interesting. So you are saying, I mean ideally you would have your agent go into mind craft and create a world based on what our needs our, whatever the needs they need.

Joscha Bach: Yeah.

David Kruse: Wow! That would be – how far, I mean that seems like a very tough problem to solve. How far away do you think you are from creating an agent that could do something like that?

Joscha Bach: I think that we have agents that can do very simple things in mind craft and to be able to do things in at human levels in mind craft, I think this is quite somewhere off.

David Kruse: Got you. It makes sense, okay. And – all right, so let’s talk about your other interests and around deep learning and can you tell us a little bit more, kind of the – I guess some of the projects your working on in order to explore kind of how to improve the deep learning aspect?

Joscha Bach: I think that with respect to deep learning we are not the best team to do that. Right now there are larger teams that exist. For instance the Google Deep Mind team and OpenAI and a few others and I don’t think that we should put our resources on trying to compete with these teams. So instead the largely means existing algorithms and try to do things that these algorithms are not doing yet. So one of the things that we are currently exploring is how to basically take a network that is made of little modules of little tough networks and to combine them into a larger network.

David Kruse: Interesting. And what do you hope that will accomplish?

Joscha Bach: We hope that we can build networks that give us the structure that we can better understand. So for instance if you have a network that wheezes about a certain domain or gives you a certain disorder about a certain domain and you can then look at the structure of this network and know why it got to this result and what is actually modeled and how this model works. Then you can later on do neater operations on the network. Ideally you want to get it to networks that can learn how to build new networks themselves, basically hyper networks.

David Kruse: That’s smart. And I mean that’s often a problem with the deep neural networks, is that the people don’t know exactly why it’s working or what worked exactly and so do you want to bring, try to bring a little more transparency?

Joscha Bach: Yes, this is one of the goals and in sometimes deep networks try to find the best structure. They do this in people and they do this in the technical systems and often these structures can be mapped onto structures that we already know about the world. Sometimes we can help these networks along by taking structure that we already think is in result and train the networks exclusive into where could you use this structure. But it might also be interesting to look at ways to force the network to converge in between or to build regulation mechanisms into the network that – by a sister network to form certain structures instead of others and this shows that we are more compositional and that are easier to reason about.

David Kruse: Interesting. And as you said you know ideally you would have neural networks that could develop and train new neural networks right.

Joscha Bach: Yes.

David Kruse: Can you give an example of what that would look like?

Joscha Bach: At the moment most neural networks are working – basically you as engineer set up a structure of the network and then you get the network to adjust to a given domain, by addressing the ways of the new one and I think you want to have that organizing network that basically it identifies on its own what the best structure is for a given problem and changes its own architecture so it can adapt towards the structure of the problem and also to transfer the running; that is the technology that is picked up in one domain, it applies in another.

David Kruse: Yeah and that would allow neural network to fully optimize itself you would think, pretty quick, well faster than we can do it by changing the parameters ourselves, that would be pretty powerful. Has anybody gone or how close are you to creating something like that or where are you in that development?

Joscha Bach: I think that in some sense we are at the beginning of these things. One of the problems that I see is that in our field, machine learning, we often do not take the development that we have in psychology and in other computer science fields very much into account and vice versa. A lot of people that work in the product domain or that work in psychology and they have ideas about what they are doing that are very detailed and we are unaware of those in the domain of AI and machine learning. And I think what we need to do is we need to get these people together and talk about them and get these different types of theories to match that is – I think that the physiological part what we call as advise is we need to understand computation and the ways that functional component can give arise to operation and people in – I need to understand more about how minds work and achieve these things that I do.

David Kruse: Interesting, and yes and I watched your video on the convention at the computational meta-psychology, which sounds like a pretty new field. How do you think psychologists could – what type of knowledge could they bring to computation that could help these neural networks. Do you have some ideas or have you worked with psychologists or yeah.

Joscha Bach: I think I have to put this meta-psychology talk into context. I gave this at a hydra conference if you remember and this, I think the largest meeting of hackers as well in the world. It was super exciting to be there and I basically tried to give a talk that I would like to have listened to 20 years ago when I sometime stated my studies in the field and had lots and lots of questions. And many of the ideas that I presented in this talk are very speculative, so it’s not like this is the body of work that we can take and put to the test right now. Its largely an attempt to make sense of the domain and identify possible ways to reason about how we form bullets, how we form opinions, how we interact with each other, how we make a model of the world and how our minds relates to the universe around them.

David Kruse: Interesting, okay.

Joscha Bach: Sometimes through the same work, its lots of fields and it is computer science or machine learning.

David Kruse: Which is important in its own right to kind of figure out, to think about the future and how this will all be put together in many different ways. All right and so how – so you are at MIT, so how do you get…

Joscha Bach: Actually MIT is finding ended this last month. So right now I’m affiliated with the Harvard program with the nuclear dynamics, yeah.

David Kruse: That was my question. I was curious how you are funded, because you are doing such an interesting work that’s, you know it’s farther off which is important work, but okay so you are at Harvard now.

Joscha Bach: Yeah, it was really exciting to be at the MIT Media Labs and I really enjoyed working there and teaching. They have amazing students and they are very curious. They come from various different domains and what I enjoyed most was the opportunity to get a bunch of students, to gather many of them doing their PhD in the related field and ask them, ‘what do you think it is that we don’t understand yet about the mind? What are the areas that you are working on yet and this should be working work on in 10 years from now?’

David Kruse: Do you remember any of the answers?

Joscha Bach: Yeah, so this is what we have been discussing. One of the things was basically the structure, it’s about neurons. So how do groups of neurons interact in small structures in the mind. The interaction between motivation and learning and mental representation is one of the things that seem to very be a significant process, what we call differential attention. How are machine learning systems? They have pretty much the same attention to anything in the data. They must try to find structure in the data and people don’t, people look at some stuff in much more detail than others. So for instance we look at faces a lot. Whenever we are in a crowd, and the person comes up to look at their face and try to make sense of that faces. So there seems to be a reward systems that makes us super interested in faces and that is the systems is online before the most of the mental representation and visual learning is online. So even babies start looking at faces at the time when they don’t have much visual communication yet. So there seem to be some simple information that already, data mechanisms that direct our eyes on faces at a very young age. And those people that don’t have this, they seem to be developing prosopagnosia. This is the inability to tell faces apart. These people are often very smart and so on, so I had a colleague in the previous department who could not tell is students apart by looking at them, instead he had to look at their hair color and the way they dress and the voices and so on. And he was also able to read the emotional expression in their faces.

David Kruse: Wow! Interesting.

Joscha Bach: And I think it’s not because he had a defect in his brain in his cortex, because we was super smart obviously, no. It’s probably in his brain, the part that pays attention to faces was not working. So he didn’t have the exact attempt for faces that most of us have and I think that’s probably you don’t have just this system for faces. You had too much for social cognition and for several other domains against other domains. They are not completely aware of and that from our work model into something that is testing for human.

David Kruse: Interesting.

Joscha Bach: Another topic that we don’t understand yet, is load distribution. Our brain needs to distribute its resources to the different tasks in the world. It’s not like that we can do everything. We only can pick a few things in the world that models as very high detail. How can we distribute the cognitive load over the different areas of the brain? How we can make sure that most of the new ones are doing similar work or doing the same amount of work, doing something valuable?

David Kruse: So in either of those areas how do you – I mean would those be areas of interest for research and if so, how would you research. I think that seems maybe hard to get the data.

Joscha Bach: I personally. The good thing is that you can work from an engineering perspective. So you might have a problem and then you need to define the problem properly as something that you can work on. So for instance some questions might be, how can you build a new architecture that you can make sure that you have the fixed the number of neurons and they are all contributed to the reward in an optimal way, so you can maximize the global reward by maximizing some local function. What could that function be? How can you communicate its network? This could be a starting question and then you need to divide it to sub-questions that you can actually work on and then you can then come up with different designs and then compare the performance of the different designs given the benchmark cost.

David Kruse: That’s smart, okay. Well unfortunately I think we are almost done with this podcast. This is a fascinating stuff that I could talk about for a long time, because I have a lot of learn about – especially some of the, I guess you could call it more soft issues, you know the – that’s why what you are working on is so interesting. Like deep learning, you know that’s out there and how it works is there’s lots of papers and resources but what you are working on in more of kind of the soft issues which is kind of the hard part and that’s what I think we will need in order to almost create the an AI brain and so I’m curious is…

Joscha Bach: To me one of the most of interesting sessions is what does AI teach us about how we work? What are the things that we can take over into other domains and there is a lot of those things. So I do think that using AI metaphor as an insight, we can understand some parts of our minds better.

David Kruse: Interesting.

Joscha Bach: And of course then there is also the practical aspect, right. Now we are developing technologies and the field is making rapid progress. Every couple of months we have a bunch of ground breaking papers and insights and there is a backlog that can affect the application. There are so many things that we can do now technically and that hasn’t been turned into products yet. So the next days are going to be very exciting.

David Kruse: Do you have an example of one such technology in paper that came out that you think could be turned into a product that will…

Joscha Bach: If you look at how the skills of deep learning started, it was in paper that was training a network that’s 20 million frames on YouTube. This is done by Andrew at Google in Stanford and this is four years ago. And when they did this, they came up with a network that was trained with unsupervised learning. After three days it was able to recognize images in data base image net with an accuracy of I think 17%. And back then this was much, much better than anything that existed before that was handcrafted or programmed in any other computer vision paradigm. And last year our computer systems got so good that they outperformed humans and recognized all these images. And our phones are not that good at recognizing images yet. They start at being pretty good at recognizing language and speech and in a few years the speech recognition is probably going to be better than people’s speech recognition. And you don’t have image recognition in cameras or phones that approaches human performance in a few years from now. And then there comes the overall work surfacing. Maybe in a decade from now, our phones will have a good idea of where they are and what’s happening around them and imagine you are a small child and you are walking next to your parent, you already always know where you parent is and who your parent is right, so your parent doesn’t need to authentify itself. You also know what’s roughly happening around, you know the context, you know that your partner is currently with you in the subway and using this subway and wants to go from A to B. Imagine that your phone is constantly observing its environment, it has the situational awareness and can give you contextual help of what you are doing and keep track of what you are doing and also if you will lose your phone, it immediately rings to them. So there are a few thing that we don’t really imagine right now and that are giving rise to very exciting applications in the very near future.

David Kruse: Well, I love that vision of the future. That sounds – that’s another whole podcast. I’m curious how you would model, especially the – I like the idea of walking next to the parent and then all the things that can come up because of all the issues.

Joscha Bach: Yeah, so our phones won’t be as smart as it has in the next two years, but maybe its vision would be as good as the one of the doc frame for instance and get an idea of this environment.

David Kruse: Interesting. All right well, I think that’s a great way to end the podcast and Joscha, I definitely appreciate your time and your thoughts and I love the research that you are doing and I’ll have to keep a close eye on it as you continue to push forward.

Joscha Bach: Thanks for having me.

David Kruse: Definitely. And thanks everyone for listening to another episode of Flyover Labs. As always I appreciate it. And we’ll see you next time. Thanks everyone. Thanks Joscha. Bye.

Joscha Bach: Bye.