Archive FM

Data Skeptic

[MINI] Partially Observable State Spaces

Duration:
12m
Broadcast on:
23 Jan 2015
Audio Format:
other

When dealing with dynamic systems that are potentially undergoing constant change, its helpful to describe what "state" they are in.  In many applications the manner in which the state changes from one to another is not completely predictable, thus, there is uncertainty over how it transitions from state to state.  Further, in many applications, one cannot directly observe the true state, and thus we describe such situations as partially observable state spaces.  This episode explores what this means and why it is important in the context of chess, poker, and the mood of Yoshi the lilac crowned amazon parrot.

(upbeat music) - The Data Skeptic Podcast is a weekly show featuring conversations about skepticism, critical thinking and data science. (upbeat music) - Well, welcome to another mini episode of the Data Skeptic Podcast. I'm here as always with my wife and co-host Linda. - Hello. - How you doing, Linda? - I'm good. - How's the new job been treating you? - The job is busy. - Mm-hmm, I've noticed. So I really appreciate you staying up late and doing this recording session. - Yep, I'm here. - All right, so let's get down to business then. Today, I sent you a link. Did you look at it? - Nope. - Okay, we're gonna talk about state spaces. Do you have any idea what that might be? - No. - Okay, well, the way I like to introduce it to people who are not familiar with it is to talk about different games. So I assume you're familiar with chess, right? - Yes. - And how would you describe chess just from a high level? - Chess is a board game, which is checkered board game. And then they have a variety of different, I guess, mini statues and they play different roles and they all have different strengths and weaknesses. So you just have to learn how to use them as a little army. - Very true. And is there any secret information either player has? I mean, they don't know what the other person's gonna do. That's secret. - That's true, yeah, I don't know your strategy, but the point I was trying to make is all the pieces are on the board. There are no invisible pieces or like hidden moves or secret pathways in the game. And there's no dice in the game. And everyone knows where all the pieces can go. So there's no argument about whether a move is legal or not, you can look up in a rule book. - I assume. - As long as you know whose turn it is and where all the pieces are on the board, you don't actually need to know where the board was before. That tells you the total current state of the game. And what the possible next states can be are defined by the available pieces, whose turn it is and how they can move them. So it's a very, even though each player has a choice, it's very deterministic. It's what we call fully observable. And we call it fully observable because you can just look at the board and know 100% sure the current state of the game. Now let's talk instead about, oh, Texas hold them. - Okay. - So in Texas hold them, and I'm not a poker guy at all, but they put like some cards on the table and everyone has also cards in their hand. So you have like common cards in the middle, right? Anyone can use like the two or three, I think it actually goes up to five that eventually gets shown there. So we all know that what's on the table that's turned face up. And we know it's in our own hands, but we don't know what's in the other player's hands, right? - I assume I don't remember how to play this year. - She cheated, but yeah. So that's what we call private information. And it's a not fully observable. It's only partially observable 'cause I can see my cards, I can see the board, but I should not be able to see your cards. And the same is true of you. You see your cards, the cards on the table, but not mine. But that doesn't make you completely ignorant, does it? - I mean, I guess not, I don't really play poker, nor do I win at poker, so I don't really know. - Well, let's say it's this. Let's say there are three cards revealed on the table, and it's the queen of diamonds, the queen of hearts, and the queen of spades. And in your hand, you have the queen of clubs. What do you know about my hand? - Well, it's probably less likely that you have a queen. - Right, it's for sure I don't have a queen 'cause there are three on the table and you're holding the fourth. So you know that I don't have a queen for sure. You also know that there's uncertainty for me 'cause I don't know you have the queen. So I'm thinking, you know, I had maybe she does, maybe she does, and I can't be sure. That's a partially observable game in which I have some certainties, but then I have probabilities over the unknowns. So I haven't really very well defined state yet. Let's go back to chess for a second. The state is the position of every piece on the board and whose turn it is. In poker, it's what are the cards we can see? What are the hidden cards? Whether they're hidden like it's dealt downwards or it's hidden in one of our hands and whose turn it is to make a better whatever. That describes everything you need to know about the current state of the game. So more or less, it's all of the variables and values that you would need to know to fully specify the situation or game or condition you're looking at. So let's talk about the old noise maker in the corner there, Yoshi, our pet bird. We haven't talked about it in a number of episodes. Might you say she is described by of various different types of states? What would some of her states be? - She's sleepy, anxious, hungry, excited, scared, unsure, curious. - What about the dance she does when we put the vacuum on? - I think that's a combination of excited and cautious. - Okay. So each of those states are very unique and they're kind of like one to one with moods because she's an animal. Now, can you be sure which state she's in? - No, I'm not 100% sure. I mean, but I would say if it's really obvious, you could be like 90% sure. - Yeah. And do you have any idea about how she changes from one state to another? - Well, the situation changes. - Yeah, exactly. So let's say she was in a hungry state. What state might come next? - Well, probably sleepy after she ate. - So she's in a hungry state and then you're assuming we gave her food. - Yeah. - In which case, she would transition into a eating state and then transition into a sleepy state. - Yeah. - Although if we didn't give her food, what would happen? - Oh, I think she would continue being anxious and then eventually she would just get tired. But maybe it would take longer and change with different probabilities. Yeah. So you can kind of map out all of the states and the likelihood. - Whoops. - Oh, escape. Was that the fear state? Oh, good shirt. So you know, she just made a B line for the ground after we hit the stand. That would be the, what do they say, fight or flight? I imagine she's in the flight stage. - Yeah, she doesn't fight. - So the action that caused that was the bumping of the stand which transitioned her into the flight stage and then what happened? - Then I picked her up. I was afraid she was gonna poop. - And now she's in the calm stage again, maybe sleepy. So we have this model of our bird that we can talk about. She has these various states and different things can cause her to change into other states. Although it's always a little bit probabilistic. We never know for sure. That's what we also call a stochastic system. And when you have a dynamic thing, like, oh, she's a dynamic bird, she can be in any states, you'd like to have a model or a way to describe where is she now given the information you have? And as you get new information to understand how that state may change because she's a ever-changing being, right? Let's say you and I were gonna try and make an algorithm for taking care of a bird. What are some of the things we need to consider? - You have to consider her sleep and waking cycle, her eating schedule. Generally, she just, my bird, Yoshi, she wants to spend time around people. So consider that. - And each of those states has a value to them, right? The state where she's happy and has spent a lot of time and been sociable. That's a preferred state versus the state where she's cranky and hungry and pooping on the floor. That's probably a not preferred state for you and I, right? - Okay. - So it might be helpful to be modeling our belief about where she is and take action to move her into the more preferred states, right? - Yeah. - Okay, so this is why data scientists will often think of state models. In cases where you don't have a simple process that you can model with just one random variable. You have a dynamic system or, in this case, a dynamic lilac crown Amazon who will transition through a variety of states at all times. You're gonna need a way to try and prescribe what state they're in and it helps to have that description of their current state to decide what sort of actions and interventions you might wanna take. So we do one more quick one and talk about how a person might apply this in the business world. - Okay. - Well, you work on some e-commerce sites from time to time. What are some reasons a person might come to a company's website? Why would I go to any company's website? - Well, you could be an existing customer and they sent you an email so you were interested in the sale or whatever offer. You could have seen an ad and it interested you, so you clicked on it. - Right. - It could have been other things like a blog post or social media, like a tweet or a Facebook post that showed up in your feed and you thought was interesting. - So these are all forms of customers. I would also add in maybe an investor would come to the site to decide if it's a company they wanna invest in or a job seeker might come to see if there are open positions. - Yeah. - So there are lots of different reasons to come to the site. When a person first arrives and let's assume there's no referrer code so you don't know where they came from, you don't yet know why they're there. Let's just stick for the moment to your examples of customers. If you were gonna give like a 50% off coupon, would it be better to give that to an existing customer who's just returned or a new customer? - It might depend, I don't know. It really depends on what you sell. - That's true, no, I very much agree. It might've sounded like I wanted you to say, give it to the new customer 'cause the existing customer's already on board to buy something, which can be true in some situations, but it will vary from industry to industry and business to business. Maybe you could watch your web traffic and where people navigate to and what they're doing to try and develop what state they're in. Are they a person just kicking tires or are they a eager buyer? And that could help you determine what sort of content you wanna serve to them and that's more of an industry application one might apply. So most of all, I would say thinking about state spaces is helpful for modeling and for sequential planning, but it's also kind of like a philosophical thing. I think it's useful to think of non-deterministic states and not knowing where you are and wondering about how you can track the way states changed. Have I imparted that philosophical point of view onto you, Linda? - I'm not really a philosophical person, so I would say no, but good to know. - Well, nonetheless, I'm glad you stuck around. Did you learn anything at all? - I learned that you in data science calls it states. - Yeah, so the state is the perfect representation of where it is, which is not always fully observable. Sometimes you can't be sure. And if you can't be sure, you have to have a probability distribution over the state you're in. And state space is the set of all possible states you might be in. And your belief is your probability distribution over that state space. Does that make sense? - You know, here and there, sure. - And do you remember when we talked about Bayesian updating a long, long time ago? - Mmm, very little. - Do you remember pomegranates versus latinans? - I thought it was oranges. - No, it was, okay, it's versus latinans. - They're both citrus, are you sure? - Yes, we can go back, it's recorded. - And I'll have to say, I don't believe you. - The same principles can be applied here. You have a existing belief over your states and as you get new information that can help you update your beliefs as to where the current state of something else becomes. Just like Yoshi, states can change in dynamic systems. Thanks again for joining me, Blah. - Thank you. - And good night. - Good night. (upbeat music) (upbeat music) (upbeat music)