Commute times and BBQ invites help frame a discussion about the statistical concept of confidence intervals.
Data Skeptic
[MINI] Confidence Intervals
[music] The Data Skeptic Podcast is a weekly show featuring conversations about skepticism, critical thinking and data science. Welcome back to yet another mini episode of the Data Skeptic Podcast. I'm here as always with my wife and co-host Linda. I am Linda. So Linda, if I said we were going to talk about confidence intervals, would that sound perhaps like a motivational program to you? No. That was a good joke, by the way. I don't get it. Confidence intervals, like you want to build confidence. It's a motivational thing. An interval? Oh, like you mean like interval training when you work out. Right. Because you're good enough, smart enough and dog on it, people like you. No. So what would you think if I said confidence interval? I don't know what it is. Well, given our failed attempt to record this yesterday. No, no, no, no. You don't remember what it was. No. Okay. Well, let's jump right in then. A confidence interval is a nice statistical tool for determining the range in which a true value might lie of something you've been observing. So let's maybe talk about a trick coin. Imagine if you were walking down the street and you found a coin and you picked it up and kept it. Although let's assume you found it in front of like a magic shop. So it's highly suspect as to whether or not this might be a coin that is fair, meaning we'll come up 50% heads, 50% tails. How would you approach determining if this was a fair coin that you might want to use in a game of chance? Well, if you have all day, you could stand there and flip it. And then mark how many heads and tails there are. So how many times would you flip it before you were satisfied? At least 100. Okay, that's a good number. The coin flip is kind of an analogy, right? Like it's a simple statistical model, but it applies to other things. It's what we call a Bernoulli process. For example, you might put an offer on a website to sell a product, or you might send an email with a coupon or something like that. And you'd like to know in advance how many people will bite on that offer, right? How many people will say, "Yeah, I'll buy one and get one free." So you don't necessarily want to send it to your full mailing list until you're confident you have the best offer. So there can be a cost associated with flipping it. Now in the case of the coin, if you want to flip it 100 times, it's pretty much just your time. That's a cost. But in business or in a project of some kind, there are other costs. So if you want to minimize the number of coin flips you had to do before you were confident that you had a fair coin, or not even necessarily that you had a fair coin, but that you knew the outcome of the coin, how low can you go? I don't know, maybe 50. What if you only had five tosses? Five? Yeah, that's pretty low, right? I guess you could flip five and if they all turn out the same side, like they all turn up heads. Five heads in a row? Then you could calculate the odds of that what happened. If the odds were low, then you would probably say this coin is not a 50/50 coin. And what propensity would you assign to the coin if you got five heads in a row? Would you just assume it's going to give you 100% heads? No, I probably wouldn't assume that. That's why I would be flipping it because I don't assume anything. It's a test, right? Yep, what you would probably want to do is apply something called a confidence interval, which is a way of saying that although you observed the propensity for heads to be in your case 100%, you believe the true value of the coin is somewhere between maybe 90 and 100%. And that's something you would have to calculate. And there are good methods for doing that. So I'm actually in this episode going to set aside how one calculates a confidence interval. There are a couple of ways or things like the Wilson score and the Jeffries prior. And the math is a little complicated because some of the derivations require something called Chubby Chefs inequality, which is a really cool thing in statistics, but it's hard to describe on an audio podcast. So what I'm hoping you and I can do is kind of get to the bottom of what a confidence interval intends to be, not necessarily how you calculate it, because anyone can go look that up. The basic idea is that you want to determine a range within the quote, unquote true value or ground truth value lies given what you've observed. A nice analogy I was thinking about is driving to work. You have kind of a commute, right? Yes, I drive 45 minutes to work one way. And is it always 45 minutes? No, sometimes it's 35, sometimes it's 55, but the average is 40 to 45. Is it ever six hours? Nope. How about 60 minutes? It could be 60 minutes. You could phrase your arrival time as a confidence interval, because there's also another piece to confidence intervals is that you calculate them with some value of accuracy. So if I said give me the range of your arrival time and be at least 50% accurate, you can give me a pretty tight range, right? If you can be wrong half the time, you can say, oh, it's between 44 and 46 minutes. But if I ask you, hey, give me a range and make it 99% accurate, how big would you need your interval to be to be satisfied? You can be that accurate, roughly speaking. 30 minute range. 30 minute range to be 99% accurate? Yep. What contributes to the variability of your commute? I live in LA, Los Angeles, so we have a huge population. I forgot how many millions, but it's literally driving through a small country. So Mondays, it's not too busy because a lot of people work from home, and Fridays are the lightest days, and then also depends if I leave at 7 a.m. or if I leave at 8 a.m. so it depends what time I leave, and it depends if there's a holiday preceding in that, because if it's a long weekend, people might take off early. So then on the day they take off early, there's more traffic. It really depends. Yeah. So sometimes you can give your confidence interval or any statistic for that matter, given some piece of information. So you could say, this is my value, given that it's Friday. Because you can always pretty much know what day of the week it is. That's not a hard thing to figure out. But there are other uncontrolled variables, like if there's an accident, or if there's road work, or how bad some exit is backed up, those are unforeseen circumstances that contribute to the variability. And they will affect the confidence interval you'll generate, because your observations show more variability. Therefore, the data you've collected, even though you have an observed mean value, if the data has a lot of variation, it kind of tells you that the range within which the true value lives could be kind of wide, or tight, depending on how much variability there is. So let's say we were going to throw a barbecue, and we were going to invite 30 people. How many people do you think would actually attend be able to make it? Ten. So one out of three people will come. Yep. How confident are you that precisely 10 people will come? Not very confident. Give me a number, a percentage. 20%. 20% that exactly 10 people will come. How about how confident are you that between 9 and 11 people will come? I mean, actually, I'm going to revise my previous number, and say I'm only confident 5%. 5%. And then the same from 9, 11, 5. 5%. Oh, really? What about between 5 and 15? Yeah, 20%. What about between 0 and 30? 100%. All right, so you can see how there's an intuitive trade-off here. The more precise you need the answer to be, the less confident you can be. Yep. And how did you get to that 1/3 number? What did I say was 1/3? You said if we invite 30 people, 10 will come. Well, 50% rate of return seems pretty high, so I just imagine less than that. And a quarter, 25% return rate just sounds too low. Would you also say it's based on your experience that we've thrown a couple of barbecues? Well, I am a project manager. My entire life deals with sending out things for people to do, and seeing if they get a response. Honestly, I believe if I did not follow up with people, like 80% of those requests would not get answered. Potentially even more, maybe even 95%, because how many times do you answer someone that you've never met before? No, we weren't going to invite 30 random people. We're going to invite 30 of our friends. Well, that's fine. If it was random, I would have said 0. Yeah, yeah. Well, there's always a weirdo, right? Yeah. So it's on your side. So might you say that you've always kind of had some intuitive equivalent of confidence intervals in your life? Yeah. And would you like me to show you the precise way you can calculate exact confidence intervals? You can walk me through it. All right, fine. We'll do that after this. Because another dilemma you have when calculating confidence intervals is that getting the exact confidence interval that is to say the one that's mathematically precise can be very tricky in terms of being a complex or difficult calculation to execute. So sometimes you make little assumptions and approximations and you come up with a not quite precise confidence interval, but one that's within enough precision you're happy with it. So as I mentioned at the start, I'm not going to get into some of the methods, but you can go to Wikipedia and look all those up. The best one to know is the binomial confidence interval, at least in my life. I seem to apply that one the most. But I hope everyone takes away intuitively what a confidence interval is. And the next time you hear someone throw out a random statistic like, we know X percentage of people are going to do something, it might not hurt to ask them what their confidence interval is. Because I believe all observational statistics should always be expressed in this terminology. Thank you for joining me once again, Linda. Thanks for listening to the Data Skeptic Podcast. Show notes and more information are available at www.dataskeptic.com. You can follow the show on Twitter @dataskeptic. If you enjoy the program, please leave us a review on iTunes or Stitcher. A review is the greatest way to show your support. [BLANK_AUDIO]