Archive FM

Data Skeptic

[MINI] Cornbread and Overdispersion

Duration:
15m
Broadcast on:
24 Apr 2015
Audio Format:
other

For our 50th episode we enduldge a bit by cooking Linhda's previously mentioned "healthy" cornbread.  This leads to a discussion of the statistical topic of overdispersion in which the variance of some distribution is larger than what one's underlying model will account for.

(upbeat music) - The Data Skeptic Podcast is a weekly show featuring conversations about skepticism, critical thinking, and data science. - So welcome to the Monumentus 50th episode of the Data Skeptic Podcast. Thanks for joining me as always, Linda. - I'm Linda. - We're here in our kitchen today 'cause we're gonna do something a little bit different and special for the 50th episode. - Well, Kyle wants me to bake cornbread. - Not just any cornbread, but the legendary oft talked about healthy cornbread mentioned in at least two other episodes, right? - Yeah, well, you know, I don't think that many people know this, but corn is a whole grain. - Oh, yeah? - As long as you make sure you have the all the parts of the grain, which is the endo, the whatever, whatever parts. - How do you identify if you're getting the proper corn grain then? - Well, sometimes it'll stay on the label 100%, whole grain, or 100% whole corn. - So we gotta look out for that 'cause that's what we're using for our recipe. - Yes, because I want it to be healthy. - You want the full thing just like you like wheat over white flour. - Yes, but I am probably going to mix in white flour 'cause it makes it a little fluffy and doesn't make it as dense. - All right, and why are we cooking this today? Why couldn't you just give me the recipe? - 'Cause I don't remember. - So you eyeball a lot of it. - I just look at the basic corn recipe and I just modify it every time. Well, usually the same way I modify it every time, but I just can't remember. - So hopefully we're gonna inject some data science along the way here and maybe some skepticism, but we're gonna observe this work in progress of this amazing dish and share the recipe with the literally, what's smaller than a dozen, the literally many people who have asked for it on Twitter. So earlier you had me use the oil spritzer to lay down some oil onto this nine inch pie pan. A nominal amount of oil, less than one teaspoon, I would say. And all the steps and ingredients will be listed online and we would love for listeners to maybe make this on their own, even do slight perturbations of the recipe, adding more of this and less of that and ask your friends to rate it and maybe we can make an interesting data project out of this. Okay, Linda, so let's step to the side here and go over the ingredients, what do we got? - Okay, we're gonna put one cup of cornmeal, three-fourths cup of, I like to put whole wheat, one cup of white flour, the white flour makes it fluffy. - Uh-huh, that's all our dry ingredients. Oh no, salt also. - Half a teaspoon of salt, and then four tablespoons of baking powder, that helps it rise. - Yeah, double acting. In fact, I don't think you can get single acting anymore. Last but not least, we've been discussing the sugar. Do we put it or no? - Well, it depends. If you like a sweet cornbread, you could put a half cup of sugar and if you don't, just leave it out and put chives. - Regular sugar? - Yeah, any sugar you want, or honey, whatever you want. - We're going chives, right? - Yeah. - All right, we're gonna cut those up in a minute. Now what about the wet bowl, what are we putting in? - So on a separate bowl, we're gonna put a third, a cup of oil, two cups of eggs, I'm sorry, two eggs, seven tablespoons of oil, one cup buttermilk. - Uh-huh, and if they can't get buttermilk does their substitute? - You could just use any milk. - Any milk almond milk? - Yeah. - All right, and are we gonna use that buttermilk powder stuff or do you get the real stuff? - I just use a powder. - Okay, cool. - And then I like, if you're gonna put the chives then just mix it in the wet one. - All right, cool, let's get back to it. - I'm gonna pour one cup. - One cup cornmeal. Do you think I should provide the weight of that or is just one of the whole grain good enough? - I mean, Americans measure it by cups in Europe. They do it by weight 'cause it's more accurate, actually. - Oh, interesting. - I don't know why. That's just when I look up European food blogs they're a lot like by weight and like, apparently everyone does it by scale or volume. - Uh-huh. - Volume, I guess, maybe it's volume, maybe that's-- - I wonder which is more precise. I actually have a lot-- - Volume is. - Volume is? Well, I will test that out 'cause I actually have a lot to say about measurement and hopefully within the next year everyone's gonna learn a bit about that. So if this is your first episode of the Data Skeptic Podcast, thank you for indulging us in this. I promise the other episodes are generally a little bit more down to the point, I guess you'd say. Would you say Linda for many episodes? - Well, depending on who you think you're to the point. (laughs) - I guess so. You know, we can commemorate almost a year and 50 episodes by indulging in our day-to-day activities a little bit, I think. Although, why 50? You know, it's so arbitrary. It's not even a good round number. 64 is a better round number to someone like me and most data people. If only I had thought ahead and been planning like a cool 64th episode and been really working on it for a long time here and there, that would be awesome. But I guess maybe we'll just have a balloon or something when we get to that point. - This is a bowl for the dry ingredients. You need at least two bowls. - Yeah, so you got the chives there from the garden you have up front. - Yep, we are an all-organic household. - We are not an organic household. - Well, I grew those chives, so it was organic. - I'm a pro-science eater, so I like GMO food. Preheating the oven to 375. Got a pinch of salt going in. - I thought more like a teaspoon 'cause I think you could taste more. - But you eyeballed it. Precision Linda. - Half a teaspoon of salt. - Half, but you eyeballed it. It's just cheating the number to make it precise. - It's up to taste. - No, see. - I already washed my chives, no, I'm gonna cut them. - Okay. - How many chives is it gonna? - Well, we're gonna measure the chives after they're cut. - So that's about a third of a cup of loosely packed chives weighing 0.50 ounces, according to our scale unit here. - I dice the chives to release the flavors. And I put in extra extra chives 'cause I feel like you can't taste them. I just love chives, so I just feel like you can't have enough. - I'm in the chives too. - All right, chives are now stirred into the wet mix. - So the wet mix, just to reiterate, was milk, egg. And then I just went ahead and put the chives so they could start releasing their flavor. I don't know if that really works, but in my head it does. And now I'm moving to the last step and I'm going to do this relatively quickly. I am going to combine the wet and dry ingredients. And then I'm going to stir it. You don't want to over stir. I don't know why, but you just don't. And then you'll put it into the pan immediately. Then once it goes into the pan, you put it into the oven. So I highly recommend that you lick the bowl and taste it to make sure it tastes somewhat okay. There's a story where one time we made a cake and we forgot to put the sugar in. Because I licked the bowl and I did not taste any sweetness, I was like, "There's something wrong with this cake." And then I looked and I was like, "Oh, we didn't add sugar." So we had to, we had already put it in the pan and then we had to take it out, take it out of the pan, put it in a mix of bowl, mix it in the sugar and then put it back in. 'Cause there's no way we could make a pineapple upside down cake without putting sugar in. - We weren't even dating that yet. - No, we weren't. I highly recommend that you taste it every step and also one time I read a website that said you should really lick the bowl and taste it every step. Seize it and to make sure you're on the right track. So after that, I'm a big supporter of licking the bowl. - So now we're stirring it all up, wet and dry ingredients together. You know if this data science stuff doesn't work out, Linda, we could be able to have this like fully art cooking podcast. - Yeah, I don't know about that. - So it's a very goopy, thick consistency. - It definitely looks lumpy. - Lumpy, not quite a salad, but yeah, technically liquid. So that gooey in between. So Linda, how precise do you think your recipe is from time to time? Now that you've got the measurements. - Well, I think this is actually pretty close. - Yeah. - I'm gonna do it, yeah. - And you know how we might describe the difference between each batch? By the variance? See, I gotta work this stuff in here somehow. - Oh, okay. - So on average, you have some sort of middle output, but because our measurements are imprecise, or you know, the volume of our cups, the dense, how deeply packed our flour was, all that sort of stuff. It's a slight variation time to time. The question is, does our measuring device, our taste buds, have a high enough resolution to measure that? (grunting) - Then set the timer for about 15 minutes first. - All right. - I think the key is, you don't want dry cornbread. - Right. - So we put in oil to make it moist, but if you overcook it, you're gonna turn out with dry cornbread. So if anything, under cook. - All right, Linda, since we got 15 minutes left, it's gonna be the longest mini episode ever. Let's wind this into a topic. I actually got one for us. So we're gonna get into our topic now, which is called over dispersion. Are you at all familiar with this? - Well, it sounds to me like you're overspreading something out. - That's exactly right. As I was just mentioning a minute ago, there's a certain variance from time to time when you make your cornbread, right? It's not always the same. How would you describe it on average? - How I make it? - How it comes out. Is it good on average? - It comes out the same. Pretty much, I mean, you've eaten it. - Yeah, but there's slight variations time to time. Sometimes it gets a little sweeter, a little less sweet. - Well, that's because I put sugar in it. - Well, but sometimes it's more pretty than other times. Sometimes it's more moist than other times. - Yeah. - So there's a variance to that. But on average, yeah, you make a really good cornbread. So your mean is how you do on average. Your variance is the spread of how it changes from time to time. Now, who do you think's makes better cornbread? You or me? - Me. - Why do you say that? - 'Cause I make it all the time. - That's true practice, makes perfect. But do you think I have the ability to make cornbread? - Sure. - Okay, would I, and I would have a lower mean since I don't make it as good as you. What do you think of my variance? Would it be greater or smaller or the same as yours? - I don't know. - All right, let's just go with the same or greater for now. It actually doesn't totally matter. Let's go with the same. Do you think my mean would be close to yours? Like if I made cornbread and you made cornbread, would people obviously know which one was better than the other? - I don't know. I don't think I've ever had your cornbread. - Well, imagine this. Let's say we got into business making cornbread. And at first, there weren't that many orders. So you always produced it and people got used to the mean and the variance of your production. Now, if I started making it too, the customers wouldn't know that there were two chefs, but they would presumably randomly get either a Linda cornbread or a Kyle cornbread. - Okay. - So now to the customer's point of view, there's a single distribution. There's the average quality they get and there's its variance, but it's not gonna fit the typical model that they were used to with one chef. See, when you were the only cook, then they have a known mean and known variance. But when you add a separate mean and a separate variance, another producer, the consumer can't measure that. They just get one cornbread, right? With one label that says Linda cornbread LLC or whatever. They would know, it's not like you put a sticker in there that says who manufactured that particular loaf. So once I came on as the second chef, or if you added other chefs down the road, you would get a practice called over dispersion because you're measuring something that you're modeling with one distribution, but it's actually the composite of two different generators. The variance will likely be much wider than you would expect. Kind of see where I'm going with that. - No, I'm gonna choose. - All right, let's say people raided your cornbread. And on average, they gave your cornbread rating of 8.5. And then the variance was, let's say, 0.6. If you go up and down 0.6, that covers a large part of your distribution. And we'll talk about z-scores and standard deviations a little bit more later in a future episode, but the variance describes how wide of a range it is you have around the mean. So if people give you a 0.8 on average, and the variance is up or down 0.6, that means most people kind of fall on that range. Very, very few people are gonna give it a 1.0. Now, what if I made the cornbread? If you're a 0.8, maybe people on average rate me a 0.7. So I'm a 0.1 below you. And my variance might be, let's say, 0.6 as well. So in a certain density of people will be within 0.6 of that average value. You with me so far? - Okay. - And very few people will have far away numbers from my average as well. Now, if you combine those two, you add them together, you draw 50 loaves from your distribution, 50 loaves from mine, you'll get a composite of our two means. So about a 0.75 would be the average reported quality of the cornbread, because it's half of my score plus half your score. You see that? - Okay. - So we're just averaging our two loaves together, and our variances would get really wide, because the variance around my product is up and down around 0.7, and the variance around your product is up and down around 0.8. So it spreads the distribution out real flat. Whereas if you had a single chef who produced with quality 0.75, and a variance of 0.6, it would just be up and down a little bit around that value. But when you have two generators, two chefs, two people producing the good, each with different mean and variance, and you combine them as is the consumer's experience, then you have a much wider variance. And this is a concept called over dispersal. That's hard to see sometimes, but it is worth looking out for, often to tell you if you've taken a good enough snapshot of the whole picture. (bell ringing) - Toothpick test. Came out clean, so a little bit overcook. What was our cook time? We did 15 minutes and then a turn. - Well, then we 20. - And then added five. Okay, so 20 minutes is a little overcooked. - In our oven. - In our oven. Yeah, that's true. There's a variance from the oven. My goodness, Linda, all the over dispersal parameters everywhere. - I guess I didn't stir to get a liquid. - All right. Shall we try it in the morning? (upbeat music) - All right, Linda, it's the next morning. We're having our cornbread. What do you think? - Well, I like it. It's fluffy. - Yeah, it's very good. (laughing) - So thank you so much, Linda, for indulging me in this adventure. You've been on 25, actually maybe one or two more than that, of these mini episodes. And I really appreciate your contributions. And I thank you for indulging me. - Good thing we could eat this cornbread. - That makes it all worthwhile. - Yeah. - Well, thanks again. (upbeat music) (upbeat music) [MUSIC PLAYING]