Archive FM

Data Skeptic

Potholes

Duration:
41m
Broadcast on:
25 Mar 2016
Audio Format:
other

Co-host Linh Da was in a biking accident after hitting a pothole. She sustained an injury that required stitches. This is the story of our quest to file a 311 complaint and track it through the City of Los Angeles's open data portal.

My guests this episode are Chelsea Ursaner (LA City Open Data Team), Ben Berkowitz (CEO and founder of SeeClickFix), and Russ Klettke (Editor of pothole.info)

(upbeat music) Data skeptic features interviews with experts on topics related to data science, all through the eye of scientific skepticism. Well, on a bike, when you go into a deep hole, the front tire goes into it. And so I remember feeling my, I didn't really feel my front tires or handlebars go into a hole, but I remember like a sound where the front of my rubber, the rubber of my front tire like compressed, you know, and it could feel the rim. So it was like the rim of the bike hitting like asphalt and kind of like a bounce like boom. And that's the sound I remember. So I knew I hit something. There was something in the road. - That was very scary. - And the next thing I knew I was on the ground. - If you've been listening to Data skeptic for a while, you've probably heard that my wife and co-host Linda commutes to work via bicycle. She's one of 0.6% of Americans that commute this way, according to the 2013 American Community Survey. For reference, 2.8% almost five times more Americans walk. Roughly double that, 2.5, use public transit, and almost double that 9.4% carpool in a whopping 76.4% drove alone. There are many benefits to cyclists who were able to commute in this way. There were 32,000 cyclists deaths in 2013 in the US, according to the National Highway Traffic Safety Administration. There were 321 million Americans in 2013, so 0.01% of Americans were fatally injured on a bike compared to people killed in car accidents, which was in a very strange coincidence, also 0.01% according to data on Wikipedia. But I don't know what to make of that stat. It doesn't feel like a fair comparison for any number of reasons. Today's episode, as you might have surmised, is about what happens when a podcast co-host commuting by bicycle to work hits a pothole and wipes out. A morning spent in the ER, stitches are required, and eventually our co-host is healing. In the grand scheme of things, we were lucky. We have every reason to believe Linda has made a full recovery and that's just in a few months. No physical therapy, no long-term effects. Basically, many people have been in worse situations. There are lots of bike accidents every year caused by potholes, lots of car accidents too, in fact. Everyone's circumstances are unique and this is the story of our circumstances. Or more specifically, this is the story about the data of our circumstances. How does a physical defect in the world like a pothole get converted into digital information? What can data scientists do with that digital information in hand? - So I was biking down that street in a street cleaning day on that side of the street that I was biking on. So all the cars were parked on the other side. And I guess I just yelled at a car just beforehand 'cause I felt like they had passed me so closely. So I turned around and yelled at them. And then when I turned back, that's when I remember, I heard like a metallic funk on my like tire. I've never heard that sound. And I was like, I don't know what's gonna happen, which obviously there wasn't that much time to think 'cause at that time I was flying through there. And then we like slammed on the asphalt. The result was that I was pushing myself up with my hands. I looked down at my right knee and I see a huge, well, to me, what appears to be about a quarter size hole in my knee where I could see different layers of flesh. I mean, I'm not a doctor, but it was like red. And it had torn through my biking pants and then in the emergency room, we were there for five and a half hours. Like I feel like all the stuff they could have done in an hour, but because all these other things were higher priority or they came in first, they were more hurt. - There's a lot of interesting math that actually goes into it. I'm not quite sure if it's a great data skeptic episode. So I won't talk too much about triaging, but maybe we can find something that is very data science-y to look at here. - Did anybody before I got there, did they ask you to like to rate your pain on a scale of one to 10? - Yeah, they do. - Do they have a chart to show you that described each category? - No, they just sell zero to 10. What's your pain? And I was like, a two? - I didn't really have pain. It just felt like achy, it felt weird. - How are you two? You had a gashing wound in your knee. - Well, the pain, but I don't know. I think there's like adrenaline. I think my body took over and was like, just hide the pain. I seriously did not feel like much pain. - Okay, so what's a three for you? - I don't know. I mean, 10 would have to be like, you think you're gonna die. Like, you're probably screaming in pain, right? So what would be low level uncomfortable? I think it's like a two or a three, like low level. - So zero is normal. - Zero is no pain. - You're a two, that's good. What was your maximum, do you think, on the scale? - That was probably my mask. I don't really feel it. - No, you have to be more hurt earlier or after. - No, the pain doesn't change. - So it's just like a no? - You mean after like when I'm healing? - Like the maximal pain point of your whole distribution over this experience. - All right, so I'm reaching for a story here. There's something interesting to be said about the reliability of self-reported data. You can see I was shocked by Linda's answer. Mine would have been something different, I imagine. Self-reported data contributes an extra amount of variance to any measurement system. So clearly I'm coming at this with the eyes of a data scientist. Let's hear our Linda's profession shape her view of the situation. - Well, and as a kind of project manager, my first thought was, okay, I'm not gonna die. So I was like, that's good. I was like, secondly, I was like, I think, you know, there's like a lot of thoughts when you're hurt. You're like, should I go home? Like, should I go to work? Should I, I was like, how bad is it? So I looked down at my knee and I'm like, you know, there's a hole. So I was like, okay, I've never been this like, punctured, I guess, or they call it a laceration. I've never been this lacerated before. I was like, I think I have to go to doctor no matter what. So that means I'm gonna miss work, right? How do we report this? How do we file a complaint? Okay, so perhaps that's the data focus of today's podcast. Well, I know LA has a great open data portal, and one of its data sets is the 311 data set. Why don't I file a complaint and watch it flow through the system? Let's see government in action from a data perspective. The first thing we have to do is go out there and find the exact location. Give them a lat long, maybe. - Oh, that could be it. - Could have been this. - That's a disaster area. - I don't know. - I don't remember the pebbles. - I found grass. - Yeah, it's like really hard to figure out where it was because I don't remember the address that I gave to the paramedics. And the trees are pretty much planted equally, equal distance, and they're all the same tree along the road. And all I know is that I sat on grass and I'm not on gravel, and there's gravel here. - And it's a stressful event, so it wouldn't be surprising if your memory wasn't perfect or not, so Linda and I wandered around for quite a bit. We really couldn't identify the exact location. We knew the block, maybe that would be good enough. - Thank you for calling the Bureau of State Services. Your call is important to us. - I went ahead and filed a complaint with LA's 311 department. I let them know where the pothole was, what had happened, and I asked that they'd get someone out there to fix it. - Okay, I'll make a request for a pothole. - But I also really didn't know what to expect here. - We're just trying to go out there and fix it. Was someone gonna drive out that day and fix it? Probably not, how does this all work? Do they wait for a certain number of complaints? - I was hoping this call where they'd be like, "Oh, well, you have to give us an address." And then I'd be like, "All right, I'll get that." Or maybe they'll be like, "Yeah, all we need is the block. "I don't even know what they want." Or maybe they're gonna, like, do they need to know which side, like northbound or southbound we don't know? - She didn't even ask. - Yeah, she didn't ask anything. So this was a learning experience from-- - I don't even think she cares. She probably wrote it down to scrap paper and threw it away. - So I went into this expecting failure, you see, because you need to fail to learn. - Well, what do you wanna learn? - What it takes to get the city the information they need so they can prioritize. - You're assuming that they're gonna prioritize and they're actually recording this information and then that they care. - We're gonna check 'cause the city has the open data portal. We're gonna see if this record shows up and how long it takes. - Yeah, what probably won't. - Well, and do you wanna put a bet on it? - Nope. - Okay. - So Linda admittedly is probably a little annoyed at me 'cause I dragged her out there for a long time to look for this pothole that we couldn't find. But I also hear her kind of being ambivalent to the efficacy of government. There's this sentiment that, oh, it'll never get fixed so I try. And I hear this amongst a lot of people. But my view has always been that there's a trade-off here. I mean, we simultaneously don't want our taxes to go up but also want maximum services. There are three parts to this equation. There's how much tax revenue the city has, how efficiently they spend it, and the amount of services they give its citizens. So if the services aren't what you want, either taxes have to go up or we have to improve the efficiency somehow. So maybe that's the data story here. How can data help the city learn about these things or more effectively track them or fix more potholes with the same budget? I went into the LA Open Data Portal looking for my request a few weeks later. It wasn't there. In fact, I found that there were only three pothole issues logged in all of 2015, which just did not sound right to me. So I used one of the options available in the portal. I requested a new data set. I wanted to have all the pothole reports the city had on file. Which is how I got your email and I noticed all this backlog of data set suggestions and requests that no one had really been monitoring. Well, okay, I don't know if Linda wins this round or I do 'cause the request didn't show up but government got back to me. Something's working. My name is Chelsea Ursiner. I just started in January of this year and my role is Open Data Coordinator. That includes managing the data on our Open Data Portals. I got that reply from Chelsea and she was extremely helpful, not just in coming on the show, but in the journey I'm on here to track this 311 complaint through the system. And so in my first couple of days, starting I said, who responds to these? And Lily in the Chief Data Officer, who's my boss, said you monitor these, go ahead and respond to anything. That's how I responded to you. I told Chelsea my story about filing a complaint and not being able to find it in the system and of only finding three records from 2015 reporting potholes. That is surprising. So three one's job is really to be a router to connect people to the right department. Where should pothole complaints end up? Okay, it's the Department of Public Works Bureau of Street Services. I've been learning a lot about how the Department of Public Works responds to one one request. And I had the same thoughts as you, like I've submitted through on one request before and you wanna be really exact of like where the problem is. But the way that they actually respond to it is that they'll send the appropriate person out there and then they'll look at the entire area. They're kind of being more proactive. They're not just responding to your request. They're going there and then they're doing their own assessment of the whole street or whatever it is, the area. Very recently, they completed the project to overhaul the through on one system on the back end so that now there's a central repository for all the data. Oh, cool. That wasn't the case before. So when I was looking to track down like your particular instance, it's really hard to look historically because it's all decentralized systems and it was communicated by email. So that was like the Department of Street Services I don't know what went on with their, how they had the potholes listed. But that won't happen going forward. So that's good progress. I think there's still interesting gated questions here though. How much should we invest in the tracking of potholes? I mean, I do think that's valuable but they could also track the amount of tissues consumed by every government employee. I don't want someone with a clipboard at the end of every row of desks ticking off tissues used. That's too much data. There needs to be a process here just because some citizen wants to know the amount of Kleenex used doesn't mean we should track that data set. The process right now is that when you make a data set suggestion, it just goes to us on the Open Data team and not actually to the departments who would be able to fulfill the request. So that's kind of where my role comes in as Open Data Coordinator. Then we reach out to departments. It's going to be changing what I'm passionate about doing and what we're all looking to fix is the triage process so that we can have a more robust suggestion form and then have a process in place with the different departments to get it to the right person. Kind of similar to 311, actually in regard to the pothole data that we kind of started talking about. And I've been thinking about reaching out to the people like you, like the early adopters who are going on our Open Data portals, trying to get more engagement and like a two way interaction of giving data. - I really like this direction for the Open Data movement. A lot of Open Data is frankly kind of boring. I mean, you know, timetables and actuarial tables and budgetary spending stuff. Yes, I'm sure that's interesting to someone, but I've often found myself a little disappointed in what's been made open so far. But I'm glad to see this progress and bigger things to come. I've put in a couple of requests over the years and this was the first one I heard anything back on. And I was really thrilled that Chelsea got back to me and that there's some motion going on here. - I was actually going to compliment you. I thought I saw a couple of requests from you. Do you have one interesting one that was to publish where the most argued citations are? So if people are getting parking citations, when do they actually bring them to court and the citation gets dismissed? Because that would shed light on where they're confusing signs. So I thought that was an interesting one. - Oh, I'd actually forgotten about that until Chelsea reminded me. And I hope that's not too self-indulgent leaving that in, but let's check in real quick with Linda and then get back to our story. - How long has it been? Three weeks later, I feel like simple things. It was hard to bend it. It was hard to squat and kneel down. It was hard to go up and down stairs for a little bit. I had to relearn that. And right now I'm trying to learn to stretch it in ways that feel normal. And it would even hurt just sitting normally in a chair with your feet touching the ground. It would feel incredibly uncomfortable 'cause it was like my muscles did not want to go that way. So I feel like if someone told me it was just fat, that's not true. I feel like my muscle was totally impacted. - So this project began as a data forensics project. I wanted to file a report, track it through the system, and see how quickly the repair crews responded. But we're left with a vanishing 3-1-1 call, not logged in the system, and a surprising absence of 3-1-1 calls about potholes. To be clear, I believe this is not a problem of the departmental response. That is, I don't think our report was ignored. I believe this is a data quality and data provenance problem. The city may have responded and even resolved the issue, but the data are incomplete, and they do not tell the whole story. I don't know, so should I stop the episode now? I guess give up, or perhaps I can find another reason to take an interest in potholes. - They can cost money to motorists, as well as to people who are injured as a result of accidents from potholes. Those costs are really quite significant. - Okay, that's a good point. Those of you that listen to data skeptic episode number 19 on the value of information might realize that we could apply that principle here. If we had a good tracking of pothole data, we might be able to predict their general locations and frequency, thus better optimizing the fixed budget allocated for repairs. The decisions we make about how to strategically fill potholes, given the forecast, will probably prevent more accidents than the decisions we would make without the forecast to inform the decision. The value of information is precisely the value of the informed decision minus the value of the uninformed decision minus the cost of acquiring the information. Oh, but I forgot to introduce my insightful commentator. - My name is Russ Klattpe. I'm the editor and writer of pothole.info. Pothole.info is sponsored by a pothole repair asphalt company. They are Easy Street cold asphalt. Their interests are general awareness of and solutions to potholes. Do you have any sense of what those costs are? What's the average pothole cost to consumer, a motorist, or a bicyclist on an annual basis, if not repaired? - So there's a number that I think is pretty significant with regard to the costs to municipalities and states and counties. And that is a dollar not spent on road maintenance today. And that can be crack ceiling or pothole repairs or other things, but a dollar not spent today will cost you $7 in five years. And that's basic physics. There's a crack in the pavement, water gets in there. If the freeze-thaw conditions, it expands that crack. But even in climates that don't have freezing, anytime water gets below the pavement, it creates some kind of an erosion. And ultimately, that's why the streets of Los Angeles break down, the streets of Honolulu break down that comes from time, temperature, traffic, freeze-thaw cycles, and things of that nature. - Pothole.info has a lot of summary statistics, good articles, and some actually hilarious videos you might want to check out. - We try to keep it informative and to some degree entertaining. There is a little bit of humor around potholes. But, you know, I got to be honest with you. I ran into some statistics lately that really made me think about this. We've been talking about the risks to individuals in terms of safety and injury. But here's a fact that we got from bankrate.com. It is that 63% of Americans do not have $1,000 on hand to handle emergencies. So if you hit a pothole and there's all kinds of damage, it can come from a pothole. But just think about a catalytic converter. Catalytic converters always cost more than $1,000 to repair. And well, to replace. And if you don't have that spare money around, you're going to either have to skirt the law and drive without a catalytic converter and risk even bigger penalties. Or you're going to have to take everything you have to get that fixed. That's 63% of the American public are subject to this financial worry of what a pothole can do. And we all know you can avoid 99% of the potholes. It's that one that you hit that causes the problem. So it's, I think, a financial threat to a lot of people to have bad roads. - So Russ has built what is to me a fairly strong financial argument about why we need to track and repair potholes. - You know, something else to think about too is that the majority of our roads were built in the middle part of the 20th century. They were, you know, 1940s, 50s, 60s, 70s. The older they get, the more maintenance they need. So those expenses for road maintenance on both a federal, state and local level are building just because of time. And yet this is the time when those costs are being curtailed. The other thing is the gas tax being the major source of funding for road repair. Think about this. Cars are using less gas. You know, we pay gas tax on a per gallon basis with more fuel efficient vehicles, which we all happy about. There's less, there's still the same amount of road use, but less in going into the coffers from the gas tax. So that's adding to the fiscal challenge of infrastructure funding. - In your opinion, what's a good citizen to do when they see one? Should we report these incidents? Go put a cone around them. What's the right process? - If there's an immediate danger and you have a cone available, that's very good of you. There are increasing numbers of ways of reporting them to the local authorities. There's a phone app called C-Click Fix. - I'm Ben Berkowitz. I am the chief executive officer and founder at C-Click Fix. - That's something anybody can put on their phone. They can take a picture of it. - C-Click Fix is a global platform for residents and government officials to communicate about problems in the public space. And then it is also a tool for those government officials to work together to solve those problems. - So how is, as a citizen, do I interact with the platform? - Should the resident, the primary way you would interact with the platform is by using the C-Click Fix mobile application to take a photo of a problem in the public space, then describe the problem, enter it as the service request, and then it gets passed off to City Hall where they would respond back. Those issues are probably documented so anybody can comment on the issue or support the issue or follow along and get informed when the government responds. - What types of issues do people commonly report? - So the primary issue that's reported on C-Click Fix is illegal dumping, but we see after illegal dumping, tree issues, potholes, blighted properties, streetlights throughout, road safety is a big one for pedestrians, cyclists, and automobiles, and really any other kind of public infrastructure you could think of that. - So no limits, anything I think my government should know about, I can communicate it. - Should I have the location and it's physical and it's in the public space or it's impacting the public space such as a small business or a home, then you should use C-Click Fix to see if you can get it fixed. - I've just started playing around with the APIs you guys have, which I'm really excited about 'cause they're nice, easy to use, and restful APIs. What can people find there if they want to go and explore that? - Within the API, if you're thinking about it from a read perspective, if you wanna look at the data and understand maybe how Oakland responds, compotholes versus Richmond, Virginia, that would be a query that you can run inside that API, but effectively you can see what service request types residents are reporting in a particular community and how the city is responding. And of course, there's location lat long tied to each one of those issues. It might be interesting to see how that data relates to other community health data that you could get from a census or some other open data source. - It seems like the API'd be a great starting point for like a hackathon project or something. Have you seen any particularly noteworthy projects come out of third parties consuming your API? - Let's say, so where to start? So there's a project called action path out of MIT Civic Media Lab. Action path takes the idea that you can leverage push notifications and geo-fencing to tickle people into small acts of volunteerism that they're going, that they're deli routines as you're walking around between Android app and it will push a quick fix issue to you if you walk by it and it'll say, hey, is this issue still here? Or is there something you can do to resolve this issue? And then it'll give you way fine to go find it. So that's action path. And I believe it's available in the Android marketplace. One hackathon, someone built a little JavaScript app that's called Bump Finder app. And it's up on GitHub. So anybody wants to take the code. Bump Finder app is a, I would say, a little bit hypersensitive Bump Finder in the road that leverages the gyroscope in your phone to report issues and to see click fix. So if you, I hooked it up to my stroller, my kid stroller the first weekend when I went down the street and reported bumpy sidewalks. That was a fun one. We've also worked with Kaggle out of the Bay Area on a data competition. It was a predictive analytics competition where they were trying to figure out who could predict to the closest level of accuracy, the next issue that was going to be reported. And that was interesting. And then there's been a number of like, visualization competitions as well. So we've seen folks who have made dashboards out of the data. There was a really beautiful one that came out of Raleigh, North Carolina at one point. I think there was in partnership with Staccato. And then there was also one that came in like second or third in a competition here at Yale that was built for the city of New Haven. So I think we're just on the cusp of really folks leveraging our APIs. I've heard of hardwired sensors, physical sensors being built into this e-click fix platform. I've heard about testing of water sensors called crickets. They detect when the flood is coming or when storm drain is clogged. As human and sensors become part of the fabric of bug reporting in all the public space, we're gonna see a lot more use of the API in that fashion and it's gonna be really fun. - Could you define, I just started reading up on this project or a program that I think you guys initiated and it's the Open 3-1-1 protocol or project. What is that? - Sure, so this time's back to our open API, right? Our API C-click fix V2 is probably the most up-to-date version of what we want to enable developers to be able to leverage our data with. But early on in our existence, ourselves, a few cities and an organization called Open Plan wanted to create a standard beyond email for communicating between residents in City Hall and Open Standard. And so we, my co-founder Cam, created the draft spec for Open 3-1-1 and submitted that to the consortium we helped create with Open 3-1. So we've kind of led the way on that spec and what it does is it defines a standard around service request reporting and making sure that it's transparent and that it is a standard that is portable between different platforms. - You know, I come from a stats background and one of the things that's really appealing to me about your app is it's really a measurement tool. It's a way that governments that might not otherwise be able to see problems could start to observe them. But of course, that raises some questions about the sample you get. Do you have a sense of whether or not you get a good random sampling from your users or are they more concerned with certain issues like maybe graffiti or my topic of the day, which is potholes? - No, I think we see a socioeconomically diverse populace as well as gender, race, and age diverse populace using the platform with a diverse set of intro and aspirations for the public space all being spoken about relatively evenly, especially as we have seen the mobile application, rather the smartphone help residents of the United States, manned internationally, of course, even more so internationally jumps a digital divide. We've seen this kind of even more even representation of the community on say clicker action. - So in addition to reporting issues, what else do you empower users to do? - We also empower users to actually resolve issues themselves while the government officials and transit agencies and utility companies and business improvement districts are the primary resolvers of issues that are documented in the site. We also see residents going out helping neighbors who have requested assistance with snow shoveling, maybe cleaning up the park or offering commentary on a city process or sharing information about a city zoning code so that other residents can learn about that our thesis is the most powerful way to think about the requesting of services from city hall is through an open form of communication enabled by the web. At the intersection of these issues are shared interests and or rather these issues are the intersection of shared interests of varying people. And so on cyclic fix, people are meeting each other for the first time and it's like an important group of people in the community that are meeting each other for the first time 'cause they're the people that are kind of first to speak up. And so in the relatively near-term and cyclic fix we're going to be enabling things like the ability for users to connect directly with each other one-to-one message each other, reference each other in comments and really help them to build a new kind of community around some pretty important infrastructure in their communities. - So I downloaded see-click-fix and I officially filed the report of the pothole. - Although I guess that might have been redundant with the 311 call I already made. Presumably that record gets shipped to the city so now they have two reports of the same thing. - That makes a data challenge because they do get duplicate requests and then when they close them it looks like they closed two requests or what like how to count those completed through on run request. Definitely not a big problem but it's just kind of an interesting data question. - Cool, so what does that mean for the future? Well at some point I'd be able to find myself a master list of every little let-long of a pothole in the whole city or-- - Yes, well, the data, there will be data from March. What you'll see in the future is a consolidated list of through on run requests on our open data portal and it'll be updated to show the status so you can go back and check and you'll also see the same things that the people completing the request see so you'll have the reason for closure which is a really big added field that wasn't available before you were able to see okay my request was closed or completed but what does that mean? Or maybe it wasn't completed maybe it was closed but why? So it gives a lot more like human explanations. - Yeah, that's really exciting and we could even citizen data scientists could be watching that and maybe uncover what are the ambiguous things people aren't providing enough data about that are-- - Absolutely. Yeah, to look for the reason like couldn't find or something. - You know lots of people right into the show saying that they appreciate it 'cause they're just getting started in data science and this is a really helpful and gentle way to learn things. When I'm asked for advice, my answer is always the same. Find a project, do something, use the skills you're learning. I was looking for a project here myself but I couldn't do my project. The pothole data just wasn't available. I hear people quoting this expression I don't know where it came from. They say, you know, the dirty secret of data scientists is that they spend 80% of their time cleaning the data and only 20% of the time doing anything with it. I think this is actually a bit of a misnomer. I think we spend about 50% of our time looking for data that doesn't exist. When a scientist conducts an experiment and they find no actual results, you know, they test a drug that just didn't have any efficacy. Generally, that boring paper about what didn't work doesn't get published. We call this the file drawer effect. I think data scientists have our own sort of a file drawer effect. People pursue interesting problems. They get to a point where they're stuck like I am and getting pothole data and there's nothing else to be done. Businesses, governments and other organizations, we want to help them but when they fail or refuse to store enough data, the right data, or store it correctly, all the cleaning in the world won't fix that. But on the brighter side, it sounds like LA's new 3-1-1 system is going to close that gap a little bit. And with enough data, there's a lot of cool projects for all you citizen data scientists or people just getting started out to work on. So we may be on the customer something here with LA's 3-1-1 data set. - Yes, and that's why the new 3-1-1 system is so exciting because it's one feat into the open data portal. - I wanted to revisit with you now that we're, how long has it been since the accident? - Mmm, let's see now, it happened November 20th. So December, January, three months. - Three months out, how you feeling? - I feel fine. - Back at work, no disability. - Yep, I'm good, 100%. - All right, so do you think people accused me of making a big deal out of nothing? - Who accused you? - The fact that they did a whole episode on potholes and this stuff, and here you are three months later, it's the legal term, you're made whole or whatever. - I don't know what the legal term is, but I do have a scar. - Oh yeah, a scar, let's talk about the scar. - And the scar itches. - Do you keep a spreadsheet, any data on that I could track? - No. - Okay. - But it itches a lot, and it doesn't look very nice. - All right, without the spreadsheet, there's no data project around the scarring or the itching. - Quite like your wife, not very long ago, it was during this winter, maybe last fall, here in Chicago, I was thrown to the ground by a pothole. I'm a biker, I bite quite a bit, actually I had two bikes, I had a city bike, I'm a triathlete also. So the idea of potholes being dangerous is very real and present to me. I was riding down a residential street, fortunately there were no cars around me. It was nighttime, all of a sudden I found myself thrown to the pavement, I didn't understand what happened. I looked back and I discovered I had gone through a water-filled pothole, there were leaves floating on that pothole as there were leaves on this rounding pavement, and so that pothole was indistinguishable from the road around it because it was filled with water and it was dark out. - Yeah, yours is funny, it sounds almost like a scene in a movie and a jungle where someone's built a trap, they're like. - Yes, yes, it was like that, as a matter of fact, yeah. - So you were mentioning earlier, you're a triathlete, has that incident inhibited you at all from your activities? - You know, I've been a triathlete for, I dare say, 30 years. This is my 30th year competing on a very amateur level as a triathlete, and I bike with groups of people all the time, we are always cursing. We go through several towns along Chicago's North Shore and we're always cursing how they allocate not enough money to keep those roads in shape. So we sometimes change our routing and all of the roads are very bad. - So any concluding thoughts, like have you changed your bike riding style? - Oh, well, I definitely look at the street mode more to prevent potholes, if there's any debris on the street, I'm like, stay away from anything on the street. And then the first day that I bike back to work, and when I got to work safely, I truly felt like I wanted to get on the ground and just kiss the ground 'cause I was so happy, I was alive. Do you think you'll still be able to bike if we buy a house soon? - Depends where. - Yeah, hopefully maybe I know you enjoy it. I'd like to bike again too. I've had to stop last two jobs. - I wanna thank my commentators for the day. Ben Berkowitz from C-Click Fix. - Well, Ben, this has been great. Thank you so much for your time. Do you wanna throw in a final shout out? Are you guys hiring or anything like that? - Yeah, we are, we're hiring engineers. We are hiring sales folks, and we're hiring in marketing, and the job poster at C-Click Fix.com/jobs. - And I don't know if we touched on it. What city are you guys in? - So we're in New Haven, Connecticut, right in between New York and Boston. - Yep, C-Click Fix also has a podcast you might wanna check out. - Sure, so if they're in the New Haven area, New Haven, Connecticut area, it's 103.5 FM, that's W-N-H-H, and we're on at 11 o'clock on Wednesday morning. If you wanna download the podcast, you can go to C-Click Fix Radio, in iTunes or your podcast, aggregator of choice. It's also on SoundCloud under C-Click Fix Radio. - I also wanna thank Russ Klattke. - You can find us at pothole.info and check in with us periodically. We're putting up articles there a couple times a month, at least, if not even more frequently. We wanna increase awareness of not only the causes and costs of potholes, but also the solutions. - And thanks to Chelsea Ersener, who can tell us where to find out more about the LA data portals. - We have two of them, we have data.lacity.org, which is hosted by Socratic. It's kind of similar to the service that a lot of other cities have, and we also have the Geo Hub, which just launched in January, and that's how it's all the geospatial data. - So I guess the moral of the story is that, all the fun stuff, the modeling, the machine learning, the forecasting, it's all garbage and garbage out, if you don't have good data. Data scientists need access to good information. We need to have good provenance in the data. Data scientists need to deeply concern themselves with where their data came from. Does it have any sampling bias? Was it filtered in any way? Is it even available, I guess? So maybe I wasn't able to use data to make the roads safer with respect specifically to Linda's accident, but it sounds like someone else in the future will be able to do that with this new 311 portal. I also hope you guys start thinking about value of information more and the cost of tracking things. 'Cause there's always a cost. Data has to be captured, stored, archived, summarized, and sometimes great care must be taken to set up those decisions for success in the future. Anyway, thanks everyone for tuning in and to all my guests for coming together to tell this pothole story from as close of a data perspective as I could get. I hope you enjoyed this slightly atypical episode of Data Skeptic. I do seem to get positive feedback when I do these sorts of things, but they also take a lot longer to produce. Producing 100 podcasts in a weekly format has been a lot of work for me, and I've enjoyed every minute of it and look forward to bringing you 100 more. I'd actually love to expand what Data Skeptic is and explore screencasts, video episodes, animated things, interactive tutorials, more lectures and live events, but that stuff isn't easy. At some point, I'm going to have to have help, and I'm going to need to pay the people I contract to help me, so there will certainly come a day when I'll ask for subscribers and one-time donors, and I may consider advertising at some point. But not yet. If you've listened to this show before, you must know a little bit about me and surely you realize I'm playing the numbers on this. I know the listenership size and growth rate. I know about other shows and the average percentage of listeners that will subscribe to podcasts and the average amount they donate every month. If those trends apply to Data Skeptic, while they end up being a substantial and generous amount, that expected value couldn't presently cover the part-time salary I'd like to have available to pay an assistant whose job would be to improve the show. At least those numbers don't work out yet. So here's my appeal to you today. Help me get there. Help me double the size of Data Skeptic over the next four months. Tell your friends, colleagues and classmates about the show. Share the show on social media. Talk about it on internet forums. Make a flyer and stick it inside data science-related books at your local bookstore. Secretly take people's phones and subscribe them to the show. Add subliminal messages to your PowerPoint presentations. Whatever it takes. Help me get the word out. Linda, Yoshi and I appreciate it. And hopefully the people you share the show with will appreciate it as well. If we can double the audience, I believe I can take on a part-time staff member who will help improve and expand the program in ways I'm not able to. The best way you can support Data Skeptic is by telling people about Data Skeptic. It's basically the opposite of Fight Club. You can also leave us a review on iTunes, which we'd appreciate, or take our listener survey at data-skeptic.com/survey. So until next time, in between telling a few people about the show, as always, keep thinking skeptically of and with data. - For more on this episode, visit data-skeptic.com. If you enjoyed the show, please give us a review on iTunes or Stitcher. (upbeat music) [BLANK_AUDIO]