Data Skeptic

Detecting Cheating in Chess

Duration:: 44m
Broadcast on:: 22 May 2015
Audio Format:: other

With the advent of algorithms capable of beating highly ranked chess players, the temptation to cheat has emmerged as a potential threat to the integrity of this ancient and complex game. Yet, there are aspects of computer play that are measurably different than human play. Dr. Kenneth Regan has developed a methodology for looking at a long series of modes and measuring the likelihood that the moves may have been selected by an algorithm.

The full transcript of this episode is well annotated and has a wealth of excellent links to the things discussed.

If you're interested in learning more about Dr. Regan, his homepage (Kenneth Regan), his page on wikispaces, and the amazon page of books by Kenneth W. Regan are all great resources.

[music] The Data Skeptic Podcast is a weekly show featuring conversations about skepticism, critical thinking and data science. So just a quick note before we start this interview, in order for me to keep the Data Skeptic Podcast coming out on a weekly basis, I sometimes need to record in advance, especially when I know my schedule and other responsibilities make it hectic. Thus, this interview is recorded a little bit ago, and I've been really eager to share it with all my listeners. But my guest does make a couple references like last weekend or a week ago. So if you follow the chess world, just realize the timing won't quite work out on that. But for everyone else, that does not change the narrative, so please enjoy. Welcome to another episode of the Data Skeptic Podcast. I'm joined this week by Dr. Kenneth Regan from the Department of Computer Science and Engineering at the University of Buffalo. Thanks for joining me. Thank you very much for inviting me to be on the show. So I've been really enjoying over the past couple weeks getting to read your blog over your contributions to the GoDell's Lost Letters and P equal and P blog, as well as your work in Computational Complexity. But the topic I wanted to invite you on to talk about this week is one that, as far as I can tell you, one of the world's leading experts on, which is detecting cheating in chess. Yes, that's right, and I've been very active in the past week, more than I perhaps wanted to be. Excellent. Well, how serendipitous then. Maybe we can start with your background and how you first got interested in chess to begin with. Well, I was a junior player. I learned the moves at five from watching my dad play with my uncle. And then at age 10, I discovered that there were such things as chess tournaments, and my family very graciously took me into New York City and the local chess club to play in them. And I reached to just about master's strength very quickly. I was a master. I think maybe briefly before age 13, but certainly at age 13, and I was one of a number of my friends, all great people, juniors on the New York area, chess scene and elsewhere, who at one point were, quote, "the youngest master since Bobby Fisher." I was also invited as a panelist on the National TV broadcast of the Fisher Spassky match in 1972, before my 13th birthday. And then I played tournaments. I got to be US junior champion in 1977, and it was just an absolutely wonderful thing as a kid, but I did not, however, intend to go into chess as a career. Well, what made you go into computer science? I originally wanted to do mathematical physics. At Princeton, I found physics fairly tough going as an undergrad, but I got into mathematics and then discrete mathematics, which is the mathematics of computing. And then when I went to Oxford, my advisor was sending a Commodore in mathematics. Dominic Welsh got me interested in computational complexity, and that pushed me towards computer science. So maybe we could start with defining what it means to cheat at chess. I assume we're not talking about, you know, doing something nefarious, like switching pieces around. No, or having a rook in one's pocket. Right. What do you mean when you say detecting cheating? Well, it's a fortunate reality of the game is not only our supercomputer is able to beat the world champion, as happened when Deep Blue beat Karika Sparov in the 1990s, in the middle of the 20 aughts, even handheld computer programs and laptops were able to defeat humans. And that has created such a temptation, unfortunately, in some cases, for humans to consult the superior computers during what are supposed to be human-only games. So I might get an unfair advantage by using my smartphone to check a move or something? That's right, and people have been checking their smartphones in bathrooms, especially. Uh-huh, interesting. So how does one go about detecting that? Well, there is a certain pattern in the moves that indicates proximity to the patterns that come about from computer programs. Number one telltale is just the strength and quality, but I look for other factors also that can distinguish computer-like play. So what is it that makes computers, presumably if computers were going to play the same way humans would, you wouldn't be able to detect it. What is it that computers are doing differently? Well, one of the things is that they very rarely make a perceptible blunder. Okay, if you manage to beat a computer and there have been cases of that, then you have found something very deep in the position that appeals to your human intuition but is not easy to demonstrate via a moderate-depth search. So you found something very hidden. So whereas humans make little blunders all the time, in fact, even when I'm playing, I'll realize, "Oh, I didn't see this idea by my opponent. Luckily, my position still holds together." So computers just never make large-scale overt blunders. There are also some stylistic tendencies. They tend to keep a lot of their options open in the position, whereas I've been able to demonstrate that humans, if they see an attack, will try to play it and force an early crisis in the game. Interesting. So can you say with certainty that a move is given by a computer versus a human? Not any one move. No, although there are some people who will react, "Oh, this is obviously a computer move." However, in aggregates of 100-200 moves, then I can get an overall statistic on how computer-like the play is. So then I'm guessing it's comparing two probability distributions? That's right. Indeed, that is originally how I conceived the whole problem. I was going to model it as a distance metric between probability distributions, one representing the computer, and this is very important because you can't just say deterministically what is the computer's move. That depends on many variables, including the depth of search, the identity of the computer program, even how much hash memory you've given it. There's a variation there. And of course, the human being is a distribution, often fallible distribution of options. And I was originally going to use a metric called fidelity, which is a distance between distributions measure that comes up in computational physics in particular. And that name was also a pun on feed aid, which is the acronym of the world's test federation, and the idea of faith and playing and good faith. But then one of my students, former students returning on a visit, suggested a different distance measure called the Ensign Shannon divergence, which would solve a problem with COBAC Leibler divergence of being too hair-trigger in the treatment of blunders, but it lacked other technical properties that I needed, which was a great shock. So I wound up doing something highly elementary. All I do is I map, given analysis of positions in a game, and parameters denoting the strength and particulars of the human player. I map each chess move to probabilities, as if I were painting that move on the faces of many-sided dice. And then I just treat the chess position as a role of those dice. So I'll say like for this player, for 49% chance of selecting the 9G5 move, the clean G4 move is 27% and so on down the line. Once I do that, I'm able to tap into the very simple high school stats theory of Bernoulli distributions, rolls of dice, slips of coins, and not only project the expected mean number of agreements, but also the projected variance from the Bernoulli distribution theory, and thereby project confidence intervals. And since I'm estimating a sample mean, the central limit theorem comes into play, so I'm dealing with normal distribution and can use a Z-score framework for my statistical projections. Very interesting. Are there certain situations that are more revealing than others? So I'm wondering, for example, in cases of a game configuration when there's one move that will keep you alive and pretty much any other move that's going to end the game for a particular player. Right, and that is actually the main statistical principle on which the equations in my model that go from the chess positions to the probabilities is based. If there is a clear standout move, then not only will the computer find it, but it is highly likely that a good human player will find it. That actually was the explanation in the incident that began my involvement, whereas even if you have just ten moves in a game where each move, there were four highly reasonable options, which my model would assign probabilities, say between twenty and twenty-five percent for each. Yet if you consistently select the option that shows as preferred in my computer test, modulo that it too is a distribution and there's some variance but less. But if you match the computer on one of four options, ten moves in a row, that's upwards of a million to one likelihood, unlikelihood. Tell me a little bit about the training data you have. In terms of, I presume you've got a database of human games you can study and also moves that a computer is there to give the different distributions. Where does that data set come from? So there are two modes of chess analysis with a computer. There's multi-pv mode where the computer is given equal depth of search to all of the reasonable options in a position. And then there's the single-pv, which is the regular playing mode of the computer, where it focuses its resources on the move it thinks best and one or two competing moves. A lot of the art of practical computers when they play against each other and championships is how quickly they can prune away the suboptimal moves. Whereas for my model, I need to know the values of all the reasonable moves that a player might be considering. And the general landscape or shape of the position, whether indeed, as you just said, it is a forcing position where there's only one move to stay alive or when there's lots of options and all the various grades in between. So now for my training set, I have about 10,000 games of the multi-pv data, and I took those games where each player in the game was at one of the century points on the chess rating system. So chess uses the ELO rating system, which is popular. People use it for other sports and even Nate Silver's website, 538.com, is using it for national football league teams. But in chess, the human players basically top out in the 2,800s. 2,600 is sort of a strong grandmaster, 2,400 international master, which is what I am, 2,200 master, and so on in classes of 200 experts, 2,000 as expert, 1,800 class A in the United States, 1,600 class B, and so forth. So for each of the levels above 2,000, I basically exhaustively chose in their databases, now approaching 10 million games for the entire recorded history of chess. From those databases, I take all games where both players were within 10 or so, of a century point, run those games, and I get my training data for gay rated player. And then I'm able to therefore relate my two main model parameters to strength on that ELO scale. Then supplementing this, I have a vast amount of single PV data. Basically, every top level game in the entire recorded history of chess, over 200,000 games total, which may not sound a lot out of 10 million, but a lot of the 10 million games are ones not played in the most standard chess competitions or played by lower players. And I also routinely screen about 80,000 games out of the World Stop events every year. This is what has been codified as the screening mode. So I can get a first look at the play and relate it to distributions of tens of thousands of other performances by players at all ranks to see how much of an apparent outlier it is. Now this is exactly the situation that I had this past weekend. There is an anti-cheating group on Facebook, and on Saturday, I noted posts about a surprise winner of a tournament. And whispers when I ran the screening test, indeed, the person was on the edge of what I have coded as the red zone saying, "Look into this a little bit further." I defeated a Grandmaster, almost 500 points, rated above him in the last round. But when I ran the full test, which is based on the training data using the multi-PV mode, it gave a Z score even under 1.5, let alone the minimum two, to claim any kind of statistical significance. Meaning the guy had basically rolled with a lot of punches, and indeed he was dead lost against the Grandmaster in the last round, but the Grandmaster got over-hasty. So I posted feedback saying this on Facebook, and it seemed to have quieted the rumors for the moment. But it's actually one of several test cases for the viability of statistical input that I'm getting. The other thing is I confirmed that the arbiters of the tournament had acted correctly based on the less regular information that they saw. So they don't have scientific computing resources, but they have good chess marks and were able to observe a lot of the things about the games, the choppiness, the forced situations, the fact that he was lost against the Grandmaster, and reached basically the correct and sane conclusions. That's good to hear. I would imagine then that you're never saying with certainty anyone cheated, you're just talking about the probability that this happened. That's right. To what degree do you want to have confidence before maybe an action should be taken? That's right. So I'll tell a little bit about the background and history of this. Up until two plus years ago, the highest Z score that I got in cases I was involved in was about 3.5, which corresponds to 4,300 about to one odds against a null hypothesis, although very important. What it really means is that's the incidence of such a deviation occurring naturally among non-cheating players. And every week, there are more than 1,000 players taking part in tournaments around the globe, tournaments significant enough to be aggregated on a website called the Week in Chess run by Mark Crowther in England whose games I can get to test. For the tournaments that publish their own games on their own websites. So if you have that, then you get 4,000 player performances every month. So if you've got 4,000 to one odds, if you looked at every game for a month, you'd see that happen all the time. So you have to be aware of how your sample was selected or whether there are any other factors that brought it to your attention, apart from people going over the games with computers and noting a high correspondence. So my rule was that there had to be independent evidence. Now in one case four years ago, there was independent evidence based on text messages that had been preserved since the cell phone was used to transmit communications. The content of the messages could not be entered into a civil court case on the privacy reasons, but the fact of there being there and the times could be entered in evidence. And that was an independent factor. So then my input was that I had this deviation over 3. There was another reason to be that it selected this performance for attention. So the full significance I said came into play. Also, I was able to distinguish the moves that had on games and there had been lots of transmissions from the other games in the tournament played by this player. And that was a very clear distinguishing. The quality distinction was 3,200 versus 2,550. So certainly, apart from the confidence assertion, there certainly was that distinction. But up until January 2013, I did not get any Z scores above 3.5. And then suddenly, one's above 5, tending closer to 6, tumbled out for a case of a Bulgarian player playing in the tournament in Zadar, Croatia. And that prompted me to run an open letter and write a blog post raising this as a further level of problem, a really philosophical problem of what role should statistics have in adjudicating this kind of case. The player Borislav Ivanov has never been totally caught, although there was a wand beep on his shoe and at another tournament, people observed briefly a wire on his chest. But he was never really apprehended. He was allowed to walk out of the tournaments after being defaulted. Actually, he even came back for his last two rounds and one of them was undisturbed. I have my own hypothesis, but it still is a mark that nobody knows exactly. How he's doing it? And of course, they had the phrase "if he's doing it." Let alone that when I combine all these tournaments, I get Z scores over 7. And we were getting to above billions to 1 to trillions to 1 odds range. Yeah, that's a higher standard of evidence than even. I think you were saying when we talked earlier, the Higgs boson standard is 5 sigma. So fairly convincing from a statistical point of view. I mean, I was able to use the 5 sigma standard because my unit is 1,000 player performances is one week of the week in chess. If 4.75 sigma is 1 million to 1 odds, then I could say that's 20 years of chess tournaments at the level we're likely to be most concerned with. And if you use 5 sigma, that's almost 3 million to 1, that would be 60 years, meaning that you would expect to misclassify someone at 5 sigma to make an error once every 60 years. And then the question is, is that an acceptable rate of error? So then I was able to point to, well, that is the same as the standard that's used in physics for scientific discovery, but watch out. The Higgs boson has now risen above 7 sigma, so that looks safe. But a video announcing a gravity wave discovery a year ago begins by knocking on the door of the cosmologist whose theory was supposedly verified, and the guy says it's 5 sigma, but it wasn't. They made a systematic mistake in their modeling of intergalactic dust. And the result is, well, physics today said death knell for their work in February. So that went away, apparently. So what sort of sigma value do you think we might achieve if we entered an actual algorithm in a tournament? And we let someone just read the moves directly off the computer? Yeah, then if the algorithm played over 200 moves, I would be pretty confident of getting deviations above 5 sigma. Although, oddly enough, it would not be perfect correspondence, because my work does not try to forensically identify a particular algorithm. Instead, it's testing for the general computer-like nature. And the different chess programs do not agree with each other. Moreover, when they play by themselves, they tend to get themselves into positions that have a lot more options. Even at their own level, their own projection of agreement rate would be close to 50% than the over 60% than what we get for the same computers playing human games. So that even makes their agreement rates look lower. It's definitely a complication for my model that the computer programs don't agree with each other to such an extent. It imposes upper limits on the sensitivity. But it does seem that I get -- well, the policy is a Z-score above 2.75. This is in my work as a commission member on the World Chess Federation and new guidelines and regulations that were approved last year. Z-score of 2.75 is enough to claim statistical support in a case where there is independent evidence. And a Z-score above 4.0 is enough to open an inquiry. We don't, however, have the right of judgment for 5.0. We backed off that because a lot of people are understandably uncomfortable with using statistics to pass judgment. Alone, statistics alone. Yeah. Makes sense. Yeah, so I'm going to link to the open letter on chessprofessionals.org that you'd mentioned. And it sounds like the end of January 2013. Yeah. Well, a lot has happened since then. So what happened is there were more incidents in the spring of 2013, including a junior player being caught with a cell phone and a loo, and it roughed up by his opponent who happened to be on the Romanian National Soccer Team. And the world chess federation realized that this isn't just an association of chess professionals matter. This affects the entire game. Something needs to be done. So to their great credit, they put resources in and convene the commission. And one of the things we did, oddly enough actually, was we loosened a reaction requirement that we feel had been too strict about completely banning cell phones from playing halls. We put other rules in to make sure that players are not able to walk around with the cell phones during the games. Or have them in their pocket where one player was caught getting buzzes from the cell phone on his thigh that he could decode. So you're not allowed to have the cell phone on your person during a game. But we don't want to completely inconvenience people by not making them able to have the cell phones with them because they might be coming from work and might not have a car that they could store or room that they could store the cell phone in. Sure. So I've been active on the committee and our committee really constituted itself in terms of operation this year. And I'm right now writing a reform report on one case while dealing with several others, which happily, though, they are furnishing more input into my new system using multiple programs and showing how consistent they are with each other. Very neat. So I would guess going back to some of the different algorithms, they could each have different databases that they're accessing in terms of finishing moves or slightly different pruning strategies or maybe heuristics baked in that account for some of the differences. But I was really fascinated by the point you made that they tend to play with much many more opportunities open at the same time. Do you think that's sort of the defining characteristic of computer versus human play, the ability to balance so many independent strategies? It might be. It's certainly, in practice, it gives me an extra 5% margin. I analogize that to the idea that computers play the role of Spock to our Kirk. Like in a Star Trek episode, McCoy and Kirk want to go in and intervene right away, but Spock questions, wait, hold it. I think that that's actually, I'd like to see that enlarged into a useful personality aspect of personal digital assistance in general that they're not there to make up our mind for us, but rather to show us the options. Yeah, but yes, that is a, that more wait and see is the definite property of computer games on balance. They take longer and they don't have early crises, particularly as human games do. I'm a relative chess novice. If, let's say for some reason, I decided to dedicate myself to the game and I started playing 10 hours a day, six days a week, but rather than studying by, you know, reading classic texts, looking at classic games and books and things, I chose to develop my craft strictly by playing against the best algorithms that are out there. Do you think I would develop a style of play that could have a false positive in your detection? Well, I'm fielding exactly this question on Facebook from someone like within the past hour. Okay. It's a common, it's a common point. It is quite possible that players have become more computer like, and that might even be reflected in a slight increase in overall quality that I'm registering within the past 10 years. But I don't think you're, if I've yet to see a demonstration that someone can become really adept at the game solely by playing with a computer. For players under 2000, what is said to develop your skill the most is pattern recognition of positions and their tactics, which you can get by playing over famous games, like games played by the Russians in the 50s and 60s are great for this. And I agree with the contention of a documentary, My Brilliant Brain by the BBC and National Geographic featuring Women's World Champion, Susan Polgar, that it's the face recognition and pattern recognition parts of the brain that are activated most when we play chess. So if going over a computer is going to promote this ability, but it comes from seeing the patterns, I think, rather than contending against the algorithm. And up to 2000 level, people are saying now that it's more important to understand the basic tactical patterns, tactical quizzes, flash a whole page of black to move and win, and that that is the best way to improve. And then when you get over expert level, then the deeper parts of strategy fill in. So to imagine it's only really been possible to cheat in this way in the last, let's say, perhaps 15 years when the algorithms got to the right level and even the ability to kind of, well, we didn't have cell phones quite the same way in the 90s, but now we have them very small in our pockets. I guess there were some situations where there was concern that a player was getting a buzzing to indicate the move. Yeah, there's a meme that your iPad now has the power of the world stops over computer from 1994. I don't know if that's quite true, but anyway. So yes, that's the problem. Actually, 20 years ago, there was perhaps what was intended as a proof of concept at the world open. A player was given the name, Jonathan Neumann, the name of a famous computer pioneer, and had a computer in his shoe, and he was caught while the term was going on. He was able to, I think, defeat or draw against one master rated player, and I can say various more things about it, but that in another case in the 90s are considered to be the beginnings. But the real beginnings were whispers against the Bulgarian Grandmaster, Vaseline Topalov in the candidates tournament for the World Championship in 2005, allegations which my work does not support. And then in the ensuing World Championship match in 2006 against the Russians, Vladimir Kramnik, Topalov himself brought accusations of cheating based on the coincidence of moves with the then leading program, Fritz Nein, which is manufactured by chess base. So, Topalov's manager, Silio Denilov, released an accusing letter citing coincidence statistics, but not giving any methodology. Now I actually know that he used the flawed methodology to generate that. I tried reproducing the results independently, and on four of the five games I didn't. But on the second game of that match where Topalov was winning brilliantly but then lost his way and eventually lost the game, something which really must have upset him. I do get that on the second half of that game reproducibly with two different programs, Ripka and Fritz, of Vladimir Kramnik matches over 90%, which is an unheard of human percentage in general. But the reason is the explanation I found is that most of Kramnik's moves were absolutely forced, first to stay in the game and later the only way to keep his advantage. And that's what led to the statistical principle that I talked about earlier. And basically all of my in-depth work with the training data and the numerics and the model development have been just quantifying this all important qualitative principle. Yeah, that's a really important distinction. I really appreciate it that's in your model. It's like, you know, with ERA, you look at baseball pitchers, ERA, and you think, "Oh, they're all the same," whatever. But no, if one pitcher is pitching and wiggly fields with the wind blowing out and the other pitcher is pitching at Atlanta on a clammy day, that makes a difference in the expected number of runs. So I'm basically finding which way the wind is blowing in the chest positions by using deep computer analysis in a fight fire with fire. So I certainly wouldn't ask you to name any names if you didn't want to or to slander anyone. But speaking broadly, do you think that there's an undetected problem of cheating or are these isolated cases? I still think there are isolated cases. I mean, there was an article in the daily telegraph two weeks ago with a British Grandmaster saying that chess is riddled with cheating, and I've not seen that evidence. There is, I mean, as was voiced in this case over the weekend, you know, people say, "Acknowledge, yeah, there is a lot of paranoia about it." It is also true, I must acknowledge that if a player is disciplined enough to do what's called smart cheating, cheating on only a few moves for game, my stats are probably not going to detect that. So then the point, though, is that the discipline is an added burden on top of the cheating itself. Are there lots of smart cheaters out there? I've yet to see any credible anecdotal reports. Okay, any observations of something? Anything besides people saying, "Oh, this guy had an unbelievable results score, here are ten moves that match in a row in the game." I mean, I do routine screening of all the tournaments, and I can tell you, there are lots of cases of players reeling off ten straight matches to the computer or more. But there are other patterns that do need further investigation, and then the question is, you know, judging the frequency, is this excessive or not? That's a much harder determination to make. So I'm curious to understand a little bit more from your perspective on why chess has become such an interesting, and I think we can call it unsolved puzzle. We've been playing it for centuries, and from what I understand, the skill of play has evolved. We're getting better and better at the game. What makes it so complex a game when it's such... In a way, it's really simple game. Children, as you learn, maybe not all children learn to achieve the levels you achieve, but children can learn the rules and play the game mechanically speaking. How is it that such a challenging game emerges out of simple rules? Well, that's true. Well, here's where my main professional field of computational complexity theory has an input. In a generalized format where the board could be arbitrarily large and the armies could scale up too. Determining which side has the advantage in chess in a given position is what's called a polynomial space complete problem. In fact, actually, if you remove something called the 50 move rule, then it actually jumps up to being an exponential time complete problem where we definitely know there is no efficient algorithm to make a foolproof determination. Polynomial space complete is a level apparently above NP-hard. So if you've heard about NP and NP-hard and NP-complete problems, like traveling salesperson problem or tautologies and logic, chess is apparently harder than that. So there is a great kernel of complexity, and yes, the complexity comes about from a very simple specification. It's a universal pattern in my professional field. So the thing, though, what makes chess appealing is that along with the relative simplicity of the rules and the complexity of what develops is a lot of beauty. You can storm the king's fortress, your pieces raking, its analogous to battle. It gives some of that feel. The game can be really exhilarating to play, especially when you're on to attack and able to put up a heroic defense. That's some of the most enjoyable times. Even if you go down in flames and lose, sometimes you enjoy the way that you were able to defend. And in fact, actually, at a game in the European Championship in Jerusalem in February, one of the most brilliant conceptions of all time happened where a player just nonchalantly moved his king and let the opponent capture or rook with check. It even looks like he accidentally touched his king and had to be touch-moved, but it was on purpose. The other guy took the rook, and then after the guy tucked his king up one square, it was close enough to be affecting an attack against his own king. The guy just had a queen and some pawns against the other guy's king, queen, and rook, and yet the other guy couldn't get out, and it was turned into a brilliant victory. So things like that in the lore of chess books make it a storied and beautiful game. Very neat, yeah. So I know we may lose a few of my listeners as we get into some of the deeper topics and sort of graduate-level topics here, but I've always had a passion for Kamagura of Complexity, and I hope to actually cover a little bit more detail on this show, so I was curious if you could share some of your work on how that relates to chess. Well, it's interesting enough. This is technically separate from my cheating work. But one thing, all chess programs share many common features, the type of algorithm, board representation, and they all use a hashing scheme for chess positions. They don't store the chess position in the form of knight is on F3, king is on G1, pawns on H6, etc. Instead, what they do is they break a chess position down into a code of under 30 bits. You first allocate primary codes of 64 bits each for every combination of one of the six white plus six black type of chess pieces on the 64 squares. And then you just add up mod two, those vectors, and take 25 or so bits off the end, depending on how large a hash table you've allocated. Now, there's a lot of redundancy. Different positions will have the same code, and that could confuse the program, but in critical points of any given search, it rarely gets confused. And what I'm interested in are the Kamalgorov complexity of those bits that are used to initialize the hashing scheme. And whether there is a demonstrable difference in behavior, depending on whether the 50,000 needed bits are generated truly randomly from atmospheric noise at random.org or hot bits service, or whether they're done pseudo-randomly as the programs tend to do. And that makes a big change in their Kamalgorov complexity. So on the girls' loss letter blog, I wrote a post-title digital butterflies, because I also show a digital butterfly effect, where I can reproduce a mistake that chess programs make owing to hash conditions. I can isolate it to the size of the hash table and to other factors. And the question is whether this kind of difference can be harnessed in large scale to tell whether 50,000 bits are random or pseudo-random. So you can find it just by googling the term digital butterflies and lost letter, and it's an article I wrote coming on three years ago. It references George Marsalia, who discovered flaws in the pseudo-random generators that were used in the 1960s, showing that when their output was plotted in three dimensions, they had two-dimensional unwanted regularities. Interesting. Yeah, I'll be sure to link to that in the show notes as well. So just on the chance, there is a chess cheater listening who is hearing about your work for the first time. How nervous should they be getting? Well, I wouldn't want them to be nervous. I'd want them not to cheat. I'm hoping that the idea that there are sophisticated techniques beyond what they might be thinking about will give people pause, a deterrence effect. Absolutely. The one other thing is that during the years 2007 to 2011, there really were no positive cheating cases. There are a couple of negative ones where I gave the usual explanation. And so my work deepened into studying human cognition and decision making and trying to branch out to things like studying the difficulty of test taking. How can you judge the difficulty of a test? Well, I can judge the difficulty of chess positions, and then can I transfer the distributional patterns that result? So there's a large component of my work that is developing human versus computer cognitive differences. And you can find on my website a paper from last summer's multidisciplinary preferences workshop associated to AAAI 2014 in Quebec City that details some of these. And even one result that isn't in that paper shows that if you have a lot of data, even just the two hundredths of a point difference between moves is enough for more humans to select the better of the two moves. Differences that would seem inconsequential if you're just talking about one game really do pan out all their thousands of positions. Interesting. So before I ask you kind of the final questions, I actually want to go back to one topic. To my understanding, there was a case when you were tracking a potential threat of cheating, and it was actually a power outage that stood in the way of communicating some of this. So would you mind sharing that story? That yes, so that was during what I was saying about the topol of kramnik match, before that match finished. I was all set with my conclusions of saying these games were forcing, it was an illusion. And I was on the server run by chess base, I commentary privilege is there. So for the last day on Friday, October 13, 2006, I was going to go online and trot out my conclusions one chat line at a time and say there was no good scientific methodology and it was gamesmanship, I was loaded for bear. But that was the night before was the Buffalo Winter October storm, and it knocked out power. I was bailing basements, ours and neighbors. And by the time power was restored in the next week, kramnik won the playoffs, so there was no harm, no foul, and there wasn't so much in need for my input. So that's that again, then I went back to quietly developing my work. Well, I like to wind up my shows by asking my guests for two recommendations. The first I call the benevolent recommendation, a nod to a book or a research paper or a website that you think the listeners would find interesting. The second being the self serving recommendation, hopefully something that you get direct benefit out of from appearing here. So for a book, I would recommend any of various books that are about the public awareness of statistics, so not just public awareness of science. And I can think of Jordan Elliberg's how not to be wrong and the book naked statistics, I'm forgetting the author and also Nate Silver's book, the signal and the noise, which is, you know, maybe at a higher level. And if you want to play chess, there's a chess book I recommend called an invitation to chess by turn of and harkness. Olds but good comes in and out of print, but is the book that most smoothly takes you from absolute beginner to appreciating some points of strategy and even of beauty. And for the self serving recommendation, of course, the blog, Gertle's Lost Letter and P equals NT for understanding some of the caveats of my chess work. I would recommend my posts on that blog titled Little Woods Law and 13 Sigma from two years ago. Little Woods Law being the realization that if you define a miracle as a million to one event, well, we have a million waking seconds every month and we notice something every second. So therefore, everybody can say they see a miracle every month. Well, because he had to say, you know, how is it selected? What kind of miracle? That's where things get deeper. But that's my main caveat to article against reading too much into unusual occurrences that I find all the time from screening chess games. So the blog would be and also my website at Buffalo, just Google My Name and Buffalo and you'll find it. Excellent. I'll link to all that in the show notes. Well, lastly, if anyone wants to follow the ongoing saga of claims of cheating and the statistical analysis of such claims, where's the best place they can kind of keep up on the latest and greatest? Well, yeah, it's not. We don't make it very public. So again, I had to say my website. There is a closed Facebook group, which talks about individual developments. But no, it's really my website papers. And there's actually a very good paper by David Barnes, who's a Don Java textbook author and Julio Hernandez Castro and the Journal of Security on caveats of cheating testing. All of which, I mean, my reaction is just amen. They mentioned me and they don't intend to give the impression that they're criticizing my work and might come off that way. But my reaction is just amen. These are all the reasons I separate the screening from the full test, but it points out a lot of the caveats very well. So that's an independent source from me that's good to read about the general problem and they even set it in the context of cheating detection in online games, which is a potential branching out application. Very cool. Well, I want to thank you once again for coming on the show, Dr. Regan. This has been really fun for me and I appreciate your time. Thank you very much. [MUSIC] (upbeat music)