I used Machine Learning to hack baseball
[indistinct shouting] – No steal. There’s no steal there. I predict not a steal, watch this: …and he stayed. [excited]
He’s gonna steal! This is a steal! If this app works, this kid’s about to steal! [excited squealing]
There he goes! There he goes! Freakin’ works!
It works. Two years ago I came up with an idea for an app where you could decode baseball signs So you would know when the other team was going to steal, even after just the first inning. Then in a covert effort to get people more interested in coding the machine learning, I would make the app free and available to everyone.
I’m happy to report… It’s no longer just an idea. Today I’m going to show you exactly how the app works, and we’ll use it in the wild, and then we’re going to talk about machine learning in very simple terms with my buddy Jabril. But first, to set the stage we need to understand the fascinating world of secret baseball signs. It’s the game within the game. Most people know the catcher will give signs to the pitcher when your team is on defense. But when you’re on offense, the third-base coach gives signs to both the batter and the base runner. For example, he could secretly tell the batter to bunt, or to not swing at the next pitch, or he can tell the base runner to steal. And just to be clear, stealing a base is when you start running as soon as the pitches thrown, instead of waiting for the batter to try and hit it. It’s risky because if the batter doesn’t hit the ball and the catcher is good, he can throw you out at second base. Because it’s a big advantage, coaches will actually tell their players to watch and see if they could figure out the other team’s signs.
That’s considered a fair play and it’s part of the game. The problem is, our brains aren’t great at figuring out the complex pattern. So we set out to create an app that would use machine learning to do exactly that. And by we, I mean Jabril actually sat down and wrote the code and I just made sure he had unlimited Cheez-Its and Lacroix. Here’s how it works:
If I saw coach touched his nose, ears, arm, chin, and so on, I would assign those to letters in the app. Then I’ll just watch him sign and record the order. After that I would just let the app know the outcome; so was it a steal or not.
After you do this for enough sequences, the app will start to make predictions. In this case, it’s predicting the combination of “A, D”, or “nose then chin” is their steal sign. This worked well enough on my workbench, so now it’s time for the first real-world test
in a kids-versus-adults Wiffle Ball game. And my friend Sara takes the game of Wiffle Ball pretty seriously, so she was the captain of the kids team, and was signaling for them to steal at just the right moments. [Cheering] – And I wasn’t sure about the ethics of using our app against a bunch of little kids, but they were scoring runs and stealing bases with impunity, and then they got so confident, they started talking trash. [kids screaming]
WE WANT A PITCHER,
NOT A BELLY ITCHER! And no one says that about my pitcher, so at that point the gloves came off. We found it easiest to film her signs with our phone, and then scrub through and capture the order in the app afterwards, as opposed to trying to do it real-time. And I’m very happy to report we cracked their code after only three sequences. And what’s cool is once you know the code, you no longer need the app because you can just watch for the steal signal and when we saw it, we alerted our pitcher with the secret sign of our own [music] [screaming]
Ayyyyy! You’re outta there buddy! And after that the tides of the game shifted and we were able to officially prove adults rule, and kids drool. So our app cracked the code after just three sequences and I’m going to show you them to see if you can figure out the steal sign using just your brain. Here’s the first one: And this was a STEAL. Here’s the second: And this was NO STEAL. And this is the third: And it was a STEAL. Pause and go back if you want to try and figure it out, because I’m about to tell you the answer. According to the app, which Sara later confirmed, their steal sign was only if she touched her hat and then left ear back to back. Everything else was just a decoy. Now before we tell you exactly how the app can figure this out so quickly, we need a little background information. So I’ve cornered my buddy Jabril here and I’m gonna make him give us a super simple explanation of machine learning. Now here’s what you should know about Jabril: He’s basically a genius who taught himself how to code when he was 14. He has an amazing YouTube channel you should check out with videos like this one where he made a video game where the character teaches himself how to navigate any maze using machine learning and neural networks. [Jabril]
– All right. Let’s say we have Timmy here. And Timmy likes certain type of toys, but not others. So in a fake example here, he decides his based on how big the toy is and how complicated it is. So, from small to big, and then over here how many parts it has; from just one piece, to a really complex toy with gears and moving parts and things like that. And so, if we asked Timmy about 20 different toys and start to plot those on the graph, we’ll start to see a pattern. So generally, he likes toys that are big and complicated, but does not like toys that are small and simple. And so, by looking at his past preferences, we can make really good predictions for the future. If you show little Timmy here a toy that is this complicated and this big, we’re confident that he’ll like it before we show it to him because it is inside the “like” boundaries. That’s the big deal with machine learning: We don’t have to take the time and show a little Timmy here every toy that’s ever been created and record his answers. After we record some likes and dislikes we’re able to draw some boundaries. And precisely where we draw these boundaries is the secret sauce. In this case, with just two inputs, You can just eyeball it and see where to put the boundaries. Butt …when you have thousands of inputs that interact with each other, it’s impossible for our brains to comprehend where those boundaries should go. However, it’s pretty trivial for a computer using machine learning. [Mark]
– In doing research for this video, I talked to probably over 50 baseball players and coaches. And when we asked about signs, it was surprising to me how they all basically use the same strategy. See if you can pick up on it. [Coach]
– Every coach has an indicator. – I have an indicator so I could do all this, you know random stuff like this, touch anywhere until I touch this, none of that matters. – So it’ll be an indicator and the next sign, that’s the ‘hot’ sign. So it’s indicator, arm is ‘steal’, if I just do arm, that’s nothing. – Oftentimes, I’ll give this ‘Simon didn’t say’. If I go indicator, and then immediately to anywhere on my arm: Steal. Bunt is to the belt; indicator to the belt. [Mark]
– So basically nothing matters and it’s all a decoy until they touch the indicator, and then the very next sign is the instruction. So after the indicator you might be told to bunt, or to take a pitch, or to steal. And so since the steal sign comes immediately after the indicator, We just look at a sequence where steal was recorded here That’s showing us one and we look through that sequence two letters of the time and store those combos and we do the same thing for all the Sequences where steal was recorded and whatever two letter combo shows up in all of them is their indicator and steal sign in this case It’s a D. So if we decode that that means noses their indicator and chin is their steal sign now I have a confession what I just showed you doesn’t use machine learning at all. It’s just a simple algorithm We realized at work once we discovered that pretty much all teams will use an indicator directly before Giving the real sign but based off all the people we talked to you if you’re trying to decode signs this simple version should work Like ninety percent of the time but what about the other 10% where they do anything other than indicator followed by sign? That’s where you need machine learning because I’ve done properly machine learning can crack any code as long as you give it enough training data So to really see how good Jabril x’ machine learning app was I generated some training data based off an insanely? Complicated steals line I came up with to see if he could figure it out so my secret sign was a mustache rub is the indicator followed by any random sign and then a tooth tap as The steal sign then I can have up to fifty different Signals in each sequence and then to throw them off even more if I ever touch my right, eyebrow It’s not a steal So you ignore everything in the sequence? Even if i’ve already given this deal now just to set the stage if we weren’t going to use machine learning This would take a normal computer Thousands of years to solve because it’s the same as if you’re asking Timmy about every toy ever made instead of just drawing the boundaries Before we see if gibreel can successfully crack my code and how long it will take him Let’s just go one layer deeper than the Timmy toy example and see how machine learning Mimics the human brain in creating neural networks that can draw those boundaries in the more complicated case of more than two inputs There are three main parts to a neural network and hang with me here because I’m going to keep this simple You have the input layer which in our case are the signs being given and then way over here You have the output layer in our case steel or no steel and then in the middle We have the hidden layers and right now that’s just a black box So if the sign was hat hat nose hand We would tweet these input knobs like this depending on how each of these knobs is turned Each one is interconnected with the knobs to its left So it causes some simple math to occur at each node And when you sum up all of those numbers You’re left with a number between 0 And 1 and if that number is really close to 0 that’s no steal And if the number is really close to 1 that’s a steal so you give it a bunch of training data We’re given an input You know what the output should be you start with these hidden layer knobs turned in totally random directions And when you add them all up you get something like point five five. Well, that doesn’t make sense You can’t have half a steel and more importantly with the training data you know the answers and you know that this Combination should have been a steel So you just start tweaking these hidden layer knobs until you start to get Outputs that are more correct over time and after going through a bunch of examples where you know what the answer should be Eventually, you get to a point where any more tweaking of the knobs just makes it less accurate so you stop so now you Superglue these three hidden layer knobs into place because you’ve trained your model And so now with the brand-new input where you don’t know the answer it gives you the correct output in this case us steel Of course, this is a simple model But this scales up so you could have thousands of inputs and thousands of outputs and you’re able to discover really? complicated relationships What’s so cool about this is this is basically how neural networks in our own Rains are set up to learn once I’ve been given enough training data to understand the interaction between my hand and arm and keeping something balanced like this then my model is Trained and I superglue those middle knobs in my brain, then I can introduce a totally new input I’ve never tried before and I still know what to do I don’t need to be trained on every possible different type of object because my brain Has drawn the boundaries think about that for a long time before machine learning in neural networks computers were just following hard-coded instructions given by humans now they can learn like humans only they can do it much faster and more comprehensively than we can look at Ori and That’s why when I give Jabril the challenge that would take a normal computer using brute force methods thousands of years to solve Yo, man Please tell me that your indicator is a moustache rub and then you know, whatever and Then it do that Is that it Jibril’s machine learning algorithm just created the right boundaries and solved it in less than three minutes The machine learning model requires more data than our simple version which can solve it after like three sequences But the upside is that it will eventually decode any set of signs. As long as you’re capturing the inputs, right? And so now it was time for a real-life test. So I asked my buddy Destin to do some secretive recon work Okay, dude, this is the field. I used to play it growing up today. My son is playing on the Rascals I’m gonna go film on third base coaches doing their thing and we’ll plug all that data into the machine learning algorithm These kids are yelling at me Want me to know what my channel is and the fact they didn’t recognize him just proved to me that whatever was happening in this Region was a totally effective disguise and thanks to Dustin’s great interviews in raw footage We were able to get even more data to show that methods totally worked And while Destin just brazenly set up a tripod and filmed all these third-base coaches Yeah, cuz signs are pretty secret, right yeah, yes, sir get it I get in trouble for stealing sons I was a little more nervous and discreet if you put a GoPro in a cup you can actually just watch the footage and frame the shot real time and it just looks like your check and dank means with the Drink in your hand. I’ve actually used this trick a few times especially on my Carnival scam science video So just like with Destin’s footage It worked perfectly the games I went to and it was just a cool feeling once you’d crack their code To be able to predict exactly what was coming next I Hope you enjoyed this excuse to learn more about machine learning as much as I did I will put links to both versions of the app in the video description for you to check out and of course if there are specific rules against using Technology to steal science in your league. I am not telling you to break them. Otherwise from personal experience You might risk getting some very important people mad at you I Recently took a trip to the Bahamas to visit some sharks for an upcoming video on the way there I noticed some suspicious YouTube account activity So I did what you should never do while using Unsecured public Wi-Fi and I changed my master YouTube password as in the password where if you had it? You could delete all of my videos and the reason I was not at all worried to do important things over Unsecure public Wi-Fi was because I was using Nord VPN. So unlike probably everyone else at the airport I knew I was fully encrypted and protected and anonymous It’s a game changer for me when I travel not only for online protection But all my favourite websites worked the exact same as if I was at home and Norview PN is the best because it’s super fast with thousands of servers in over 61 countries and unlimited bandwidth So if you find yourself using unsecured public Wi-Fi in hotels or coffee shops or airports or give in today’s headlines You’re just ready to start taking a privacy more. Seriously you go to north VPN calm slash mark Grover or use the link in the video description for 75% there are three year plans which works out to less than three bucks a month So it’s Norview pn comm slash mark rover and use the code marker over at checkout So they know I sent you and they’ll throw in an extra month free. Thanks for watching