What Makes a Good Feature? – Machine Learning Recipes #3 November 30, 2019 100 By Stanley Isaacs CategoryArticles BlogTagsbinary classification Fullname: Josh Gordon GDS: Full Production how to pick good features independent features introduction to machine learning Location: NYC machine learning machine learning algorithms machine learning examples machine learning features machine learning projects machine learning recipes Machine Learning Tutorial Other: NoGreenScreen product: web Team: Scalable Advocacy Type: Other 100 Comments giant_neural_network says: April 28, 2016 at 3:28 am Loved the stacked histogram, nice way to visualize the different means of the distributions! Reply Koushik Khan says: April 28, 2016 at 3:52 am Great series for machine learning………only thing, please upload videos within a week. Reply Samir Fersobe says: April 28, 2016 at 5:46 am Is this heaven? Reply Rokesh Jankie says: April 28, 2016 at 8:35 pm like the way you are (story)telling about this subject. Makes it accessible for many People. looking forward to "future episodes". 😉 Reply fatih turgel says: April 28, 2016 at 10:33 pm Great work! Really love your series! 🙂 Reply Alex Leiva says: April 29, 2016 at 12:56 am WE WANT MORE EPISODES!!!! Reply Hugo Alvarado says: April 29, 2016 at 2:57 am Thank you for these videos! I like how they try to be a simple as possible, but still show real tools used "in the field". Tip for other youtubers: this is another great resource for those looking to learn more about machine learning and data science: https://github.com/donnemartin/data-science-ipython-notebooks Reply Michael Roberts says: April 29, 2016 at 3:18 am A bit light on the info here I think. Compared to the last episode. Reply mitchese1 says: April 29, 2016 at 6:03 pm This is a really great series, please publish more often/sooner! Reply Jerome Etienne says: May 1, 2016 at 4:37 am So clear! Thanks Reply Gareth Hall says: May 1, 2016 at 10:28 am I could honestly watch Josh all day. He presents really well. Keep up the good quality content Josh! 🙂 Reply Joy King says: May 1, 2016 at 6:49 pm Very cool series & I appreciate the links to the examples and especially the "article that inspired". Extra links like that really help! Thank you! Reply Luis Leal says: May 2, 2016 at 12:58 am How do you use categorycal features, for example, we are trying to train a classifier and one of the features is "State"(or maybe "city"), do you create a mapping table where every state(or city) gets a numerical representation? Or would you solve this at programming level: looping through the states(or cities) and finding a classifier for every state(or city) ? Reply Alson Yap says: May 2, 2016 at 2:58 pm Great job on the video! I can tell that you have taken feedback from previous videos and made adjustments. Thanks for the effort! Will be waiting for the next episode 🙂 Reply Охтеров Егор says: May 3, 2016 at 6:42 am Amazing. I understand everything perfectly 🙂 Reply VladVlog says: May 4, 2016 at 3:23 am Thanks for the video. Really appreciate it! Reply Eddie Imada says: May 4, 2016 at 2:15 pm Very good explanation! Looking forward for new episodes! Reply Philip Salvo says: May 6, 2016 at 6:15 pm Josh, Thank you so much to you and your team for building this series! In particular, I really like your 'tl;dr' approach and keeping things grounded in accessible, real-world examples — I can't wait to see what comes next! Reply hieu nguyen canh says: May 10, 2016 at 2:17 am thanks !!!! I hope you will explain in the future the "feature engineer" technique. Reply Tema Z says: May 10, 2016 at 9:44 am How to make that pretty cute cow in Terminal? Reply Matt Siegel says: May 10, 2016 at 7:21 pm terrific episode! those heuristics for feature selection are invaluable. also, lol @ whoever produces the graphics: the frontmost dog head tilt XD Reply Siraj Raval says: May 17, 2016 at 8:12 am Hell yeah! If you guys like machine learning check out my new ML series on my channel. Reply Derek Eskens says: May 17, 2016 at 7:57 pm Great series so far. Considering independent features, for something like a dog, would capturing both height and weight be counter-productive since they are most likely interrelated? Reply DragoonGalaxy says: May 19, 2016 at 11:27 am what's the program you use to write the python code? Reply José Carlos Lazarte Aspíllaga says: June 3, 2016 at 4:15 pm I have limited knowledge of programming and computer science, yet I find this series very approachable and fun. Reply Jimmy Vivas says: June 6, 2016 at 9:40 pm we want more of these videos Reply Muammar El Khatib says: June 16, 2016 at 10:44 pm Very clear explanation. Reply Crash-Test says: June 17, 2016 at 12:59 pm <3 Reply asiddiqi123 says: June 29, 2016 at 7:27 pm Is it possible that we don't know about what features lie in Data and we do some processing and find features needed? Reply Warren Kushner says: July 6, 2016 at 3:20 pm Awesome Series! please make more videos! Reply Sanjaya Kumar Sahoo says: July 12, 2016 at 3:47 pm Awesome series, prevents myth that machine learning is difficult Reply MJ Lim says: July 20, 2016 at 4:54 am This is explained in such a simple and practical way.Loving this series! 🙂 Reply Kelvin Kagia Kim says: July 20, 2016 at 12:23 pm so awesome i think machine learning is the easiest topic i have ever come across having a good background in programming,probabbility and statistics Reply Diego Lima says: July 30, 2016 at 11:48 pm I just want you to know that I loved the article reference in this video. Please refer to more nice articles like this. Reply Stanovich says: August 8, 2016 at 8:58 am These videos are very helpful. Through them I know how to code and understand happily.How can I pass the gap between Little Code and real world problem?Hope for topics about this：） Reply Stanovich says: August 8, 2016 at 9:34 am Are there viedos like this tutorial's type? Reply Sergio Arroyo says: August 25, 2016 at 11:41 pm Awesome 😀😀!!! Reply husain zafar says: August 27, 2016 at 11:09 pm I think np.random.randn is not restricted in range (1,1). So the error wouldn't be just +/ 4%.Btw excellent videos and loving your style of explaining!! Reply Max Titkov says: September 5, 2016 at 10:13 am Thank you a lot!!!!!!! Reply raven says: September 5, 2016 at 5:19 pm Good video series but Instructor is acting in unnatural way. He is making worked face impressions. Too annoying. Reply Venkat Krishna Turlapati says: October 1, 2016 at 2:13 am Awesome videos, probably the best ML course by far Reply Andrés Colón says: November 13, 2016 at 3:04 am Good job on these videos! Reply Harsh Seth says: December 10, 2016 at 3:29 pm Nice Tshirt ! 😛 Reply SCAPE-IT says: January 8, 2017 at 5:42 pm These vids are great! Thank you and keep it up! Reply deevioo says: January 17, 2017 at 8:20 pm What about ratio between height of a dog and the width of its head? Reply Ajinkya Jumbad says: February 15, 2017 at 5:52 pm Wouldn't Latitude and Longitude give you accurate distances by some very simple calculations ? Reply AW Crowe says: March 4, 2017 at 7:45 pm do a thought experiment… wow you are making a huge assumption that people know how to use criteria logically to make a decision… we collect information until we are comfortable with out decision. now if you are saying collect enough features to know for certain that is a different question. Reply AW Crowe says: March 4, 2017 at 7:52 pm simplified classifiers OK. what if feature ratio's are significant or their are sorting stations for the letters with varied release times for the letters… eh just confusing things. will the ML program figure out relationships between features if it is supervised or is it our job to figure out the relationships. Actually isn't the problem that if we have a system with 2500 features and do not know how they are significant then ml will figure it out? Reply Shrestha Diwash says: May 22, 2017 at 12:27 am It great easy to understand series .we want more. Reply Robert Nsinga says: June 6, 2017 at 6:10 am Sometimes I see other faces on people… Josh Gordon reminds me of Gustavo Fring from Breaking Bad! OK, back to coding. Reply Tiago says: June 6, 2017 at 11:17 am I have created a github repo with all of the code for all of the recipes of this series. I've used Python3 for all recipes. I've also updated all of the libraries and have added some things to the code here and there. Check it out: https://github.com/TheCoinTosser/MachineLearningGoogleSeries Reply Dipali Malviya says: June 18, 2017 at 5:53 am hey Josh! I wanted to learn concepts and standard algorithms of machine learning ,please suggest me how i can do this?It will be so helpful.I also want more tutorial of this series.Thanks for this 9 episodes series Reply abeeweeda says: June 30, 2017 at 7:31 pm when the max height is 28 + 4 where does 35 comes from Reply Akarsh Rastogi says: July 2, 2017 at 8:58 am The decision tree that he follows to decide when and how often he SMILES creeps me out..! 1:06 Reply Mayank Gupta says: July 12, 2017 at 8:01 pm Hi All, I created a nicely formatted repository containing the code from this video, but updated to work with new packages.https://github.com/officialgupta/MachineLearningRecipesLike this so people can see it! Reply Guilherme Iazzetta says: August 1, 2017 at 1:04 am Hey humans. This speaker isn't a human. Reply John Gabriel says: August 2, 2017 at 2:44 am ML is a misnomer for what you have been describing in your videos. No learning of any form takes place by the computer. The processes you have established as part of your libraries in acquiring data and making predictions have ZERO to do with learning. In fact, what you have done is automated some simple processes that are solutions in your mind to the problem of identifying useful features and labeling the same. https://www.linkedin.com/pulse/what-artificial-intelligence-john-gabriel-1 With regards to features, you can learn what it means for a concept to be well formed here: https://www.linkedin.com/pulse/what-does-mean-concept-well-defined-john-gabriel There is no such thing as AI. Humans are AI and will never create anything that works like the human mind. It took far superior beings (not God and I am an atheist!) millions of years to create AI in the form of us. The human brain is a feat of galactic engineering. Nothing we as humans can create will ever be anything but a joke compared to the real AI – the human brain. Robots will never think independently or be capable of thought processes as those found in the human brain. Of course, nothing is stopping you from trying. But if you do not follow my advice and realise that you need KATIS, you will no doubt never get close to anything but a joke! Reply Ernest G. Wilson II says: August 6, 2017 at 5:40 pm # Import Numpyimport numpy as np # Import matplotlibimport matplotlib.pyplot as plt # 500 of each doggreyhounds = 500labs = 500 # Set the dog heights +/- 4" randomlygrey_height = 28 + 4 * np.random.randn(greyhounds)lab_height = 24 + 4 * np.random.randn(labs) # Plotplt.hist([grey_height, lab_height], stacked=True, color=['r', 'b']) # Launch the results in a windowplt.show() Reply jean watson says: August 11, 2017 at 12:59 pm 1waawww Reply newcoolvid27 says: August 20, 2017 at 7:00 pm If you're using spyder and want a new window to show the plot, [Tools > Preferences > IPython console > Graphics > Graphics backend > Backend: Automatic] then restart Spyder. Reply Saman Sadeghyan says: September 5, 2017 at 11:33 am feature selection by #preml https://github.com/5amron/pre-ml https://www.youtube.com/watch?v=ByLijacuqdQ Reply Sachin Bhat says: September 30, 2017 at 5:08 am Hey I am getting the following error please help!Traceback (most recent call last): File "dog.py", line 10, in <module> plt.hist([grey_height, lab_height], stacked=True, colour=['red', 'blue']) File "/home/sachin/.local/lib/python2.7/site-packages/matplotlib/pyplot.py", line 3081, in hist stacked=stacked, data=data, **kwargs) File "/home/sachin/.local/lib/python2.7/site-packages/matplotlib/__init__.py", line 1898, in inner return func(ax, *args, **kwargs) File "/home/sachin/.local/lib/python2.7/site-packages/matplotlib/axes/_axes.py", line 6389, in hist p.update(kwargs) File "/home/sachin/.local/lib/python2.7/site-packages/matplotlib/artist.py", line 885, in update for k, v in props.items()] File "/home/sachin/.local/lib/python2.7/site-packages/matplotlib/artist.py", line 878, in _update_property raise AttributeError('Unknown property %s' % k)AttributeError: Unknown property colour Reply [email protected] [email protected] says: October 2, 2017 at 2:19 pm Doing it on Python 3? Don't want to pause the video and write? Find the code here: https://github.com/akanshajainn/Machine-Learning—Google-Developers Reply jack sjsjs says: October 25, 2017 at 4:59 pm probably should talk more about numpy at this point Reply Zohaib says: November 7, 2017 at 1:08 pm has anyone else noticed that he never blinks Reply Rajnish Rajput says: November 24, 2017 at 1:46 am So if i want to make a program that identify all the dogs in the world, I have to store all the data of all dogs in the world? like height, weight, speed hair etc. Reply MattieCooper says: January 2, 2018 at 12:42 pm Exactly what I needed! Thank you so Much! Love your presentation! Reply Danny Hunn says: January 2, 2018 at 7:53 pm Couldn't you use Latitude and Longitude to find Euclidean distance? Reply Bipul Mohanto says: January 6, 2018 at 8:15 am Complex in easiest words, thanks a lot Reply theburntcrumpet says: January 28, 2018 at 4:42 pm I'm a bit late to the party here but I'd just like to say thanks to the Google Developer's channel for putting these videos out there. Reply Yeshwin Hk says: January 30, 2018 at 4:22 am This guy is definitely a robot Reply Stewie Griffin says: February 26, 2018 at 5:52 am this episodes are making me only dumber Reply Bhavy Khatri says: March 6, 2018 at 10:37 pm His smile is so motivating. Reply Jamie Quigley says: March 21, 2018 at 11:40 am Creepy smiles at the end of each sentence, "Smile MORE Josh" Marketing bellows! Reply Peter Hepp says: March 29, 2018 at 2:01 am seriously google…you hired this guy to explain your video…he can hardly keep from smiling lolololololol google you are some kind of pranksters…better to use FEET who are you kidding?????? Reply R M says: April 1, 2018 at 4:51 am What does the greyhound = 500 line actually do? Reply Denise Dias says: April 28, 2018 at 6:14 pm Thank you Reply Brahm life says: May 4, 2018 at 5:50 pm can you have sub features of features in your decision tree algorithm? Reply The travel of time says: May 16, 2018 at 12:31 am why does my bar graph look way worse aesthetically than yours? Reply Everything Tech Review says: May 18, 2018 at 3:13 pm mine doesn't overlap , is that normal? Reply ReverseMe says: May 22, 2018 at 2:22 pm better remove inches Reply Harendra Singh says: June 18, 2018 at 9:13 pm This is an awesome series ! The best thing ever in ML 😛 (Well not "the best" , but yeah ! ) ! Reply MiniGam3s says: August 10, 2018 at 6:45 pm I wish i could look so happy ^^ Reply Nandish Ajani says: August 20, 2018 at 7:26 am I tried the same code but the graph is looking very ugly. There are no spaces between the bars. Can anyone please help? Reply Amey Naik says: September 22, 2018 at 3:28 pm Wasn't this episode a bit inconclusive? Does anyone know the next episode in this series which discusses about the features? Reply Abdullah Aghazadah says: October 1, 2018 at 3:03 am quick summary of the video: – let's say that your goal is develop a program that can distinguish between two breeds of dogs – what features do you want your example data to have?– you want the features to be the "distinguishing" features between the breeds, i.e. features that are very different between the two dog breeds– for example, if the two dog breeds tend to have very different heights, you want to use height as a feature in your training data– if on the other hand, the two dogs have about the same distribution of eye colors, you don't want to use eye color as a feature– you also don't want to use features that are highly correlated (i.e. that don't bring in new information)– you want to use simple features, as simple features will require less examples to get a decent classifier – you wanna be careful about adding too many features, especially if the features are not "distinguishing" features, they may just by chance be distinguishing in your example data, thus your classifier will start basing its predictions based on these faulty features key thing to take away from the video:Selecting features is extremely important. Select the simple, distinguishing features, that bring in new information (i.e. that aren't highly correlated). Thanks so much for these videos! Reply Avinash Ravi says: October 2, 2018 at 9:18 am where i can Learn deeper about ML algorithm with statistics Reply Gabriel Enrique Rueda Acevedo says: October 16, 2018 at 12:36 am how does training work, my program marks an apple as an orange Reply ARLEQUINA says: November 29, 2018 at 6:04 am The graph doesn't work, only gets <Figure size 640×480 with 1 Axes> this message. Reply Davie Chen says: December 13, 2018 at 11:23 pm Dem doggos. Reply 奇l says: December 19, 2018 at 2:32 am 。。 Reply Sadiq Sariq says: February 10, 2019 at 6:14 am what is the difference between np.random.random() and np.random.randn() ? Reply Lucas Lima says: February 23, 2019 at 8:12 pm #version 3.7.2 import numpy as np import matplotlib.pyplot as plt galgo = 500 labrador = 500 galgo_height = 28 + 4 * np.random.randn(galgo) labrador_height = 24 + 4 * np.random.randn(labrador) plt.hist([galgo_height,labrador_height], stacked = True, color = ['r','b']) plt.show() Reply Gary D says: April 1, 2019 at 2:17 am I loved it, but my dog would prefer a squirrel detection algorithm. Reply Bilal Raja says: April 9, 2019 at 2:57 pm im trying to learn two things at once here, python and machine learning, but i guess its not too hard as i already know c#,php etc…ML is also not very hard at first but gets little complicated as you go deep… Reply Amo Masi says: April 19, 2019 at 4:47 pm #Hi guys. Code that works now so you can follow: import numpy as npimport matplotlib.pyplot as plt greyhounds = 500labs = 500 grey_height = 28 + 4 * np.random.randn(greyhounds)lab_height = 24 + 4 * np.random.randn(labs) plt.hist([grey_height, lab_height], stacked=True, color=['r', 'b'])plt.show Reply Amo Masi says: April 19, 2019 at 5:33 pm This was incredibly useful, Josh. Thanks. Reply hejar shahabi says: June 2, 2019 at 12:03 am holy god, Ive been looking whole Youtube and torrent to find a best package for learning machine learning and I watched many videos, but believe me I am totally in love with your course, this is awesome, you explain it so simple it is like you teach me in my mother tongue. i wish you best like my man. Reply Hubert Rozmarynowski says: August 19, 2019 at 2:17 pm How to get that pretty data visualisation from matplotlib like in 2:05 ? Reply Gaurav Tiwari says: August 22, 2019 at 2:30 pm What if I don't what is dog and I need to identity it … My program is just a toddler and it is learning from Internet … Reply Leave a Reply Cancel reply Your email address will not be published. Required fields are marked *Comment Name * Email * Save my name, email, and website in this browser for the next time I comment.