Making decisions and finding insights from your data is an essential part of data science and one such I’d rather than which is widely used for this purpose this decision and keeping that Borden’s off decision tree in mind we have come up with this comprehends Of course Now before we go ahead recession I like to inform you guys that we’ve launched a completely three black form where you have access to free courses such as the I Cloud and that still marketing. Now let’s have a glance at the agenda will start by understanding.
Where exactly is machine learning and what does that decision t algorithm in machine learning Then we’ll go ahead and implement the Decision tree algorithm with the R programming language After will understand the concepts off in Droopy and Gini Index going at will understand what exactly is a war fitting and how can we prevent were fitting in decision T and finally will understand Shannon’s entropy So let’s start a recession All right so we’ll understand machine learning with this little example over here So what do you see in this Like what This is exactly Well it’s a fish doesn’t it On What is this Will this again is a fish.
And how about this Will this do is a fish Now how do you know all of these are fish Well as a kid you might have come across a picture of a fish and you would have been told by up kindergarten teacher or your appearance that this is a fish and your brain learns that anything which looks like that is a fish right And that is our brain functions But whatever a machine I fighting this major for fish and I feel it through a machine will be able to understand that this is actually your fish Well this is what machine learning comes and surrender was I’ll take all of these images off the fish and I’ll keep on feeding them to this machine until it loans all the features associated with this fish.
Now when I say all the features associated with this fish so the features would be let’s say a fish has to wise it has two friends and has a tail has gills And so one so this machine would learn all of these features associated with the fish Now once the training is done it is given new data to determine how much it has learned So when they see trying tenuous first the machine is given training data So once forced the training is done and one cyclone all of the features associated with the training data it is given the test data So this is where we are actually testing the machine how much it has learned.
So we have given this new image of this fish to this machine and we’re trying to estimate whether this machinist able to correctly labeled this majors fish or not So this is the underlying concept of machine learning right now that we know where exactly as machine learning will have a look at the categories of machine learning So machine learning can be broadly categorized into supervised learning and unsupervised learning and will serve the 1st 1 which is supplies learning.
So in surprise learning, you basically have output variables and input variables, and this output variable is denoted with white and the import valuable as denoted with X So these are quickly edibles are also denoted as the dependent variable and this input variable is also known as an independent valuation and we’re trying to determine how does on output variable change with respectable input Variable Or basically how does a dependent variable very with respect One independent variable So that’s what you see Why equals F X over here Why would be the dependent Variable and X would be the independent variable and we’re trying to determine That’s why very with X are not so against provides Learning can be broadly categorized into regression and classification so we’ll start with the 1st 1 which is classification.
So as you see or hear classification is basically the process of breaking the class off a new variable Now let’s take this example to understand classification better So let’s have you want to determine whether a person has cancer are not based on Whether the person smokes are not so over here where the person has cancer on what would be the dependent variable and where the person smokes on or to be the independent variable cvb ski have with details It which would comprise off two columns and one column would be the person whether he has cancer Oh no And then the next column would be whether the person smokes a lot right And based on this column we’re trying to determine whether the person has cancer or not And this is Bs leader underlying method of classification Now there is another matter known as regression.
So as stated over here this method is basically used to find Alina relationship among different entities So over here we have a dependent variable and we have the independent variable So the dependent feta billers inverted with white and that independent available is noted with Rex Now what happens Integrationist Let’s see if I put in some random value off eggs we would have to get a value off Why for that So this is what is basically known as regression So for the arbitrary value off X we’re trying to find out a corresponding value off white Now we’ll head onto the most important part of a session which is decision free algorithm So decision tree algorithm is a supervised learning matter which is used for boot classification as we last regression purpose.
Right So you can put classify the data also you can use it to up You know find out what is the relationship between white and X So let’s understand the decision tree algorithm with this up major we’re here Now let’s see You know I have this very simple question so I want to know whether So my question is this I wouldn’t know where that I can watch the movie Avengers or not And for this I have a set-off decision to make So what you have is basically your three light structure and this tree-like structure is Decision Tree over here So we’ll start in the first condition over here which is whether I like Marvel movies or not And if I like Marvel movies I’ll come onto the left side offered right sort Osias And then again.
I’ll check the second condition which is if I am a final for Robert Downey Jr, No if I am a fine off Robert Downey Jr I will definitely what of NGOs But if I don’t like Robert Downey Jr then I will not watch Avengers Similarly if I don’t like Marvel movies then the next condition would be whether I’m a dc Fanboy or not And if I am a BC Fanboy then I will definitely not watch Avengers because I’m on time are well so that is why not go and watch Avengers But on the other hand if I’m not a dc find the boy but really like Scarlett Johansson then I will go and watch Avengers But if I don’t like Marvel movies and I’m not a dc fan but and I also don’t like scholar Johnson then I’ll not watch Avengers So this is what happens and decision tree algorithm.
So we have a set-off decision which we are making a what Here and on the basis, off the set off decisions it will go ahead and classify the data to this is out The decision tree algorithm works All right so this is the basic idea behind decision three algorithm Now we’ll have a comprehensive demo off this decision tree algorithm in the boot are and vital so we’ll start with our So what will be working with this car seats detested And this car seats data set is actually part off The ISIL are package so I’ll go ahead and load this Arcelor package first to load the package I will type in the library and then given the name of the package which is I s L art.
So I have loaded the package Now this Eisler package basically has some data sets with it So I love to have a glance at different deal assets which are part of this Isola package So as you see over here these are the different facets you got the artery aside Carolan car seats college credit default And someone not of all of these dealers said I want to work with this car seat dealer said which is basically about the sales of child car seats Now I have to load this Diaz it so the scars his dressage As you see over here this is actually capital C Now I will go ahead and Stuart this in a new object.