and win INR 5000 in 30mins! Register now! Register link: https://mailchi.mp/3378b7b6bfce/data-science-challenge #DataScience30 #challenge #learning #jobseekers #education #career #datascience #artificialintelligence #machinelearning #AI #ML #data #dataanalytics #bigdata #programming #technology #datascientist #linearregression
Archives: Questions
Dropping of variables which are not significant and are having high VIF value
For more accuracy while fitting a linear model, we drop the input variables/features which are not significant(i.e pvalue>0.05) and whose variation inflation factor is greater than 5,I have landed up in a case where both input variables are highly correlated with each other and significant too,so which one should I drop?
Multiple Linear Regression
If the p value of a specific independent variable is less than 0.05 then it is considered to be a statistically significant variable. How to extract these variables from the dataset using Python ?
Scaling for numeric variables
Why is scaling for numeric variables done before splitting it into a train & test dataset?
Intercept in linear regression model
Why do we need an intercept in a linear regression model? If we use statsmodel.OLS do we need to add the intercept explicitly and how do we do it?
Linear regression with multiple variables
I was working on Dataset of Insurance in Python it had both categorical and numeric variables. For fitting a linear regression model I did the conversion of categorical to dummy variable, did scaling of whole data frame afterwards, after splitting training and testing data and, fitting model on training data, I found that VIF of… Continue reading Linear regression with multiple variables
Constant in Logistic Regression
Why do we need to add a constants column with the independent variables while doing logistic regression?
P-Value in Linear Regression
In Linear Regression, if the p-value of the f-stat is < 0.05 we reject the H0 and accept the Ha. Which means we know that at least one variable has a coefficient greater than zero. For individual variables, we consider the variable to be statistically significant if the p-value<0.05. Why this is so?
Model Overfitting or Underfitting
How do I know if my model is Overfitting or Underfitting?
How can I validate user input in Python
How can I make the program ask for valid inputs instead of crashing when non-sensible data is entered?