Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.


Have an account? Sign In Now

Sorry, you do not have a permission to ask a question, You must login to ask question.

Forgot Password?

Need An Account, Sign Up Here
Sign InSign Up

Algoritmo Lab Forum

Algoritmo Lab Forum Logo Algoritmo Lab Forum Logo

Algoritmo Lab Forum Navigation

  • Forum
  • Algoritmo Lab
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Forum
  • Algoritmo Lab
Home/ Questions/Q 498
Next
Answered
Ketaki Bhide
Ketaki Bhide

Ketaki Bhide

  • 2 Questions
  • 0 Answers
  • 0 Best Answers
  • 0 Points
View Profile
  • 0
Ketaki BhideUser
Asked: June 15, 20212021-06-15T17:33:19+00:00 2021-06-15T17:33:19+00:00In: Linear Regression

Linear regression with multiple variables

  • 0

I was working on Dataset of Insurance in Python it had both categorical and numeric variables. For fitting a linear regression model I did the conversion of categorical to dummy variable, did scaling of whole data frame afterwards, after splitting training and testing data and, fitting model on training data, I found that VIF of one variable was more than 5 .(So while finding method, for reducing this high VIF, I came across Recursive Feature elimination method, in which it detects essential variables and nonessential variables). That particular variable with high VIF came under essential, so my query is should drop that variable by criteria of high VIF or keep it as it is? Is it only criteria for dropping variable?

  • 1 1 Answer
  • 38 Views
  • 0 Followers
  • 0
Answer
Share
  • Facebook

    1 Answer

    • Voted
    • Oldest
    • Recent
    1. Suchita

      Suchita

      • 0 Questions
      • 5 Answers
      • 1 Best Answer
      • 17 Points
      View Profile
      Best Answer
      Suchita SME
      2021-06-16T05:55:34+00:00Added an answer on June 16, 2021 at 5:55 am

      If this particular variable is essential, it should be included in the model. You may go ahead and build the model including this variable. However, you should check the correlation of this variable with other predictor variables. Drop the variable which is highly correlated with this particular variable. Because highly correlated variables provide the similar information and hence lead to multicollinearity.

      Also, after building the model, check if this particular variable is statistically significant or not and take appropriate action in the next version of the model.

      Additionally, try Lasso regression. It is an intrinsic method of feature selection and see if this model has included the variable in question.

      • 0
      • Share
        Share
        • Share on Facebook
        • Share on Twitter
        • Share on LinkedIn
        • Share on WhatsApp

    You must login to add an answer.

    Forgot Password?

    Need An Account, Sign Up Here

    Sidebar

    Ask A Question
    • Popular
    • Answers
    • Tags
    • NehaSequeira

      Intercept in linear regression model

      • 2 Answers
    • NehaSequeira

      Scaling for numeric variables

      • 2 Answers
    • mahima_vaidya

      Multiple Linear Regression

      • 2 Answers
    • Aditya Sharma

      Are there any coding standards in Python?

      • 2 Answers
    • Bikash Ghosh

      Model Overfitting or Underfitting

      • 2 Answers
    • Dipayan Sarkar
      Dipayan Sarkar added an answer One of the assumptions of Linear Regression - No multicollinearity.… July 14, 2021 at 4:46 am
    • mahima_vaidya
      mahima_vaidya added an answer 'OLS' object has no attribute 'pvalues' This is the error… July 6, 2021 at 7:24 am
    • Dipayan Sarkar
      Dipayan Sarkar added an answer The statsmodels.regression.linear_model.OLSResults.pvalues should give you the pvalues of the respective… July 5, 2021 at 6:10 pm
    • shreemann
      shreemann added an answer If we remove the intercept then that would make the… June 23, 2021 at 4:53 am
    • Suchita
      Suchita added an answer When we scale the date prior to train-test split,  we… June 18, 2021 at 11:35 am
    codingstandards linear regression logistic regression p-value python pythoncoding question

    Top Members

    Dipayan Sarkar

    Dipayan Sarkar

    • 0 Questions
    • 39 Points
    SME
    Shivam17

    Shivam17

    • 0 Questions
    • 29 Points
    SME
    Prasad Valse

    Prasad Valse

    • 0 Questions
    • 28 Points
    SME

    Explore

    • Recent Questions
    • Feed
    • Most Answered
    • Answers
    • No Answers
    • Most Visited
    • Most Voted

    © 2021 Algoritmo Lab. All Rights Reserved