Removing variable with big p-value?
up vote
0
down vote
favorite
I have made a regression with 2 explanatory variables. The summary of that regression shows that one of my variable has a big p-value (0.705). Should I include that variable when writing the the y hat equation?
statistics linear-regression p-value
New contributor
add a comment |
up vote
0
down vote
favorite
I have made a regression with 2 explanatory variables. The summary of that regression shows that one of my variable has a big p-value (0.705). Should I include that variable when writing the the y hat equation?
statistics linear-regression p-value
New contributor
add a comment |
up vote
0
down vote
favorite
up vote
0
down vote
favorite
I have made a regression with 2 explanatory variables. The summary of that regression shows that one of my variable has a big p-value (0.705). Should I include that variable when writing the the y hat equation?
statistics linear-regression p-value
New contributor
I have made a regression with 2 explanatory variables. The summary of that regression shows that one of my variable has a big p-value (0.705). Should I include that variable when writing the the y hat equation?
statistics linear-regression p-value
statistics linear-regression p-value
New contributor
New contributor
New contributor
asked 2 days ago
Camue
1
1
New contributor
New contributor
add a comment |
add a comment |
2 Answers
2
active
oldest
votes
up vote
0
down vote
This depends on the goal of your analysis. Have you made a hypothesis that both your explanatory variables affect the dependent variable? In this case you shouldn't remove the variable since you'd be modifying your regression a posteriori (that is after you've collected your data.)
Are you trying to make a descriptive statement about what you're analyzing? For example, are you trying to understand whether education and sex predict income? Similarly, you shouldn't drop a variable since you'll no longer be able to conclude that one of the two variables has no effect.
Finally, are you trying to make a prediction? In this case, it's appropriate to try both models and compare their performance. You can do this using an F-test/ANOVA.
add a comment |
up vote
0
down vote
This depends on your expected results. In your cases, you have only 2 features and, if you remove one of them, the percentage that you lose important data will really high.
Instead of removing the insignificant feature, you should try to make it better by detecting an anomaly or dropping the outlier. In a common way, plotting covariance matrix to see how relevant btw the features, you can analyze boxplot and adjust the threshold to gain the more reliable data.
If you have enough data, you can split data into training, validation and test set. Then, you can improve your model coefficient by using some voting methods in the validation set.
Finally, you can implement the result coefficient R-square, p-value... and do some test ANOVA testing, AIC score... to compare two cases.
New contributor
add a comment |
2 Answers
2
active
oldest
votes
2 Answers
2
active
oldest
votes
active
oldest
votes
active
oldest
votes
up vote
0
down vote
This depends on the goal of your analysis. Have you made a hypothesis that both your explanatory variables affect the dependent variable? In this case you shouldn't remove the variable since you'd be modifying your regression a posteriori (that is after you've collected your data.)
Are you trying to make a descriptive statement about what you're analyzing? For example, are you trying to understand whether education and sex predict income? Similarly, you shouldn't drop a variable since you'll no longer be able to conclude that one of the two variables has no effect.
Finally, are you trying to make a prediction? In this case, it's appropriate to try both models and compare their performance. You can do this using an F-test/ANOVA.
add a comment |
up vote
0
down vote
This depends on the goal of your analysis. Have you made a hypothesis that both your explanatory variables affect the dependent variable? In this case you shouldn't remove the variable since you'd be modifying your regression a posteriori (that is after you've collected your data.)
Are you trying to make a descriptive statement about what you're analyzing? For example, are you trying to understand whether education and sex predict income? Similarly, you shouldn't drop a variable since you'll no longer be able to conclude that one of the two variables has no effect.
Finally, are you trying to make a prediction? In this case, it's appropriate to try both models and compare their performance. You can do this using an F-test/ANOVA.
add a comment |
up vote
0
down vote
up vote
0
down vote
This depends on the goal of your analysis. Have you made a hypothesis that both your explanatory variables affect the dependent variable? In this case you shouldn't remove the variable since you'd be modifying your regression a posteriori (that is after you've collected your data.)
Are you trying to make a descriptive statement about what you're analyzing? For example, are you trying to understand whether education and sex predict income? Similarly, you shouldn't drop a variable since you'll no longer be able to conclude that one of the two variables has no effect.
Finally, are you trying to make a prediction? In this case, it's appropriate to try both models and compare their performance. You can do this using an F-test/ANOVA.
This depends on the goal of your analysis. Have you made a hypothesis that both your explanatory variables affect the dependent variable? In this case you shouldn't remove the variable since you'd be modifying your regression a posteriori (that is after you've collected your data.)
Are you trying to make a descriptive statement about what you're analyzing? For example, are you trying to understand whether education and sex predict income? Similarly, you shouldn't drop a variable since you'll no longer be able to conclude that one of the two variables has no effect.
Finally, are you trying to make a prediction? In this case, it's appropriate to try both models and compare their performance. You can do this using an F-test/ANOVA.
answered 2 days ago
fny
864612
864612
add a comment |
add a comment |
up vote
0
down vote
This depends on your expected results. In your cases, you have only 2 features and, if you remove one of them, the percentage that you lose important data will really high.
Instead of removing the insignificant feature, you should try to make it better by detecting an anomaly or dropping the outlier. In a common way, plotting covariance matrix to see how relevant btw the features, you can analyze boxplot and adjust the threshold to gain the more reliable data.
If you have enough data, you can split data into training, validation and test set. Then, you can improve your model coefficient by using some voting methods in the validation set.
Finally, you can implement the result coefficient R-square, p-value... and do some test ANOVA testing, AIC score... to compare two cases.
New contributor
add a comment |
up vote
0
down vote
This depends on your expected results. In your cases, you have only 2 features and, if you remove one of them, the percentage that you lose important data will really high.
Instead of removing the insignificant feature, you should try to make it better by detecting an anomaly or dropping the outlier. In a common way, plotting covariance matrix to see how relevant btw the features, you can analyze boxplot and adjust the threshold to gain the more reliable data.
If you have enough data, you can split data into training, validation and test set. Then, you can improve your model coefficient by using some voting methods in the validation set.
Finally, you can implement the result coefficient R-square, p-value... and do some test ANOVA testing, AIC score... to compare two cases.
New contributor
add a comment |
up vote
0
down vote
up vote
0
down vote
This depends on your expected results. In your cases, you have only 2 features and, if you remove one of them, the percentage that you lose important data will really high.
Instead of removing the insignificant feature, you should try to make it better by detecting an anomaly or dropping the outlier. In a common way, plotting covariance matrix to see how relevant btw the features, you can analyze boxplot and adjust the threshold to gain the more reliable data.
If you have enough data, you can split data into training, validation and test set. Then, you can improve your model coefficient by using some voting methods in the validation set.
Finally, you can implement the result coefficient R-square, p-value... and do some test ANOVA testing, AIC score... to compare two cases.
New contributor
This depends on your expected results. In your cases, you have only 2 features and, if you remove one of them, the percentage that you lose important data will really high.
Instead of removing the insignificant feature, you should try to make it better by detecting an anomaly or dropping the outlier. In a common way, plotting covariance matrix to see how relevant btw the features, you can analyze boxplot and adjust the threshold to gain the more reliable data.
If you have enough data, you can split data into training, validation and test set. Then, you can improve your model coefficient by using some voting methods in the validation set.
Finally, you can implement the result coefficient R-square, p-value... and do some test ANOVA testing, AIC score... to compare two cases.
New contributor
New contributor
answered yesterday
AnNg
224
224
New contributor
New contributor
add a comment |
add a comment |
Camue is a new contributor. Be nice, and check out our Code of Conduct.
Camue is a new contributor. Be nice, and check out our Code of Conduct.
Camue is a new contributor. Be nice, and check out our Code of Conduct.
Camue is a new contributor. Be nice, and check out our Code of Conduct.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fmath.stackexchange.com%2fquestions%2f2998958%2fremoving-variable-with-big-p-value%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown