Generalised Linear Models: Binary data

up vote
0
down vote

favorite

I am currently working on GLM problem.

My response variable is binary as are some of my explanatory variable,others are categorical i.e. 1-1day, 2- 2-3days, 3-5+days and so forth.
I have coded it into factors.

My question is: I have used the step function and I am left with a model with many insignificant variables, in this case; do I simply drop these variables, if not what do I do ?
Also I tried to do the model selection, manually, using the anova function to test if the differences in the deviance were significant enough, and this gives me an answer that is somewhat different to the automatic model selection. Is this to be expected?

How do i go about my model selection, and how can I test if the functional form of my variables is correct ?

Thanks any help! :)

asked 2 days ago

odesinit

255

You might want to do a search for related terms on stats.stackexchange.com
– shadowtalker
2 days ago

add a comment |

up vote
0
down vote

favorite

I am currently working on GLM problem.

My response variable is binary as are some of my explanatory variable,others are categorical i.e. 1-1day, 2- 2-3days, 3-5+days and so forth.
I have coded it into factors.

How do i go about my model selection, and how can I test if the functional form of my variables is correct ?

Thanks any help! :)

asked 2 days ago

odesinit

255

You might want to do a search for related terms on stats.stackexchange.com
– shadowtalker
2 days ago

add a comment |

up vote
0
down vote

favorite

I am currently working on GLM problem.

My response variable is binary as are some of my explanatory variable,others are categorical i.e. 1-1day, 2- 2-3days, 3-5+days and so forth.
I have coded it into factors.

How do i go about my model selection, and how can I test if the functional form of my variables is correct ?

Thanks any help! :)

asked 2 days ago

odesinit

255

I am currently working on GLM problem.

My response variable is binary as are some of my explanatory variable,others are categorical i.e. 1-1day, 2- 2-3days, 3-5+days and so forth.
I have coded it into factors.

How do i go about my model selection, and how can I test if the functional form of my variables is correct ?

Thanks any help! :)

statistics statistical-inference binary logistic-regression

asked 2 days ago

odesinit

255

asked 2 days ago

odesinit

255

asked 2 days ago

odesinit

255

asked 2 days ago

odesinit

255

asked 2 days ago

odesinit

255

You might want to do a search for related terms on stats.stackexchange.com
– shadowtalker
2 days ago

add a comment |

You might want to do a search for related terms on stats.stackexchange.com
– shadowtalker
2 days ago

You might want to do a search for related terms on stats.stackexchange.com
– shadowtalker
2 days ago

add a comment |

1 Answer
1

active

oldest

votes

up vote
0
down vote

Model selection is an art included numerous statistical skill and analyzing technique. Generally, if you get the correct model form or do the right way of variables selecting, the coefficient in result will be meaningful and the model will predict more correctly the target variables. And you can check it by splitting to the training set, validation set and testing set.

With the GLMs have a general form as $y_i=beta_0+sum_{i=1}^nbeta_ix_i+epsilon$, we focus mostly on how to choose the right distribution of random component $Y$ and how to modify the predictors in the best way.

You can imagine that predicting will be more strict if you have the right distribution for the target variable. E.g, you can check the distribution by using the Tweedie model with the functional parameter can specify types distribution such as discrete (Poisson), continuous (Normal, Gamma) and mixed type (Compound Poisson). You can approach specifically shrinkage methods following each type of distributions.

For the predictors $X$, instead of removing the insignificant feature, you should try to make it better by detecting an anomaly or dropping the outlier. In a common way, plotting covariance matrix to see how relevant btw the features, you can analyze and adjust the threshold of boxplot for the continuous features, and categorical features can be split into the dummy matrix.

After that, you can fit the model and analyze the result. Trying to do several statistical tests to see how well features fit with target variables such as R-square, adjusted-R-square, p-value, do ANOVA testing, do some likelihood test AIC... Using the validation set (or cross-validation set) to improve the model.

Implement the result and testing method, then repeat the model selection steps until you get your expected result.

My resources: Non-Life Insurance Pricing with Generalized Linear Models-Authors: Ohlsson, Esbjörn, Johansson, Björn, and others paper for specific topic.

Hope it is helpful.

edited 2 days ago

answered 2 days ago

AnNg

374

New contributor

add a comment |

Your Answer

StackExchange.ifUsing("editor", function () {
return StackExchange.using("mathjaxEditing", function () {
StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix) {
StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\$","\$"]]);
});
});
}, "mathjax-editing");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "69"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
noCode: true, onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});

}
});

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fmath.stackexchange.com%2fquestions%2f2999132%2fgeneralised-linear-models-binary-data%23new-answer', 'question_page');
}
);

Post as a guest

Name

Required, but never shown

1 Answer
1

active

oldest

votes

1 Answer
1

active

oldest

votes

up vote
0
down vote

Implement the result and testing method, then repeat the model selection steps until you get your expected result.

My resources: Non-Life Insurance Pricing with Generalized Linear Models-Authors: Ohlsson, Esbjörn, Johansson, Björn, and others paper for specific topic.

Hope it is helpful.

edited 2 days ago

answered 2 days ago

AnNg

374

New contributor

add a comment |

up vote
0
down vote

Implement the result and testing method, then repeat the model selection steps until you get your expected result.

My resources: Non-Life Insurance Pricing with Generalized Linear Models-Authors: Ohlsson, Esbjörn, Johansson, Björn, and others paper for specific topic.

Hope it is helpful.

edited 2 days ago

answered 2 days ago

AnNg

374

New contributor

add a comment |

up vote
0
down vote

Implement the result and testing method, then repeat the model selection steps until you get your expected result.

My resources: Non-Life Insurance Pricing with Generalized Linear Models-Authors: Ohlsson, Esbjörn, Johansson, Björn, and others paper for specific topic.

Hope it is helpful.

edited 2 days ago

answered 2 days ago

AnNg

374

New contributor

Implement the result and testing method, then repeat the model selection steps until you get your expected result.

My resources: Non-Life Insurance Pricing with Generalized Linear Models-Authors: Ohlsson, Esbjörn, Johansson, Björn, and others paper for specific topic.

Hope it is helpful.

edited 2 days ago

answered 2 days ago

AnNg

374

New contributor

edited 2 days ago

answered 2 days ago

AnNg

374

New contributor

answered 2 days ago

AnNg

374

answered 2 days ago

AnNg

374

New contributor

AnNg is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.

add a comment |

draft saved

draft discarded

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

X,BWjwaBPQNTH,VIahaDkEpdeGjh4unsIcSXWM 8ht,tXLbXz622 X lq T eLU

搜尋此網誌

Vrftjkry