Linear Relationship vs Correlation
I am new to machine learning, and I'm trying to cover some of the basics. One of the assumptions of linear regression is a linear relationship.
However on Reddit I was told today that no machine learning model requires a correlation between any of the predictors and the output.
My question is is there a difference between correlation, and a linear relationship?
regression correlation
New contributor
add a comment |
I am new to machine learning, and I'm trying to cover some of the basics. One of the assumptions of linear regression is a linear relationship.
However on Reddit I was told today that no machine learning model requires a correlation between any of the predictors and the output.
My question is is there a difference between correlation, and a linear relationship?
regression correlation
New contributor
Regarding the connection between simple linear regression and correlation, there are some useful answers on SE. See e. g. the answers to this Q and this answer.
– statmerkur
Dec 28 '18 at 10:43
add a comment |
I am new to machine learning, and I'm trying to cover some of the basics. One of the assumptions of linear regression is a linear relationship.
However on Reddit I was told today that no machine learning model requires a correlation between any of the predictors and the output.
My question is is there a difference between correlation, and a linear relationship?
regression correlation
New contributor
I am new to machine learning, and I'm trying to cover some of the basics. One of the assumptions of linear regression is a linear relationship.
However on Reddit I was told today that no machine learning model requires a correlation between any of the predictors and the output.
My question is is there a difference between correlation, and a linear relationship?
regression correlation
regression correlation
New contributor
New contributor
edited Dec 27 '18 at 23:23
kjetil b halvorsen
28.6k980208
28.6k980208
New contributor
asked Dec 27 '18 at 23:17
Jweir136
111
111
New contributor
New contributor
Regarding the connection between simple linear regression and correlation, there are some useful answers on SE. See e. g. the answers to this Q and this answer.
– statmerkur
Dec 28 '18 at 10:43
add a comment |
Regarding the connection between simple linear regression and correlation, there are some useful answers on SE. See e. g. the answers to this Q and this answer.
– statmerkur
Dec 28 '18 at 10:43
Regarding the connection between simple linear regression and correlation, there are some useful answers on SE. See e. g. the answers to this Q and this answer.
– statmerkur
Dec 28 '18 at 10:43
Regarding the connection between simple linear regression and correlation, there are some useful answers on SE. See e. g. the answers to this Q and this answer.
– statmerkur
Dec 28 '18 at 10:43
add a comment |
1 Answer
1
active
oldest
votes
One of the assumptions of linear regression is a linear relationship.
There is a fairly common confusion on this matter that makes the scope of linear regression look narrower than it actually is. In regression analysis, we model the expected value of a response variable $Y_i$ conditional on some regressors $mathbb{x}_i$. In general, we write the response variables as:
$$Y_i = mathbb{E}(Y_i|mathbb{x}_i) + varepsilon_i quad quad quad varepsilon_i equiv Y_i - mathbb{x}_i,$$
where the first part is the true regression function and the second part is the error term. (This model form implies that $mathbb{E}(varepsilon_i | mathbf{x}_i) = 0$.) In a linear regression we assume that the true regression function is a linear function of the parameter vector $boldsymbol{beta} = (beta_0,...,beta_m)$. This gives us the model form:
$$Y_i = sum_{k=0}^m beta_k x_{i,k}^* + varepsilon_i quad quad quad x_{i,k}^* equiv f_k(mathbf{x}_i).$$
You can see from this model form that we can transform the original regressors $mathbf{x}_i$ via any transform we want (including a non-linear transform). Hence, tThe important thing to notice about this is that linear regression does not necessarily assume linearity with respect to the regressor variables. The "linear" in linear regression comes from the fact that the model is linear with respect to the parameters in the regression function. Nonlinear regression occurs when the regression function has one or more parameters that cannot be linearised.
...is there a difference between correlation, and a linear relationship?
Correlation is a measure of the strength of a linear relationship between two variables. It occurs as a special case of linear regression. If we have use a simple linear regression model under standard assumptions then we have a single regressor $x_i$, with no transformation of this variable. The simple linear regression model is:
$$Y_i = beta_0 + beta_1 x_{i} + varepsilon_i quad quad quad varepsilon_i sim text{N}(0, sigma^2).$$
If we fit the simple linear regression model using ordinary least squares (OLS) estimation (the standard estimation method) then we get a coefficient of determination that is equal to the square of the sample correlation between the $y_i$ and $x_i$ values. This gives a close connection between simple linear regression and sample correlation analysis.
add a comment |
Your Answer
StackExchange.ifUsing("editor", function () {
return StackExchange.using("mathjaxEditing", function () {
StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix) {
StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\\(","\\)"]]);
});
});
}, "mathjax-editing");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "65"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Jweir136 is a new contributor. Be nice, and check out our Code of Conduct.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstats.stackexchange.com%2fquestions%2f384699%2flinear-relationship-vs-correlation%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
One of the assumptions of linear regression is a linear relationship.
There is a fairly common confusion on this matter that makes the scope of linear regression look narrower than it actually is. In regression analysis, we model the expected value of a response variable $Y_i$ conditional on some regressors $mathbb{x}_i$. In general, we write the response variables as:
$$Y_i = mathbb{E}(Y_i|mathbb{x}_i) + varepsilon_i quad quad quad varepsilon_i equiv Y_i - mathbb{x}_i,$$
where the first part is the true regression function and the second part is the error term. (This model form implies that $mathbb{E}(varepsilon_i | mathbf{x}_i) = 0$.) In a linear regression we assume that the true regression function is a linear function of the parameter vector $boldsymbol{beta} = (beta_0,...,beta_m)$. This gives us the model form:
$$Y_i = sum_{k=0}^m beta_k x_{i,k}^* + varepsilon_i quad quad quad x_{i,k}^* equiv f_k(mathbf{x}_i).$$
You can see from this model form that we can transform the original regressors $mathbf{x}_i$ via any transform we want (including a non-linear transform). Hence, tThe important thing to notice about this is that linear regression does not necessarily assume linearity with respect to the regressor variables. The "linear" in linear regression comes from the fact that the model is linear with respect to the parameters in the regression function. Nonlinear regression occurs when the regression function has one or more parameters that cannot be linearised.
...is there a difference between correlation, and a linear relationship?
Correlation is a measure of the strength of a linear relationship between two variables. It occurs as a special case of linear regression. If we have use a simple linear regression model under standard assumptions then we have a single regressor $x_i$, with no transformation of this variable. The simple linear regression model is:
$$Y_i = beta_0 + beta_1 x_{i} + varepsilon_i quad quad quad varepsilon_i sim text{N}(0, sigma^2).$$
If we fit the simple linear regression model using ordinary least squares (OLS) estimation (the standard estimation method) then we get a coefficient of determination that is equal to the square of the sample correlation between the $y_i$ and $x_i$ values. This gives a close connection between simple linear regression and sample correlation analysis.
add a comment |
One of the assumptions of linear regression is a linear relationship.
There is a fairly common confusion on this matter that makes the scope of linear regression look narrower than it actually is. In regression analysis, we model the expected value of a response variable $Y_i$ conditional on some regressors $mathbb{x}_i$. In general, we write the response variables as:
$$Y_i = mathbb{E}(Y_i|mathbb{x}_i) + varepsilon_i quad quad quad varepsilon_i equiv Y_i - mathbb{x}_i,$$
where the first part is the true regression function and the second part is the error term. (This model form implies that $mathbb{E}(varepsilon_i | mathbf{x}_i) = 0$.) In a linear regression we assume that the true regression function is a linear function of the parameter vector $boldsymbol{beta} = (beta_0,...,beta_m)$. This gives us the model form:
$$Y_i = sum_{k=0}^m beta_k x_{i,k}^* + varepsilon_i quad quad quad x_{i,k}^* equiv f_k(mathbf{x}_i).$$
You can see from this model form that we can transform the original regressors $mathbf{x}_i$ via any transform we want (including a non-linear transform). Hence, tThe important thing to notice about this is that linear regression does not necessarily assume linearity with respect to the regressor variables. The "linear" in linear regression comes from the fact that the model is linear with respect to the parameters in the regression function. Nonlinear regression occurs when the regression function has one or more parameters that cannot be linearised.
...is there a difference between correlation, and a linear relationship?
Correlation is a measure of the strength of a linear relationship between two variables. It occurs as a special case of linear regression. If we have use a simple linear regression model under standard assumptions then we have a single regressor $x_i$, with no transformation of this variable. The simple linear regression model is:
$$Y_i = beta_0 + beta_1 x_{i} + varepsilon_i quad quad quad varepsilon_i sim text{N}(0, sigma^2).$$
If we fit the simple linear regression model using ordinary least squares (OLS) estimation (the standard estimation method) then we get a coefficient of determination that is equal to the square of the sample correlation between the $y_i$ and $x_i$ values. This gives a close connection between simple linear regression and sample correlation analysis.
add a comment |
One of the assumptions of linear regression is a linear relationship.
There is a fairly common confusion on this matter that makes the scope of linear regression look narrower than it actually is. In regression analysis, we model the expected value of a response variable $Y_i$ conditional on some regressors $mathbb{x}_i$. In general, we write the response variables as:
$$Y_i = mathbb{E}(Y_i|mathbb{x}_i) + varepsilon_i quad quad quad varepsilon_i equiv Y_i - mathbb{x}_i,$$
where the first part is the true regression function and the second part is the error term. (This model form implies that $mathbb{E}(varepsilon_i | mathbf{x}_i) = 0$.) In a linear regression we assume that the true regression function is a linear function of the parameter vector $boldsymbol{beta} = (beta_0,...,beta_m)$. This gives us the model form:
$$Y_i = sum_{k=0}^m beta_k x_{i,k}^* + varepsilon_i quad quad quad x_{i,k}^* equiv f_k(mathbf{x}_i).$$
You can see from this model form that we can transform the original regressors $mathbf{x}_i$ via any transform we want (including a non-linear transform). Hence, tThe important thing to notice about this is that linear regression does not necessarily assume linearity with respect to the regressor variables. The "linear" in linear regression comes from the fact that the model is linear with respect to the parameters in the regression function. Nonlinear regression occurs when the regression function has one or more parameters that cannot be linearised.
...is there a difference between correlation, and a linear relationship?
Correlation is a measure of the strength of a linear relationship between two variables. It occurs as a special case of linear regression. If we have use a simple linear regression model under standard assumptions then we have a single regressor $x_i$, with no transformation of this variable. The simple linear regression model is:
$$Y_i = beta_0 + beta_1 x_{i} + varepsilon_i quad quad quad varepsilon_i sim text{N}(0, sigma^2).$$
If we fit the simple linear regression model using ordinary least squares (OLS) estimation (the standard estimation method) then we get a coefficient of determination that is equal to the square of the sample correlation between the $y_i$ and $x_i$ values. This gives a close connection between simple linear regression and sample correlation analysis.
One of the assumptions of linear regression is a linear relationship.
There is a fairly common confusion on this matter that makes the scope of linear regression look narrower than it actually is. In regression analysis, we model the expected value of a response variable $Y_i$ conditional on some regressors $mathbb{x}_i$. In general, we write the response variables as:
$$Y_i = mathbb{E}(Y_i|mathbb{x}_i) + varepsilon_i quad quad quad varepsilon_i equiv Y_i - mathbb{x}_i,$$
where the first part is the true regression function and the second part is the error term. (This model form implies that $mathbb{E}(varepsilon_i | mathbf{x}_i) = 0$.) In a linear regression we assume that the true regression function is a linear function of the parameter vector $boldsymbol{beta} = (beta_0,...,beta_m)$. This gives us the model form:
$$Y_i = sum_{k=0}^m beta_k x_{i,k}^* + varepsilon_i quad quad quad x_{i,k}^* equiv f_k(mathbf{x}_i).$$
You can see from this model form that we can transform the original regressors $mathbf{x}_i$ via any transform we want (including a non-linear transform). Hence, tThe important thing to notice about this is that linear regression does not necessarily assume linearity with respect to the regressor variables. The "linear" in linear regression comes from the fact that the model is linear with respect to the parameters in the regression function. Nonlinear regression occurs when the regression function has one or more parameters that cannot be linearised.
...is there a difference between correlation, and a linear relationship?
Correlation is a measure of the strength of a linear relationship between two variables. It occurs as a special case of linear regression. If we have use a simple linear regression model under standard assumptions then we have a single regressor $x_i$, with no transformation of this variable. The simple linear regression model is:
$$Y_i = beta_0 + beta_1 x_{i} + varepsilon_i quad quad quad varepsilon_i sim text{N}(0, sigma^2).$$
If we fit the simple linear regression model using ordinary least squares (OLS) estimation (the standard estimation method) then we get a coefficient of determination that is equal to the square of the sample correlation between the $y_i$ and $x_i$ values. This gives a close connection between simple linear regression and sample correlation analysis.
answered Dec 28 '18 at 1:37
Ben
21.7k224103
21.7k224103
add a comment |
add a comment |
Jweir136 is a new contributor. Be nice, and check out our Code of Conduct.
Jweir136 is a new contributor. Be nice, and check out our Code of Conduct.
Jweir136 is a new contributor. Be nice, and check out our Code of Conduct.
Jweir136 is a new contributor. Be nice, and check out our Code of Conduct.
Thanks for contributing an answer to Cross Validated!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
Use MathJax to format equations. MathJax reference.
To learn more, see our tips on writing great answers.
Some of your past answers have not been well-received, and you're in danger of being blocked from answering.
Please pay close attention to the following guidance:
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstats.stackexchange.com%2fquestions%2f384699%2flinear-relationship-vs-correlation%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Regarding the connection between simple linear regression and correlation, there are some useful answers on SE. See e. g. the answers to this Q and this answer.
– statmerkur
Dec 28 '18 at 10:43