Linear Regression's Expectation of Prediction Error on a given point in the test set

I'm self-learning the book "The Elements of Statistical Learning", and I've got a little problem on deriving equation 2.27 in this book. I would appreciate if anyone could help me with this. In order to make this question consistent, I'll list all the information needed for the deriving.

Suppose we have two random variables $Y$ and $X$ with the following relation:

$Y = X^Tbeta + epsilon$ where $epsilon sim N(0, sigma^2)$.

and we fit data in a training set $mathcal{T}$ with linear regression by least squares:

$hat{Y} = X^T hat{beta}$, where $hat{beta}$ is the parameter and $hat{Y}$ is our prediction, $X$ is the input.

we define the Expected Prediction Error (EPE) of a record ($x_0, y_0$) in test data as

$EPE(x_0) = E_{y_0|x_0}E_{mathcal{T}}(y_0 - hat{y_0})^2$ where $hat{y_0}$ is our prediction w.r.t. $x_0$.

According to the book ,
$EPE(x_0) = Var(y_0|x_0) + Var_{mathcal{T}}(hat{y_0}) + Bias^2(hat{y_0})$

where $Bias^2(hat{y_0}) = (E_{mathcal{T}}hat{y_0} - y_0)^2$.

Here is my derivation:

$EPE(x_0) = E_{y_0|x_0}E_{mathcal{T}}(y_0 - hat{y_0})^2$

$=E_{y_0|x_0}E_{mathcal{T}}(y_0^2 - 2y_0hat{y_0} + hat{y_0}^2)$

$=E_{y_0|x_0}(y_0^2) -2E_{y_0|x_0}y_0E_{mathcal{T}}hat{y_0} + E_{mathcal{T}}hat{y_0}^2$

By the lemma

$Var(X) = E(X^2) - (E(X))^2$ we have

$EPE(x_0) = Var(y_0|x_0) + (E_{y_0|x_0}(y_0))^2 -2E_{y_0|x_0}y_0E_{mathcal{T}}hat{y_0} + Var_{mathcal{T}}(hat{y_0)} + (E_{mathcal{T}}(hat{y_0}))^2$

$= Var(y_0|x_0) + Var_{mathcal{T}}(hat{y_0}) + (E_{mathcal{T}}(hat{y_0}) - E_{y_0|x_0}(y_0))^2$ which is really close to the conclusion given in the book, as long as we can prove $E_{y_0|x_0}(y_0) = y_0$ the proof is completed.

But I've no clue how to prove $E_{y_0|x_0}(y_0) = y_0$ since I've no idea about $p(y_0 | x_0)$. I'm wondering if anyone could help me on that.

My guess is that, since $y_0 = x_0^Tbeta + epsilon$, $E_{y_0|x_0}(y_0) = E_{y_0|x_0}(x_0^Tbeta) + E_{y_0|x_0}(epsilon) = E_{y_0|x_0}(x_0^Tbeta) + epsilon$, and $x_0^Tbeta$ is deterministic given $x_0$, so $E_{y_0|x_0}(y_0) = x_0^Tbeta + epsilon = y_0$, but I'm not sure if this is right.

Thank you very much for your reading!

Possible relative questions:

What is Expected Prediction Error (EPE) a function of?

Expected squared prediction error conditioned on training set

edited Jan 18 at 2:39

asked Jan 18 at 2:31

许94.94 XU

488

add a comment |

Suppose we have two random variables $Y$ and $X$ with the following relation:

$Y = X^Tbeta + epsilon$ where $epsilon sim N(0, sigma^2)$.

and we fit data in a training set $mathcal{T}$ with linear regression by least squares:

$hat{Y} = X^T hat{beta}$, where $hat{beta}$ is the parameter and $hat{Y}$ is our prediction, $X$ is the input.

we define the Expected Prediction Error (EPE) of a record ($x_0, y_0$) in test data as

$EPE(x_0) = E_{y_0|x_0}E_{mathcal{T}}(y_0 - hat{y_0})^2$ where $hat{y_0}$ is our prediction w.r.t. $x_0$.

According to the book ,
$EPE(x_0) = Var(y_0|x_0) + Var_{mathcal{T}}(hat{y_0}) + Bias^2(hat{y_0})$

where $Bias^2(hat{y_0}) = (E_{mathcal{T}}hat{y_0} - y_0)^2$.

Here is my derivation:

$EPE(x_0) = E_{y_0|x_0}E_{mathcal{T}}(y_0 - hat{y_0})^2$

$=E_{y_0|x_0}E_{mathcal{T}}(y_0^2 - 2y_0hat{y_0} + hat{y_0}^2)$

$=E_{y_0|x_0}(y_0^2) -2E_{y_0|x_0}y_0E_{mathcal{T}}hat{y_0} + E_{mathcal{T}}hat{y_0}^2$

By the lemma

$Var(X) = E(X^2) - (E(X))^2$ we have

$EPE(x_0) = Var(y_0|x_0) + (E_{y_0|x_0}(y_0))^2 -2E_{y_0|x_0}y_0E_{mathcal{T}}hat{y_0} + Var_{mathcal{T}}(hat{y_0)} + (E_{mathcal{T}}(hat{y_0}))^2$

But I've no clue how to prove $E_{y_0|x_0}(y_0) = y_0$ since I've no idea about $p(y_0 | x_0)$. I'm wondering if anyone could help me on that.

Thank you very much for your reading!

Possible relative questions:

What is Expected Prediction Error (EPE) a function of?

Expected squared prediction error conditioned on training set

edited Jan 18 at 2:39

asked Jan 18 at 2:31

许94.94 XU

488

add a comment |

Suppose we have two random variables $Y$ and $X$ with the following relation:

$Y = X^Tbeta + epsilon$ where $epsilon sim N(0, sigma^2)$.

and we fit data in a training set $mathcal{T}$ with linear regression by least squares:

$hat{Y} = X^T hat{beta}$, where $hat{beta}$ is the parameter and $hat{Y}$ is our prediction, $X$ is the input.

we define the Expected Prediction Error (EPE) of a record ($x_0, y_0$) in test data as

$EPE(x_0) = E_{y_0|x_0}E_{mathcal{T}}(y_0 - hat{y_0})^2$ where $hat{y_0}$ is our prediction w.r.t. $x_0$.

According to the book ,
$EPE(x_0) = Var(y_0|x_0) + Var_{mathcal{T}}(hat{y_0}) + Bias^2(hat{y_0})$

where $Bias^2(hat{y_0}) = (E_{mathcal{T}}hat{y_0} - y_0)^2$.

Here is my derivation:

$EPE(x_0) = E_{y_0|x_0}E_{mathcal{T}}(y_0 - hat{y_0})^2$

$=E_{y_0|x_0}E_{mathcal{T}}(y_0^2 - 2y_0hat{y_0} + hat{y_0}^2)$

$=E_{y_0|x_0}(y_0^2) -2E_{y_0|x_0}y_0E_{mathcal{T}}hat{y_0} + E_{mathcal{T}}hat{y_0}^2$

By the lemma

$Var(X) = E(X^2) - (E(X))^2$ we have

$EPE(x_0) = Var(y_0|x_0) + (E_{y_0|x_0}(y_0))^2 -2E_{y_0|x_0}y_0E_{mathcal{T}}hat{y_0} + Var_{mathcal{T}}(hat{y_0)} + (E_{mathcal{T}}(hat{y_0}))^2$

But I've no clue how to prove $E_{y_0|x_0}(y_0) = y_0$ since I've no idea about $p(y_0 | x_0)$. I'm wondering if anyone could help me on that.

Thank you very much for your reading!

Possible relative questions:

What is Expected Prediction Error (EPE) a function of?

Expected squared prediction error conditioned on training set

edited Jan 18 at 2:39

asked Jan 18 at 2:31

许94.94 XU

488

Suppose we have two random variables $Y$ and $X$ with the following relation:

$Y = X^Tbeta + epsilon$ where $epsilon sim N(0, sigma^2)$.

and we fit data in a training set $mathcal{T}$ with linear regression by least squares:

$hat{Y} = X^T hat{beta}$, where $hat{beta}$ is the parameter and $hat{Y}$ is our prediction, $X$ is the input.

we define the Expected Prediction Error (EPE) of a record ($x_0, y_0$) in test data as

$EPE(x_0) = E_{y_0|x_0}E_{mathcal{T}}(y_0 - hat{y_0})^2$ where $hat{y_0}$ is our prediction w.r.t. $x_0$.

According to the book ,
$EPE(x_0) = Var(y_0|x_0) + Var_{mathcal{T}}(hat{y_0}) + Bias^2(hat{y_0})$

where $Bias^2(hat{y_0}) = (E_{mathcal{T}}hat{y_0} - y_0)^2$.

Here is my derivation:

$EPE(x_0) = E_{y_0|x_0}E_{mathcal{T}}(y_0 - hat{y_0})^2$

$=E_{y_0|x_0}E_{mathcal{T}}(y_0^2 - 2y_0hat{y_0} + hat{y_0}^2)$

$=E_{y_0|x_0}(y_0^2) -2E_{y_0|x_0}y_0E_{mathcal{T}}hat{y_0} + E_{mathcal{T}}hat{y_0}^2$

By the lemma

$Var(X) = E(X^2) - (E(X))^2$ we have

$EPE(x_0) = Var(y_0|x_0) + (E_{y_0|x_0}(y_0))^2 -2E_{y_0|x_0}y_0E_{mathcal{T}}hat{y_0} + Var_{mathcal{T}}(hat{y_0)} + (E_{mathcal{T}}(hat{y_0}))^2$

But I've no clue how to prove $E_{y_0|x_0}(y_0) = y_0$ since I've no idea about $p(y_0 | x_0)$. I'm wondering if anyone could help me on that.

Thank you very much for your reading!

Possible relative questions:

What is Expected Prediction Error (EPE) a function of?

Expected squared prediction error conditioned on training set

statistics linear-regression

edited Jan 18 at 2:39

asked Jan 18 at 2:31

许94.94 XU

488

edited Jan 18 at 2:39

asked Jan 18 at 2:31

许94.94 XU

488

edited Jan 18 at 2:39

asked Jan 18 at 2:31

许94.94 XU

488

asked Jan 18 at 2:31

许94.94 XU

488

asked Jan 18 at 2:31

许94.94 XU

488

add a comment |

0

active

oldest

votes

Your Answer

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "69"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
noCode: true, onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});

}
});

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fmath.stackexchange.com%2fquestions%2f3077768%2flinear-regressions-expectation-of-prediction-error-on-a-given-point-in-the-test%23new-answer', 'question_page');
}
);

Post as a guest

Name

Required, but never shown

0

active

oldest

votes

0

active

oldest

votes

draft saved

draft discarded

Thanks for contributing an answer to Mathematics Stack Exchange!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

Use MathJax to format equations. MathJax reference.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

搜尋此網誌

Dtyjlui