Is the average of betas from Y ~ X and X ~ Y valid?












7












$begingroup$


I am interested in the relationship between two time series variables: $Y$ and $X$. The two variables are related to each other, and it's not clear from theory which one causes the other.



Given this, I have no good reason to prefer the linear regression $ Y = alpha + beta X$ over $ X = kappa + gamma Y $.



Clearly there is some relationship between $beta$ and $gamma$, though I recall enough statistics to understand that $beta = 1/ gamma$ is not true. Or perhaps it's not even close? I'm a bit hazy.



The problem is to decide how much of $X$ one ought to hold against $Y$.



I'm considering taking the average of $beta$ and $1/ gamma$ and using that as the hedge ratio.



Is the average of $beta$ and $1/ gamma$ a meaningful concept?



And as a secondary question (perhaps this should be another post), what is the appropriate way to deal with the fact that the two variables are related to each other -- meaning that there really isn't an independent and dependent variable?










share|cite|improve this question











$endgroup$








  • 1




    $begingroup$
    The problem is not causality but instead the errors of measurement (it is just that often the dependent variable Y is the one with large measurement error, making "Y = a + B x + error" the common expression) Do you have an idea about the errors in the measurement of X and Y.
    $endgroup$
    – Martijn Weterings
    Jan 6 at 12:04








  • 1




    $begingroup$
    The exact values of $beta$ and $gamma$ can be found in this answer of mine to Effect of switching responses and explanatory variables..., and, as you suspect, $beta$ is not the reciprocal of $gamma$, and averaging $beta$ and $1/gamma$ is not the right way to go. A pictorial view of what $beta$ and $gamma$ are minimizing is given in Elvis's answer to the same question, and he introduces a"least rectangles" regression that you might want .....
    $endgroup$
    – Dilip Sarwate
    Jan 6 at 15:43








  • 3




    $begingroup$
    You are in the ideal scenario where the choice of technique has a direct, physically measurable impact; you can simply measure the out-of-sample hedging error for each estimate, and compare them. Also, typically optimal hedging is better handled by using a VECM model (see for example Gatarek & Johansen, 2014, Optimal hedging with the cointegrated vector autoregressive model), which does not require choosing to model Y as a function of X or vice-versa.
    $endgroup$
    – Chris Haug
    Jan 6 at 16:32






  • 1




    $begingroup$
    You might want to look at the geometric mean $sqrt{dfrac{beta}{gamma}}$ as a possibility (if they are both negative you might take the negative square root). Then look at $dfrac{s_y}{s_x}$, which should be very similar
    $endgroup$
    – Henry
    Jan 6 at 18:37








  • 1




    $begingroup$
    @ricardo Note that I specified out-of-sample error, so not the (in-sample) fit of the model. And it is entirely possible for the optimal hedge ratio to change over time (especially if the relationship is not actually linear), that doesn't change the fact that figuring out the best hedging strategy can be most directly done by backtesting the model and observing the results.
    $endgroup$
    – Chris Haug
    Jan 6 at 23:53
















7












$begingroup$


I am interested in the relationship between two time series variables: $Y$ and $X$. The two variables are related to each other, and it's not clear from theory which one causes the other.



Given this, I have no good reason to prefer the linear regression $ Y = alpha + beta X$ over $ X = kappa + gamma Y $.



Clearly there is some relationship between $beta$ and $gamma$, though I recall enough statistics to understand that $beta = 1/ gamma$ is not true. Or perhaps it's not even close? I'm a bit hazy.



The problem is to decide how much of $X$ one ought to hold against $Y$.



I'm considering taking the average of $beta$ and $1/ gamma$ and using that as the hedge ratio.



Is the average of $beta$ and $1/ gamma$ a meaningful concept?



And as a secondary question (perhaps this should be another post), what is the appropriate way to deal with the fact that the two variables are related to each other -- meaning that there really isn't an independent and dependent variable?










share|cite|improve this question











$endgroup$








  • 1




    $begingroup$
    The problem is not causality but instead the errors of measurement (it is just that often the dependent variable Y is the one with large measurement error, making "Y = a + B x + error" the common expression) Do you have an idea about the errors in the measurement of X and Y.
    $endgroup$
    – Martijn Weterings
    Jan 6 at 12:04








  • 1




    $begingroup$
    The exact values of $beta$ and $gamma$ can be found in this answer of mine to Effect of switching responses and explanatory variables..., and, as you suspect, $beta$ is not the reciprocal of $gamma$, and averaging $beta$ and $1/gamma$ is not the right way to go. A pictorial view of what $beta$ and $gamma$ are minimizing is given in Elvis's answer to the same question, and he introduces a"least rectangles" regression that you might want .....
    $endgroup$
    – Dilip Sarwate
    Jan 6 at 15:43








  • 3




    $begingroup$
    You are in the ideal scenario where the choice of technique has a direct, physically measurable impact; you can simply measure the out-of-sample hedging error for each estimate, and compare them. Also, typically optimal hedging is better handled by using a VECM model (see for example Gatarek & Johansen, 2014, Optimal hedging with the cointegrated vector autoregressive model), which does not require choosing to model Y as a function of X or vice-versa.
    $endgroup$
    – Chris Haug
    Jan 6 at 16:32






  • 1




    $begingroup$
    You might want to look at the geometric mean $sqrt{dfrac{beta}{gamma}}$ as a possibility (if they are both negative you might take the negative square root). Then look at $dfrac{s_y}{s_x}$, which should be very similar
    $endgroup$
    – Henry
    Jan 6 at 18:37








  • 1




    $begingroup$
    @ricardo Note that I specified out-of-sample error, so not the (in-sample) fit of the model. And it is entirely possible for the optimal hedge ratio to change over time (especially if the relationship is not actually linear), that doesn't change the fact that figuring out the best hedging strategy can be most directly done by backtesting the model and observing the results.
    $endgroup$
    – Chris Haug
    Jan 6 at 23:53














7












7








7


2



$begingroup$


I am interested in the relationship between two time series variables: $Y$ and $X$. The two variables are related to each other, and it's not clear from theory which one causes the other.



Given this, I have no good reason to prefer the linear regression $ Y = alpha + beta X$ over $ X = kappa + gamma Y $.



Clearly there is some relationship between $beta$ and $gamma$, though I recall enough statistics to understand that $beta = 1/ gamma$ is not true. Or perhaps it's not even close? I'm a bit hazy.



The problem is to decide how much of $X$ one ought to hold against $Y$.



I'm considering taking the average of $beta$ and $1/ gamma$ and using that as the hedge ratio.



Is the average of $beta$ and $1/ gamma$ a meaningful concept?



And as a secondary question (perhaps this should be another post), what is the appropriate way to deal with the fact that the two variables are related to each other -- meaning that there really isn't an independent and dependent variable?










share|cite|improve this question











$endgroup$




I am interested in the relationship between two time series variables: $Y$ and $X$. The two variables are related to each other, and it's not clear from theory which one causes the other.



Given this, I have no good reason to prefer the linear regression $ Y = alpha + beta X$ over $ X = kappa + gamma Y $.



Clearly there is some relationship between $beta$ and $gamma$, though I recall enough statistics to understand that $beta = 1/ gamma$ is not true. Or perhaps it's not even close? I'm a bit hazy.



The problem is to decide how much of $X$ one ought to hold against $Y$.



I'm considering taking the average of $beta$ and $1/ gamma$ and using that as the hedge ratio.



Is the average of $beta$ and $1/ gamma$ a meaningful concept?



And as a secondary question (perhaps this should be another post), what is the appropriate way to deal with the fact that the two variables are related to each other -- meaning that there really isn't an independent and dependent variable?







regression regression-coefficients






share|cite|improve this question















share|cite|improve this question













share|cite|improve this question




share|cite|improve this question








edited Jan 9 at 22:23







ricardo

















asked Jan 6 at 7:46









ricardoricardo

1436




1436








  • 1




    $begingroup$
    The problem is not causality but instead the errors of measurement (it is just that often the dependent variable Y is the one with large measurement error, making "Y = a + B x + error" the common expression) Do you have an idea about the errors in the measurement of X and Y.
    $endgroup$
    – Martijn Weterings
    Jan 6 at 12:04








  • 1




    $begingroup$
    The exact values of $beta$ and $gamma$ can be found in this answer of mine to Effect of switching responses and explanatory variables..., and, as you suspect, $beta$ is not the reciprocal of $gamma$, and averaging $beta$ and $1/gamma$ is not the right way to go. A pictorial view of what $beta$ and $gamma$ are minimizing is given in Elvis's answer to the same question, and he introduces a"least rectangles" regression that you might want .....
    $endgroup$
    – Dilip Sarwate
    Jan 6 at 15:43








  • 3




    $begingroup$
    You are in the ideal scenario where the choice of technique has a direct, physically measurable impact; you can simply measure the out-of-sample hedging error for each estimate, and compare them. Also, typically optimal hedging is better handled by using a VECM model (see for example Gatarek & Johansen, 2014, Optimal hedging with the cointegrated vector autoregressive model), which does not require choosing to model Y as a function of X or vice-versa.
    $endgroup$
    – Chris Haug
    Jan 6 at 16:32






  • 1




    $begingroup$
    You might want to look at the geometric mean $sqrt{dfrac{beta}{gamma}}$ as a possibility (if they are both negative you might take the negative square root). Then look at $dfrac{s_y}{s_x}$, which should be very similar
    $endgroup$
    – Henry
    Jan 6 at 18:37








  • 1




    $begingroup$
    @ricardo Note that I specified out-of-sample error, so not the (in-sample) fit of the model. And it is entirely possible for the optimal hedge ratio to change over time (especially if the relationship is not actually linear), that doesn't change the fact that figuring out the best hedging strategy can be most directly done by backtesting the model and observing the results.
    $endgroup$
    – Chris Haug
    Jan 6 at 23:53














  • 1




    $begingroup$
    The problem is not causality but instead the errors of measurement (it is just that often the dependent variable Y is the one with large measurement error, making "Y = a + B x + error" the common expression) Do you have an idea about the errors in the measurement of X and Y.
    $endgroup$
    – Martijn Weterings
    Jan 6 at 12:04








  • 1




    $begingroup$
    The exact values of $beta$ and $gamma$ can be found in this answer of mine to Effect of switching responses and explanatory variables..., and, as you suspect, $beta$ is not the reciprocal of $gamma$, and averaging $beta$ and $1/gamma$ is not the right way to go. A pictorial view of what $beta$ and $gamma$ are minimizing is given in Elvis's answer to the same question, and he introduces a"least rectangles" regression that you might want .....
    $endgroup$
    – Dilip Sarwate
    Jan 6 at 15:43








  • 3




    $begingroup$
    You are in the ideal scenario where the choice of technique has a direct, physically measurable impact; you can simply measure the out-of-sample hedging error for each estimate, and compare them. Also, typically optimal hedging is better handled by using a VECM model (see for example Gatarek & Johansen, 2014, Optimal hedging with the cointegrated vector autoregressive model), which does not require choosing to model Y as a function of X or vice-versa.
    $endgroup$
    – Chris Haug
    Jan 6 at 16:32






  • 1




    $begingroup$
    You might want to look at the geometric mean $sqrt{dfrac{beta}{gamma}}$ as a possibility (if they are both negative you might take the negative square root). Then look at $dfrac{s_y}{s_x}$, which should be very similar
    $endgroup$
    – Henry
    Jan 6 at 18:37








  • 1




    $begingroup$
    @ricardo Note that I specified out-of-sample error, so not the (in-sample) fit of the model. And it is entirely possible for the optimal hedge ratio to change over time (especially if the relationship is not actually linear), that doesn't change the fact that figuring out the best hedging strategy can be most directly done by backtesting the model and observing the results.
    $endgroup$
    – Chris Haug
    Jan 6 at 23:53








1




1




$begingroup$
The problem is not causality but instead the errors of measurement (it is just that often the dependent variable Y is the one with large measurement error, making "Y = a + B x + error" the common expression) Do you have an idea about the errors in the measurement of X and Y.
$endgroup$
– Martijn Weterings
Jan 6 at 12:04






$begingroup$
The problem is not causality but instead the errors of measurement (it is just that often the dependent variable Y is the one with large measurement error, making "Y = a + B x + error" the common expression) Do you have an idea about the errors in the measurement of X and Y.
$endgroup$
– Martijn Weterings
Jan 6 at 12:04






1




1




$begingroup$
The exact values of $beta$ and $gamma$ can be found in this answer of mine to Effect of switching responses and explanatory variables..., and, as you suspect, $beta$ is not the reciprocal of $gamma$, and averaging $beta$ and $1/gamma$ is not the right way to go. A pictorial view of what $beta$ and $gamma$ are minimizing is given in Elvis's answer to the same question, and he introduces a"least rectangles" regression that you might want .....
$endgroup$
– Dilip Sarwate
Jan 6 at 15:43






$begingroup$
The exact values of $beta$ and $gamma$ can be found in this answer of mine to Effect of switching responses and explanatory variables..., and, as you suspect, $beta$ is not the reciprocal of $gamma$, and averaging $beta$ and $1/gamma$ is not the right way to go. A pictorial view of what $beta$ and $gamma$ are minimizing is given in Elvis's answer to the same question, and he introduces a"least rectangles" regression that you might want .....
$endgroup$
– Dilip Sarwate
Jan 6 at 15:43






3




3




$begingroup$
You are in the ideal scenario where the choice of technique has a direct, physically measurable impact; you can simply measure the out-of-sample hedging error for each estimate, and compare them. Also, typically optimal hedging is better handled by using a VECM model (see for example Gatarek & Johansen, 2014, Optimal hedging with the cointegrated vector autoregressive model), which does not require choosing to model Y as a function of X or vice-versa.
$endgroup$
– Chris Haug
Jan 6 at 16:32




$begingroup$
You are in the ideal scenario where the choice of technique has a direct, physically measurable impact; you can simply measure the out-of-sample hedging error for each estimate, and compare them. Also, typically optimal hedging is better handled by using a VECM model (see for example Gatarek & Johansen, 2014, Optimal hedging with the cointegrated vector autoregressive model), which does not require choosing to model Y as a function of X or vice-versa.
$endgroup$
– Chris Haug
Jan 6 at 16:32




1




1




$begingroup$
You might want to look at the geometric mean $sqrt{dfrac{beta}{gamma}}$ as a possibility (if they are both negative you might take the negative square root). Then look at $dfrac{s_y}{s_x}$, which should be very similar
$endgroup$
– Henry
Jan 6 at 18:37






$begingroup$
You might want to look at the geometric mean $sqrt{dfrac{beta}{gamma}}$ as a possibility (if they are both negative you might take the negative square root). Then look at $dfrac{s_y}{s_x}$, which should be very similar
$endgroup$
– Henry
Jan 6 at 18:37






1




1




$begingroup$
@ricardo Note that I specified out-of-sample error, so not the (in-sample) fit of the model. And it is entirely possible for the optimal hedge ratio to change over time (especially if the relationship is not actually linear), that doesn't change the fact that figuring out the best hedging strategy can be most directly done by backtesting the model and observing the results.
$endgroup$
– Chris Haug
Jan 6 at 23:53




$begingroup$
@ricardo Note that I specified out-of-sample error, so not the (in-sample) fit of the model. And it is entirely possible for the optimal hedge ratio to change over time (especially if the relationship is not actually linear), that doesn't change the fact that figuring out the best hedging strategy can be most directly done by backtesting the model and observing the results.
$endgroup$
– Chris Haug
Jan 6 at 23:53










4 Answers
4






active

oldest

votes


















11












$begingroup$

To see the connection between both representations, take a bivariate Normal vector:
$$
begin{pmatrix}
X_1 \
X_2
end{pmatrix} sim mathcal{N} left( begin{pmatrix}
mu_1 \
mu_2
end{pmatrix} , begin{pmatrix}
sigma^2_1 & rho sigma_1 sigma_2 \
rho sigma_1 sigma_2 & sigma^2_2
end{pmatrix} right)
$$

with conditionals
$$X_1 mid X_2=x_2 sim mathcal{N} left( mu_1 + rho frac{sigma_1}{sigma_2}(x_2 - mu_2),(1-rho^2)sigma^2_1 right)$$
and
$$X_2 mid X_1=x_1 sim mathcal{N} left( mu_2 + rho frac{sigma_2}{sigma_1}(x_1 - mu_1),(1-rho^2)sigma^2_2 right)$$
This means that
$$X_1=underbrace{left(mu_1-rho frac{sigma_1}{sigma_2}mu_2right)}_alpha+underbrace{rho frac{sigma_1}{sigma_2}}_beta X_2+sqrt{1-rho^2}sigma_1epsilon_1$$
and
$$X_2=underbrace{left(mu_2-rho frac{sigma_2}{sigma_1}mu_1right)}_kappa+underbrace{rho frac{sigma_2}{sigma_1}}_gamma X_1+sqrt{1-rho^2}sigma_2epsilon_2$$
which means (a) $gamma$ is not $1/beta$ and (b) the connection between the two regressions depends on the joint distribution of $(X_1,X_2)$.






share|cite|improve this answer











$endgroup$













  • $begingroup$
    How would I decide if the average of the two betas is a better measure of the hedge ratio than one or the other?
    $endgroup$
    – ricardo
    Jan 6 at 9:08






  • 4




    $begingroup$
    I have no idea.
    $endgroup$
    – Xi'an
    Jan 6 at 10:20










  • $begingroup$
    @ricardo By measuring the out-of-sample hedging error under each estimate, which is ultimately what you are trying to minimize.
    $endgroup$
    – Chris Haug
    Jan 6 at 16:35





















3












$begingroup$

Converted from a comment.....



The exact values of $beta$ and $gamma$
can be found in this answer of mine to Effect of switching responses and explanatory variables in simple linear regression, and, as you suspect,
$beta$ is not the reciprocal of $gamma$, and averaging $beta$ and $gamma$
(or averaging $beta$ and $1/gamma$) is not the right way to go. A pictorial view of what $beta$ and $gamma$
are minimizing is given in Elvis's answer to the same question, and in the answer, he introduces a "least rectangles" regression that might be what you are looking for. The comments following Elvis's answer should not be neglected; they relate this "least rectangles" regression to other, previously studied, techniques. In particular, note that Moderator chl points out that this method is of interest when it is not clear which is the predictor variable and which the response variable.






share|cite|improve this answer









$endgroup$





















    3












    $begingroup$


    $beta$ and $gamma$



    As Xi'an noted in his answer the $beta$ and $gamma$ are related to each other by relating to the conditional means $X|Y$ and $Y|X$ (which in their turn relate to a single joint distribution) these are not symmetric in the sense that $beta neq 1/gamma$. This is neither the case if you would 'know' the true $sigma$ and $rho$ instead of using estimates. You have $$beta = rho_{XY} frac{sigma_Y}{sigma_X}$$ and $$gamma = rho_{XY} frac{sigma_X}{sigma_Y}$$



    or you could say



    $$beta gamma = rho_{XY}^2 leq 1$$



    See also simple linear regression on wikipedia for computation of the $beta$ and $gamma$.



    It is this correlation term which sort of disturbs the symmetry. When the $beta$ and $gamma$ would be simply the ratio of the standard deviation $sigma_Y/sigma_X$ and $sigma_X/sigma_Y$ then they would indeed be each others inverse. The $rho_{XY}$ term can be seen as modifying this as a sort of regression to the mean.




    • With perfect correlation $rho_{XY} = 1$ then you can fully predict $X$ based on $Y$ or vice versa. The slopes will be equal $$beta gamma = 1$$

    • But with less than perfect correlation, $rho_{XY} < 1$, you can not make those perfect predictions and the conditional mean will be somewhat closer to the unconditional mean, in comparison to a simple scaling by $sigma_Y/sigma_X$ or $sigma_X/sigma_Y$. The slopes of the regression lines will be less steep. The slopes will be not related as each others reciprocal and their product will be smaller than one $$beta gamma < 1$$




    Is a regression line the right method?



    You may wonder whether these conditional probabilities and regression lines is what you need to determine your ratios of $X$ and $Y$. It is unclear to me how you would wish to use a regression line in the computation of an optimal ratio.



    Below is an alternative way to compute the ratio. This method does have symmetry (ie if you switch X and Y then you will get the same ratio).





    Alternative



    Say, the yields of bonds $X$ and $Y$ are distributed according to a multivariate normal distribution$^dagger$ with correlation $rho_{XY}$ and standard deviations $sigma_X$ and $sigma_Y$ then the yield of a hedge that is sum of $X$ and $Y$ will be normal distributed:



    $$H = alpha X + (1-alpha) Y sim N(mu_H,sigma_H^2)$$



    were $0 leq alpha leq 1$ and with



    $$begin{array}{rcl}
    mu_H &=& alpha mu_X+(1-alpha) mu_Y \
    sigma_H^2 &=& alpha^2 sigma_X^2 + (1-alpha)^2 sigma_Y^2 + 2 alpha (1-alpha) rho_{XY} sigma_X sigma_Y \
    & =& alpha^2(sigma_X^2+sigma_Y^2 -2 rho_{XY} sigma_Xsigma_Y) + alpha (-2 sigma_Y^2+2rho_{XY}sigma_Xsigma_Y) +sigma_Y^2
    end{array} $$



    The maximum of the mean $mu_H$ will be at $$alpha = 0 text{ or } alpha=1$$ or not existing when $mu_X=mu_Y$.



    The minimum of the variance $sigma_H^2$ will be at $$alpha = 1 - frac{sigma_X^2 -rho_{XY}sigma_Xsigma_Y}{sigma_X^2 +sigma_Y^2 -2 rho_{XY} sigma_Xsigma_Y} = frac{sigma_Y^2-rho_{XY}sigma_Xsigma_Y}{sigma_X^2+sigma_Y^2 -2 rho_{XY} sigma_Xsigma_Y} $$



    The optimum will be somewhere in between those two extremes and depends on how you wish to compare losses and gains



    Note that now there is a symmetry between $alpha$ and $1-alpha$. It does not matter whether you use the hedge $H=alpha_1 X+(1-alpha_1)Y$ or the hedge $H=alpha_2 Y + (1-alpha_2) X$. You will get the same ratios in terms of $alpha_1 = 1-alpha_2$.



    Minimal variance case and relation with principle components



    In the minimal variance case (here you actually do not need to assume a multivariate Normal distribution) you get the following hedge ratio as optimum $$frac{alpha}{1-alpha} = frac{var(Y) - cov(X,Y)}{var(X)-cov(X,Y)}$$ which can be expressed in terms of the regression coefficients $beta = cov(X,Y)/var(X)$ and $gamma = cov(X,Y)/var(Y)$ and is as following $$frac{alpha}{1-alpha} = frac{1-beta}{1-gamma}$$



    In a situation with more than two variables/stocks/bonds you might generalize this to the last (smallest eigenvalue) principle component.





    Variants



    Improvements of the model can be made by using different distributions than multivariate normal. Also you could incorporate the time in a more sophisticated model to make better predictions of future values/distributions for the pair $X,Y$.





    $dagger$ This is a simplification but it suits the purpose of explaining how one can, and should, perform the analysis to find an optimal ratio without a regression line.






    share|cite|improve this answer











    $endgroup$









    • 1




      $begingroup$
      I am sorry, but as a physicist, I know too little about the language (long, short, holdings, etc.) related to stocks, bonds and finance. If you could cast it in simpler language I might be able to understand it and work with it. My answer is just a very simple expression that is unaware of the details and possibilities how to express hedging and stocks, but it shows the basic principle how you can get away from the use of a regression line (go back to first principles, express the model for profit which is at the core instead of using regression lines whose relevance is not directly clear).
      $endgroup$
      – Martijn Weterings
      Jan 7 at 11:41












    • $begingroup$
      I think i understand. The problem is that 1/ρ_{XY} ne p_{XY}$. indeed, $p_{XY}$ often changes quite and bit when we take the inverse. Your alternative is close to the case I am thinking about, but i do want to check one thing: does this allow non-negative holdings? Adopting your terminology, i'd have a unit holding of bond X, and a negative holding of Y. Say long one unit of bond X and short (say) 1.2 units of bond Y ... but it could be 0.2 units or 5 units, depending on the math.
      $endgroup$
      – ricardo
      Jan 7 at 11:42










    • $begingroup$
      long means that i make 1% on a bond if the price increases by ~1%; short means that i lose ~1% on a bond if the price increases by ~1%. So the idea is that i am long one unit of one bond (so i benefit from an appreciation) and am short some amount of the other bond (so i lose from an appreciation).
      $endgroup$
      – ricardo
      Jan 7 at 11:46










    • $begingroup$
      "The problem is to decide how much of X one ought to hold against Y." My problem with this is that there is no explanation/model/expression how you decide about this. How do you define losses and gains and how much do you value them?
      $endgroup$
      – Martijn Weterings
      Jan 7 at 11:46












    • $begingroup$
      Are there costs associated with being short and long? I imagine that you have a given amount to invest and this limits how much you can be short/long in those bonds. Then based on your previous knowledge you can estimate/determine the distribution of losses/gains for whatever combination on that limit. Finally, based on some function that determines how you value losses and gains (this expresses why/how you hedge) you can decide which combination to choose.
      $endgroup$
      – Martijn Weterings
      Jan 7 at 12:04



















    1












    $begingroup$

    Perhaps the approach of "Granger causality" might help. This would help you to assess whether X is a good predictor of Y or whether X is a better of Y. In other words, it tells you whether beta or gamma is the thing to take more seriously. Also, considering that you are dealing with time series data, it tells you how much of the history of X counts towards the prediction of Y (or vice versa).



    Wikipedia gives a simple explanation:
    A time series X is said to Granger-cause Y if it can be shown, usually through a series of t-tests and F-tests on lagged values of X (and with lagged values of Y also included), that those X values provide statistically significant information about future values of Y.



    What you do is the following:




    • regress X(t-1) and Y(t-1) on Y(t)

    • regress X(t-1), X(t-2), Y(t-1), Y(t-2) on Y(t)

    • regress X(t-1), X(t-2), X(t-3), Y(t-1), Y(t-2), Y(t-3) on Y(t)


    Continue for whatever history length might be reasonable. Check the significance of the F-statistics for each regression.
    Then do the same the reverse (so, now regress the past values of X and Y on X(t)) and see which regressions have significant F-values.



    A very straightforward example, with R code, is found here.
    Granger causality has been critiqued for not actually establishing causality (in some cases). But it seems that you application is really about "predictive causality," which is exactly what the Granger causality approach is meant for.



    The point is that the approach will tell you whether X predicts Y or whether Y predicts X (so you no longer would be tempted to artificially--and incorrectly--compound the two regression coefficients) and it gives you a better prediction (as you will know how much history of X and Y you need to know to predict Y), which is useful for hedging purposes, right?






    share|cite|improve this answer









    $endgroup$













    • $begingroup$
      I have a strong theoretical reason to believe that neither is truly a cause, and that even if one became a cause that it would not remain true over time. So i don't think that Granger Causailty is the answer in this case. I've upvoted the answer in any case, as it is useful -- esp. the R code.
      $endgroup$
      – ricardo
      Jan 7 at 3:04










    • $begingroup$
      That is why I explicitly mention that "Granger causality has been critiqued for not actually establishing causality (in some cases)." It seems to me that your question is more about establishing "predictive causality," which is what Granger causality is meant for. In addition, Granger's approach uses the information in your time series data, which are a waste not to use if you have them. Of course, you can (should?) re-estimate the effects over time. I expect that the Granger effects are more stable than cross-sectional OLS (you can test this beforehand, using historical data). HTH
      $endgroup$
      – Steve G. Jones
      Jan 7 at 7:04











    Your Answer





    StackExchange.ifUsing("editor", function () {
    return StackExchange.using("mathjaxEditing", function () {
    StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix) {
    StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\\(","\\)"]]);
    });
    });
    }, "mathjax-editing");

    StackExchange.ready(function() {
    var channelOptions = {
    tags: "".split(" "),
    id: "65"
    };
    initTagRenderer("".split(" "), "".split(" "), channelOptions);

    StackExchange.using("externalEditor", function() {
    // Have to fire editor after snippets, if snippets enabled
    if (StackExchange.settings.snippets.snippetsEnabled) {
    StackExchange.using("snippets", function() {
    createEditor();
    });
    }
    else {
    createEditor();
    }
    });

    function createEditor() {
    StackExchange.prepareEditor({
    heartbeatType: 'answer',
    autoActivateHeartbeat: false,
    convertImagesToLinks: false,
    noModals: true,
    showLowRepImageUploadWarning: true,
    reputationToPostImages: null,
    bindNavPrevention: true,
    postfix: "",
    imageUploader: {
    brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
    contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
    allowUrls: true
    },
    onDemand: true,
    discardSelector: ".discard-answer"
    ,immediatelyShowMarkdownHelp:true
    });


    }
    });














    draft saved

    draft discarded


















    StackExchange.ready(
    function () {
    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstats.stackexchange.com%2fquestions%2f385812%2fis-the-average-of-betas-from-y-x-and-x-y-valid%23new-answer', 'question_page');
    }
    );

    Post as a guest















    Required, but never shown

























    4 Answers
    4






    active

    oldest

    votes








    4 Answers
    4






    active

    oldest

    votes









    active

    oldest

    votes






    active

    oldest

    votes









    11












    $begingroup$

    To see the connection between both representations, take a bivariate Normal vector:
    $$
    begin{pmatrix}
    X_1 \
    X_2
    end{pmatrix} sim mathcal{N} left( begin{pmatrix}
    mu_1 \
    mu_2
    end{pmatrix} , begin{pmatrix}
    sigma^2_1 & rho sigma_1 sigma_2 \
    rho sigma_1 sigma_2 & sigma^2_2
    end{pmatrix} right)
    $$

    with conditionals
    $$X_1 mid X_2=x_2 sim mathcal{N} left( mu_1 + rho frac{sigma_1}{sigma_2}(x_2 - mu_2),(1-rho^2)sigma^2_1 right)$$
    and
    $$X_2 mid X_1=x_1 sim mathcal{N} left( mu_2 + rho frac{sigma_2}{sigma_1}(x_1 - mu_1),(1-rho^2)sigma^2_2 right)$$
    This means that
    $$X_1=underbrace{left(mu_1-rho frac{sigma_1}{sigma_2}mu_2right)}_alpha+underbrace{rho frac{sigma_1}{sigma_2}}_beta X_2+sqrt{1-rho^2}sigma_1epsilon_1$$
    and
    $$X_2=underbrace{left(mu_2-rho frac{sigma_2}{sigma_1}mu_1right)}_kappa+underbrace{rho frac{sigma_2}{sigma_1}}_gamma X_1+sqrt{1-rho^2}sigma_2epsilon_2$$
    which means (a) $gamma$ is not $1/beta$ and (b) the connection between the two regressions depends on the joint distribution of $(X_1,X_2)$.






    share|cite|improve this answer











    $endgroup$













    • $begingroup$
      How would I decide if the average of the two betas is a better measure of the hedge ratio than one or the other?
      $endgroup$
      – ricardo
      Jan 6 at 9:08






    • 4




      $begingroup$
      I have no idea.
      $endgroup$
      – Xi'an
      Jan 6 at 10:20










    • $begingroup$
      @ricardo By measuring the out-of-sample hedging error under each estimate, which is ultimately what you are trying to minimize.
      $endgroup$
      – Chris Haug
      Jan 6 at 16:35


















    11












    $begingroup$

    To see the connection between both representations, take a bivariate Normal vector:
    $$
    begin{pmatrix}
    X_1 \
    X_2
    end{pmatrix} sim mathcal{N} left( begin{pmatrix}
    mu_1 \
    mu_2
    end{pmatrix} , begin{pmatrix}
    sigma^2_1 & rho sigma_1 sigma_2 \
    rho sigma_1 sigma_2 & sigma^2_2
    end{pmatrix} right)
    $$

    with conditionals
    $$X_1 mid X_2=x_2 sim mathcal{N} left( mu_1 + rho frac{sigma_1}{sigma_2}(x_2 - mu_2),(1-rho^2)sigma^2_1 right)$$
    and
    $$X_2 mid X_1=x_1 sim mathcal{N} left( mu_2 + rho frac{sigma_2}{sigma_1}(x_1 - mu_1),(1-rho^2)sigma^2_2 right)$$
    This means that
    $$X_1=underbrace{left(mu_1-rho frac{sigma_1}{sigma_2}mu_2right)}_alpha+underbrace{rho frac{sigma_1}{sigma_2}}_beta X_2+sqrt{1-rho^2}sigma_1epsilon_1$$
    and
    $$X_2=underbrace{left(mu_2-rho frac{sigma_2}{sigma_1}mu_1right)}_kappa+underbrace{rho frac{sigma_2}{sigma_1}}_gamma X_1+sqrt{1-rho^2}sigma_2epsilon_2$$
    which means (a) $gamma$ is not $1/beta$ and (b) the connection between the two regressions depends on the joint distribution of $(X_1,X_2)$.






    share|cite|improve this answer











    $endgroup$













    • $begingroup$
      How would I decide if the average of the two betas is a better measure of the hedge ratio than one or the other?
      $endgroup$
      – ricardo
      Jan 6 at 9:08






    • 4




      $begingroup$
      I have no idea.
      $endgroup$
      – Xi'an
      Jan 6 at 10:20










    • $begingroup$
      @ricardo By measuring the out-of-sample hedging error under each estimate, which is ultimately what you are trying to minimize.
      $endgroup$
      – Chris Haug
      Jan 6 at 16:35
















    11












    11








    11





    $begingroup$

    To see the connection between both representations, take a bivariate Normal vector:
    $$
    begin{pmatrix}
    X_1 \
    X_2
    end{pmatrix} sim mathcal{N} left( begin{pmatrix}
    mu_1 \
    mu_2
    end{pmatrix} , begin{pmatrix}
    sigma^2_1 & rho sigma_1 sigma_2 \
    rho sigma_1 sigma_2 & sigma^2_2
    end{pmatrix} right)
    $$

    with conditionals
    $$X_1 mid X_2=x_2 sim mathcal{N} left( mu_1 + rho frac{sigma_1}{sigma_2}(x_2 - mu_2),(1-rho^2)sigma^2_1 right)$$
    and
    $$X_2 mid X_1=x_1 sim mathcal{N} left( mu_2 + rho frac{sigma_2}{sigma_1}(x_1 - mu_1),(1-rho^2)sigma^2_2 right)$$
    This means that
    $$X_1=underbrace{left(mu_1-rho frac{sigma_1}{sigma_2}mu_2right)}_alpha+underbrace{rho frac{sigma_1}{sigma_2}}_beta X_2+sqrt{1-rho^2}sigma_1epsilon_1$$
    and
    $$X_2=underbrace{left(mu_2-rho frac{sigma_2}{sigma_1}mu_1right)}_kappa+underbrace{rho frac{sigma_2}{sigma_1}}_gamma X_1+sqrt{1-rho^2}sigma_2epsilon_2$$
    which means (a) $gamma$ is not $1/beta$ and (b) the connection between the two regressions depends on the joint distribution of $(X_1,X_2)$.






    share|cite|improve this answer











    $endgroup$



    To see the connection between both representations, take a bivariate Normal vector:
    $$
    begin{pmatrix}
    X_1 \
    X_2
    end{pmatrix} sim mathcal{N} left( begin{pmatrix}
    mu_1 \
    mu_2
    end{pmatrix} , begin{pmatrix}
    sigma^2_1 & rho sigma_1 sigma_2 \
    rho sigma_1 sigma_2 & sigma^2_2
    end{pmatrix} right)
    $$

    with conditionals
    $$X_1 mid X_2=x_2 sim mathcal{N} left( mu_1 + rho frac{sigma_1}{sigma_2}(x_2 - mu_2),(1-rho^2)sigma^2_1 right)$$
    and
    $$X_2 mid X_1=x_1 sim mathcal{N} left( mu_2 + rho frac{sigma_2}{sigma_1}(x_1 - mu_1),(1-rho^2)sigma^2_2 right)$$
    This means that
    $$X_1=underbrace{left(mu_1-rho frac{sigma_1}{sigma_2}mu_2right)}_alpha+underbrace{rho frac{sigma_1}{sigma_2}}_beta X_2+sqrt{1-rho^2}sigma_1epsilon_1$$
    and
    $$X_2=underbrace{left(mu_2-rho frac{sigma_2}{sigma_1}mu_1right)}_kappa+underbrace{rho frac{sigma_2}{sigma_1}}_gamma X_1+sqrt{1-rho^2}sigma_2epsilon_2$$
    which means (a) $gamma$ is not $1/beta$ and (b) the connection between the two regressions depends on the joint distribution of $(X_1,X_2)$.







    share|cite|improve this answer














    share|cite|improve this answer



    share|cite|improve this answer








    edited Jan 7 at 10:14









    Martijn Weterings

    12.7k1457




    12.7k1457










    answered Jan 6 at 8:29









    Xi'anXi'an

    54.7k792351




    54.7k792351












    • $begingroup$
      How would I decide if the average of the two betas is a better measure of the hedge ratio than one or the other?
      $endgroup$
      – ricardo
      Jan 6 at 9:08






    • 4




      $begingroup$
      I have no idea.
      $endgroup$
      – Xi'an
      Jan 6 at 10:20










    • $begingroup$
      @ricardo By measuring the out-of-sample hedging error under each estimate, which is ultimately what you are trying to minimize.
      $endgroup$
      – Chris Haug
      Jan 6 at 16:35




















    • $begingroup$
      How would I decide if the average of the two betas is a better measure of the hedge ratio than one or the other?
      $endgroup$
      – ricardo
      Jan 6 at 9:08






    • 4




      $begingroup$
      I have no idea.
      $endgroup$
      – Xi'an
      Jan 6 at 10:20










    • $begingroup$
      @ricardo By measuring the out-of-sample hedging error under each estimate, which is ultimately what you are trying to minimize.
      $endgroup$
      – Chris Haug
      Jan 6 at 16:35


















    $begingroup$
    How would I decide if the average of the two betas is a better measure of the hedge ratio than one or the other?
    $endgroup$
    – ricardo
    Jan 6 at 9:08




    $begingroup$
    How would I decide if the average of the two betas is a better measure of the hedge ratio than one or the other?
    $endgroup$
    – ricardo
    Jan 6 at 9:08




    4




    4




    $begingroup$
    I have no idea.
    $endgroup$
    – Xi'an
    Jan 6 at 10:20




    $begingroup$
    I have no idea.
    $endgroup$
    – Xi'an
    Jan 6 at 10:20












    $begingroup$
    @ricardo By measuring the out-of-sample hedging error under each estimate, which is ultimately what you are trying to minimize.
    $endgroup$
    – Chris Haug
    Jan 6 at 16:35






    $begingroup$
    @ricardo By measuring the out-of-sample hedging error under each estimate, which is ultimately what you are trying to minimize.
    $endgroup$
    – Chris Haug
    Jan 6 at 16:35















    3












    $begingroup$

    Converted from a comment.....



    The exact values of $beta$ and $gamma$
    can be found in this answer of mine to Effect of switching responses and explanatory variables in simple linear regression, and, as you suspect,
    $beta$ is not the reciprocal of $gamma$, and averaging $beta$ and $gamma$
    (or averaging $beta$ and $1/gamma$) is not the right way to go. A pictorial view of what $beta$ and $gamma$
    are minimizing is given in Elvis's answer to the same question, and in the answer, he introduces a "least rectangles" regression that might be what you are looking for. The comments following Elvis's answer should not be neglected; they relate this "least rectangles" regression to other, previously studied, techniques. In particular, note that Moderator chl points out that this method is of interest when it is not clear which is the predictor variable and which the response variable.






    share|cite|improve this answer









    $endgroup$


















      3












      $begingroup$

      Converted from a comment.....



      The exact values of $beta$ and $gamma$
      can be found in this answer of mine to Effect of switching responses and explanatory variables in simple linear regression, and, as you suspect,
      $beta$ is not the reciprocal of $gamma$, and averaging $beta$ and $gamma$
      (or averaging $beta$ and $1/gamma$) is not the right way to go. A pictorial view of what $beta$ and $gamma$
      are minimizing is given in Elvis's answer to the same question, and in the answer, he introduces a "least rectangles" regression that might be what you are looking for. The comments following Elvis's answer should not be neglected; they relate this "least rectangles" regression to other, previously studied, techniques. In particular, note that Moderator chl points out that this method is of interest when it is not clear which is the predictor variable and which the response variable.






      share|cite|improve this answer









      $endgroup$
















        3












        3








        3





        $begingroup$

        Converted from a comment.....



        The exact values of $beta$ and $gamma$
        can be found in this answer of mine to Effect of switching responses and explanatory variables in simple linear regression, and, as you suspect,
        $beta$ is not the reciprocal of $gamma$, and averaging $beta$ and $gamma$
        (or averaging $beta$ and $1/gamma$) is not the right way to go. A pictorial view of what $beta$ and $gamma$
        are minimizing is given in Elvis's answer to the same question, and in the answer, he introduces a "least rectangles" regression that might be what you are looking for. The comments following Elvis's answer should not be neglected; they relate this "least rectangles" regression to other, previously studied, techniques. In particular, note that Moderator chl points out that this method is of interest when it is not clear which is the predictor variable and which the response variable.






        share|cite|improve this answer









        $endgroup$



        Converted from a comment.....



        The exact values of $beta$ and $gamma$
        can be found in this answer of mine to Effect of switching responses and explanatory variables in simple linear regression, and, as you suspect,
        $beta$ is not the reciprocal of $gamma$, and averaging $beta$ and $gamma$
        (or averaging $beta$ and $1/gamma$) is not the right way to go. A pictorial view of what $beta$ and $gamma$
        are minimizing is given in Elvis's answer to the same question, and in the answer, he introduces a "least rectangles" regression that might be what you are looking for. The comments following Elvis's answer should not be neglected; they relate this "least rectangles" regression to other, previously studied, techniques. In particular, note that Moderator chl points out that this method is of interest when it is not clear which is the predictor variable and which the response variable.







        share|cite|improve this answer












        share|cite|improve this answer



        share|cite|improve this answer










        answered Jan 6 at 23:13









        Dilip SarwateDilip Sarwate

        30k252147




        30k252147























            3












            $begingroup$


            $beta$ and $gamma$



            As Xi'an noted in his answer the $beta$ and $gamma$ are related to each other by relating to the conditional means $X|Y$ and $Y|X$ (which in their turn relate to a single joint distribution) these are not symmetric in the sense that $beta neq 1/gamma$. This is neither the case if you would 'know' the true $sigma$ and $rho$ instead of using estimates. You have $$beta = rho_{XY} frac{sigma_Y}{sigma_X}$$ and $$gamma = rho_{XY} frac{sigma_X}{sigma_Y}$$



            or you could say



            $$beta gamma = rho_{XY}^2 leq 1$$



            See also simple linear regression on wikipedia for computation of the $beta$ and $gamma$.



            It is this correlation term which sort of disturbs the symmetry. When the $beta$ and $gamma$ would be simply the ratio of the standard deviation $sigma_Y/sigma_X$ and $sigma_X/sigma_Y$ then they would indeed be each others inverse. The $rho_{XY}$ term can be seen as modifying this as a sort of regression to the mean.




            • With perfect correlation $rho_{XY} = 1$ then you can fully predict $X$ based on $Y$ or vice versa. The slopes will be equal $$beta gamma = 1$$

            • But with less than perfect correlation, $rho_{XY} < 1$, you can not make those perfect predictions and the conditional mean will be somewhat closer to the unconditional mean, in comparison to a simple scaling by $sigma_Y/sigma_X$ or $sigma_X/sigma_Y$. The slopes of the regression lines will be less steep. The slopes will be not related as each others reciprocal and their product will be smaller than one $$beta gamma < 1$$




            Is a regression line the right method?



            You may wonder whether these conditional probabilities and regression lines is what you need to determine your ratios of $X$ and $Y$. It is unclear to me how you would wish to use a regression line in the computation of an optimal ratio.



            Below is an alternative way to compute the ratio. This method does have symmetry (ie if you switch X and Y then you will get the same ratio).





            Alternative



            Say, the yields of bonds $X$ and $Y$ are distributed according to a multivariate normal distribution$^dagger$ with correlation $rho_{XY}$ and standard deviations $sigma_X$ and $sigma_Y$ then the yield of a hedge that is sum of $X$ and $Y$ will be normal distributed:



            $$H = alpha X + (1-alpha) Y sim N(mu_H,sigma_H^2)$$



            were $0 leq alpha leq 1$ and with



            $$begin{array}{rcl}
            mu_H &=& alpha mu_X+(1-alpha) mu_Y \
            sigma_H^2 &=& alpha^2 sigma_X^2 + (1-alpha)^2 sigma_Y^2 + 2 alpha (1-alpha) rho_{XY} sigma_X sigma_Y \
            & =& alpha^2(sigma_X^2+sigma_Y^2 -2 rho_{XY} sigma_Xsigma_Y) + alpha (-2 sigma_Y^2+2rho_{XY}sigma_Xsigma_Y) +sigma_Y^2
            end{array} $$



            The maximum of the mean $mu_H$ will be at $$alpha = 0 text{ or } alpha=1$$ or not existing when $mu_X=mu_Y$.



            The minimum of the variance $sigma_H^2$ will be at $$alpha = 1 - frac{sigma_X^2 -rho_{XY}sigma_Xsigma_Y}{sigma_X^2 +sigma_Y^2 -2 rho_{XY} sigma_Xsigma_Y} = frac{sigma_Y^2-rho_{XY}sigma_Xsigma_Y}{sigma_X^2+sigma_Y^2 -2 rho_{XY} sigma_Xsigma_Y} $$



            The optimum will be somewhere in between those two extremes and depends on how you wish to compare losses and gains



            Note that now there is a symmetry between $alpha$ and $1-alpha$. It does not matter whether you use the hedge $H=alpha_1 X+(1-alpha_1)Y$ or the hedge $H=alpha_2 Y + (1-alpha_2) X$. You will get the same ratios in terms of $alpha_1 = 1-alpha_2$.



            Minimal variance case and relation with principle components



            In the minimal variance case (here you actually do not need to assume a multivariate Normal distribution) you get the following hedge ratio as optimum $$frac{alpha}{1-alpha} = frac{var(Y) - cov(X,Y)}{var(X)-cov(X,Y)}$$ which can be expressed in terms of the regression coefficients $beta = cov(X,Y)/var(X)$ and $gamma = cov(X,Y)/var(Y)$ and is as following $$frac{alpha}{1-alpha} = frac{1-beta}{1-gamma}$$



            In a situation with more than two variables/stocks/bonds you might generalize this to the last (smallest eigenvalue) principle component.





            Variants



            Improvements of the model can be made by using different distributions than multivariate normal. Also you could incorporate the time in a more sophisticated model to make better predictions of future values/distributions for the pair $X,Y$.





            $dagger$ This is a simplification but it suits the purpose of explaining how one can, and should, perform the analysis to find an optimal ratio without a regression line.






            share|cite|improve this answer











            $endgroup$









            • 1




              $begingroup$
              I am sorry, but as a physicist, I know too little about the language (long, short, holdings, etc.) related to stocks, bonds and finance. If you could cast it in simpler language I might be able to understand it and work with it. My answer is just a very simple expression that is unaware of the details and possibilities how to express hedging and stocks, but it shows the basic principle how you can get away from the use of a regression line (go back to first principles, express the model for profit which is at the core instead of using regression lines whose relevance is not directly clear).
              $endgroup$
              – Martijn Weterings
              Jan 7 at 11:41












            • $begingroup$
              I think i understand. The problem is that 1/ρ_{XY} ne p_{XY}$. indeed, $p_{XY}$ often changes quite and bit when we take the inverse. Your alternative is close to the case I am thinking about, but i do want to check one thing: does this allow non-negative holdings? Adopting your terminology, i'd have a unit holding of bond X, and a negative holding of Y. Say long one unit of bond X and short (say) 1.2 units of bond Y ... but it could be 0.2 units or 5 units, depending on the math.
              $endgroup$
              – ricardo
              Jan 7 at 11:42










            • $begingroup$
              long means that i make 1% on a bond if the price increases by ~1%; short means that i lose ~1% on a bond if the price increases by ~1%. So the idea is that i am long one unit of one bond (so i benefit from an appreciation) and am short some amount of the other bond (so i lose from an appreciation).
              $endgroup$
              – ricardo
              Jan 7 at 11:46










            • $begingroup$
              "The problem is to decide how much of X one ought to hold against Y." My problem with this is that there is no explanation/model/expression how you decide about this. How do you define losses and gains and how much do you value them?
              $endgroup$
              – Martijn Weterings
              Jan 7 at 11:46












            • $begingroup$
              Are there costs associated with being short and long? I imagine that you have a given amount to invest and this limits how much you can be short/long in those bonds. Then based on your previous knowledge you can estimate/determine the distribution of losses/gains for whatever combination on that limit. Finally, based on some function that determines how you value losses and gains (this expresses why/how you hedge) you can decide which combination to choose.
              $endgroup$
              – Martijn Weterings
              Jan 7 at 12:04
















            3












            $begingroup$


            $beta$ and $gamma$



            As Xi'an noted in his answer the $beta$ and $gamma$ are related to each other by relating to the conditional means $X|Y$ and $Y|X$ (which in their turn relate to a single joint distribution) these are not symmetric in the sense that $beta neq 1/gamma$. This is neither the case if you would 'know' the true $sigma$ and $rho$ instead of using estimates. You have $$beta = rho_{XY} frac{sigma_Y}{sigma_X}$$ and $$gamma = rho_{XY} frac{sigma_X}{sigma_Y}$$



            or you could say



            $$beta gamma = rho_{XY}^2 leq 1$$



            See also simple linear regression on wikipedia for computation of the $beta$ and $gamma$.



            It is this correlation term which sort of disturbs the symmetry. When the $beta$ and $gamma$ would be simply the ratio of the standard deviation $sigma_Y/sigma_X$ and $sigma_X/sigma_Y$ then they would indeed be each others inverse. The $rho_{XY}$ term can be seen as modifying this as a sort of regression to the mean.




            • With perfect correlation $rho_{XY} = 1$ then you can fully predict $X$ based on $Y$ or vice versa. The slopes will be equal $$beta gamma = 1$$

            • But with less than perfect correlation, $rho_{XY} < 1$, you can not make those perfect predictions and the conditional mean will be somewhat closer to the unconditional mean, in comparison to a simple scaling by $sigma_Y/sigma_X$ or $sigma_X/sigma_Y$. The slopes of the regression lines will be less steep. The slopes will be not related as each others reciprocal and their product will be smaller than one $$beta gamma < 1$$




            Is a regression line the right method?



            You may wonder whether these conditional probabilities and regression lines is what you need to determine your ratios of $X$ and $Y$. It is unclear to me how you would wish to use a regression line in the computation of an optimal ratio.



            Below is an alternative way to compute the ratio. This method does have symmetry (ie if you switch X and Y then you will get the same ratio).





            Alternative



            Say, the yields of bonds $X$ and $Y$ are distributed according to a multivariate normal distribution$^dagger$ with correlation $rho_{XY}$ and standard deviations $sigma_X$ and $sigma_Y$ then the yield of a hedge that is sum of $X$ and $Y$ will be normal distributed:



            $$H = alpha X + (1-alpha) Y sim N(mu_H,sigma_H^2)$$



            were $0 leq alpha leq 1$ and with



            $$begin{array}{rcl}
            mu_H &=& alpha mu_X+(1-alpha) mu_Y \
            sigma_H^2 &=& alpha^2 sigma_X^2 + (1-alpha)^2 sigma_Y^2 + 2 alpha (1-alpha) rho_{XY} sigma_X sigma_Y \
            & =& alpha^2(sigma_X^2+sigma_Y^2 -2 rho_{XY} sigma_Xsigma_Y) + alpha (-2 sigma_Y^2+2rho_{XY}sigma_Xsigma_Y) +sigma_Y^2
            end{array} $$



            The maximum of the mean $mu_H$ will be at $$alpha = 0 text{ or } alpha=1$$ or not existing when $mu_X=mu_Y$.



            The minimum of the variance $sigma_H^2$ will be at $$alpha = 1 - frac{sigma_X^2 -rho_{XY}sigma_Xsigma_Y}{sigma_X^2 +sigma_Y^2 -2 rho_{XY} sigma_Xsigma_Y} = frac{sigma_Y^2-rho_{XY}sigma_Xsigma_Y}{sigma_X^2+sigma_Y^2 -2 rho_{XY} sigma_Xsigma_Y} $$



            The optimum will be somewhere in between those two extremes and depends on how you wish to compare losses and gains



            Note that now there is a symmetry between $alpha$ and $1-alpha$. It does not matter whether you use the hedge $H=alpha_1 X+(1-alpha_1)Y$ or the hedge $H=alpha_2 Y + (1-alpha_2) X$. You will get the same ratios in terms of $alpha_1 = 1-alpha_2$.



            Minimal variance case and relation with principle components



            In the minimal variance case (here you actually do not need to assume a multivariate Normal distribution) you get the following hedge ratio as optimum $$frac{alpha}{1-alpha} = frac{var(Y) - cov(X,Y)}{var(X)-cov(X,Y)}$$ which can be expressed in terms of the regression coefficients $beta = cov(X,Y)/var(X)$ and $gamma = cov(X,Y)/var(Y)$ and is as following $$frac{alpha}{1-alpha} = frac{1-beta}{1-gamma}$$



            In a situation with more than two variables/stocks/bonds you might generalize this to the last (smallest eigenvalue) principle component.





            Variants



            Improvements of the model can be made by using different distributions than multivariate normal. Also you could incorporate the time in a more sophisticated model to make better predictions of future values/distributions for the pair $X,Y$.





            $dagger$ This is a simplification but it suits the purpose of explaining how one can, and should, perform the analysis to find an optimal ratio without a regression line.






            share|cite|improve this answer











            $endgroup$









            • 1




              $begingroup$
              I am sorry, but as a physicist, I know too little about the language (long, short, holdings, etc.) related to stocks, bonds and finance. If you could cast it in simpler language I might be able to understand it and work with it. My answer is just a very simple expression that is unaware of the details and possibilities how to express hedging and stocks, but it shows the basic principle how you can get away from the use of a regression line (go back to first principles, express the model for profit which is at the core instead of using regression lines whose relevance is not directly clear).
              $endgroup$
              – Martijn Weterings
              Jan 7 at 11:41












            • $begingroup$
              I think i understand. The problem is that 1/ρ_{XY} ne p_{XY}$. indeed, $p_{XY}$ often changes quite and bit when we take the inverse. Your alternative is close to the case I am thinking about, but i do want to check one thing: does this allow non-negative holdings? Adopting your terminology, i'd have a unit holding of bond X, and a negative holding of Y. Say long one unit of bond X and short (say) 1.2 units of bond Y ... but it could be 0.2 units or 5 units, depending on the math.
              $endgroup$
              – ricardo
              Jan 7 at 11:42










            • $begingroup$
              long means that i make 1% on a bond if the price increases by ~1%; short means that i lose ~1% on a bond if the price increases by ~1%. So the idea is that i am long one unit of one bond (so i benefit from an appreciation) and am short some amount of the other bond (so i lose from an appreciation).
              $endgroup$
              – ricardo
              Jan 7 at 11:46










            • $begingroup$
              "The problem is to decide how much of X one ought to hold against Y." My problem with this is that there is no explanation/model/expression how you decide about this. How do you define losses and gains and how much do you value them?
              $endgroup$
              – Martijn Weterings
              Jan 7 at 11:46












            • $begingroup$
              Are there costs associated with being short and long? I imagine that you have a given amount to invest and this limits how much you can be short/long in those bonds. Then based on your previous knowledge you can estimate/determine the distribution of losses/gains for whatever combination on that limit. Finally, based on some function that determines how you value losses and gains (this expresses why/how you hedge) you can decide which combination to choose.
              $endgroup$
              – Martijn Weterings
              Jan 7 at 12:04














            3












            3








            3





            $begingroup$


            $beta$ and $gamma$



            As Xi'an noted in his answer the $beta$ and $gamma$ are related to each other by relating to the conditional means $X|Y$ and $Y|X$ (which in their turn relate to a single joint distribution) these are not symmetric in the sense that $beta neq 1/gamma$. This is neither the case if you would 'know' the true $sigma$ and $rho$ instead of using estimates. You have $$beta = rho_{XY} frac{sigma_Y}{sigma_X}$$ and $$gamma = rho_{XY} frac{sigma_X}{sigma_Y}$$



            or you could say



            $$beta gamma = rho_{XY}^2 leq 1$$



            See also simple linear regression on wikipedia for computation of the $beta$ and $gamma$.



            It is this correlation term which sort of disturbs the symmetry. When the $beta$ and $gamma$ would be simply the ratio of the standard deviation $sigma_Y/sigma_X$ and $sigma_X/sigma_Y$ then they would indeed be each others inverse. The $rho_{XY}$ term can be seen as modifying this as a sort of regression to the mean.




            • With perfect correlation $rho_{XY} = 1$ then you can fully predict $X$ based on $Y$ or vice versa. The slopes will be equal $$beta gamma = 1$$

            • But with less than perfect correlation, $rho_{XY} < 1$, you can not make those perfect predictions and the conditional mean will be somewhat closer to the unconditional mean, in comparison to a simple scaling by $sigma_Y/sigma_X$ or $sigma_X/sigma_Y$. The slopes of the regression lines will be less steep. The slopes will be not related as each others reciprocal and their product will be smaller than one $$beta gamma < 1$$




            Is a regression line the right method?



            You may wonder whether these conditional probabilities and regression lines is what you need to determine your ratios of $X$ and $Y$. It is unclear to me how you would wish to use a regression line in the computation of an optimal ratio.



            Below is an alternative way to compute the ratio. This method does have symmetry (ie if you switch X and Y then you will get the same ratio).





            Alternative



            Say, the yields of bonds $X$ and $Y$ are distributed according to a multivariate normal distribution$^dagger$ with correlation $rho_{XY}$ and standard deviations $sigma_X$ and $sigma_Y$ then the yield of a hedge that is sum of $X$ and $Y$ will be normal distributed:



            $$H = alpha X + (1-alpha) Y sim N(mu_H,sigma_H^2)$$



            were $0 leq alpha leq 1$ and with



            $$begin{array}{rcl}
            mu_H &=& alpha mu_X+(1-alpha) mu_Y \
            sigma_H^2 &=& alpha^2 sigma_X^2 + (1-alpha)^2 sigma_Y^2 + 2 alpha (1-alpha) rho_{XY} sigma_X sigma_Y \
            & =& alpha^2(sigma_X^2+sigma_Y^2 -2 rho_{XY} sigma_Xsigma_Y) + alpha (-2 sigma_Y^2+2rho_{XY}sigma_Xsigma_Y) +sigma_Y^2
            end{array} $$



            The maximum of the mean $mu_H$ will be at $$alpha = 0 text{ or } alpha=1$$ or not existing when $mu_X=mu_Y$.



            The minimum of the variance $sigma_H^2$ will be at $$alpha = 1 - frac{sigma_X^2 -rho_{XY}sigma_Xsigma_Y}{sigma_X^2 +sigma_Y^2 -2 rho_{XY} sigma_Xsigma_Y} = frac{sigma_Y^2-rho_{XY}sigma_Xsigma_Y}{sigma_X^2+sigma_Y^2 -2 rho_{XY} sigma_Xsigma_Y} $$



            The optimum will be somewhere in between those two extremes and depends on how you wish to compare losses and gains



            Note that now there is a symmetry between $alpha$ and $1-alpha$. It does not matter whether you use the hedge $H=alpha_1 X+(1-alpha_1)Y$ or the hedge $H=alpha_2 Y + (1-alpha_2) X$. You will get the same ratios in terms of $alpha_1 = 1-alpha_2$.



            Minimal variance case and relation with principle components



            In the minimal variance case (here you actually do not need to assume a multivariate Normal distribution) you get the following hedge ratio as optimum $$frac{alpha}{1-alpha} = frac{var(Y) - cov(X,Y)}{var(X)-cov(X,Y)}$$ which can be expressed in terms of the regression coefficients $beta = cov(X,Y)/var(X)$ and $gamma = cov(X,Y)/var(Y)$ and is as following $$frac{alpha}{1-alpha} = frac{1-beta}{1-gamma}$$



            In a situation with more than two variables/stocks/bonds you might generalize this to the last (smallest eigenvalue) principle component.





            Variants



            Improvements of the model can be made by using different distributions than multivariate normal. Also you could incorporate the time in a more sophisticated model to make better predictions of future values/distributions for the pair $X,Y$.





            $dagger$ This is a simplification but it suits the purpose of explaining how one can, and should, perform the analysis to find an optimal ratio without a regression line.






            share|cite|improve this answer











            $endgroup$




            $beta$ and $gamma$



            As Xi'an noted in his answer the $beta$ and $gamma$ are related to each other by relating to the conditional means $X|Y$ and $Y|X$ (which in their turn relate to a single joint distribution) these are not symmetric in the sense that $beta neq 1/gamma$. This is neither the case if you would 'know' the true $sigma$ and $rho$ instead of using estimates. You have $$beta = rho_{XY} frac{sigma_Y}{sigma_X}$$ and $$gamma = rho_{XY} frac{sigma_X}{sigma_Y}$$



            or you could say



            $$beta gamma = rho_{XY}^2 leq 1$$



            See also simple linear regression on wikipedia for computation of the $beta$ and $gamma$.



            It is this correlation term which sort of disturbs the symmetry. When the $beta$ and $gamma$ would be simply the ratio of the standard deviation $sigma_Y/sigma_X$ and $sigma_X/sigma_Y$ then they would indeed be each others inverse. The $rho_{XY}$ term can be seen as modifying this as a sort of regression to the mean.




            • With perfect correlation $rho_{XY} = 1$ then you can fully predict $X$ based on $Y$ or vice versa. The slopes will be equal $$beta gamma = 1$$

            • But with less than perfect correlation, $rho_{XY} < 1$, you can not make those perfect predictions and the conditional mean will be somewhat closer to the unconditional mean, in comparison to a simple scaling by $sigma_Y/sigma_X$ or $sigma_X/sigma_Y$. The slopes of the regression lines will be less steep. The slopes will be not related as each others reciprocal and their product will be smaller than one $$beta gamma < 1$$




            Is a regression line the right method?



            You may wonder whether these conditional probabilities and regression lines is what you need to determine your ratios of $X$ and $Y$. It is unclear to me how you would wish to use a regression line in the computation of an optimal ratio.



            Below is an alternative way to compute the ratio. This method does have symmetry (ie if you switch X and Y then you will get the same ratio).





            Alternative



            Say, the yields of bonds $X$ and $Y$ are distributed according to a multivariate normal distribution$^dagger$ with correlation $rho_{XY}$ and standard deviations $sigma_X$ and $sigma_Y$ then the yield of a hedge that is sum of $X$ and $Y$ will be normal distributed:



            $$H = alpha X + (1-alpha) Y sim N(mu_H,sigma_H^2)$$



            were $0 leq alpha leq 1$ and with



            $$begin{array}{rcl}
            mu_H &=& alpha mu_X+(1-alpha) mu_Y \
            sigma_H^2 &=& alpha^2 sigma_X^2 + (1-alpha)^2 sigma_Y^2 + 2 alpha (1-alpha) rho_{XY} sigma_X sigma_Y \
            & =& alpha^2(sigma_X^2+sigma_Y^2 -2 rho_{XY} sigma_Xsigma_Y) + alpha (-2 sigma_Y^2+2rho_{XY}sigma_Xsigma_Y) +sigma_Y^2
            end{array} $$



            The maximum of the mean $mu_H$ will be at $$alpha = 0 text{ or } alpha=1$$ or not existing when $mu_X=mu_Y$.



            The minimum of the variance $sigma_H^2$ will be at $$alpha = 1 - frac{sigma_X^2 -rho_{XY}sigma_Xsigma_Y}{sigma_X^2 +sigma_Y^2 -2 rho_{XY} sigma_Xsigma_Y} = frac{sigma_Y^2-rho_{XY}sigma_Xsigma_Y}{sigma_X^2+sigma_Y^2 -2 rho_{XY} sigma_Xsigma_Y} $$



            The optimum will be somewhere in between those two extremes and depends on how you wish to compare losses and gains



            Note that now there is a symmetry between $alpha$ and $1-alpha$. It does not matter whether you use the hedge $H=alpha_1 X+(1-alpha_1)Y$ or the hedge $H=alpha_2 Y + (1-alpha_2) X$. You will get the same ratios in terms of $alpha_1 = 1-alpha_2$.



            Minimal variance case and relation with principle components



            In the minimal variance case (here you actually do not need to assume a multivariate Normal distribution) you get the following hedge ratio as optimum $$frac{alpha}{1-alpha} = frac{var(Y) - cov(X,Y)}{var(X)-cov(X,Y)}$$ which can be expressed in terms of the regression coefficients $beta = cov(X,Y)/var(X)$ and $gamma = cov(X,Y)/var(Y)$ and is as following $$frac{alpha}{1-alpha} = frac{1-beta}{1-gamma}$$



            In a situation with more than two variables/stocks/bonds you might generalize this to the last (smallest eigenvalue) principle component.





            Variants



            Improvements of the model can be made by using different distributions than multivariate normal. Also you could incorporate the time in a more sophisticated model to make better predictions of future values/distributions for the pair $X,Y$.





            $dagger$ This is a simplification but it suits the purpose of explaining how one can, and should, perform the analysis to find an optimal ratio without a regression line.







            share|cite|improve this answer














            share|cite|improve this answer



            share|cite|improve this answer








            edited Jan 8 at 1:06

























            answered Jan 7 at 9:32









            Martijn WeteringsMartijn Weterings

            12.7k1457




            12.7k1457








            • 1




              $begingroup$
              I am sorry, but as a physicist, I know too little about the language (long, short, holdings, etc.) related to stocks, bonds and finance. If you could cast it in simpler language I might be able to understand it and work with it. My answer is just a very simple expression that is unaware of the details and possibilities how to express hedging and stocks, but it shows the basic principle how you can get away from the use of a regression line (go back to first principles, express the model for profit which is at the core instead of using regression lines whose relevance is not directly clear).
              $endgroup$
              – Martijn Weterings
              Jan 7 at 11:41












            • $begingroup$
              I think i understand. The problem is that 1/ρ_{XY} ne p_{XY}$. indeed, $p_{XY}$ often changes quite and bit when we take the inverse. Your alternative is close to the case I am thinking about, but i do want to check one thing: does this allow non-negative holdings? Adopting your terminology, i'd have a unit holding of bond X, and a negative holding of Y. Say long one unit of bond X and short (say) 1.2 units of bond Y ... but it could be 0.2 units or 5 units, depending on the math.
              $endgroup$
              – ricardo
              Jan 7 at 11:42










            • $begingroup$
              long means that i make 1% on a bond if the price increases by ~1%; short means that i lose ~1% on a bond if the price increases by ~1%. So the idea is that i am long one unit of one bond (so i benefit from an appreciation) and am short some amount of the other bond (so i lose from an appreciation).
              $endgroup$
              – ricardo
              Jan 7 at 11:46










            • $begingroup$
              "The problem is to decide how much of X one ought to hold against Y." My problem with this is that there is no explanation/model/expression how you decide about this. How do you define losses and gains and how much do you value them?
              $endgroup$
              – Martijn Weterings
              Jan 7 at 11:46












            • $begingroup$
              Are there costs associated with being short and long? I imagine that you have a given amount to invest and this limits how much you can be short/long in those bonds. Then based on your previous knowledge you can estimate/determine the distribution of losses/gains for whatever combination on that limit. Finally, based on some function that determines how you value losses and gains (this expresses why/how you hedge) you can decide which combination to choose.
              $endgroup$
              – Martijn Weterings
              Jan 7 at 12:04














            • 1




              $begingroup$
              I am sorry, but as a physicist, I know too little about the language (long, short, holdings, etc.) related to stocks, bonds and finance. If you could cast it in simpler language I might be able to understand it and work with it. My answer is just a very simple expression that is unaware of the details and possibilities how to express hedging and stocks, but it shows the basic principle how you can get away from the use of a regression line (go back to first principles, express the model for profit which is at the core instead of using regression lines whose relevance is not directly clear).
              $endgroup$
              – Martijn Weterings
              Jan 7 at 11:41












            • $begingroup$
              I think i understand. The problem is that 1/ρ_{XY} ne p_{XY}$. indeed, $p_{XY}$ often changes quite and bit when we take the inverse. Your alternative is close to the case I am thinking about, but i do want to check one thing: does this allow non-negative holdings? Adopting your terminology, i'd have a unit holding of bond X, and a negative holding of Y. Say long one unit of bond X and short (say) 1.2 units of bond Y ... but it could be 0.2 units or 5 units, depending on the math.
              $endgroup$
              – ricardo
              Jan 7 at 11:42










            • $begingroup$
              long means that i make 1% on a bond if the price increases by ~1%; short means that i lose ~1% on a bond if the price increases by ~1%. So the idea is that i am long one unit of one bond (so i benefit from an appreciation) and am short some amount of the other bond (so i lose from an appreciation).
              $endgroup$
              – ricardo
              Jan 7 at 11:46










            • $begingroup$
              "The problem is to decide how much of X one ought to hold against Y." My problem with this is that there is no explanation/model/expression how you decide about this. How do you define losses and gains and how much do you value them?
              $endgroup$
              – Martijn Weterings
              Jan 7 at 11:46












            • $begingroup$
              Are there costs associated with being short and long? I imagine that you have a given amount to invest and this limits how much you can be short/long in those bonds. Then based on your previous knowledge you can estimate/determine the distribution of losses/gains for whatever combination on that limit. Finally, based on some function that determines how you value losses and gains (this expresses why/how you hedge) you can decide which combination to choose.
              $endgroup$
              – Martijn Weterings
              Jan 7 at 12:04








            1




            1




            $begingroup$
            I am sorry, but as a physicist, I know too little about the language (long, short, holdings, etc.) related to stocks, bonds and finance. If you could cast it in simpler language I might be able to understand it and work with it. My answer is just a very simple expression that is unaware of the details and possibilities how to express hedging and stocks, but it shows the basic principle how you can get away from the use of a regression line (go back to first principles, express the model for profit which is at the core instead of using regression lines whose relevance is not directly clear).
            $endgroup$
            – Martijn Weterings
            Jan 7 at 11:41






            $begingroup$
            I am sorry, but as a physicist, I know too little about the language (long, short, holdings, etc.) related to stocks, bonds and finance. If you could cast it in simpler language I might be able to understand it and work with it. My answer is just a very simple expression that is unaware of the details and possibilities how to express hedging and stocks, but it shows the basic principle how you can get away from the use of a regression line (go back to first principles, express the model for profit which is at the core instead of using regression lines whose relevance is not directly clear).
            $endgroup$
            – Martijn Weterings
            Jan 7 at 11:41














            $begingroup$
            I think i understand. The problem is that 1/ρ_{XY} ne p_{XY}$. indeed, $p_{XY}$ often changes quite and bit when we take the inverse. Your alternative is close to the case I am thinking about, but i do want to check one thing: does this allow non-negative holdings? Adopting your terminology, i'd have a unit holding of bond X, and a negative holding of Y. Say long one unit of bond X and short (say) 1.2 units of bond Y ... but it could be 0.2 units or 5 units, depending on the math.
            $endgroup$
            – ricardo
            Jan 7 at 11:42




            $begingroup$
            I think i understand. The problem is that 1/ρ_{XY} ne p_{XY}$. indeed, $p_{XY}$ often changes quite and bit when we take the inverse. Your alternative is close to the case I am thinking about, but i do want to check one thing: does this allow non-negative holdings? Adopting your terminology, i'd have a unit holding of bond X, and a negative holding of Y. Say long one unit of bond X and short (say) 1.2 units of bond Y ... but it could be 0.2 units or 5 units, depending on the math.
            $endgroup$
            – ricardo
            Jan 7 at 11:42












            $begingroup$
            long means that i make 1% on a bond if the price increases by ~1%; short means that i lose ~1% on a bond if the price increases by ~1%. So the idea is that i am long one unit of one bond (so i benefit from an appreciation) and am short some amount of the other bond (so i lose from an appreciation).
            $endgroup$
            – ricardo
            Jan 7 at 11:46




            $begingroup$
            long means that i make 1% on a bond if the price increases by ~1%; short means that i lose ~1% on a bond if the price increases by ~1%. So the idea is that i am long one unit of one bond (so i benefit from an appreciation) and am short some amount of the other bond (so i lose from an appreciation).
            $endgroup$
            – ricardo
            Jan 7 at 11:46












            $begingroup$
            "The problem is to decide how much of X one ought to hold against Y." My problem with this is that there is no explanation/model/expression how you decide about this. How do you define losses and gains and how much do you value them?
            $endgroup$
            – Martijn Weterings
            Jan 7 at 11:46






            $begingroup$
            "The problem is to decide how much of X one ought to hold against Y." My problem with this is that there is no explanation/model/expression how you decide about this. How do you define losses and gains and how much do you value them?
            $endgroup$
            – Martijn Weterings
            Jan 7 at 11:46














            $begingroup$
            Are there costs associated with being short and long? I imagine that you have a given amount to invest and this limits how much you can be short/long in those bonds. Then based on your previous knowledge you can estimate/determine the distribution of losses/gains for whatever combination on that limit. Finally, based on some function that determines how you value losses and gains (this expresses why/how you hedge) you can decide which combination to choose.
            $endgroup$
            – Martijn Weterings
            Jan 7 at 12:04




            $begingroup$
            Are there costs associated with being short and long? I imagine that you have a given amount to invest and this limits how much you can be short/long in those bonds. Then based on your previous knowledge you can estimate/determine the distribution of losses/gains for whatever combination on that limit. Finally, based on some function that determines how you value losses and gains (this expresses why/how you hedge) you can decide which combination to choose.
            $endgroup$
            – Martijn Weterings
            Jan 7 at 12:04











            1












            $begingroup$

            Perhaps the approach of "Granger causality" might help. This would help you to assess whether X is a good predictor of Y or whether X is a better of Y. In other words, it tells you whether beta or gamma is the thing to take more seriously. Also, considering that you are dealing with time series data, it tells you how much of the history of X counts towards the prediction of Y (or vice versa).



            Wikipedia gives a simple explanation:
            A time series X is said to Granger-cause Y if it can be shown, usually through a series of t-tests and F-tests on lagged values of X (and with lagged values of Y also included), that those X values provide statistically significant information about future values of Y.



            What you do is the following:




            • regress X(t-1) and Y(t-1) on Y(t)

            • regress X(t-1), X(t-2), Y(t-1), Y(t-2) on Y(t)

            • regress X(t-1), X(t-2), X(t-3), Y(t-1), Y(t-2), Y(t-3) on Y(t)


            Continue for whatever history length might be reasonable. Check the significance of the F-statistics for each regression.
            Then do the same the reverse (so, now regress the past values of X and Y on X(t)) and see which regressions have significant F-values.



            A very straightforward example, with R code, is found here.
            Granger causality has been critiqued for not actually establishing causality (in some cases). But it seems that you application is really about "predictive causality," which is exactly what the Granger causality approach is meant for.



            The point is that the approach will tell you whether X predicts Y or whether Y predicts X (so you no longer would be tempted to artificially--and incorrectly--compound the two regression coefficients) and it gives you a better prediction (as you will know how much history of X and Y you need to know to predict Y), which is useful for hedging purposes, right?






            share|cite|improve this answer









            $endgroup$













            • $begingroup$
              I have a strong theoretical reason to believe that neither is truly a cause, and that even if one became a cause that it would not remain true over time. So i don't think that Granger Causailty is the answer in this case. I've upvoted the answer in any case, as it is useful -- esp. the R code.
              $endgroup$
              – ricardo
              Jan 7 at 3:04










            • $begingroup$
              That is why I explicitly mention that "Granger causality has been critiqued for not actually establishing causality (in some cases)." It seems to me that your question is more about establishing "predictive causality," which is what Granger causality is meant for. In addition, Granger's approach uses the information in your time series data, which are a waste not to use if you have them. Of course, you can (should?) re-estimate the effects over time. I expect that the Granger effects are more stable than cross-sectional OLS (you can test this beforehand, using historical data). HTH
              $endgroup$
              – Steve G. Jones
              Jan 7 at 7:04
















            1












            $begingroup$

            Perhaps the approach of "Granger causality" might help. This would help you to assess whether X is a good predictor of Y or whether X is a better of Y. In other words, it tells you whether beta or gamma is the thing to take more seriously. Also, considering that you are dealing with time series data, it tells you how much of the history of X counts towards the prediction of Y (or vice versa).



            Wikipedia gives a simple explanation:
            A time series X is said to Granger-cause Y if it can be shown, usually through a series of t-tests and F-tests on lagged values of X (and with lagged values of Y also included), that those X values provide statistically significant information about future values of Y.



            What you do is the following:




            • regress X(t-1) and Y(t-1) on Y(t)

            • regress X(t-1), X(t-2), Y(t-1), Y(t-2) on Y(t)

            • regress X(t-1), X(t-2), X(t-3), Y(t-1), Y(t-2), Y(t-3) on Y(t)


            Continue for whatever history length might be reasonable. Check the significance of the F-statistics for each regression.
            Then do the same the reverse (so, now regress the past values of X and Y on X(t)) and see which regressions have significant F-values.



            A very straightforward example, with R code, is found here.
            Granger causality has been critiqued for not actually establishing causality (in some cases). But it seems that you application is really about "predictive causality," which is exactly what the Granger causality approach is meant for.



            The point is that the approach will tell you whether X predicts Y or whether Y predicts X (so you no longer would be tempted to artificially--and incorrectly--compound the two regression coefficients) and it gives you a better prediction (as you will know how much history of X and Y you need to know to predict Y), which is useful for hedging purposes, right?






            share|cite|improve this answer









            $endgroup$













            • $begingroup$
              I have a strong theoretical reason to believe that neither is truly a cause, and that even if one became a cause that it would not remain true over time. So i don't think that Granger Causailty is the answer in this case. I've upvoted the answer in any case, as it is useful -- esp. the R code.
              $endgroup$
              – ricardo
              Jan 7 at 3:04










            • $begingroup$
              That is why I explicitly mention that "Granger causality has been critiqued for not actually establishing causality (in some cases)." It seems to me that your question is more about establishing "predictive causality," which is what Granger causality is meant for. In addition, Granger's approach uses the information in your time series data, which are a waste not to use if you have them. Of course, you can (should?) re-estimate the effects over time. I expect that the Granger effects are more stable than cross-sectional OLS (you can test this beforehand, using historical data). HTH
              $endgroup$
              – Steve G. Jones
              Jan 7 at 7:04














            1












            1








            1





            $begingroup$

            Perhaps the approach of "Granger causality" might help. This would help you to assess whether X is a good predictor of Y or whether X is a better of Y. In other words, it tells you whether beta or gamma is the thing to take more seriously. Also, considering that you are dealing with time series data, it tells you how much of the history of X counts towards the prediction of Y (or vice versa).



            Wikipedia gives a simple explanation:
            A time series X is said to Granger-cause Y if it can be shown, usually through a series of t-tests and F-tests on lagged values of X (and with lagged values of Y also included), that those X values provide statistically significant information about future values of Y.



            What you do is the following:




            • regress X(t-1) and Y(t-1) on Y(t)

            • regress X(t-1), X(t-2), Y(t-1), Y(t-2) on Y(t)

            • regress X(t-1), X(t-2), X(t-3), Y(t-1), Y(t-2), Y(t-3) on Y(t)


            Continue for whatever history length might be reasonable. Check the significance of the F-statistics for each regression.
            Then do the same the reverse (so, now regress the past values of X and Y on X(t)) and see which regressions have significant F-values.



            A very straightforward example, with R code, is found here.
            Granger causality has been critiqued for not actually establishing causality (in some cases). But it seems that you application is really about "predictive causality," which is exactly what the Granger causality approach is meant for.



            The point is that the approach will tell you whether X predicts Y or whether Y predicts X (so you no longer would be tempted to artificially--and incorrectly--compound the two regression coefficients) and it gives you a better prediction (as you will know how much history of X and Y you need to know to predict Y), which is useful for hedging purposes, right?






            share|cite|improve this answer









            $endgroup$



            Perhaps the approach of "Granger causality" might help. This would help you to assess whether X is a good predictor of Y or whether X is a better of Y. In other words, it tells you whether beta or gamma is the thing to take more seriously. Also, considering that you are dealing with time series data, it tells you how much of the history of X counts towards the prediction of Y (or vice versa).



            Wikipedia gives a simple explanation:
            A time series X is said to Granger-cause Y if it can be shown, usually through a series of t-tests and F-tests on lagged values of X (and with lagged values of Y also included), that those X values provide statistically significant information about future values of Y.



            What you do is the following:




            • regress X(t-1) and Y(t-1) on Y(t)

            • regress X(t-1), X(t-2), Y(t-1), Y(t-2) on Y(t)

            • regress X(t-1), X(t-2), X(t-3), Y(t-1), Y(t-2), Y(t-3) on Y(t)


            Continue for whatever history length might be reasonable. Check the significance of the F-statistics for each regression.
            Then do the same the reverse (so, now regress the past values of X and Y on X(t)) and see which regressions have significant F-values.



            A very straightforward example, with R code, is found here.
            Granger causality has been critiqued for not actually establishing causality (in some cases). But it seems that you application is really about "predictive causality," which is exactly what the Granger causality approach is meant for.



            The point is that the approach will tell you whether X predicts Y or whether Y predicts X (so you no longer would be tempted to artificially--and incorrectly--compound the two regression coefficients) and it gives you a better prediction (as you will know how much history of X and Y you need to know to predict Y), which is useful for hedging purposes, right?







            share|cite|improve this answer












            share|cite|improve this answer



            share|cite|improve this answer










            answered Jan 6 at 11:12









            Steve G. JonesSteve G. Jones

            1485




            1485












            • $begingroup$
              I have a strong theoretical reason to believe that neither is truly a cause, and that even if one became a cause that it would not remain true over time. So i don't think that Granger Causailty is the answer in this case. I've upvoted the answer in any case, as it is useful -- esp. the R code.
              $endgroup$
              – ricardo
              Jan 7 at 3:04










            • $begingroup$
              That is why I explicitly mention that "Granger causality has been critiqued for not actually establishing causality (in some cases)." It seems to me that your question is more about establishing "predictive causality," which is what Granger causality is meant for. In addition, Granger's approach uses the information in your time series data, which are a waste not to use if you have them. Of course, you can (should?) re-estimate the effects over time. I expect that the Granger effects are more stable than cross-sectional OLS (you can test this beforehand, using historical data). HTH
              $endgroup$
              – Steve G. Jones
              Jan 7 at 7:04


















            • $begingroup$
              I have a strong theoretical reason to believe that neither is truly a cause, and that even if one became a cause that it would not remain true over time. So i don't think that Granger Causailty is the answer in this case. I've upvoted the answer in any case, as it is useful -- esp. the R code.
              $endgroup$
              – ricardo
              Jan 7 at 3:04










            • $begingroup$
              That is why I explicitly mention that "Granger causality has been critiqued for not actually establishing causality (in some cases)." It seems to me that your question is more about establishing "predictive causality," which is what Granger causality is meant for. In addition, Granger's approach uses the information in your time series data, which are a waste not to use if you have them. Of course, you can (should?) re-estimate the effects over time. I expect that the Granger effects are more stable than cross-sectional OLS (you can test this beforehand, using historical data). HTH
              $endgroup$
              – Steve G. Jones
              Jan 7 at 7:04
















            $begingroup$
            I have a strong theoretical reason to believe that neither is truly a cause, and that even if one became a cause that it would not remain true over time. So i don't think that Granger Causailty is the answer in this case. I've upvoted the answer in any case, as it is useful -- esp. the R code.
            $endgroup$
            – ricardo
            Jan 7 at 3:04




            $begingroup$
            I have a strong theoretical reason to believe that neither is truly a cause, and that even if one became a cause that it would not remain true over time. So i don't think that Granger Causailty is the answer in this case. I've upvoted the answer in any case, as it is useful -- esp. the R code.
            $endgroup$
            – ricardo
            Jan 7 at 3:04












            $begingroup$
            That is why I explicitly mention that "Granger causality has been critiqued for not actually establishing causality (in some cases)." It seems to me that your question is more about establishing "predictive causality," which is what Granger causality is meant for. In addition, Granger's approach uses the information in your time series data, which are a waste not to use if you have them. Of course, you can (should?) re-estimate the effects over time. I expect that the Granger effects are more stable than cross-sectional OLS (you can test this beforehand, using historical data). HTH
            $endgroup$
            – Steve G. Jones
            Jan 7 at 7:04




            $begingroup$
            That is why I explicitly mention that "Granger causality has been critiqued for not actually establishing causality (in some cases)." It seems to me that your question is more about establishing "predictive causality," which is what Granger causality is meant for. In addition, Granger's approach uses the information in your time series data, which are a waste not to use if you have them. Of course, you can (should?) re-estimate the effects over time. I expect that the Granger effects are more stable than cross-sectional OLS (you can test this beforehand, using historical data). HTH
            $endgroup$
            – Steve G. Jones
            Jan 7 at 7:04


















            draft saved

            draft discarded




















































            Thanks for contributing an answer to Cross Validated!


            • Please be sure to answer the question. Provide details and share your research!

            But avoid



            • Asking for help, clarification, or responding to other answers.

            • Making statements based on opinion; back them up with references or personal experience.


            Use MathJax to format equations. MathJax reference.


            To learn more, see our tips on writing great answers.




            draft saved


            draft discarded














            StackExchange.ready(
            function () {
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstats.stackexchange.com%2fquestions%2f385812%2fis-the-average-of-betas-from-y-x-and-x-y-valid%23new-answer', 'question_page');
            }
            );

            Post as a guest















            Required, but never shown





















































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown

































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown







            Popular posts from this blog

            Human spaceflight

            Can not write log (Is /dev/pts mounted?) - openpty in Ubuntu-on-Windows?

            張江高科駅