What's a mean field variational family?





.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty{ margin-bottom:0;
}







3












$begingroup$


I'm working through variational Bayesian methods at the moment, and I think I have a grasp of the bigger picture. Where I sometimes have trouble is with the exact details of how it can be implemented. Right now, this centrs on the idea of a mean field variational family. Specifically, Blei et al. say the following:




In this review we focus on the mean-field variational family, where
the latent variables are mutually independent and each governed by a
distinct factor in the variational density. A generic member of the
mean-field variational family is




$$q(z) = prod_ {j=1}^m q_j (z_j )$$



I'm afraid that I can't see how a distribution can be expressed as a product in this way without being reduced to a constant. Clearly, I'm missing something fundamental, but I seem to be going around in circles trying to google the answer.



Can anyone supply some intuition?










share|cite|improve this question











$endgroup$



















    3












    $begingroup$


    I'm working through variational Bayesian methods at the moment, and I think I have a grasp of the bigger picture. Where I sometimes have trouble is with the exact details of how it can be implemented. Right now, this centrs on the idea of a mean field variational family. Specifically, Blei et al. say the following:




    In this review we focus on the mean-field variational family, where
    the latent variables are mutually independent and each governed by a
    distinct factor in the variational density. A generic member of the
    mean-field variational family is




    $$q(z) = prod_ {j=1}^m q_j (z_j )$$



    I'm afraid that I can't see how a distribution can be expressed as a product in this way without being reduced to a constant. Clearly, I'm missing something fundamental, but I seem to be going around in circles trying to google the answer.



    Can anyone supply some intuition?










    share|cite|improve this question











    $endgroup$















      3












      3








      3





      $begingroup$


      I'm working through variational Bayesian methods at the moment, and I think I have a grasp of the bigger picture. Where I sometimes have trouble is with the exact details of how it can be implemented. Right now, this centrs on the idea of a mean field variational family. Specifically, Blei et al. say the following:




      In this review we focus on the mean-field variational family, where
      the latent variables are mutually independent and each governed by a
      distinct factor in the variational density. A generic member of the
      mean-field variational family is




      $$q(z) = prod_ {j=1}^m q_j (z_j )$$



      I'm afraid that I can't see how a distribution can be expressed as a product in this way without being reduced to a constant. Clearly, I'm missing something fundamental, but I seem to be going around in circles trying to google the answer.



      Can anyone supply some intuition?










      share|cite|improve this question











      $endgroup$




      I'm working through variational Bayesian methods at the moment, and I think I have a grasp of the bigger picture. Where I sometimes have trouble is with the exact details of how it can be implemented. Right now, this centrs on the idea of a mean field variational family. Specifically, Blei et al. say the following:




      In this review we focus on the mean-field variational family, where
      the latent variables are mutually independent and each governed by a
      distinct factor in the variational density. A generic member of the
      mean-field variational family is




      $$q(z) = prod_ {j=1}^m q_j (z_j )$$



      I'm afraid that I can't see how a distribution can be expressed as a product in this way without being reduced to a constant. Clearly, I'm missing something fundamental, but I seem to be going around in circles trying to google the answer.



      Can anyone supply some intuition?







      probability computational-statistics variational-bayes






      share|cite|improve this question















      share|cite|improve this question













      share|cite|improve this question




      share|cite|improve this question








      edited Feb 11 at 8:13









      Ferdi

      3,86542355




      3,86542355










      asked Feb 10 at 19:46









      Lodore66Lodore66

      1183




      1183






















          1 Answer
          1






          active

          oldest

          votes


















          4












          $begingroup$

          Loosely speaking, the mean field family defines a specific class of joint distributions. So $z$ here is actually a parameter vector of length m. That means that $q(z)$ describes a joint distribution over all of the individual z's, and can be written as



          $$q(z) = q(z_1, z_2, ldots, z_m)$$



          We can use the chain rule to factorize this:



          $$ = q(z_1)q(z_2|z_1)ldots q(z_m|z_1, z_2, ldots z_{m-1})$$



          Now, for this joint distribution to be in the mean field family, we make a simplifying assumption and assume that all of the $z_i$s are independent from each other. I'll note here that this assumes that the $z_i$'s under the variational distributions are independent; the true joint $p(z_1, ldots z_m)$ is almost certainly going to have some dependence among the variables. In this sense, we are trading off accuracy (throwing away all covariances) for some computational benefits.



          Now, if we make that independence assumption, we can see that the joint reduces down to



          $$q(z) = q(z_1)q(z_2)ldots q(z_m) = prod_{i=1}^m q(z_i)$$



          Which is the form that the mean field family takes. As for your question about how this won't reduce to a constant, I'm not entirely sure what you mean. All of the $z_i$'s are random variables, so I don't see how this could become a constant.






          share|cite|improve this answer









          $endgroup$













          • $begingroup$
            This is really helpful and has clarified things immensely. What was catching me out was where all the marginal probabilities went; by explaining that this is an approximation that trades off accuracy for computability over the joint distribution makes it much more intuitive. Thanks indeed!
            $endgroup$
            – Lodore66
            Feb 10 at 21:16












          Your Answer





          StackExchange.ifUsing("editor", function () {
          return StackExchange.using("mathjaxEditing", function () {
          StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix) {
          StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\\(","\\)"]]);
          });
          });
          }, "mathjax-editing");

          StackExchange.ready(function() {
          var channelOptions = {
          tags: "".split(" "),
          id: "65"
          };
          initTagRenderer("".split(" "), "".split(" "), channelOptions);

          StackExchange.using("externalEditor", function() {
          // Have to fire editor after snippets, if snippets enabled
          if (StackExchange.settings.snippets.snippetsEnabled) {
          StackExchange.using("snippets", function() {
          createEditor();
          });
          }
          else {
          createEditor();
          }
          });

          function createEditor() {
          StackExchange.prepareEditor({
          heartbeatType: 'answer',
          autoActivateHeartbeat: false,
          convertImagesToLinks: false,
          noModals: true,
          showLowRepImageUploadWarning: true,
          reputationToPostImages: null,
          bindNavPrevention: true,
          postfix: "",
          imageUploader: {
          brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
          contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
          allowUrls: true
          },
          onDemand: true,
          discardSelector: ".discard-answer"
          ,immediatelyShowMarkdownHelp:true
          });


          }
          });














          draft saved

          draft discarded


















          StackExchange.ready(
          function () {
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstats.stackexchange.com%2fquestions%2f391776%2fwhats-a-mean-field-variational-family%23new-answer', 'question_page');
          }
          );

          Post as a guest















          Required, but never shown

























          1 Answer
          1






          active

          oldest

          votes








          1 Answer
          1






          active

          oldest

          votes









          active

          oldest

          votes






          active

          oldest

          votes









          4












          $begingroup$

          Loosely speaking, the mean field family defines a specific class of joint distributions. So $z$ here is actually a parameter vector of length m. That means that $q(z)$ describes a joint distribution over all of the individual z's, and can be written as



          $$q(z) = q(z_1, z_2, ldots, z_m)$$



          We can use the chain rule to factorize this:



          $$ = q(z_1)q(z_2|z_1)ldots q(z_m|z_1, z_2, ldots z_{m-1})$$



          Now, for this joint distribution to be in the mean field family, we make a simplifying assumption and assume that all of the $z_i$s are independent from each other. I'll note here that this assumes that the $z_i$'s under the variational distributions are independent; the true joint $p(z_1, ldots z_m)$ is almost certainly going to have some dependence among the variables. In this sense, we are trading off accuracy (throwing away all covariances) for some computational benefits.



          Now, if we make that independence assumption, we can see that the joint reduces down to



          $$q(z) = q(z_1)q(z_2)ldots q(z_m) = prod_{i=1}^m q(z_i)$$



          Which is the form that the mean field family takes. As for your question about how this won't reduce to a constant, I'm not entirely sure what you mean. All of the $z_i$'s are random variables, so I don't see how this could become a constant.






          share|cite|improve this answer









          $endgroup$













          • $begingroup$
            This is really helpful and has clarified things immensely. What was catching me out was where all the marginal probabilities went; by explaining that this is an approximation that trades off accuracy for computability over the joint distribution makes it much more intuitive. Thanks indeed!
            $endgroup$
            – Lodore66
            Feb 10 at 21:16
















          4












          $begingroup$

          Loosely speaking, the mean field family defines a specific class of joint distributions. So $z$ here is actually a parameter vector of length m. That means that $q(z)$ describes a joint distribution over all of the individual z's, and can be written as



          $$q(z) = q(z_1, z_2, ldots, z_m)$$



          We can use the chain rule to factorize this:



          $$ = q(z_1)q(z_2|z_1)ldots q(z_m|z_1, z_2, ldots z_{m-1})$$



          Now, for this joint distribution to be in the mean field family, we make a simplifying assumption and assume that all of the $z_i$s are independent from each other. I'll note here that this assumes that the $z_i$'s under the variational distributions are independent; the true joint $p(z_1, ldots z_m)$ is almost certainly going to have some dependence among the variables. In this sense, we are trading off accuracy (throwing away all covariances) for some computational benefits.



          Now, if we make that independence assumption, we can see that the joint reduces down to



          $$q(z) = q(z_1)q(z_2)ldots q(z_m) = prod_{i=1}^m q(z_i)$$



          Which is the form that the mean field family takes. As for your question about how this won't reduce to a constant, I'm not entirely sure what you mean. All of the $z_i$'s are random variables, so I don't see how this could become a constant.






          share|cite|improve this answer









          $endgroup$













          • $begingroup$
            This is really helpful and has clarified things immensely. What was catching me out was where all the marginal probabilities went; by explaining that this is an approximation that trades off accuracy for computability over the joint distribution makes it much more intuitive. Thanks indeed!
            $endgroup$
            – Lodore66
            Feb 10 at 21:16














          4












          4








          4





          $begingroup$

          Loosely speaking, the mean field family defines a specific class of joint distributions. So $z$ here is actually a parameter vector of length m. That means that $q(z)$ describes a joint distribution over all of the individual z's, and can be written as



          $$q(z) = q(z_1, z_2, ldots, z_m)$$



          We can use the chain rule to factorize this:



          $$ = q(z_1)q(z_2|z_1)ldots q(z_m|z_1, z_2, ldots z_{m-1})$$



          Now, for this joint distribution to be in the mean field family, we make a simplifying assumption and assume that all of the $z_i$s are independent from each other. I'll note here that this assumes that the $z_i$'s under the variational distributions are independent; the true joint $p(z_1, ldots z_m)$ is almost certainly going to have some dependence among the variables. In this sense, we are trading off accuracy (throwing away all covariances) for some computational benefits.



          Now, if we make that independence assumption, we can see that the joint reduces down to



          $$q(z) = q(z_1)q(z_2)ldots q(z_m) = prod_{i=1}^m q(z_i)$$



          Which is the form that the mean field family takes. As for your question about how this won't reduce to a constant, I'm not entirely sure what you mean. All of the $z_i$'s are random variables, so I don't see how this could become a constant.






          share|cite|improve this answer









          $endgroup$



          Loosely speaking, the mean field family defines a specific class of joint distributions. So $z$ here is actually a parameter vector of length m. That means that $q(z)$ describes a joint distribution over all of the individual z's, and can be written as



          $$q(z) = q(z_1, z_2, ldots, z_m)$$



          We can use the chain rule to factorize this:



          $$ = q(z_1)q(z_2|z_1)ldots q(z_m|z_1, z_2, ldots z_{m-1})$$



          Now, for this joint distribution to be in the mean field family, we make a simplifying assumption and assume that all of the $z_i$s are independent from each other. I'll note here that this assumes that the $z_i$'s under the variational distributions are independent; the true joint $p(z_1, ldots z_m)$ is almost certainly going to have some dependence among the variables. In this sense, we are trading off accuracy (throwing away all covariances) for some computational benefits.



          Now, if we make that independence assumption, we can see that the joint reduces down to



          $$q(z) = q(z_1)q(z_2)ldots q(z_m) = prod_{i=1}^m q(z_i)$$



          Which is the form that the mean field family takes. As for your question about how this won't reduce to a constant, I'm not entirely sure what you mean. All of the $z_i$'s are random variables, so I don't see how this could become a constant.







          share|cite|improve this answer












          share|cite|improve this answer



          share|cite|improve this answer










          answered Feb 10 at 20:06









          snickerdoodles777snickerdoodles777

          1188




          1188












          • $begingroup$
            This is really helpful and has clarified things immensely. What was catching me out was where all the marginal probabilities went; by explaining that this is an approximation that trades off accuracy for computability over the joint distribution makes it much more intuitive. Thanks indeed!
            $endgroup$
            – Lodore66
            Feb 10 at 21:16


















          • $begingroup$
            This is really helpful and has clarified things immensely. What was catching me out was where all the marginal probabilities went; by explaining that this is an approximation that trades off accuracy for computability over the joint distribution makes it much more intuitive. Thanks indeed!
            $endgroup$
            – Lodore66
            Feb 10 at 21:16
















          $begingroup$
          This is really helpful and has clarified things immensely. What was catching me out was where all the marginal probabilities went; by explaining that this is an approximation that trades off accuracy for computability over the joint distribution makes it much more intuitive. Thanks indeed!
          $endgroup$
          – Lodore66
          Feb 10 at 21:16




          $begingroup$
          This is really helpful and has clarified things immensely. What was catching me out was where all the marginal probabilities went; by explaining that this is an approximation that trades off accuracy for computability over the joint distribution makes it much more intuitive. Thanks indeed!
          $endgroup$
          – Lodore66
          Feb 10 at 21:16


















          draft saved

          draft discarded




















































          Thanks for contributing an answer to Cross Validated!


          • Please be sure to answer the question. Provide details and share your research!

          But avoid



          • Asking for help, clarification, or responding to other answers.

          • Making statements based on opinion; back them up with references or personal experience.


          Use MathJax to format equations. MathJax reference.


          To learn more, see our tips on writing great answers.




          draft saved


          draft discarded














          StackExchange.ready(
          function () {
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstats.stackexchange.com%2fquestions%2f391776%2fwhats-a-mean-field-variational-family%23new-answer', 'question_page');
          }
          );

          Post as a guest















          Required, but never shown





















































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown

































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown







          Popular posts from this blog

          Human spaceflight

          Can not write log (Is /dev/pts mounted?) - openpty in Ubuntu-on-Windows?

          File:DeusFollowingSea.jpg