Not until I understand it a bit better. As I said, I’m not very good with these exponential family derivations.

It certainly makes sense, but the issue is whether you can come up with some computable way to do it. The trick to variational inference is that the hairy integral involved in the componentwise hill climbing can be solved for conjugate priors and approximated elsewhere.

Actually, there is another paper written by some Japanese researcher that I cannot recall the title. They show that mean field (including non conjugate models) can see from the dual problem, is to minimize bregman divergence block coordinate wise.