Why is jacobian computed in a differentiable manner? #470

raulTrial · 2023-02-26T04:07:27Z

raulTrial
Feb 26, 2023

This could be a bit anecdotal evidence, but It seem for me, there is no benefit in computing the gradients for the jacobian. It makes the training a bit unstable (again just my observations). Passing the gradient through self._compute_error() is fine. Do you have any suggestions regarding gradients through the jacobian being necessary?

Again, based on my observations, using cholesky for the dense solver makes the system a bit unstable, when the matrix is not well-conditioned. Replacing this with QR seems to be better, but a bit slower (while also taking lesser memory). Does it make sense to provide a more stable solver than cholesky as a user option?

def _solve_sytem(self, Atb: torch.Tensor, AtA: torch.Tensor) -> torch.Tensor:
    # lower = torch.linalg.cholesky(AtA)
    # return  torch.cholesky_solve(Atb, lower).squeeze(2)
    q, r = torch.linalg.qr(AtA)
    b_solve = (q.permute(0, 2, 1) @ Atb)
    return torch.triangular_solve(b_solve, r, upper=True)[0][..., 0]

Answered by mhmukadam

Mar 6, 2023

Hi @raulTrial,

The gradients of the Jacobians are computed for the backward pass (cc @luisenp). Can you point to the place in the code where you think it might not be necessary or an example where you see instability? We can potentially help you resolve that.
Yes, Cholesky is just one dense solver we exposed from what is available natively in torch. We could definitely provide the option for QR as you point out in your example. Would you like to submit a PR to enable this? Thanks!

View full answer

mhmukadam · 2023-03-06T20:16:35Z

mhmukadam
Mar 6, 2023
Collaborator

Hi @raulTrial,

The gradients of the Jacobians are computed for the backward pass (cc @luisenp). Can you point to the place in the code where you think it might not be necessary or an example where you see instability? We can potentially help you resolve that.
Yes, Cholesky is just one dense solver we exposed from what is available natively in torch. We could definitely provide the option for QR as you point out in your example. Would you like to submit a PR to enable this? Thanks!

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Why is jacobian computed in a differentiable manner? #470

{{title}}

Replies: 1 comment

{{title}}

Select a reply

Why is jacobian computed in a differentiable manner? #470

raulTrial Feb 26, 2023

Replies: 1 comment

mhmukadam Mar 6, 2023 Collaborator

raulTrial
Feb 26, 2023

mhmukadam
Mar 6, 2023
Collaborator