If a job fails very quickly, we never get any logs #79

ihodes · 2016-12-08T20:38:44Z

No description provided.

smondet · 2016-12-08T20:41:46Z

We already try to save the logs when a job dies: https://github.com/hammerlab/coclobas/blob/master/src/lib/server.ml#L227

What happened in your case?
Can you still describe it for example?

ihodes · 2016-12-08T20:47:13Z

Describe works, the job is run on the docker, but dies immediately (bad CLI args in my shell script) and exists.

opam@e2b78e43fa00:/coclo/_cocloroot/logs/logs/job/522c556b-b975-567e-b254-02d4beadc9ca/commands$ cat 1481229903345_3c1b3504.json
{
  "command": {
    "command": "kubectl logs 522c556b-b975-567e-b254-02d4beadc9ca",
    "stdout": "",
    "stderr":
      "Error from server: Get https://gke-ihodes-coco3-cluster-default-pool-36378887-pskd:10250/containerLogs/default/522c556b-b975-567e-b254-02d4beadc9ca/522c556b-b975-567e-b254-02d4beadc9cacontainer: No SSH tunnels currently open. Were the targets able to accept an ssh-key for user \"gke-e170239faa5e49b2ac95\"?\n",
    "status": [ "Exited", 1 ],
    "exn": null
  }
}

ihodes · 2016-12-08T20:49:49Z

This may be due to the Google Gcloud metadata limitation; we run out of room at 32kb or something absurd (project-wide).

armish · 2016-12-08T20:51:24Z

I also had similar issues where the describe log showed a successful allocation of resources and the initiation of the job, yet the job fails without any kubernetes log. For example, when you pass invalid URLs to wget (that is passing a poorly constructed URL to either --tumor, --rna or --normal), those fetch jobs also fail fast and leave no trace behind them.

ihodes · 2016-12-08T22:23:32Z

This may have been the "ran out of metadata space on GCP" issue again.

ihodes added the bug label Dec 8, 2016

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

If a job fails very quickly, we never get any logs #79

If a job fails very quickly, we never get any logs #79

ihodes commented Dec 8, 2016

smondet commented Dec 8, 2016 •

edited

Loading

ihodes commented Dec 8, 2016

ihodes commented Dec 8, 2016 •

edited

Loading

armish commented Dec 8, 2016

ihodes commented Dec 8, 2016

If a job fails very quickly, we never get any logs #79

If a job fails very quickly, we never get any logs #79

Comments

ihodes commented Dec 8, 2016

smondet commented Dec 8, 2016 • edited Loading

ihodes commented Dec 8, 2016

ihodes commented Dec 8, 2016 • edited Loading

armish commented Dec 8, 2016

ihodes commented Dec 8, 2016

smondet commented Dec 8, 2016 •

edited

Loading

ihodes commented Dec 8, 2016 •

edited

Loading