Skip to content
This repository has been archived by the owner on Oct 9, 2023. It is now read-only.

Latest commit

 

History

History
55 lines (41 loc) · 2.35 KB

README.md

File metadata and controls

55 lines (41 loc) · 2.35 KB
🗑 As of Oct-23 we moved the development of this component to the monorepo. 🗑

Flyte CoPilot

Overview

Flyte CoPilot provides a sidecar that understand Flyte Metadata Format as specified in FlyteIDL and make it possible to run arbitrary containers in Flyte. This is achieved using flyte-copilot a binary that runs in 2 modes, -Downloader - Downloads the metadata and any other data (if configured) to a provided path. In kubernetes this path could be a shared volume.

  • Sidecar - Monitors the process and uploads any data that is generated by the process in a prescribed path/

Mode: Downloader

$ flyte-copilot downloader

In K8s flyte-copilot downloader can be run as part of the init containers with the download volume mounted. This guarantees that the metadata and any data (if configured) is downloaded before the main container starts up.

Mode: Sidecar

As a sidecar process, that runs in parallel with the main container/process, the goal is to

  1. identify the main container
  2. Wait for the main container to start up
  3. Wait for the main container to exit
  4. Copy the data to remote store (especially the metadata)
  5. Exit
$ flyte-copilot sidecar

Raw notes

Solution 1: poll Kubeapi. - Works perfectly fine, but too much load on kubeapi

Solution 2: Create a protocol. Main container will exit and write a _SUCCESS file to a known location - problem in the case of oom or random exits. Uploader will be stuck. We could use a timeout? and in the sidecar just kill the pod, when the main exits unhealthy?

Solution 3: Use shared process namespace. This allows all pids in a pod to share the namespace. Thus pids can see each other.

Problems:
 How to identify the main container?
   - Container id is not known ahead of time and container name -> Pid mapping is not possible?
   - How to wait for main container to start up.
      One solution for both, call kubeapi and get pod info and find the container id
   
Note: we can poll /proc/pid/cgroup file (it contains the container id) so we can create a blind container id to pid mapping. Then somehow get the main container id

Once we know the main container, waiting for it to exit is simple and implemented
Copying data is simple and implemented