Skip to content

Commit

Permalink
feat: support for similar function
Browse files Browse the repository at this point in the history
Signed-off-by: vsoch <[email protected]>
  • Loading branch information
vsoch committed Jun 16, 2024
1 parent de4faa9 commit 0aa5836
Show file tree
Hide file tree
Showing 38 changed files with 3,071 additions and 2,943 deletions.
14 changes: 7 additions & 7 deletions .github/CONTRIBUTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,10 +4,10 @@ This code is licensed under the MPL 2.0 [LICENSE](LICENSE).

# Contributing

When contributing to Singularity Registry Global Client, it is
important to properly communicate the gist of the contribution.
If it is a simple code or editorial fix, simply explaining this
within the GitHub Pull Request (PR) will suffice. But if this is a larger
When contributing to Singularity Registry Global Client, it is
important to properly communicate the gist of the contribution.
If it is a simple code or editorial fix, simply explaining this
within the GitHub Pull Request (PR) will suffice. But if this is a larger
fix or Enhancement, it should be first discussed with the project
leader or developers.

Expand All @@ -29,8 +29,8 @@ all your interactions with the project members and users.
4. The project's default copyright and header have been included in any new
source files.
5. All (major) changes to Singularity Registry must be documented in
[docs](docs). If your PR changes a core functionality, please
include clear description of the changes in your PR so that the docs
[docs](docs). If your PR changes a core functionality, please
include clear description of the changes in your PR so that the docs
can be updated, or better, submit another PR to update the docs directly.
6. If necessary, update the README.md.
7. The pull request will be reviewed by others, and the final merge must be
Expand Down Expand Up @@ -102,7 +102,7 @@ an incident. Further details of specific enforcement policies may be posted
separately.

Project maintainers, contributors and users who do not follow or enforce the
Code of Conduct in good faith may face temporary or permanent repercussions
Code of Conduct in good faith may face temporary or permanent repercussions
with their involvement in the project as determined by the project's leader(s).

## Attribution
Expand Down
4 changes: 4 additions & 0 deletions .github/dev-requirements.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
pre-commit
black==23.3.0
isort
flake8
6 changes: 3 additions & 3 deletions .github/workflows/generate.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -36,7 +36,7 @@ jobs:
runs-on: ubuntu-latest
strategy:
fail-fast: false
matrix:
matrix:
image: ["ubuntu", "centos", "rockylinux:9.0", "alpine", "busybox"]

name: Generate Matrix
Expand All @@ -62,12 +62,12 @@ jobs:
outfile: ${{ matrix.image }}.json
- name: View Output
run: cat ${{ matrix.image }}.json

test-diffs:
runs-on: ubuntu-latest
strategy:
fail-fast: false
matrix:
matrix:
image: [["vanessa/salad", "vanessa-salad.json"]]

name: Generate Matrix
Expand Down
16 changes: 3 additions & 13 deletions .github/workflows/main.yml
Original file line number Diff line number Diff line change
Expand Up @@ -21,19 +21,9 @@ jobs:
with:
files: ./docs/getting_started/ ./docs/index.rst

- name: Lint python code
- name: Lint and format Python code
run: |
export PATH="/usr/share/miniconda/bin:$PATH"
source activate black
pip install black
black --check container_guts
- name: Check imports with pyflakes
run: |
export PATH="/usr/share/miniconda/bin:$PATH"
source activate black
pyflakes container_guts/*.py
pyflakes container_guts/client
pyflakes container_guts/main/client.py
pyflakes container_guts/main/templates.py
pyflakes container_guts/main/container/docker.py
pip install -r .github/dev-requirements.txt
pre-commit run --all-files
25 changes: 25 additions & 0 deletions .pre-commit-config.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,25 @@
exclude: ".all-contributorsrc"
repos:
- repo: https://github.com/pre-commit/pre-commit-hooks
rev: v4.3.0
hooks:
- id: check-added-large-files
- id: check-case-conflict
- id: check-docstring-first
- id: end-of-file-fixer
- id: trailing-whitespace
- id: mixed-line-ending

- repo: local
hooks:
- id: black
name: black
language: python
types: [python]
entry: black

- id: flake8
name: flake8
language: python
types: [python]
entry: flake8
2 changes: 1 addition & 1 deletion CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -14,10 +14,10 @@ and **Merged pull requests**. Critical items to know are:
The versions coincide with releases on pip. Only major versions will be released as tags on Github.

## [0.0.x](https://github.com/singularityhub/guts/tree/main) (0.0.x)
- add function for most similar (0.0.16)
- skip /dev in tar (0.0.15)
- diff action and support (0.0.14)
- adding fs (filesystem) extraction support (0.0.13)
- tag should be own file (0.0.12)
- Support for output directory (so path prepared by guts) (0.0.11)
- Initial creation of project (0.0.1)

12 changes: 6 additions & 6 deletions action/diff/action.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -29,11 +29,11 @@ runs:
- name: Guts Diff for ${{ inputs.image }}
id: guts_run
env:
image: ${{ inputs.image }}
outfile: ${{ inputs.outfile }}
outdir: ${{ inputs.outdir }}
database: ${{ inputs.database }}
run: |
image: ${{ inputs.image }}
outfile: ${{ inputs.outfile }}
outdir: ${{ inputs.outdir }}
database: ${{ inputs.database }}
run: |
cmd="guts diff"
if [ "${outfile}" != "" ]; then
cmd="${cmd} --outfile ${outfile}"
Expand All @@ -44,7 +44,7 @@ runs:
if [ "${database}" != "" ]; then
cmd="${cmd} --db ${database}"
fi
cmd="${cmd} ${image}"
cmd="${cmd} ${image}"
printf "${cmd}\n"
${cmd}
shell: bash
10 changes: 5 additions & 5 deletions action/manifest/action.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -30,11 +30,11 @@ runs:
- name: Guts for ${{ inputs.image }}
id: guts_run
env:
image: ${{ inputs.image }}
outfile: ${{ inputs.outfile }}
outdir: ${{ inputs.outdir }}
image: ${{ inputs.image }}
outfile: ${{ inputs.outfile }}
outdir: ${{ inputs.outdir }}
includes: ${{ inputs.include }}
run: |
run: |
cmd="guts manifest"
if [ "${outfile}" != "" ]; then
cmd="${cmd} --outfile ${outfile}"
Expand All @@ -45,7 +45,7 @@ runs:
for include in ${includes}; do
cmd="${cmd} --include ${include}"
done
cmd="${cmd} ${image}"
cmd="${cmd} ${image}"
printf "${cmd}\n"
${cmd}
shell: bash
18 changes: 16 additions & 2 deletions container_guts/client/__init__.py
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
#!/usr/bin/env python

__author__ = "Vanessa Sochat"
__copyright__ = "Copyright 2021-2022, Vanessa Sochat"
__copyright__ = "Copyright 2021-2024, Vanessa Sochat"
__license__ = "MPL 2.0"

import argparse
Expand Down Expand Up @@ -69,8 +69,20 @@ def get_parser():
help="Database root (of json files) to use, either filesystem or git URL to clone",
dest="database",
)
similar = subparsers.add_parser(
"similar",
description="calculate similarity of your container against a guts database.",
formatter_class=argparse.RawTextHelpFormatter,
)
for command in [diff, similar]:
command.add_argument(
"--db",
"--database",
help="Database root (of json files) to use, either filesystem or git URL to clone",
dest="database",
)

for command in manifest, diff:
for command in manifest, diff, similar:
command.add_argument(
"-i",
"--include",
Expand Down Expand Up @@ -159,6 +171,8 @@ def help(return_code=0):
from .manifest import main
elif args.command == "diff":
from .diff import main
elif args.command == "similar":
from .similar import main

# Pass on to the correct parser
return_code = 0
Expand Down
4 changes: 2 additions & 2 deletions container_guts/client/diff.py
Original file line number Diff line number Diff line change
@@ -1,16 +1,16 @@
__author__ = "Vanessa Sochat"
__copyright__ = "Copyright 2022, Vanessa Sochat"
__copyright__ = "Copyright 2022-2024, Vanessa Sochat"
__license__ = "MPL 2.0"

import json
import os

import container_guts.utils as utils

from ..main import ManifestGenerator


def main(args, parser, extra, subparser):

# Show args to the user
print(" image: %s" % args.image)
print(" outfile: %s" % args.outfile)
Expand Down
4 changes: 2 additions & 2 deletions container_guts/client/manifest.py
Original file line number Diff line number Diff line change
@@ -1,16 +1,16 @@
__author__ = "Vanessa Sochat"
__copyright__ = "Copyright 2022, Vanessa Sochat"
__copyright__ = "Copyright 2022-2024, Vanessa Sochat"
__license__ = "MPL 2.0"

import json
import os

import container_guts.utils as utils

from ..main import ManifestGenerator


def main(args, parser, extra, subparser):

# Show args to the user
print(" image: %s" % args.image)
print(" outfile: %s" % args.outfile)
Expand Down
41 changes: 41 additions & 0 deletions container_guts/client/similar.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,41 @@
__author__ = "Vanessa Sochat"
__copyright__ = "Copyright 2022-2024, Vanessa Sochat"
__license__ = "MPL 2.0"

import json
import os

import container_guts.utils as utils

from ..main import ManifestGenerator


def main(args, parser, extra, subparser):
# Show args to the user
print(" image: %s" % args.image)
print(" outfile: %s" % args.outfile)
print(" outdir: %s" % args.outdir)
print("container tech: %s" % args.container_tech)
print(" database: %s" % args.database)

# Derive an initial manifest
cli = ManifestGenerator(tech=args.container_tech)
manifests = cli.similar(args.image, database=args.database)
outfile = None

# Default to using outfile first, then outdir if defined
if args.outfile:
outfile = args.outfile
elif args.outdir:
outfile = os.path.join(args.outdir, "%s.json" % cli.save_path(args.image))
dirname = os.path.dirname(outfile)
if not os.path.exists(dirname):
os.makedirs(dirname)

# If we have an output file, make sure to set step output
if outfile:
print(f"Saving to {outfile}...")
print(f"::set-output name=outfile::{outfile}")
utils.write_json(manifests, outfile)
else:
print(json.dumps(manifests, indent=4))
5 changes: 3 additions & 2 deletions container_guts/defaults.py
Original file line number Diff line number Diff line change
@@ -1,10 +1,11 @@
__author__ = "Vanessa Sochat"
__copyright__ = "Copyright 2021-2022, Vanessa Sochat"
__copyright__ = "Copyright 2021-2024, Vanessa Sochat"
__license__ = "MPL 2.0"

import container_guts.utils as utils
import os

import container_guts.utils as utils

# Default database for base image
database = "https://github.com/singularityhub/shpc-guts"

Expand Down
3 changes: 1 addition & 2 deletions container_guts/logger.py
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
__author__ = "Vanessa Sochat"
__copyright__ = "Copyright 2021-2022, Vanessa Sochat"
__copyright__ = "Copyright 2021-2024, Vanessa Sochat"
__license__ = "MPL 2.0"

import inspect
Expand Down Expand Up @@ -37,7 +37,6 @@ def add_prefix(msg, char=">>"):


class ColorizingStreamHandler(_logging.StreamHandler):

BLACK, RED, GREEN, YELLOW, BLUE, MAGENTA, CYAN, WHITE = range(8)
RESET_SEQ = LogColors.ENDC
COLOR_SEQ = "\033[%dm"
Expand Down
32 changes: 26 additions & 6 deletions container_guts/main/client.py
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
__author__ = "Vanessa Sochat"
__copyright__ = "Copyright 2021-2022, Vanessa Sochat"
__copyright__ = "Copyright 2021-2024, Vanessa Sochat"
__license__ = "MPL 2.0"


Expand All @@ -8,10 +8,9 @@

from .. import utils
from ..logger import logger

from .database import Database
from .container.decorator import ensure_container
from .container.base import ContainerName
from .container.decorator import ensure_container
from .database import Database


class ManifestGenerator:
Expand Down Expand Up @@ -60,7 +59,7 @@ def diff(self, image, database=None):
# Catch the error so we clean up the running container
try:
result = db.diff(self.manifests[image.uri])
except:
except Exception:
self.container.cleanup(image)
return

Expand All @@ -70,6 +69,28 @@ def diff(self, image, database=None):
self.container.cleanup(image)
return {image.uri: {"diff": result}}

@ensure_container
def similar(self, image, database=None):
"""
Generate a manifest for an image and similarity scores
"""
print(f"Generating similarity for {image}")
tmpdir = self.extract(image, cleanup=False, includes=["paths", "fs"])
db = Database(database)

# Catch the error so we clean up the running container
try:
result = db.similar(self.manifests[image.uri])
except Exception:
self.container.cleanup(image)
return

# Only cleans up if was cloned
db.cleanup()
shutil.rmtree(tmpdir, ignore_errors=True)
self.container.cleanup(image)
return {image.uri: {"similar": result}}

def extract_filesystem(self, root):
"""
List all contents of the filesystem
Expand Down Expand Up @@ -181,7 +202,6 @@ def get_manifests(self, root):
for jsonfile in utils.recursive_find(root, "json$"):
data = utils.read_json(jsonfile)
if "manifest" in jsonfile:

continue
print("Found layer config %s" % jsonfile)

Expand Down
Loading

0 comments on commit 0aa5836

Please sign in to comment.