Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix license diffs to match new data format #192

Open
wants to merge 4 commits into
base: develop
Choose a base branch
from

Conversation

Zach-Johnson
Copy link

Fixes data format issues from #191

@Zach-Johnson
Copy link
Author

@AyanSinhaMahapatra I'm not sure how to handle the category field -- it looks like this was some kind of high-level identifier of the type of license and its no longer present in the new format. It seems like it was just used for scoring the delta. I'm not sure how you want to handle this now.

@Zach-Johnson Zach-Johnson marked this pull request as ready for review November 17, 2023 22:20
@Zach-Johnson
Copy link
Author

@AyanSinhaMahapatra I updated the PR to handle the category by pulling the categories out of the license_references field - let me know if that will work. I think this PR is ready to review.

@AyanSinhaMahapatra
Copy link
Member

AyanSinhaMahapatra commented Nov 20, 2023

@Zach-Johnson Thanks++ for the PR. ❤️

I updated the PR to handle the category by pulling the categories out of the license_references field - let me know if that will work

Yes this is indeed the correct way to get these categories, as license reference info is now at codebase-level.

Three things though:

  1. About the test failures: the test fixtures (SCTK scans) are massively outdated, these are mostly from SCTK v2.x which is 4-5 major versions behind current SCTK and these need to be updated. Ideally we should have had scripts to re-generate these fixtures from latest as default, or atleast links/instructions to regenerate them, but that's not the case here. So I'll write something up to update the output format temporarily (similar to migrations) to pass the tests before we replace these with actual scans later. And then you can merge/rebase on that and we can look at any other test failures remaining. 😅
  2. This would not affect the function but should we also change the models like in https://github.com/Zach-Johnson/deltacode/blob/issue-191-license-fixes/src/deltacode/models.py#L244 to reflect more the current data models at https://github.com/nexB/scancode-toolkit/blob/develop/src/licensedcode/models.py so this is cleaner and less confusing?
  3. We should add some integration tests which would install deltacode with latest SCTK/commoncode to check various functions

  1. and 3, is out of scope for this PR and I'll take care of this, otherwise the changes seem to be correct, but we have to bring the tests back to passing and then apply these changes to verify eveything is working as expected before merging this in. Should not be too long, thanks again for the PR.

@Zach-Johnson
Copy link
Author

@AyanSinhaMahapatra I updated the models. I will wait on the test updates to rebase!

@pombredanne
Copy link
Member

@Zach-Johnson gentle ping... unless @AyanSinhaMahapatra needs to do something first?

@Zach-Johnson
Copy link
Author

@Zach-Johnson gentle ping... unless @AyanSinhaMahapatra needs to do something first?

I was under the impression that @AyanSinhaMahapatra was going to work on fixing up all the tests!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants