Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Empty license field for sourcearchive components #585

Open
qtomlinson opened this issue Jul 2, 2024 · 5 comments
Open

Empty license field for sourcearchive components #585

qtomlinson opened this issue Jul 2, 2024 · 5 comments

Comments

@qtomlinson
Copy link
Collaborator

qtomlinson commented Jul 2, 2024

Based on the Exploratory Data Analysis conducted by Aleksandrs Volodjkins on October 2, 2023, 92.6% of sourcearchive components have an EMPTY license expression.
image

Efforts have since been made to rectify this for sourcearchive components. If a license file is present in the META-INF of the sourcearchive, a recent commit aids in identifying the license. For more details, please refer to issue #533.

However, for sourcearchive components that do not contain a license file, such as sourcearchive/mavencentral/org.osgi/osgi.annotation/8.1.0, the license field remains empty. In the maven repository from which the sourcearchive was downloaded, a pom.xml file is available and contains the license information that may be helpful for us.

Proposal

As per the POM Reference for Maven, it is recommended that "A project should list licenses that apply directly to this project". The license information in the project's pom.xml has been utilized to determine the license for Maven packages. This information could also be beneficial for sourcearchive components within the project, offering insights into its license and helping to decrease instances of blank license fields.

Special Case Handling

During our discussions, it was noted that there are instances where different licenses apply to binary and source releases. How should we approach these cases? Further investigation and examples may be required.

@qtomlinson
Copy link
Collaborator Author

@ariel11 @capfei @jeffwilcox @bduranc @Jeffrey-Luszcz @@sgustafsson @elrayle Appreciate your feedback on this!

@sgustafsson
Copy link

I like the proposal.
I'D also be interested in examples of Maven artifacts where the license differs between binary and source release.

@bduranc
Copy link

bduranc commented Jul 5, 2024

This proposal sounds good (I actually thought we were already looking at pom.xml license attribute for sourcearchive).

The only example I can think of at this time where the binary license is different maybe is VS Code (the pre-built binary available on the MS website is under terms different from the source on the project's GitHub). If any others come to mind, I'll add them here.

Would the sourcearchive's pom.xml reliably reflect a difference in license information? The safest option may just be to take what is listed in the POMs for the respective sourcearchive and binary artifacts.

@qtomlinson
Copy link
Collaborator Author

qtomlinson commented Jul 17, 2024

An additional instance is available at https://clearlydefined.io/definitions/maven/mavencentral/com.azure/azure-storage-blob/12.20.0, where the MIT license is determined from the pom.xml of the project.
In comparison, for the corresponding source component at https://clearlydefined.io/definitions/sourcearchive/mavencentral/com.azure/azure-storage-blob/12.20.0, the declared license is empty

@qtomlinson
Copy link
Collaborator Author

@yashkohli88 Persisting manifesting information in the tool result on the crawler side can be found at this commit. The clearlydefined.js in service repo needs to be updated to consume manifest data (addSourceArchiveData) for sourcearchive (addSourceArchiveData, similar to the consumption of manifest data in addMavenData).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants