Auto labeling using GPT-4 #35

imenelydiaker · 2024-06-18T11:57:35Z

Prompt for labeling with GPT-4 variants
Script to compare labels from GPT with human annotations
Update all entries with username: "auto" or "gpt"

Summary by Sourcery

This pull request adds a new script for pre-labeling datasets using GPT-4. The script downloads metadata and votes, processes images, generates labels using the OpenAI API, and saves or uploads the results to the Hugging Face Hub.

New Features:
- Introduced a script for pre-labeling the dataset using GPT-4, which includes functionality to download metadata and votes, process images, and generate labels using the OpenAI API.
Enhancements:
- Added functionality to save and upload metadata and votes to the Hugging Face Hub.

sourcery-ai · 2024-06-18T11:57:41Z

Reviewer's Guide by Sourcery

This pull request introduces a new script auto_labeling_using_llm.py to automate the pre-labeling of datasets using GPT-4. The script includes functions for handling metadata, retrieving votes, computing concepts, and making requests to the OpenAI API. The main function orchestrates the entire labeling process and saves the results locally or pushes them to the hub.

File-Level Changes

Files	Changes
`scripts/auto_labeling_using_llm.py`	Introduced a new script to automate dataset labeling using GPT-4, including functions for metadata handling, OpenAI API requests, and result processing.

Tips

Trigger a new Sourcery review by commenting @sourcery-ai review on the pull request.
You can change your review settings at any time by accessing your dashboard:
- Enable or disable the Sourcery-generated pull request summary or reviewer's guide;
- Change the review language;
You can always contact us if you have any questions or feedback.

sourcery-ai

Hey @imenelydiaker - I've reviewed your changes and they look great!

Here's what I looked at during the review

🟡 General issues: 5 issues found
🟢 Security: all looks good
🟢 Testing: all looks good
🟢 Complexity: all looks good
🟢 Documentation: all looks good

Sourcery is free for open source - if you like our reviews please consider sharing them ✨

_{Help me be more useful! Please click 👍 or 👎 on each comment to tell me if it was helpful.}

sourcery-ai · 2024-06-18T11:58:45Z

scripts/auto_labeling_using_llm.py

+"""
+
+import os
+import random


issue: Unused import 'random'

The 'random' module is imported but not used anywhere in the script. Consider removing it to keep the code clean.

scripts/auto_labeling_using_llm.py

sourcery-ai · 2024-06-18T11:58:45Z

scripts/auto_labeling_using_llm.py

+            )
+
+            pred = response.choices[0].message.content
+            pred = pred[pred.rfind("{"):pred.rfind("}")]


issue (bug_risk): Potential off-by-one error

The slicing operation might exclude the closing brace '}'. Consider using 'pred.rfind("}") + 1' to include it.

scripts/auto_labeling_using_llm.py

Xmaster6y

Additionally add a image limit to debug, like only labelling the 10 first (or random) images

Co-authored-by: Yoann Poupart <[email protected]>

Co-authored-by: sourcery-ai[bot] <58596630+sourcery-ai[bot]@users.noreply.github.com>

Co-authored-by: Yoann Poupart <[email protected]>

imenelydiaker added 7 commits June 17, 2024 10:08

add script for labelling with GPT4

98aad61

update autolaveking script

884713b

update script

31b1995

fix call of openai request

d8555b5

update response format in prompt

e777219

add icl exmaple manually

98a3bee

add model_name argument

6defc43

imenelydiaker had a problem deploying to ci-base June 18, 2024 11:57 — with GitHub Actions Failure

imenelydiaker requested a review from Xmaster6y June 18, 2024 11:57

imenelydiaker marked this pull request as draft June 18, 2024 11:58

sourcery-ai bot reviewed Jun 18, 2024

View reviewed changes

Fix concepts parsing in prompt

9b859fc

imenelydiaker had a problem deploying to ci-base June 18, 2024 13:50 — with GitHub Actions Failure

Xmaster6y reviewed Jun 18, 2024

View reviewed changes

scripts/auto_labeling_using_llm.py Outdated Show resolved Hide resolved

Xmaster6y reviewed Jun 18, 2024

View reviewed changes

scripts/auto_labeling_using_llm.py Outdated Show resolved Hide resolved

Xmaster6y reviewed Jun 18, 2024

View reviewed changes

scripts/auto_labeling_using_llm.py Outdated Show resolved Hide resolved

Xmaster6y requested changes Jun 18, 2024

View reviewed changes

Update scripts/auto_labeling_using_llm.py

df8abbc

Co-authored-by: Yoann Poupart <[email protected]>

imenelydiaker had a problem deploying to ci-base July 10, 2024 19:57 — with GitHub Actions Failure

Update scripts/auto_labeling_using_llm.py

b84de92

Co-authored-by: Yoann Poupart <[email protected]>

imenelydiaker had a problem deploying to ci-base July 10, 2024 19:58 — with GitHub Actions Failure

Update scripts/auto_labeling_using_llm.py

2b556fc

Co-authored-by: sourcery-ai[bot] <58596630+sourcery-ai[bot]@users.noreply.github.com>

imenelydiaker had a problem deploying to ci-base July 10, 2024 19:58 — with GitHub Actions Failure

Update scripts/auto_labeling_using_llm.py

2fa181c

Co-authored-by: Yoann Poupart <[email protected]>

imenelydiaker had a problem deploying to ci-base July 10, 2024 19:59 — with GitHub Actions Failure

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Auto labeling using GPT-4 #35

Auto labeling using GPT-4 #35

imenelydiaker commented Jun 18, 2024 •

edited by Xmaster6y

Loading

sourcery-ai bot commented Jun 18, 2024 •

edited

Loading

sourcery-ai bot left a comment

sourcery-ai bot Jun 18, 2024

sourcery-ai bot Jun 18, 2024

Xmaster6y left a comment

Auto labeling using GPT-4 #35

Are you sure you want to change the base?

Auto labeling using GPT-4 #35

Conversation

imenelydiaker commented Jun 18, 2024 • edited by Xmaster6y Loading

Summary by Sourcery

sourcery-ai bot commented Jun 18, 2024 • edited Loading

Reviewer's Guide by Sourcery

File-Level Changes

sourcery-ai bot left a comment

Choose a reason for hiding this comment

sourcery-ai bot Jun 18, 2024

Choose a reason for hiding this comment

sourcery-ai bot Jun 18, 2024

Choose a reason for hiding this comment

Xmaster6y left a comment

Choose a reason for hiding this comment

imenelydiaker commented Jun 18, 2024 •

edited by Xmaster6y

Loading

sourcery-ai bot commented Jun 18, 2024 •

edited

Loading