Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Add Parquet format to data page #629

Open
wants to merge 11 commits into
base: main
Choose a base branch
from

Conversation

jeremyarancio
Copy link
Contributor

What

  • We recently converted the JSONL dump into the Parquet format, which is more suitable for analytic queries and lightweight.
  • The dataset is hosted on the Hugging Face platform.
  • This commit adds a section to the data webpage to allow the database users to load the Parquet file.

Part of

  • PR: Add CLI command to convert and push JSONL to Huggingface

@jeremyarancio jeremyarancio changed the title Add Parquet format to data page feat: Add Parquet format to data page Oct 28, 2024
lang/aa/texts/data.html Outdated Show resolved Hide resolved
@jeremyarancio
Copy link
Contributor Author

@teolemon
I integrated your feedback.
Let me know if it's good

Copy link
Member

@teolemon teolemon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jeremyarancio you're editing the aa page, you should edit the en/texts/data.html page

@teolemon teolemon added the Data label Nov 5, 2024
@jeremyarancio
Copy link
Contributor Author

@jeremyarancio you're editing the aa page, you should edit the en/texts/data.html page

Indeed! It's fixed

lang/en/texts/data.html Outdated Show resolved Hide resolved
@jeremyarancio
Copy link
Contributor Author

jeremyarancio commented Nov 14, 2024

@teolemon
Is the PR good to merge?

lang/en/texts/data.html Outdated Show resolved Hide resolved
lang/en/texts/data.html Outdated Show resolved Hide resolved
jeremyarancio and others added 2 commits November 17, 2024 14:48
Co-authored-by: Pierre Slamich <[email protected]>
Co-authored-by: Pierre Slamich <[email protected]>
Copy link
Member

@teolemon teolemon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM @jeremyarancio 👌 you can merge, and it will be deployed to .net within minutes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
Status: In Progress
Development

Successfully merging this pull request may close these issues.

3 participants