-
Notifications
You must be signed in to change notification settings - Fork 108
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Mixer validator #215
Mixer validator #215
Commits on Sep 10, 2024
-
Adding script that validates if mixer config is well formated and has…
… everything in place
Masha Iureva authored and Masha Iureva committedSep 10, 2024 Configuration menu - View commit details
-
Copy full SHA for b724d5f - Browse repository at this point
Copy the full SHA b724d5fView commit details
Commits on Sep 13, 2024
-
Add S3 path validation with boto3 existence check
Masha Iureva authored and Masha Iureva committedSep 13, 2024 Configuration menu - View commit details
-
Copy full SHA for 1954932 - Browse repository at this point
Copy the full SHA 1954932View commit details
Commits on Sep 16, 2024
-
Adding check of the files, trying to run jq expressions on them and s…
…ee if both files and jq expressions are valid
Masha Iureva authored and Masha Iureva committedSep 16, 2024 Configuration menu - View commit details
-
Copy full SHA for 9ebe5f1 - Browse repository at this point
Copy the full SHA 9ebe5f1View commit details
Commits on Sep 25, 2024
-
Add S3 path validation, sampling, and doc-attribute alignment checks
Masha Iureva authored and Masha Iureva committedSep 25, 2024 Configuration menu - View commit details
-
Copy full SHA for 67adadf - Browse repository at this point
Copy the full SHA 67adadfView commit details -
adding logic to split jsonpath expressions into pieces and check them
Masha Iureva authored and Masha Iureva committedSep 25, 2024 Configuration menu - View commit details
-
Copy full SHA for 4391805 - Browse repository at this point
Copy the full SHA 4391805View commit details -
Added JsonPath syntax evaluation, started working on sampling docs an…
…d checking their content
Masha Iureva authored and Masha Iureva committedSep 25, 2024 Configuration menu - View commit details
-
Copy full SHA for 2885e7e - Browse repository at this point
Copy the full SHA 2885e7eView commit details
Commits on Sep 27, 2024
-
Adding logic to check if all doc and corresponding attributes files c…
…ontain correct fields and same anount of lines
Masha Iureva authored and Masha Iureva committedSep 27, 2024 Configuration menu - View commit details
-
Copy full SHA for 82920f8 - Browse repository at this point
Copy the full SHA 82920f8View commit details
Commits on Oct 3, 2024
-
Adding functionality to check if filters in config and attribute file…
…s match
Masha Iureva authored and Masha Iureva committedOct 3, 2024 Configuration menu - View commit details
-
Copy full SHA for 8745c8d - Browse repository at this point
Copy the full SHA 8745c8dView commit details -
updating filter checking logic to focus on filters missing from the m…
…ixer config
Masha Iureva authored and Masha Iureva committedOct 3, 2024 Configuration menu - View commit details
-
Copy full SHA for 564cee6 - Browse repository at this point
Copy the full SHA 564cee6View commit details
Commits on Oct 7, 2024
-
adding ligic to run jq and jsonpath filters on small set of docs to s…
…ee if they work or fail
Masha Iureva authored and Masha Iureva committedOct 7, 2024 Configuration menu - View commit details
-
Copy full SHA for 5e7e3d4 - Browse repository at this point
Copy the full SHA 5e7e3d4View commit details
Commits on Oct 9, 2024
-
refactored to use smart open and added logic to download sample files…
… to a temp folder
Masha Iureva authored and Masha Iureva committedOct 9, 2024 Configuration menu - View commit details
-
Copy full SHA for 445bfef - Browse repository at this point
Copy the full SHA 445bfefView commit details
Commits on Oct 11, 2024
-
added logic to sample lines from doc and apply filters to it, refacto…
…red main, added logic to download sample files and work with them locally
Masha Iureva authored and Masha Iureva committedOct 11, 2024 Configuration menu - View commit details
-
Copy full SHA for 83481ac - Browse repository at this point
Copy the full SHA 83481acView commit details -
Adding clean up logic to delete sample files after the run
Masha Iureva authored and Masha Iureva committedOct 11, 2024 Configuration menu - View commit details
-
Copy full SHA for 890de88 - Browse repository at this point
Copy the full SHA 890de88View commit details -
Merge branch 'main' of https://github.com/allenai/dolma into mixer-va…
…lidator
Masha Iureva authored and Masha Iureva committedOct 11, 2024 Configuration menu - View commit details
-
Copy full SHA for 50763bd - Browse repository at this point
Copy the full SHA 50763bdView commit details
Commits on Oct 14, 2024
-
adding test configs for mixer validator
Masha Iureva authored and Masha Iureva committedOct 14, 2024 Configuration menu - View commit details
-
Copy full SHA for 001fd04 - Browse repository at this point
Copy the full SHA 001fd04View commit details -
Masha Iureva authored and Masha Iureva committed
Oct 14, 2024 Configuration menu - View commit details
-
Copy full SHA for 15a7104 - Browse repository at this point
Copy the full SHA 15a7104View commit details
Commits on Oct 16, 2024
-
addressing comments, spliting script into smaller files, moving test …
…configs to test folder, adding a couple of helpers functions
Masha Iureva authored and Masha Iureva committedOct 16, 2024 Configuration menu - View commit details
-
Copy full SHA for b740e45 - Browse repository at this point
Copy the full SHA b740e45View commit details
Commits on Oct 17, 2024
-
adding --verbose method, support of .env variables
Masha Iureva authored and Masha Iureva committedOct 17, 2024 Configuration menu - View commit details
-
Copy full SHA for c1708e2 - Browse repository at this point
Copy the full SHA c1708e2View commit details -
Masha Iureva authored and Masha Iureva committed
Oct 17, 2024 Configuration menu - View commit details
-
Copy full SHA for 2aca6a3 - Browse repository at this point
Copy the full SHA 2aca6a3View commit details
Commits on Oct 21, 2024
-
updating types in function definitions, updating Readme
Masha Iureva authored and Masha Iureva committedOct 21, 2024 Configuration menu - View commit details
-
Copy full SHA for d10de44 - Browse repository at this point
Copy the full SHA d10de44View commit details -
Masha Iureva authored and Masha Iureva committed
Oct 21, 2024 Configuration menu - View commit details
-
Copy full SHA for e59c64b - Browse repository at this point
Copy the full SHA e59c64bView commit details -
Masha Iureva authored and Masha Iureva committed
Oct 21, 2024 Configuration menu - View commit details
-
Copy full SHA for 07c2367 - Browse repository at this point
Copy the full SHA 07c2367View commit details -
deleting the initial version of the script
Masha Iureva authored and Masha Iureva committedOct 21, 2024 Configuration menu - View commit details
-
Copy full SHA for 6941ac6 - Browse repository at this point
Copy the full SHA 6941ac6View commit details -
Configuration menu - View commit details
-
Copy full SHA for f019fef - Browse repository at this point
Copy the full SHA f019fefView commit details