-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
wip feat(reader): add a new column for days since last reply in a issue or PR #18
base: main
Are you sure you want to change the base?
Conversation
@@ -86,6 +88,7 @@ async def run_async(self): | |||
|
|||
for level, df in data_segmented.items(): | |||
table = df.reset_index(drop=True).to_markdown() | |||
print(table) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I forgot to delete this line. I was trying to print the information at first. Just for checking the table generated.
Thank you for working on that @tintayadev! I'll be reviewing it later today. |
Hi @tintayadev sorry for taking so long to give you some feedback First of all, thanks for working on that! So, I think we should try to get the data from all open PRs/Issues from a given repository, because by only getting the latest one can be impressive, lets say a repository have good time to answering prs/issues but the latest one is pretty innactive, it would cause it to have a bad place when applying the criteria, that's why if we can get all the data we can apply a more precise criteria. Also, I think we should split one query per method/function and compose it, what I mean is:
the query you would be using in query {
repository(owner: "conda-forge", name: "conda-forge-pinning-feedstock") {
name
nameWithOwner
owner {
url
login
}
pullRequests(first: 45, states: OPEN) { # this will come from the query we already have
edges {
node {
comments(last: 1) {
nodes {
lastEditedAt
}
}
}
}
}
}
} So you would be looping trough the With all this data you could build some dataframe with: repo_name, 3_days_since_last_reply, more_than_10_days_since_last_reply, some grouping where we would have the counting of prs from that repo that falls into that number. After that, we would be joining this created dataframe to the dataframe we got from It may sound complicated, but you did the data processing heavy lifting already The reason for composing the queries by separating it by method is because it would became too cluttered with the more requirements from issues we grow, this way you can compose different queries to achieve different needs |
Solves #9
Pull Request description
In
reader.py
, It has been implemented a query to get the dates of the most recent reply in a Issue and PR. According to that we get the most recent date between those and using datetime we get the number of days since that date.report.py
also has been modified, in the functionapply_criteria
to change within the new information of the repos.How to test these changes
...
Pull Request checklists
This PR is a:
About this PR:
Author's checklist:
Additional information
Reviewer's checklist
Copy and paste this template for your review's note: