Feature Type
- [ ] Extending the existing CI checks for package versioning to account for optional extras now defined in
setup.cfg
Problem Description
After #47336 added optional_extras in setup.cfg
, we do not have a programmatic check that versions are aligned across the all places they are specified in pandas.
@JMBurley & @mroeschke active in prior thread.
Feature Description
A pre-commit check that ensure that the min version in setup.cfg are aligned with other areas we specify min version (would mean augmenting scripts/validate_min_versions_in_sync.py
)
Alternative Solutions
N/A. Simplest to extend existing actions to cover new usage.
Additional Context
No response
Comment From: imnik11
Hi, is this issue related to fixing the versioning problem when we have dependencies on different versions of options.extras_require ( hypothesis>=5.5.3, pytest>=6.0, pytest-xdist>=1.31, pytest-asyncio>=0.17.0) or all the depending packages version mentioned in setup.cfg ?
Comment From: mroeschke
This is an issue to follow up https://github.com/pandas-dev/pandas/pull/47336 such that setup.cfg
dependencies are versions are synced with other files that specify minimum dependencies
Comment From: kostyafarber
Hey interesting in taking this on? Any ideas where to get started?
Comment From: imnik11
yes, I have earlier worked on creating ci actions which checks the code formatting, builds etc whenever there is push to pr but this one I am still trying to understand the flow to get start
Comment From: MarcoGorelli
I'd suggest checking how scripts/validate_min_versions_in_sync.py
works, and trying to make something similar. You can use https://docs.python.org/3/library/configparser.html to parse .cfg
files
Comment From: mroeschke
Noting that this issue is blocked by https://github.com/pandas-dev/pandas/pull/47336 being merged first.
Comment From: kostyafarber
Anybody have any idea on how to handle the symmetric difference for more than two sets?
diff = set(ci_optional.items()).symmetric_difference(code_optional.items())
in scripts/validate_min_versions_in_sync.py
From my understanding we would want to add the dependencies extracted from setup.cfg
into that line. Not sure how to chain them together.
Comment From: kostyafarber
Would something like diff = set(ci_optional.items() ^ code_optional.items() ^ setup_optional.items())
work?
Comment From: JMBurley
diff = a.items() ^ b.items()^ c.items()
or any similar chained XOR will not work. For one, any items that exist in both a&b&c would be incorrectly flagged as missing (because you return the list of difference between a,b and then compare that against c).
Instead subtract the intersection from the union diff = (a | b | c) - (a & b & c)
. I'm not sure why the sample is using .items()
to work with tuples of key:value as I suspect we only need the keys or values (not both). But regardless with some modification we can do something based on intersection-union of sets.
Based on experience my key goal here would be clear error messages to help users find the problem, so I'd then compare the diff list against each the pkg list from each source file to identify which file is missing what pkg and print that as part of the failure message.
Comment From: JMBurley
PS. @mroeschke suggest we add this to the 2.0 milestone. Apologies for the ping on admin; I don't have permissions to do add milestones or tags.
Comment From: kostyafarber
Thanks @JMBurley ill look into it.
Comment From: kostyafarber
^ Was meant to make it a draft