follow up on issues https://github.com/pandas-dev/pandas/issues/56804, https://github.com/pandas-dev/pandas/issues/59458 and https://github.com/pandas-dev/pandas/issues/58063 pandas has a script for validating docstrings:
https://github.com/pandas-dev/pandas/blob/2244402942dbd30bdf367ceae49937c179e42bcb/ci/code_checks.sh#L112-L128
Currently, some methods fail docstring validation check.
The task here is:
* take 2-4 methods
* run: scripts/validate_docstrings.py <method-name>
* run pytest pandas/tests/scalar/test_nat.py::test_nat_doc_strings
* fix the docstrings according to whatever error is reported
* remove those methods from code_checks.sh
script
* commit, push, open pull request
Example:
scripts/validate_docstrings.py pandas.Timedelta.ceil
pandas.Timedelta.ceil fails with the SA01 and ES01 errors
################################################################################
################################## Validation ##################################
################################################################################
2 Errors found for `pandas.Timedelta.ceil`:
ES01 No extended summary found
SA01 See Also section not found
Please don't comment take
as multiple people can work on this issue. You also don't need to ask for permission to work on this, just comment on which methods are you going to work.
If you're new contributor, please check the contributing guide
Comment From: TheRockStarDBA
take
Comment From: doshi-kevin
Working on --i "pandas.Timedelta.asm8 SA01" \
Comment From: doshi-kevin
Hey @natmokval I tried running 'scripts/validate_docstrings.py pandas.Timedelta.ceil' command on my terminal. But I dont know why it doesnt execute.
Comment From: NavaneetthaSundararaj
I would like to work on:
-i "pandas.TimedeltaIndex.nanoseconds SA01" \ -i "pandas.TimedeltaIndex.seconds SA01" \ -i "pandas.TimedeltaIndex.to_pytimedelta RT03,SA01" \
Comment From: ivonastojanovic
I'll work on these:
-i "pandas.Timedelta.components SA01" \
-i "pandas.Timedelta.floor SA01" \
-i "pandas.Timedelta.max PR02" \
-i "pandas.Timedelta.min PR02" \
Comment From: doshi-kevin
Hey @ivonastojanovic , could you please tell me how to solve this issue, I really want to be a contributor here but for some reason scripts/validate_docstrings.py
Comment From: ivonastojanovic
Hey @ivonastojanovic , could you please tell me how to solve this issue, I really want to be a contributor here but for some reason scripts/validate_docstrings.py command doesnt work. (I have tried to put different functions in place, if anyone could help me... :)
Hey @doshi-kevin, when you run scripts/validate_docstrings.py pandas.Timedelta.ceil
, what do you get as output? You should get something like this:
$ scripts/validate_docstrings.py pandas.Timedelta.ceil
+ /home/codespace/.python/current/bin/ninja
[1/1] Generating write_version_file with a custom command
################################################################################
###################### Docstring (pandas.Timedelta.ceil) ######################
################################################################################
Return a new Timedelta ceiled to this resolution.
Parameters
----------
freq : str
Frequency string indicating the ceiling resolution.
It uses the same units as class constructor :class:`~pandas.Timedelta`.
Examples
--------
>>> td = pd.Timedelta('1001ms')
>>> td
Timedelta('0 days 00:00:01.001000')
>>> td.ceil('s')
Timedelta('0 days 00:00:02')
################################################################################
################################## Validation ##################################
################################################################################
2 Errors found for `pandas.Timedelta.ceil`:
ES01 No extended summary found
SA01 See Also section not found
At the bottom you see 2 errors, you shouldn't do anything about ES01
, but you should fix SA01
. pandas.Timedelta.ceil
method is missing 'See Also' section which you should add. Try to locate the method first and then add references to related methods, classes, attributes to 'See Also' section.
Comment From: doshi-kevin
That's what the problem is, I forked the repo, kept it upstream and ran this command. The thing is nothing happens after that, the terminal just stops .... Maybe I need to fork the repo once again
Comment From: doshi-kevin
I don't understand why this is happening, the code does not give me any output.
Comment From: fbourgey
Working on
"pandas.Timedelta.round SA01"
"pandas.Timedelta.to_timedelta64 SA01"
"pandas.Timedelta.total_seconds SA01"
"pandas.Timedelta.view SA01"
Comment From: uditbaliyan
sorry i forgot to mention on which methods i was working but i have created pull request for this method "pandas.Timedelta.round SA01" @fbourgey
Comment From: amanlai
Working on
-i "pandas.Timedelta.to_numpy PR01" \
-i "pandas.TimedeltaIndex.components SA01" \
-i "pandas.TimedeltaIndex.microseconds SA01" \
Comment From: ammar-qazi
Working on
-i "pandas.Timedelta.ceil SA01" \
I installed a whole operating system for this :).
Comment From: ammar-qazi
Working on
-i "pandas.Timedelta.ceil SA01" \
I installed a whole operating system for this :).
I guess @uditbaliyan already completed this.
Comment From: amanlai
@ammar-qazi If you have time, could you take pandas.Timedelta.resolution
? I initially wanted to take it but I can't find its docstring.
Comment From: ammar-qazi
Thank you, @amanlai. I'll take pandas.Timedelta.resolution
then.
Comment From: ammar-qazi
@ivonastojanovic I might have resolved the following in my commit for pandas.Timedelta.resolution
.
-i "pandas.Timedelta.max PR02" \
-i "pandas.Timedelta.min PR02" \
That said, I'm not sure because there's no definitive docstring, as @amanlai already mentioned.
Min, max, and resolution are probably attributes being defined by cdef class _Timedelta(timedelta)
and MinMaxReso
. As a result, I don't know how to exactly handle it.
Alternatively, we can get the docs to show up after compilation by adding the docstring to timedeltas.pyi instead of timedeltas.pyx. However, I don't think that's the recommended way.
Comment From: saldanhad
working on #59914
-i "pandas.TimedeltaIndex.to_pytimedelta RT03,SA01" \
Comment From: Manju080
Working on -i "pandas.Timedelta.components SA01" \
Comment From: Ravenin7
@doshi-kevin I was having the same problem, but writing "python" before scripts/validate_docstrings.py
Comment From: Ravenin7
@ammar-qazi Are you still working on the following or can I take them -
-i "pandas.Timedelta.resolution PR02" \
-i "pandas.Timedelta.max PR02" \
-i "pandas.Timedelta.min PR02" \
It appears that your PR was waiting for review, then closed because you didn't update it with the suggested fixes.
Comment From: ammar-qazi
@Ravenin7 It didn't receive a review for quite a while, so I forgot about it. I just tried once again.
Comment From: ammar-qazi
@Ravenin7 On second thoughts, you can take it. I'm not 100% sure what's the right way to document attributes created by MinMaxReso, so I'd love to see how you approach it.
Comment From: Ravenin7
@ammar-qazi I couldn't push the requested changes to your PR due to lack of permissions, so I have created a new PR #60283 to address them.
Comment From: ammar-qazi
@Ravenin7 Thank you. That said, it still leads to an error docstring validation — I don't know the solution to that. That's why I left it.
Comment From: thaoto22
I can work on the following:
-i "pandas.Timedelta.to_numpy PR01" \
-i "pandas.TimedeltaIndex.microseconds SA01" \
Comment From: olivecon
I can work on the following "pandas.RangeIndex.start", "pandas.RangeIndex.step", "pandas.RangeIndex.stop"
(edit) Seems these have already been fixed. I will instead look into: pandas.Timestamp.resolution PR02, and pandas.util.hash_pandas_object PR07,SA01
Comment From: AnthonyPrudent
I will look into the following:
-i "pandas.TimedeltaIndex.components SA01" \
-i "pandas.Timedelta.view SA01" \
Comment From: Cheath2
Hello. I will work on the following:
-i "pandas.Timedelta.floor SA01" \ -i "pandas.Timedelta.round SA01" \
Comment From: j-hendricks
Does the issue with pandas.Timedelta.max
and pandas.Timedelta.min
lie in the fact that scripts/validate_docstrings.py
thinks these are methods, when they are actually scalars? It does not make sense for these constants to have parameters (PR02).
I think the issue has to do with the following code in doc/source/user_guide/timedeltas.rst
:
pandas represents ``Timedeltas`` in nanosecond resolution using
64 bit integers. As such, the 64 bit integer limits determine
the ``Timedelta`` limits.
.. ipython:: python
pd.Timedelta.min
pd.Timedelta.max
Do you all agree that scripts/validate_docstrings.py
is reading pd.Timedelta.min
and pd.Timedelta.max
as methods, and editing this rst file would resolve the PR02 issue?
Comment From: j-hendricks
I have found the issue with pandas.Timedelta.max
! scripts/validate_docstrings.py
is not correctly interpreting the docstring for pandas.Timedelta
.
Below is the docstring for Timedelta
. When you run Python scripts/validate_docstrings.py pandas.Timedelta.max
you get >>> PR02 Unknown parameters {'unit', 'value', '**kwargs'}
, but as you can see below, these parameters are properly documented.
Could someone please provide some advise on how to pinpoint the issue? Could Python scripts/validate_docstrings.py
be misreading __new__
? Or perhaps its confusing Timedetla
with _Timedelta
? Open to any ideas!
``` class Timedelta(_Timedelta): """ Represents a duration, the difference between two dates or times.
Timedelta is the pandas equivalent of python's ``datetime.timedelta``
and is interchangeable with it in most cases.
Parameters
----------
value : Timedelta, timedelta, np.timedelta64, str, int or float
Input value.
unit : str, default 'ns'
If input is an integer, denote the unit of the input.
If input is a float, denote the unit of the integer parts.
The decimal parts with resolution lower than 1 nanosecond are ignored.
Possible values:
* 'W', or 'D'
* 'days', or 'day'
* 'hours', 'hour', 'hr', or 'h'
* 'minutes', 'minute', 'min', or 'm'
* 'seconds', 'second', 'sec', or 's'
* 'milliseconds', 'millisecond', 'millis', 'milli', or 'ms'
* 'microseconds', 'microsecond', 'micros', 'micro', or 'us'
* 'nanoseconds', 'nanosecond', 'nanos', 'nano', or 'ns'.
.. deprecated:: 3.0.0
Allowing the values `w`, `d`, `MIN`, `MS`, `US` and `NS` to denote units
are deprecated in favour of the values `W`, `D`, `min`, `ms`, `us` and `ns`.
**kwargs
Available kwargs: {days, seconds, microseconds,
milliseconds, minutes, hours, weeks}.
Values for construction in compat with datetime.timedelta.
Numpy ints and floats will be coerced to python ints and floats.
See Also
--------
Timestamp : Represents a single timestamp in time.
TimedeltaIndex : Immutable Index of timedelta64 data.
DateOffset : Standard kind of date increment used for a date range.
to_timedelta : Convert argument to timedelta.
datetime.timedelta : Represents a duration in the datetime module.
numpy.timedelta64 : Represents a duration compatible with NumPy.
Notes
-----
The constructor may take in either both values of value and unit or
kwargs as above. Either one of them must be used during initialization
The ``.value`` attribute is always in ns.
If the precision is higher than nanoseconds, the precision of the duration is
truncated to nanoseconds.
Examples
--------
Here we initialize Timedelta object with both value and unit
>>> td = pd.Timedelta(1, "D")
>>> td
Timedelta('1 days 00:00:00')
Here we initialize the Timedelta object with kwargs
>>> td2 = pd.Timedelta(days=1)
>>> td2
Timedelta('1 days 00:00:00')
We see that either way we get the same result
"""
_req_any_kwargs_new = {"weeks", "days", "hours", "minutes", "seconds",
"milliseconds", "microseconds", "nanoseconds"}
def __new__(cls, value=_no_input, unit=None, **kwargs):
if value is _no_input:
if not len(kwargs):
raise ValueError("cannot construct a Timedelta without a "
"value/unit or descriptive keywords "
"(days,seconds....)")
kwargs = {key: _to_py_int_float(kwargs[key]) for key in kwargs}
'''