This is a general issue, to have as a starting point for (new) contributors to improve the docstrings (many docstrings have missing parts, and improvements on this front are always welcome).

In general, some attention points for docstrings in pandas. Each docstring should have:

  • Clear explanation of the function/method/attribute's purpose and what it does
  • Clear explanation of all parameter values
  • State the return value and type (xref #17114)
  • Some examples
  • Note: examples should be runnable (thus include creation of the example DataFrame/Series, import pandas as pd not needed to include) (xref #3439)
  • follow more the numpydoc docstring standard (stylistic, see also, all arguments)

There are quite some docstrings that are still lacking on some of these points (in particular examples), and PRs to improve this are always welcome.

Approach: pick a function, check the docstring, and if needed make a PR with updates!

Comment From: ResidentMario

How about writing docstrings for internal routines?

In my admittedly brief experience, it is hard to know which chunk of code you should be using for whatever particular task. In part this is because pandas is such a sprawling library and it takes experience to know where things are, but also in part because IMO there's little internal documentation on the architecture.

The Block classes are one such thing for me, for example, I'm still not totally sure what they're there for.

Now I realize that you have to be careful with this because methods change and "no docstring is better than a wrong docstring", but I still feel like a little bit more direction would be helpful. πŸ˜…

Comment From: jorisvandenbossche

@ResidentMario Yes, of course, that is certainly something were pandas is often lacking, and contributions there are also welcome! But, let's keep this issue focused on public functions (the 'quality' requirements there are more stringent, and internal functions is not always the best place to start for new contributors.

Comment From: MarianaBazely

Hi all, I have been learning Python during my "free" time and I am loving every single time of it. I try to come up with ideas for projects, so I can keep learning. I have been using Pandas more and more and I really enjoy it. I have been thinking of ways to help with open source as a beginner and I have read plenty of blogs that says "Find a project that you feel committed to, etc.". There's this deep painful love relationship between Pandas and I. πŸ’˜ I would like to help somehow. I constantly have to google how to do the things I am thinking of and I know some of the docs need examples of how to use them , etc. So I thought that maybe I can help with that? At least to start with!

I guess I am trying to introduce myself and show my interest in helping. Maybe I am just checking if my plan is a good idea!

(Sorry for the poorly written comment - English is not my first language!)

πŸ˜€

Comment From: jorisvandenbossche

@PukkaPad That sounds great! Certainly check out the contributing docs: http://pandas.pydata.org/pandas-docs/stable/contributing.html, and if you have any questions, ask here or on the pandas-dev gitter channel https://gitter.im/pydata/pandas

Comment From: MarianaBazely

Hi, @jorisvandenbossche ! I wasn't expecting to start having problems already, but I don't really know what to do. Could you please help me? I am trying to do the 'Build the documentation' step. I dont think it's working. I have followed the guide, so I have done the following steps already:

  • Fork

  • Creating a development environment

  • install Requirements

My understanding is that I have to activate my environment (source activate pandas_dev) and then run python make.py html in order to Build the documentation. The text is not very clear to me, to be honest. But the requirements are installed on the pandas_dev environment, so it makes sense that I should activate it. I could be very wrong. http://pandas.pydata.org/pandas-docs/stable/contributing.html#building-the-documentation

So, I tried to run python make.py html This is what I get:

`(pandas_dev) Marianas-MBP:doc MarianaSouza$ python make.py html Converting source/html-styling.ipynb No module named jupyter_client.manager Failed to convert source/html-styling.ipynb Running Sphinx v1.5.1 Exception occurred while building, starting debugger: Traceback (most recent call last): File "/Users/MarianaSouza/miniconda2/envs/pandas_dev/lib/python2.7/site-packages/Sphinx-1.5.1-py2.7.egg/sphinx/cmdline.py", line 295, in main opts.warningiserror, opts.tags, opts.verbosity, opts.jobs) File "/Users/MarianaSouza/miniconda2/envs/pandas_dev/lib/python2.7/site-packages/Sphinx-1.5.1-py2.7.egg/sphinx/application.py", line 163, in init confoverrides or {}, self.tags) File "/Users/MarianaSouza/miniconda2/envs/pandas_dev/lib/python2.7/site-packages/Sphinx-1.5.1-py2.7.egg/sphinx/config.py", line 134, in init execfile_(filename, config) File "/Users/MarianaSouza/miniconda2/envs/pandas_dev/lib/python2.7/site-packages/Sphinx-1.5.1-py2.7.egg/sphinx/util/pycompat.py", line 129, in execfile_ exec_(code, globals) File "/Users/MarianaSouza/miniconda2/envs/pandas_dev/lib/python2.7/site-packages/six.py", line 699, in exec exec("""exec code in globs, locs""") File "", line 1, in File "conf.py", line 17, in File "/Users/MarianaSouza/Code/projects/pandas_dev/pandas-mariana/pandas/init.py", line 35, in "the C extensions first.".format(module)) ImportError: C extension: No module named tslib not built. If you want to import pandas from the source directory, you may need to run 'python setup.py build_ext --inplace --force' to build the C extensions first.

/Users/MarianaSouza/Code/projects/pandas_dev/pandas-mariana/pandas/init.py(35)() -> "the C extensions first.".format(module)) (Pdb)`

Is there something else I should do? Am I missing something obvious? I thought of creating a branch afterwards, once I find the document that needs help. I tried to run python setup.py build_ext --inplace --force with no success :(

Thank you

Comment From: jreback

you need to run python setup.py build_ext --inplace in the forked directory (e.g. where pandas was downloaded from git).

did that not work?

Comment From: MarianaBazely

That was what I was doing, running from pandas-mariana. I googled the error and I had to install cython (pip install cython). Now python setup.py build_ext --inplace seems to work but I got warnings.

I tried again python make.py html, which I am running from my environment, in the doc/ directory. I get an error *** ImportError: No module named matplotlib So I decided to install all pandas dependencies, just to see if that would fix it conda install -n pandas_dev -c pandas --file ci/requirements_all.txt

And I get the PAckageNotFoundError, which seem to be an open issue: https://github.com/conda/conda/issues/4860 This is the error: PackageNotFoundError: Package not found: Conda could not find '

I would love to keep trying if I know what else I can do? I really want to have it working so I can help

Should I uninstall miniconda and start over with anaconda instead?

Comment From: jorisvandenbossche

@PukkaPad Can you ask it at gitter? (to keep this issue about docstrings) https://gitter.im/pydata/pandas

Comment From: jorisvandenbossche

BTW, you can better use -c conda-forge instead of -c pandas (should be updated in the docs I think)

Comment From: bsipocz

@jorisvandenbossche - you mention that while all docstring examples are expected to be runable the pandas imports are not required. Could I suggest to still include it? (I've run into it when copy pasting an example from the API docs today. While it was trivial to see that why I got the error, it would have been more convenient if the import was included).

Comment From: Milind220

Hi, I'd like to look into contributing to this. I've never made a PR before, and I'd love to get started. This looks like the kind of issue that I could actually help with :) Is this still open?

EDIT: While browsing the repository, I found that the min, max, sum and prod functions defined here do not have docstrings. Is this a valid issue to solve? I could add the docstrings :)

Comment From: chikausa

Hello! I was wondering if you're still accepting contributions on this. If it's alright, could I take up this issue to try contributing some docstrings?

Comment From: chikausa

take

Comment From: MarcoGorelli

we OK to close this issue? it's too generic and not-actionable at this point, and there's already the validate_docstrings checks for docstrings

Comment From: MarianaBazely

Yes please

On Wed, 19 Apr 2023 at 09:12, Marco Edward Gorelli @.***> wrote:

we OK to close this issue? it's too generic and not-actionable at this point, and there's already the validate_docstrings checks for docstrings

β€” Reply to this email directly, view it on GitHub https://github.com/pandas-dev/pandas/issues/15580#issuecomment-1514319213, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADHWTZ5ZJ4TNL5B4SW3ZLZ3XB6NAFANCNFSM4DCN4ZNA . You are receiving this because you were mentioned.Message ID: @.***>

-- Mariana

Comment From: mroeschke

Yeah let's close in favor of more specific issues targeting specific docstrings