I'm looking into the code to see if this change would be a major or minor undertaking. I don't know what your opinion is about it.
I have a large data frame and a complicated map function I want to apply to all rows. Calling apply along axis 1 has been running for an hour now, and I have no idea if this will take until an hour from now or until tomorrow or until next week. I think a some progress reporting would be instrumental in helping a user decide quickly if its reasonable to use pandas or if moving to another library / stack is worthwhile.
Comment From: max-sixty
If you want to see progress without doing much - write a for loop and print something after each run
But have you considered using Dask? I don't think adding to Pandas is a small undertaking + there are a few projects explicitly for parallelizing & long running jobs; primarily Dask
Comment From: melvyniandrag
Thanks for the reply! I've heard about that for a while but still haven't used it. I'll give it a shot.
Comment From: jreback
several options here: http://stackoverflow.com/questions/18603270/progress-indicator-during-pandas-operations-python