Pandas ENH: Do not exclude Boolean columns from DataFrame.describe()

Sometimes, it is desired to know the percentage of true values in a Boolean column. Like when a DataFrame is composed of results from fitting many models, knowing up front the distribution of number of successful fits is helpful.

In [1]: df = pd.DataFrame({'c1' : list('abcdef'),
                           'c2' : [True, False, True, True, False, False],
                           'c3' : np.random.randn(6),
                           'c4' : np.arange(6)})
        df.describe()
Out[1]:
        c3          c4
count   6.000000    6.000000
mean    0.088425    2.500000
std     1.633860    1.870829
min     -2.337804   0.000000
25%     -0.973194   1.250000
50%     0.487471    2.500000
75%     1.231870    3.750000
max     1.873493    5.000000

Expected Output

In [2]: df['c2'] = df['c2'].astype('int64')
        df.describe()
Out[2]: 
        c2          c3          c4
count   6.000000    6.000000    6.000000
mean    0.500000    0.088425    2.500000
std     0.547723    1.633860    1.870829
min     0.000000    -2.337804   0.000000
25%     0.000000    -0.973194   1.250000
50%     0.500000    0.487471    2.500000
75%     1.000000    1.231870    3.750000
max     1.000000    1.873493    5.000000

Comment From: jreback

This requires an explict inclusion, as bool is generally a categorical (and not a numeric).

In [4]: df.describe(include=['bool'])
Out[4]: 
          c2
count      6
unique     2
top     True
freq       3

Doing this generically is very easy.

In [6]: f.select_dtypes(include=['bool']).astype('int').describe()
Out[6]: 
             c2
count  6.000000
mean   0.500000
std    0.547723
min    0.000000
25%    0.000000
50%    0.500000
75%    1.000000
max    1.000000