Code Sample, a copy-pastable example if possible
# toy data
x = [0, 1, 1, 1, 2, 3, 4]
y = pd.qcut(sorted(np.unique(x), reverse=True), 5, labels=np.arange(5, 0, -1)) # size 5 only!
table = pd.DataFrame()
table['col_1'] = x # size 7
table['col_2'] = y # size 5!
Problem description
I'm assigning vectors with mismatched sizes to the dataframe. I'm not even specifying the index to assign it. This became a problem because it didn't raise any error. I was expecting it to raise an error or a warning. This can be a problem for code using table
.
Expected Output
Raise some error upon this mismatched assignment. The error can be generated if I only call:
table
---------------------------------------------------------------------------
IndexError Traceback (most recent call last)
/usr/local/lib/python3.4/dist-packages/IPython/core/formatters.py in __call__(self, obj)
309 method = get_real_method(obj, self.print_method)
310 if method is not None:
--> 311 return method()
312 return None
313 else:
/usr/local/lib/python3.4/dist-packages/pandas/core/frame.py in _repr_html_(self)
608
609 return self.to_html(max_rows=max_rows, max_cols=max_cols,
--> 610 show_dimensions=show_dimensions, notebook=True)
611 else:
612 return None
/usr/local/lib/python3.4/dist-packages/pandas/core/frame.py in to_html(self, buf, columns, col_space, header, index, na_rep, formatters, float_format, sparsify, index_names, justify, bold_rows, classes, escape, max_rows, max_cols, show_dimensions, notebook, decimal, border)
1606 decimal=decimal)
1607 # TODO: a generic formatter wld b in DataFrameFormatter
-> 1608 formatter.to_html(classes=classes, notebook=notebook, border=border)
1609
1610 if buf is None:
/usr/local/lib/python3.4/dist-packages/pandas/formats/format.py in to_html(self, classes, notebook, border)
698 border=border)
699 if hasattr(self.buf, 'write'):
--> 700 html_renderer.write_result(self.buf)
701 elif isinstance(self.buf, compat.string_types):
702 with open(self.buf, 'w') as f:
/usr/local/lib/python3.4/dist-packages/pandas/formats/format.py in write_result(self, buf)
1022 indent += self.indent_delta
1023 indent = self._write_header(indent)
-> 1024 indent = self._write_body(indent)
1025
1026 self.write('</table>', indent)
/usr/local/lib/python3.4/dist-packages/pandas/formats/format.py in _write_body(self, indent)
1181 self._write_hierarchical_rows(fmt_values, indent)
1182 else:
-> 1183 self._write_regular_rows(fmt_values, indent)
1184 else:
1185 for i in range(len(self.frame)):
/usr/local/lib/python3.4/dist-packages/pandas/formats/format.py in _write_regular_rows(self, fmt_values, indent)
1215 row = []
1216 row.append(index_values[i])
-> 1217 row.extend(fmt_values[j][i] for j in range(ncols))
1218
1219 if truncate_h:
/usr/local/lib/python3.4/dist-packages/pandas/formats/format.py in <genexpr>(.0)
1215 row = []
1216 row.append(index_values[i])
-> 1217 row.extend(fmt_values[j][i] for j in range(ncols))
1218
1219 if truncate_h:
IndexError: list index out of range
Out[33]:
col_1 col_2
0 0 1
1 1 2
2 1 3
3 1 4
4 2 5
5 3
6 4
Output of pd.show_versions()
INSTALLED VERSIONS
------------------
commit: None
python: 3.4.3.final.0
python-bits: 64
OS: Linux
OS-release: 3.13.0-101-generic
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: C
LANG: en_US.UTF-8
LOCALE: None.None
pandas: 0.19.2
nose: 1.3.1
pip: 9.0.1
setuptools: 32.3.1
Cython: None
numpy: 1.11.0
scipy: 0.18.1
statsmodels: None
xarray: None
IPython: 5.1.0
sphinx: None
patsy: None
dateutil: 2.0
pytz: 2012c
blosc: None
bottleneck: None
tables: None
numexpr: None
matplotlib: 1.5.3
openpyxl: None
xlrd: None
xlwt: 1.2.0
xlsxwriter: None
lxml: None
bs4: None
html5lib: 0.999
httplib2: None
apiclient: None
sqlalchemy: None
pymysql: None
psycopg2: None
jinja2: 2.8.1
boto: None
pandas_datareader: None
Comment From: chris-b1
Thanks for the report - this is a duplicate of #13696 - PR to fix welcome.
Comment From: jreback
thanks, yes this is a bug, duplicate of #13696 pull-requests to fix are welcome.
I think this should raise as this is a non-labeled array.
@chris-b1 beat me too it!