Pandas No error when mismatched vectors are assigned to pandas.DataFrame

Code Sample, a copy-pastable example if possible

# toy data
x = [0, 1, 1, 1, 2, 3, 4]
y = pd.qcut(sorted(np.unique(x), reverse=True), 5, labels=np.arange(5, 0, -1)) # size 5 only! 
table = pd.DataFrame()
table['col_1'] = x # size 7
table['col_2'] = y # size 5!

Problem description

I'm assigning vectors with mismatched sizes to the dataframe. I'm not even specifying the index to assign it. This became a problem because it didn't raise any error. I was expecting it to raise an error or a warning. This can be a problem for code using table.

Expected Output

Raise some error upon this mismatched assignment. The error can be generated if I only call:

table

---------------------------------------------------------------------------
IndexError                                Traceback (most recent call last)
/usr/local/lib/python3.4/dist-packages/IPython/core/formatters.py in __call__(self, obj)
    309             method = get_real_method(obj, self.print_method)
    310             if method is not None:
--> 311                 return method()
    312             return None
    313         else:

/usr/local/lib/python3.4/dist-packages/pandas/core/frame.py in _repr_html_(self)
    608 
    609             return self.to_html(max_rows=max_rows, max_cols=max_cols,
--> 610                                 show_dimensions=show_dimensions, notebook=True)
    611         else:
    612             return None

/usr/local/lib/python3.4/dist-packages/pandas/core/frame.py in to_html(self, buf, columns, col_space, header, index, na_rep, formatters, float_format, sparsify, index_names, justify, bold_rows, classes, escape, max_rows, max_cols, show_dimensions, notebook, decimal, border)
   1606                                            decimal=decimal)
   1607         # TODO: a generic formatter wld b in DataFrameFormatter
-> 1608         formatter.to_html(classes=classes, notebook=notebook, border=border)
   1609 
   1610         if buf is None:

/usr/local/lib/python3.4/dist-packages/pandas/formats/format.py in to_html(self, classes, notebook, border)
    698                                       border=border)
    699         if hasattr(self.buf, 'write'):
--> 700             html_renderer.write_result(self.buf)
    701         elif isinstance(self.buf, compat.string_types):
    702             with open(self.buf, 'w') as f:

/usr/local/lib/python3.4/dist-packages/pandas/formats/format.py in write_result(self, buf)
   1022         indent += self.indent_delta
   1023         indent = self._write_header(indent)
-> 1024         indent = self._write_body(indent)
   1025 
   1026         self.write('</table>', indent)

/usr/local/lib/python3.4/dist-packages/pandas/formats/format.py in _write_body(self, indent)
   1181                 self._write_hierarchical_rows(fmt_values, indent)
   1182             else:
-> 1183                 self._write_regular_rows(fmt_values, indent)
   1184         else:
   1185             for i in range(len(self.frame)):

/usr/local/lib/python3.4/dist-packages/pandas/formats/format.py in _write_regular_rows(self, fmt_values, indent)
   1215             row = []
   1216             row.append(index_values[i])
-> 1217             row.extend(fmt_values[j][i] for j in range(ncols))
   1218 
   1219             if truncate_h:

/usr/local/lib/python3.4/dist-packages/pandas/formats/format.py in <genexpr>(.0)
   1215             row = []
   1216             row.append(index_values[i])
-> 1217             row.extend(fmt_values[j][i] for j in range(ncols))
   1218 
   1219             if truncate_h:

IndexError: list index out of range

Out[33]:
   col_1 col_2
0      0     1
1      1     2
2      1     3
3      1     4
4      2     5
5      3      
6      4

Output of `pd.show_versions()`

INSTALLED VERSIONS ------------------ commit: None python: 3.4.3.final.0 python-bits: 64 OS: Linux OS-release: 3.13.0-101-generic machine: x86_64 processor: x86_64 byteorder: little LC_ALL: C LANG: en_US.UTF-8 LOCALE: None.None pandas: 0.19.2 nose: 1.3.1 pip: 9.0.1 setuptools: 32.3.1 Cython: None numpy: 1.11.0 scipy: 0.18.1 statsmodels: None xarray: None IPython: 5.1.0 sphinx: None patsy: None dateutil: 2.0 pytz: 2012c blosc: None bottleneck: None tables: None numexpr: None matplotlib: 1.5.3 openpyxl: None xlrd: None xlwt: 1.2.0 xlsxwriter: None lxml: None bs4: None html5lib: 0.999 httplib2: None apiclient: None sqlalchemy: None pymysql: None psycopg2: None jinja2: 2.8.1 boto: None pandas_datareader: None

Comment From: chris-b1

Thanks for the report - this is a duplicate of #13696 - PR to fix welcome.

Comment From: jreback

thanks, yes this is a bug, duplicate of #13696 pull-requests to fix are welcome.

I think this should raise as this is a non-labeled array.

@chris-b1 beat me too it!

Pandas No error when mismatched vectors are assigned to pandas.DataFrame

Code Sample, a copy-pastable example if possible

Problem description

Expected Output

Output of pd.show_versions()

Output of `pd.show_versions()`