Pandas Error on multiIndex

Code Sample, a copy-pastable example if possible

compare = pd.MultiIndex.from_product([df_agr['TEXT1'],df_mat['TEXT1']]).to_series()

Problem description

I'm creating an series of multi index to compare and get the fuzzy score

ValueError Traceback (most recent call last) in () ----> 1 compare = pd.MultiIndex.from_product([df_agr['KTEXT1'],df_mat['TXZ01']]).to_series()

C:\Anaconda3\lib\site-packages\pandas\indexes\multi.py in from_product(cls, iterables, sortorder, names) 934 categoricals = [Categorical.from_array(it, ordered=True) 935 for it in iterables] --> 936 labels = cartesian_product([c.codes for c in categoricals]) 937 938 return MultiIndex(levels=[c.categories for c in categoricals],

C:\Anaconda3\lib\site-packages\pandas\tools\util.py in cartesian_product(X) 37 return [np.tile(np.repeat(np.asarray(com._values_from_object(x)), b[i]), 38 np.product(a[i])) ---> 39 for i, x in enumerate(X)] 40 41

C:\Anaconda3\lib\site-packages\pandas\tools\util.py in (.0) 37 return [np.tile(np.repeat(np.asarray(com._values_from_object(x)), b[i]), 38 np.product(a[i])) ---> 39 for i, x in enumerate(X)] 40 41

C:\Anaconda3\lib\site-packages\numpy\core\fromnumeric.py in repeat(a, repeats, axis) 395 except AttributeError: 396 return _wrapit(a, 'repeat', repeats, axis) --> 397 return repeat(repeats, axis) 398 399

ValueError: negative dimensions are not allowed

Expected Output

Agreement   Material    ratio   token_sort_ratio    partial_ratio   token_set_ratio

0 OP-001BR1(4) MA Offshore Day Rate TIE,CABLE,NYLON,780 X 8.9 MM,50/PACK 14 15 41 41 1 OP-001BR1(4) MA Offshore Day Rate CHAIN TAIL 76mm 12 20 21 21 2 OP-001BR1(4) MA Offshore Day Rate 5-LINK ADAPTOR NRV4-84mm 21 21 36 36 3 OP-001BR1(4) MA Offshore Day Rate 5-LINK ADAPTOR NRV4-76mm 21 21 32 32 4 OP-001BR1(4) MA Offshore Day Rate ANCHOR-12 TONS STEVPRIS MK5 23 22 37 37

Output of `pd.show_versions()`

# Paste the output here pd.show_versions() here

Comment From: TomAugspurger

Could you simplify your example to make it reproducible?

Comment From: sp3234

For every row in df_agr['TEXT1'] comapre with all the rows in df_mat['TEXT1'] column if you have seen in an expected out OP-001BR1(4) MA Offshore Day Rate is compared with all the rows in df_mat['TEXT1'] column ratio tokensortratio partial_ratio token_set_ratio are calculated columns

It is working fine if shape is same for two dataframes

Simple ratio =========== fuzz.ratio("this is a test", "this is a test!") 97 Partial Ratio =========== fuzz.partial_ratio("this is a test", "this is a test!") 100 Token Sort Ratio =========== fuzz.token_sort_ratio("fuzzy wuzzy was a bear", "wuzzy fuzzy was a bear") 100 Token Set Ratio =========== fuzz.token_set_ratio("fuzzy was a bear", "fuzzy fuzzy was a bear") 100

Comment From: jreback

@sp3234 need a copy-pastable example that one can simply run. pls follow the instructions and populate pd.show_versions() as well.