Feature Type

  • [x] Adding new functionality to pandas

  • [ ] Changing existing functionality in pandas

  • [ ] Removing existing functionality in pandas

Problem Description

This should be a simple follow-up to https://github.com/pandas-dev/pandas/issues/9471, enabling support for alignment with method='nearest'.

Since fillna internally uses interpolate, which already supports method='nearest', this might work right away, though it will require extensive testing.

Feature Description

The new feature could be implemented by extending the current alignment functionality in Pandas to support method='nearest'. This would allow the user to align two Series or DataFrames by their indices, using the nearest available value when exact matches are not found. Here's a basic idea of how it could be implemented in pseudocode:

def align_nearest(df1, df2):
    # Use a nearest neighbor search to align the indices
    df1_nearest = df1.reindex(df2.index, method='nearest')
    return df1_nearest

This functionality could be added as a method to the existing pandas.DataFrame and pandas.Series objects, integrating smoothly into the current API.

Alternative Solutions

An alternative solution would be to use the existing interpolate function with method='nearest', which can be applied to the DataFrame or Series before performing the alignment. Additionally, third-party libraries like fuzzywuzzy or scipy.spatial could be used for more complex nearest matching.

import pandas as pd
from fuzzywuzzy import process

# Example using fuzzywuzzy to find nearest match
df1 = pd.DataFrame([...])
df2 = pd.DataFrame([...])
df1['nearest'] = df1['index_column'].apply(lambda x: process.extractOne(x, df2['index_column'])[0])

However, native support within Pandas would likely be more efficient and user-friendly.

Additional Context