Pandas Support additional parameters in read_xml

Is your feature request related to a problem?

I would like to use pandas read_xml function to parse shallow xml data about products and related data. My data has numeric like identifiers (ean), which are automatically casted to numbers in read_xml, and there after I can not use them to identify records in the data. There is a simple solution: support and forward extra kwargs to _parse and then TextParser. dtype=str solves the problem for me.


from pandas.io.xml import _parse
data = _parse(xml_str, './*', None, False, False, None, 'utf-8', 'lxml', None, 'infer', None, dtype=str)

->

from pandas import read_xml
data = read_xml(xml_str, dtype=str)

Without dtype=str: Pandas Support additional parameters in read_xml

If I try to convert data data after parsing with astype(str) I get strings like "5999548048023.0" With dtype=str: Pandas Support additional parameters in read_xml

Comment From: ParfaitG

Did you try current dev version of pandas? In forthcoming pandas 1.5, pandas.read_xml will support dtype parameter including converters. See dev docs.

Comment From: mroeschke

Appears dtype and converters should suffice the need here so closing. If it doesn't meet the original use case we can reopen