JSON Text Markdown PDF Page PDF Paragraph Tika (DOCX, PPTX, HTML…​)

Can ETL Pipeline support more ways to read the file content, such as byte [], http url, or the file upload MultipartFile can directly read the file text content?

Comment From: alexcheng1982

You need to provide your own implementations of DocumentReader.

Comment From: markpollack

TextReader(Resource resource) takes the spring resource, so that could handle byte[] and URLs. Let me noe @OnceCrazyer if that works for your use case.