HTML Text Extractor Utility

Extracts unstructured text from scientific papers published as HTML files .

source

TrafilaturaSectionLoader

 TrafilaturaSectionLoader (file_path:str)

Load HTML files and parse them with trafilatura into sections.