HTML Text Extractor Utility
Extracts unstructured text from scientific papers published as HTML files .
TrafilaturaSectionLoader
TrafilaturaSectionLoader (file_path:str)
Load HTML files and parse them with trafilatura into sections.
TrafilaturaSectionLoader (file_path:str)
Load HTML files and parse them with trafilatura into sections.