HTML Text Extractor Utility
Extracts unstructured text from scientific papers published as HTML files .
TrafilaturaSectionLoader
TrafilaturaSectionLoader (file_path:str)
Load HTML
files and parse them with trafilatura
into sections.
TrafilaturaSectionLoader (file_path:str)
Load HTML
files and parse them with trafilatura
into sections.