HTML to text
html2text is a Python package that converts a page of
HTML
into clean, easy-to-read plainASCII text
.
The ASCII also happens to be a valid Markdown
(a text-to-HTML format).
Installation and Setupβ
pip install html2text
Document Transformerβ
See a usage example.
from langchain.document_loaders import Html2TextTransformer