📄️ Beautiful Soup
Beautiful Soup is a
📄️ Google Cloud Document AI
Document AI is a document understanding platform from Google Cloud to
📄️ Doctran: extract properties
We can extract useful features of documents using the
📄️ Doctran: interrogate documents
Documents used in a vector store knowledge base are typically stored in
📄️ Doctran: language translation
Comparing documents through embeddings has the benefit of working across
📄️ Google Translate
Google Translate is a multilingual
📄️ HTML to text
html2text is a Python package
📄️ Nuclia
Nuclia automatically indexes your unstructured
📄️ OpenAI metadata tagger
It can often be useful to tag ingested documents with structured