From bf6a15e9b5eb89820bf82c04cbe934bf62fb8617 Mon Sep 17 00:00:00 2001 From: KennyZhang1 <90438893+KennyZhang1@users.noreply.github.com> Date: Sat, 1 Feb 2025 01:23:26 -0500 Subject: [PATCH] Kennyzhang/docintel docs (#312) * updated docs to include doc intelligence * include reference to doc intel setup docs --- README.md | 18 ++++++++++++++++++ 1 file changed, 18 insertions(+) diff --git a/README.md b/README.md index 6bc91e6..76a4d3f 100644 --- a/README.md +++ b/README.md @@ -33,12 +33,20 @@ Or use `-o` to specify the output file: markitdown path-to-file.pdf -o document.md ``` +To use Document Intelligence conversion: + +```bash +markitdown path-to-file.pdf -o document.md -d -e "" +``` + You can also pipe content: ```bash cat path-to-file.pdf | markitdown ``` +More information about how to set up an Azure Document Intelligence Resource can be found [here](https://learn.microsoft.com/en-us/azure/ai-services/document-intelligence/how-to-guides/create-document-intelligence-resource?view=doc-intel-4.0.0) + ### Python API Basic usage in Python: @@ -51,6 +59,16 @@ result = md.convert("test.xlsx") print(result.text_content) ``` +Document Intelligence conversion in Python: + +```python +from markitdown import MarkItDown + +md = MarkItDown(docintel_endpoint="") +result = md.convert("test.pdf") +print(result.text_content) +``` + To use Large Language Models for image descriptions, provide `llm_client` and `llm_model`: ```python