diff --git a/README.md b/README.md index 3bd1581..865d5a5 100644 --- a/README.md +++ b/README.md @@ -14,6 +14,22 @@ It presently supports: - Various other text-based formats (csv, json, xml, etc.) - ZIP (Iterates over contents and converts each file) +# Installation + +You can install `markitdown` using pip: + +```python +pip install markitdown +``` + +or from the source + +```sh +pip install -e . +``` + + +# Usage The API is simple: ```python @@ -24,6 +40,18 @@ result = markitdown.convert("test.xlsx") print(result.text_content) ``` +You can also configure markitdown to use Large Language Models to describe images. To do so you must provide mlm_client and mlm_model parameters to MarkItDown object, according to your specific client. + +```python +from markitdown import MarkItDown +from openai import OpenAI + +client = OpenAI() +md = MarkItDown(mlm_client=client, mlm_model="gpt-4o") +result = md.convert("example.jpg") +print(result.text_content) +``` + ## Contributing This project welcomes contributions and suggestions. Most contributions require you to agree to a @@ -38,6 +66,21 @@ This project has adopted the [Microsoft Open Source Code of Conduct](https://ope For more information see the [Code of Conduct FAQ](https://opensource.microsoft.com/codeofconduct/faq/) or contact [opencode@microsoft.com](mailto:opencode@microsoft.com) with any additional questions or comments. +### Running Tests + +To run the tests for this project, use the following command: + +```sh +hatch shell +hatch test +``` + +### Running Pre-commit Checks + +```sh +pre-commit run --all-files +``` + ## Trademarks This project may contain trademarks or logos for projects, products, or services. Authorized use of Microsoft