Merge branch 'main' into main

This commit is contained in:
Josh XT
2024-12-14 23:09:30 -05:00
committed by GitHub

View File

@@ -14,6 +14,22 @@ It presently supports:
- Various other text-based formats (csv, json, xml, etc.)
- ZIP (Iterates over contents and converts each file)
# Installation
You can install `markitdown` using pip:
```python
pip install markitdown
```
or from the source
```sh
pip install -e .
```
# Usage
The API is simple:
```python
@@ -24,6 +40,18 @@ result = markitdown.convert("test.xlsx")
print(result.text_content)
```
You can also configure markitdown to use Large Language Models to describe images. To do so you must provide mlm_client and mlm_model parameters to MarkItDown object, according to your specific client.
```python
from markitdown import MarkItDown
from openai import OpenAI
client = OpenAI()
md = MarkItDown(mlm_client=client, mlm_model="gpt-4o")
result = md.convert("example.jpg")
print(result.text_content)
```
## Contributing
This project welcomes contributions and suggestions. Most contributions require you to agree to a
@@ -38,6 +66,21 @@ This project has adopted the [Microsoft Open Source Code of Conduct](https://ope
For more information see the [Code of Conduct FAQ](https://opensource.microsoft.com/codeofconduct/faq/) or
contact [opencode@microsoft.com](mailto:opencode@microsoft.com) with any additional questions or comments.
### Running Tests
To run the tests for this project, use the following command:
```sh
hatch shell
hatch test
```
### Running Pre-commit Checks
```sh
pre-commit run --all-files
```
## Trademarks
This project may contain trademarks or logos for projects, products, or services. Authorized use of Microsoft