Merge branch 'main' into main
This commit is contained in:
25
README.md
25
README.md
@@ -1,5 +1,7 @@
|
||||
# MarkItDown
|
||||
|
||||
[](https://pypi.org/project/markitdown/)
|
||||
|
||||
The MarkItDown library is a utility tool for converting various files to Markdown (e.g., for indexing, text analysis, etc.)
|
||||
|
||||
It presently supports:
|
||||
@@ -12,6 +14,7 @@ It presently supports:
|
||||
- Audio (EXIF metadata, and speech transcription)
|
||||
- HTML (special handling of Wikipedia, etc.)
|
||||
- Various other text-based formats (csv, json, xml, etc.)
|
||||
- ZIP (Iterates over contents and converts each file)
|
||||
|
||||
# Installation
|
||||
|
||||
@@ -27,7 +30,6 @@ or from the source
|
||||
pip install -e .
|
||||
```
|
||||
|
||||
|
||||
# Usage
|
||||
The API is simple:
|
||||
|
||||
@@ -39,7 +41,26 @@ result = markitdown.convert("test.xlsx")
|
||||
print(result.text_content)
|
||||
```
|
||||
|
||||
You can also configure markitdown to use Large Language Models to describe images. To do so you must provide mlm_client and mlm_model parameters to MarkItDown object, according to your specific client.
|
||||
To use this as a command-line utility, install it and then run it like this:
|
||||
|
||||
```bash
|
||||
markitdown path-to-file.pdf
|
||||
```
|
||||
|
||||
This will output Markdown to standard output. You can save it like this:
|
||||
|
||||
```bash
|
||||
markitdown path-to-file.pdf > document.md
|
||||
```
|
||||
|
||||
You can pipe content to standard input by omitting the argument:
|
||||
|
||||
```bash
|
||||
cat path-to-file.pdf | markitdown
|
||||
```
|
||||
|
||||
|
||||
You can also configure markitdown to use Large Language Models to describe images. To do so you must provide `mlm_client` and `mlm_model` parameters to MarkItDown object, according to your specific client.
|
||||
|
||||
```python
|
||||
from markitdown import MarkItDown
|
||||
|
||||
Reference in New Issue
Block a user