* Added an initial minimal MCP server for MarkItDown
* Added STDIO default option.
* Added a Dockerfile, and updated the README accordingly. Also added instructions for Claude Desktop
* Pin mcp version.
* optional reserve base64 string in markdown _CustomMarkdownify and pptx
* add other converter para support
* fix linter
* Use *kwarg to pass keep_data_uri para.
* Add module cli vector tests
* Fixed formatting, and adjusted tests.
Adjusts warning filters to be more contextual
Updates dependencies for magika and youtube-transcript-api
Updates the version to 0.1.0a5 in __about__.py
* Refactored tests.
* Fixed CI errors, and included misc tests.
* Omit mskanji from streaminfo test.
* Omit mskanji from no hints test.
* Log results of debugging in comments (linked to Magika issue)
* Added docs as to when to use misc tests.
* refactor(docker): remove unnecessary root user
The USER root directive isn't needed directly after FROM
Signed-off-by: Sebastian Yaghoubi <sebastianyaghoubi@gmail.com>
* fix(docker): use generic nobody nogroup default instead of uid gid
Signed-off-by: Sebastian Yaghoubi <sebastianyaghoubi@gmail.com>
* fix(docker): build app from source locally instead of installing package
Signed-off-by: Sebastian Yaghoubi <sebastianyaghoubi@gmail.com>
* fix(docker): use correct files in dockerignore
Signed-off-by: Sebastian Yaghoubi <sebastianyaghoubi@gmail.com>
* chore(docker): dont install recommended packages with git
Signed-off-by: Sebastian Yaghoubi <sebastianyaghoubi@gmail.com>
* fix(docker): run apt as non-interactive
Signed-off-by: Sebastian Yaghoubi <sebastianyaghoubi@gmail.com>
* Update Dockerfile to new package structure, and fix streaming bugs.
---------
Signed-off-by: Sebastian Yaghoubi <sebastianyaghoubi@gmail.com>
Co-authored-by: afourney <adamfo@microsoft.com>
* Sort PPTX shapes to be read in top-to-bottom, left-to-right order
Referenced from 39bef65b31/pptx2md/parser.py (L249)
* Update README.md
* Fixed formatting.
* Added missing import
* Updated DocumentConverter interface
* Updated all DocumentConverter classes
* Added support for various new audio files.
* Updated sample plugin to new DocumentConverter interface.
* Updated project README with notes about changes, and use-cases.
* Updated DocumentConverter documentation.
* Move priority to outside DocumentConverter, allowing them to be reprioritized, and keeping the DocumentConverter interface simple.
---------
Co-authored-by: Kenny Zhang <kzhang678@gmail.com>
Initialize `res` at the beginning of `_convert`. If the first converter raises an exception, then the `res` variable was not initialized and we got an error when checking `if res is not None`