* Add support for Path objects in MarkItDown conversion methods
* Remove unnecessary blank line in test_markitdown_exiftool function
* Remove unnecessary blank line in test_markitdown_exiftool function
* remove pathlib path in test file
---------
Co-authored-by: afourney <adamfo@microsoft.com>
Co-authored-by: gagb <gagb@users.noreply.github.com>
Use `textwrap.dedent()` to allow indented cli-helpdoc in `__main__.py` file. The indentation increases readability, while `textwrap.dedent` helps maintain the same functionality without breaking code.
fix: prevent path traversal vulnerabilities in ZipConverter
Added a secure check for path traversal vulnerabilities in the ZipConverter class.
Now validates extracted file paths using `os.path.commonprefix` to ensure all files
remain within the intended extraction directory. Raises a `ValueError` if a
path traversal attempt is detected.
- Normalized file paths using `os.path.normpath`.
- Added specific exception handling for `zipfile.BadZipFile` and traversal errors.
- Ensured cleanup of extracted files after processing when `cleanup_extracted` is enabled.
Replaced all occurrences of mlm_client and mlm_model with llm_client and llm_model for consistent terminology when referencing Large Language Models (LLMs).
- Implemented IpynbConverter class for converting Jupyter Notebook (.ipynb) files into Markdown format.
- Supports markdown cells, code cells and raw cells.
- First markdown heading is used as the title if no title is found in notebook metadata.
- Created a test notebook (`test_notebook.ipynb`) to verify the functionality of the converter.