orhan/test - test - AydilekNet Git Repo

orhan/test

Author	SHA1	Message	Date
Soulter	1123392306	fix: support -o param to avoid encoding issues (#116 ) * perf: cli supports -o param * doc: update README --------- Co-authored-by: gagb <gagb@users.noreply.github.com>	2024-12-20 14:43:00 -08:00
SigireddyBalasai	5276616ba1	Added support to use Pathlib (#93 ) * Add support for Path objects in MarkItDown conversion methods * Remove unnecessary blank line in test_markitdown_exiftool function * Remove unnecessary blank line in test_markitdown_exiftool function * remove pathlib path in test file --------- Co-authored-by: afourney <adamfo@microsoft.com> Co-authored-by: gagb <gagb@users.noreply.github.com>	2024-12-20 14:12:48 -08:00
Sugato Ray	08a25345e3	[feat]: add support for type-hinting for PEP-561	2024-12-20 02:37:10 +00:00
Sugato Ray	613825d5b3	[feat]: add support for type-hinting for PEP-561	2024-12-20 02:12:24 +00:00
Sugato Ray	6f3c762526	Merge branch 'main' into update_commandline_help	2024-12-18 17:50:07 -05:00
Sugato Ray	356e895306	update formatting with pre-commit	2024-12-18 21:45:23 +00:00
gagb	5fc70864f2	Run pre-commit	2024-12-18 11:46:39 -08:00
Sugato Ray	39410d01df	Update CLI helpdoc formatting to allow indentation in code Use `textwrap.dedent()` to allow indented cli-helpdoc in `__main__.py` file. The indentation increases readability, while `textwrap.dedent` helps maintain the same functionality without breaking code.	2024-12-18 14:22:58 -05:00
Joel Esler	6e4caac70d	Safeguard against path traversal for ZipConverter fix: prevent path traversal vulnerabilities in ZipConverter Added a secure check for path traversal vulnerabilities in the ZipConverter class. Now validates extracted file paths using `os.path.commonprefix` to ensure all files remain within the intended extraction directory. Raises a `ValueError` if a path traversal attempt is detected. - Normalized file paths using `os.path.normpath`. - Added specific exception handling for `zipfile.BadZipFile` and traversal errors. - Ensured cleanup of extracted files after processing when `cleanup_extracted` is enabled.	2024-12-18 13:12:55 -05:00
gagb	362214323e	Merge branch 'main' into feature/fix-code-comments	2024-12-17 16:38:47 -08:00
afourney	9e546a8588	Merge branch 'main' into main	2024-12-17 15:37:28 -08:00
Adam Fourney	8d5f16ecd2	Fixed formatting.	2024-12-17 15:27:06 -08:00
afourney	a571021199	Merge branch 'main' into main	2024-12-17 15:12:59 -08:00
afourney	9add517510	Merge branch 'main' into feature/fix-code-comments	2024-12-17 14:56:16 -08:00
Adam Fourney	9518c01d4e	Bump version.	2024-12-17 13:51:13 -08:00
Adam Fourney	95188a4a27	Merge main.	2024-12-17 13:46:26 -08:00
Adam Fourney	03a7843a0a	Added deprecation warnings for mlm_* arguments.	2024-12-17 13:22:48 -08:00
Adam Fourney	248d64edd0	Added llm tests to the local test set.	2024-12-17 12:13:19 -08:00
Lee Bush	05a49ca129	fix incorrect comments for "bail if not ..." for WAV and image cases.	2024-12-17 08:10:53 -07:00
Soulter	752fbd333c	feat: add tests of rss convertor	2024-12-17 22:45:27 +08:00
Soulter	7dc2695b96	feat: support convert atom to markdown	2024-12-17 21:44:50 +08:00
Soulter	53fad6eb31	feat: add rss converter	2024-12-17 21:22:27 +08:00
Om Gupta	60c4a62917	Merge branch 'microsoft:main' into main	2024-12-17 10:33:40 +05:30
Om Gupta	3eb8cf385b	Merge branch 'main' of https://github.com/AumGupta/markitdown	2024-12-17 10:24:30 +05:30
Om Gupta	8c91c11ea8	pre-commit run	2024-12-17 10:24:25 +05:30
gagb	ad29122592	run precommit	2024-12-16 18:09:48 -08:00
gagb	898bfd4774	Merge branch 'main' into main	2024-12-16 18:00:26 -08:00
gagb	825d3bbb77	Merge branch 'main' into issue#65	2024-12-16 17:09:53 -08:00
gagb	874eba6265	Merge branch 'main' into patch-2	2024-12-16 16:59:22 -08:00
gagb	c3fa2934b9	Run pre-commit	2024-12-16 16:56:52 -08:00
kevinbabou	33638f1fe6	feature: add argument parsing and setup.py file for cli tool capability	2024-12-16 16:28:44 -08:00
gagb	dbc727615d	Merge branch 'main' into main	2024-12-16 15:48:49 -08:00
gagb	b0115cf971	Merge branch 'main' into youtube-transcript-languages	2024-12-16 15:47:38 -08:00
gagb	980abd3a60	Merge branch 'main' into main	2024-12-16 15:24:58 -08:00
afourney	afaff11ef0	Merge branch 'main' into main	2024-12-16 14:40:58 -08:00
afourney	e7636656d8	Merge branch 'main' into support-comments-in-docx	2024-12-16 14:23:14 -08:00
afourney	ddc1bebea4	Merge branch 'main' into patch-2	2024-12-16 14:20:16 -08:00
afourney	12ce5e95b2	Merge branch 'main' into feature/add-pptx-chart-support	2024-12-16 14:06:14 -08:00
gagb	9e6a19987b	Merge branch 'main' into main	2024-12-16 13:51:39 -08:00
CharlesCNorton	ed651aeb16	Fix LLM terminology in code Replaced all occurrences of mlm_client and mlm_model with llm_client and llm_model for consistent terminology when referencing Large Language Models (LLMs).	2024-12-16 16:23:52 -05:00
Om Gupta	a3208f2bd0	feat: Add IpynbConverter - Implemented IpynbConverter class for converting Jupyter Notebook (.ipynb) files into Markdown format. - Supports markdown cells, code cells and raw cells. - First markdown heading is used as the title if no title is found in notebook metadata. - Created a test notebook (`test_notebook.ipynb`) to verify the functionality of the converter.	2024-12-17 01:00:41 +05:30
Divit	ad01da308d	fix issue #65	2024-12-16 21:48:33 +05:30
narumi	695100d5d8	Support specifying YouTube transcript language	2024-12-16 13:16:00 +08:00
SH4DOW4RE	1559d9d163	pre-commit ran	2024-12-15 22:15:20 +01:00
SH4DOW4RE	b7f5662ffd	PR: Catching pydub's warning of ffmpeg or avconv missing	2024-12-15 17:29:14 +01:00
Ville Puuska	0a7203b876	add style_map prop to MarkItDown class	2024-12-15 17:23:57 +02:00
Ville Puuska	0704b0b6ff	pass 'style_map' kwarg to mammoth when converting docx	2024-12-15 16:59:21 +02:00
sakasegawa	0dd4e95584	Remove _is_chart	2024-12-15 21:14:58 +09:00
sakasegawa	93130b5ba5	Add PPTX chart support	2024-12-15 20:42:55 +09:00
Divyansh Singh	52b723724c	Fix character decoding issues with text-like files	2024-12-15 10:37:59 +05:30

1 2

57 Commits