Skip to content

Releases: docling-project/docling

v2.53.0

17 Sep 13:59
Compare
Choose a tag to compare

Feature

Fix

  • Handle empty result from RapidOCR to avoid crash (#2264) (609d902)

Documentation

v2.52.0

11 Sep 16:11
Compare
Choose a tag to compare

Feature

  • Enrichment steps on all convert pipelines (incl docx, html, etc) (#2251) (2c91234)

Fix

  • Add missing features in ThreadedStandardPdfPipeline (#2252) (0700af2)
  • Address deprecation warnings of dependencies (#2237) (c696549)

Documentation

  • Add an example of RAG with OpenSearch (#2238) (f8cc545)
  • Add instructions for using Docling with MCP to README (#2219) (e5cd702)
  • Document VLM support requirement in extraction example (#2231) (55f5f37)

v2.51.0

05 Sep 13:01
Compare
Choose a tag to compare

Feature

  • Updating default parameters to get better performance with docling-parse (#2208) (b49d1ad)
  • Updated the backend for new docling-parse (#2187) (b3d7542)

Documentation

v2.50.0

03 Sep 11:39
Compare
Choose a tag to compare

Feature

Fix

  • html: Access to variable not yet declared (#2171) (293e81b)

v2.49.0

01 Sep 16:39
Compare
Choose a tag to compare

Feature

  • [Beta] Extraction with schema (#2138) (9f4bc5b)
  • msexcel: Set ContentLayer.INVISIBLE for invisible sheet (#1876) (a283ccf)

Fix

  • pypdfium2: Fix OCR bounding box misalignment caused by mismatched rotation metadata (#2039) (4d94e38)
  • Translation example (#2166) (9f0286b)
  • Extend offline mode for rapidocr fonts (#2155) (9904d14)

Documentation

v2.48.0

26 Aug 05:29
Compare
Choose a tag to compare

Feature

Fix

  • html: Preserve code blocks in list items (#2131) (fa3327e)

v2.47.1

23 Aug 14:11
Compare
Choose a tag to compare

Fix

v2.47.0

22 Aug 14:15
Compare
Choose a tag to compare

Feature

  • CLI: Option to download arbitrary HuggingFace model (#2123) (cdf079d)
  • Batching support for VLMs in transformers backend, add initial VLLM backend (#2094) (3c660c0)
  • html: Support formatting tags in HTML texts (#2111) (94fcc46)

Fix

  • Improve numbered list detection for msword docs (#2100) (3f03709)

Documentation

v2.46.0

20 Aug 15:25
Compare
Choose a tag to compare

Feature

Fix

  • HTML: Parse footer tag as a group in furniture content layer (#2106) (c5f2e2f)

Performance

  • Clean up resources with docling-parse v4, no parsed_page output by default (#2105) (5f57ff2)
  • Speed up function _parse_orientation (#1934) (8820b55)

v2.45.0

18 Aug 10:25
Compare
Choose a tag to compare

Feature

  • Add backend for METS with Google Books profile (#1989) (31087f3)
  • html: Support in-line anchor tags in HTML texts (#1659) (9687297)
  • vlm: Ability to preprocess VLM response (#1907) (5f050f9)

Documentation