Public API
The top-level docwow module exposes six functions covering the most common workflows.
docwow.open(source)
Parse a DOCX file or a docwow HTML string into a :class:~docwow.api.document.DocumentWrapper.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
source
|
str | Path | bytes
|
A file path ( |
required |
Returns:
| Name | Type | Description |
|---|---|---|
A |
'DocumentWrapper'
|
class: |
docwow.to_html(source, page_view=False)
Convert a DOCX file to a self-contained HTML string.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
source
|
str | Path | bytes
|
Path to a |
required |
page_view
|
bool
|
When True, styles the output as a physical page and adds
|
False
|
Returns:
| Type | Description |
|---|---|
str
|
UTF-8 HTML string produced by :func: |
docwow.to_docx(html, target=None, *, is_foreign_html=False, fetch_images=False, fetch_external_css=False)
Convert an HTML string to a DOCX file.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
html
|
str | bytes
|
HTML string or bytes. |
required |
target
|
str | Path | None
|
Optional output path. When provided the bytes are also written to disk. |
None
|
is_foreign_html
|
bool
|
Set to |
False
|
fetch_images
|
bool
|
When |
False
|
fetch_external_css
|
bool
|
When |
False
|
Returns:
| Type | Description |
|---|---|
bytes
|
Raw DOCX bytes (a valid ZIP archive). |
Raises:
| Type | Description |
|---|---|
ValueError
|
If |
docwow.parse_docx(source)
Parse a DOCX file and return a Document.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
source
|
str | Path | bytes
|
Path to a .docx file (str or Path), or raw bytes of the zip archive (useful in tests and web upload handlers). |
required |
Returns:
| Type | Description |
|---|---|
Document
|
A fully populated Document ready for rendering. |
docwow.parse_html(source)
Parse a docwow HTML string back into a Document model.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
source
|
str | bytes
|
HTML produced by |
required |
Returns:
| Name | Type | Description |
|---|---|---|
A |
Document
|
class: |
Document
|
styles, and numbering reflect the content of the HTML. |
Raises:
| Type | Description |
|---|---|
ValueError
|
If the HTML does not contain a |
docwow.render_document(doc, embed_images=True, page_view=False)
Render a Document to a complete, self-contained HTML string.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
doc
|
Document
|
The document model to render. |
required |
embed_images
|
bool
|
When True (default), images are embedded as base64 data URIs. When False, a placeholder src is used (useful for testing without large base64 blobs). |
True
|
page_view
|
bool
|
When True, adds CSS that styles the document as a
physical page (gray background, drop shadow) and
injects an |
False
|
Returns:
| Type | Description |
|---|---|
str
|
A UTF-8 HTML string starting with <!DOCTYPE html>. |
docwow.write_docx(doc, target=None)
Write a Document to a DOCX byte string.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
doc
|
Document
|
The document model to serialise. |
required |
target
|
str | Path | None
|
Optional file path. When provided the bytes are also written to disk. |
None
|
Returns:
| Type | Description |
|---|---|
bytes
|
The raw DOCX bytes (a valid ZIP archive). |