Document Conversion

Plone Intranet renders previews of office documents (LibreOffice, MS Office). This is not only beautiful, but also very practical since it enables you to visually recognize the “right” document you were looking for.

docconv.client

ploneintranet.doccconv.client generates previews for office documents.

How it works

When a content object is added, an event handler (see handlers.py) triggers the preview generation. The previews are generated asynchronously using the built-in Celery integration. Preview generation requires docsplit (and dependencies, including libreoffice for office document support) to be installed. Upon completion a PDF version of the object is stored in the document annotations.

When the PDF is available the previews will be generated the first time they are requested using the external program pdf2svg.

There are cases in which the SVG preview generated by pdf2svg might be undesirable.

For example, if the system detects that the produced SVG is too large, it will fallback creating an SVG with an embedded at smaller PNG.

Also some scanners have been reported to produce PDF that pdf2svg cannot handle correctly (see https://gitlab.freedesktop.org/poppler/poppler/-/issues/226).

Those can be listed in the registry record ploneintranet.docconv.bad_pdf_creators.

HTML to PDF conversion using wkhtmltopdf

The program wkhtmltopdf can be used to convert HTML documents into PDFs.

The PDF files produced by wkhtmltopdf will mantain the Quaive look and feel because they will include Ploneintrasnet’s CSS styles.

If wkhtmltopdf is not available or disabled, the conversion will be executed using docsplit

This has been tested on:

  1. Ubuntu bionic and newer (apt install wkhtmltopdf)
  2. On Ubuntu Xenial installing via dpkg the package you can fetch from https://downloads.wkhtmltopdf.org/0.12/0.12.5/wkhtmltox_0.12.5-1.xenial_amd64.deb
  3. On NixOS (installing wkhtmltopdf-0.12.4, with patched QT)

To enable the document conversion using wkhtmltopdf, edit the value of the registry record ploneintranet.docconv.wkhtmltopdf.options.

You can either set it to an empty value or you can set it to a meaningfull value like --margin-bottom 2cm --margin-left 2cm --margin-right 2cm --margin-top 3cm --disable-javascript --viewport-size 10000x10000

To disable it the registry record value should be disabled. See the wkhtmltopdf extended help (wkhtmltopdf -H) for the the available options.

Attachments

ploneintranet.attachments stores previews for office documents.

How it works

Make a content type support attachments by having it implement IAttachmentStoragable. The provided adapter is used to add and retrieve values:

>>> storage = IAttachmentStorage(obj)
>>> storage.add('test.doc', attachment_obj)
>>> retrieved = storage.get('test.doc')

To list the ids of available attachments:

>>> storage.keys()

To delete an attachment:

>>> storage.remove('test.doc')

Security

Attachment previews on normal Plone objects, like Files, can be uploaded via the @@upload-attachments helper view, which is protected by cmf.AddPortalContent, and accessed via the @@attachments helper view, which is protected by zope2.View.

Attachments for microblog Statusupdates follow a more convoluted route. They’re uploaded via @@upload-statusupdate-attachments which is protected by ploneintranet.microblog.AddStatusUpdate which means that even normal users that are not allowed to add content to a specific context, like the site root, will be enabled to add attachments and previews on that context.

While composing a StatusUpdate, the updates are temporarily stored on the context, i.e. the workspace or siteroot where the posting widget is shown. This enables showing previews even before submitting a new post. When the post is submitted, the attachments are stored as an annotation on the actual StatusUpdate and the temporary attachment on the context is removed. There’s an additional garbage collection routine that makes sure no stale temporary attachments older than one day remain behind.

In the initial temporary stage, the status attachments can be accessed by the normal @@attachments helper view on the microblog context, which is protected by the View permission on the context.

Todo

Even “private” workspaces currently allow View for any logged-in user. That will be locked down to only workspace members in the near future.

After the StatusUpdate is stored, Statusupdate attachments can be retrieved via the @@status-attachments view, which is protected with ploneintranet.microblog.ViewStatusUpdate, and is defined on INavigationRoot (toplevel stream) and on IMicroblogContext (workspace stream).