-
Notifications
You must be signed in to change notification settings - Fork 26
docs: document the backup/restore ZIP archive format #604
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change | ||||
|---|---|---|---|---|---|---|
| @@ -0,0 +1,320 @@ | ||||||
| .. _backup-restore-format: | ||||||
|
|
||||||
| Backup / Restore Format | ||||||
| ======================= | ||||||
|
|
||||||
| The ``backup_restore`` applet lets you export a learning package (V2 content | ||||||
| library) to a portable ZIP archive and restore it on the same or a different | ||||||
| Open edX instance. | ||||||
|
|
||||||
| .. contents:: Contents | ||||||
| :local: | ||||||
| :depth: 2 | ||||||
|
|
||||||
| Overview | ||||||
| -------- | ||||||
|
|
||||||
| A backup ZIP is a self-contained snapshot of one learning package. It captures | ||||||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We should clarify the difference between a Learning Package and a Library. Namely, that a Library has one and only one Learning Package where it stores its content, but Learning Packages can also stand alone. The restore process creates a temporary Learning Package that can be reviewed by the user, and then later associates that Learning Package with a newly created Library. |
||||||
| every component, collection, container (sections / subsections / units), and | ||||||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
|
||||||
| static asset. For each component and container, only the current draft and | ||||||
| published versions are exported — the full version history is not preserved. | ||||||
|
|
||||||
| The archive uses `TOML <https://toml.io>`_ for all metadata files and keeps the | ||||||
| actual XBlock content as XML (the same ``block.xml`` format Studio has always | ||||||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
In modulestore, the XML files are not named
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Also, it's probably worth noting that the naming is different--in courses, each component would be exported with it's block_id as the name of the file. That's usually a machine-generated ID (since that's the default in Split) but sometimes it's a meaningful identifier when authored by hand. For our export format, it the OLX is always |
||||||
| used). This makes backups both machine-readable and human-inspectable. | ||||||
|
|
||||||
| .. note:: | ||||||
|
|
||||||
| The current archive ``format_version`` is **1**. Future incompatible changes | ||||||
| to the schema will increment this number so that tooling can detect them | ||||||
| before attempting a restore. | ||||||
|
|
||||||
| Exporting a Package | ||||||
| ------------------- | ||||||
|
|
||||||
| Management command (recommended for operators):: | ||||||
|
|
||||||
| python manage.py lp_dump <package_ref> output.zip | ||||||
| python manage.py lp_dump <package_ref> output.zip --username admin --origin_server cms.example.com | ||||||
|
|
||||||
| Python API:: | ||||||
|
|
||||||
| from openedx_content.api import create_zip_file | ||||||
|
|
||||||
| create_zip_file( | ||||||
| package_ref="lib:MyOrg:MyLibrary", | ||||||
| path="/tmp/my_library.zip", | ||||||
| user=request.user, # optional – recorded in package.toml | ||||||
| origin_server="cms.example.com", # optional | ||||||
| ) | ||||||
|
|
||||||
| Restoring a Package | ||||||
| ------------------- | ||||||
|
|
||||||
| Management command:: | ||||||
|
|
||||||
| python manage.py lp_load output.zip <username> | ||||||
|
|
||||||
| Python API:: | ||||||
|
|
||||||
| from openedx_content.api import load_learning_package | ||||||
|
|
||||||
| result = load_learning_package(path="/tmp/my_library.zip") | ||||||
| if result["status"] == "error": | ||||||
| print(result["log_file_error"].getvalue()) | ||||||
|
|
||||||
| .. note:: | ||||||
|
|
||||||
| ``load_learning_package`` accepts an optional ``package_ref`` argument. | ||||||
| When provided it overrides the ``key`` stored in ``package.toml``, which | ||||||
| is useful when importing a library under a new reference. | ||||||
|
Comment on lines
+69
to
+70
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We should use stronger language here. It's really dangerous to trust the archive for either the package_ref or the user, and callers should explicitly pass those to |
||||||
|
|
||||||
| Archive Structure | ||||||
| ----------------- | ||||||
|
|
||||||
| :: | ||||||
|
|
||||||
| <package>.zip | ||||||
| ├── package.toml # library metadata + archive metadata | ||||||
| ├── collections/ | ||||||
| │ └── <collection-key>.toml # one file per collection | ||||||
| └── entities/ | ||||||
| ├── <container-slug>.toml # sections, subsections, units | ||||||
| └── xblock.v1/ | ||||||
| └── <block-type>/ # e.g. html, problem, video | ||||||
| ├── <uuid>.toml # entity metadata + version list | ||||||
| └── <uuid>/ | ||||||
| └── component_versions/ | ||||||
| └── v<N>/ | ||||||
| ├── block.xml # XBlock content (XML) | ||||||
| └── static/ # media assets referenced by block.xml | ||||||
|
|
||||||
| File Format Reference | ||||||
| --------------------- | ||||||
|
|
||||||
| package.toml | ||||||
| ~~~~~~~~~~~~ | ||||||
|
|
||||||
| Located at the root of the archive. Contains two sections: | ||||||
|
|
||||||
| ``[meta]`` — archive metadata (not restored to the database, for inspection only): | ||||||
|
|
||||||
| .. list-table:: | ||||||
| :header-rows: 1 | ||||||
| :widths: 25 15 60 | ||||||
|
|
||||||
| * - Field | ||||||
| - Required | ||||||
| - Description | ||||||
| * - ``format_version`` | ||||||
| - yes | ||||||
| - Integer schema version; currently ``1`` | ||||||
| * - ``created_by`` | ||||||
| - no | ||||||
| - Username of the operator who ran the export | ||||||
| * - ``created_by_email`` | ||||||
| - no | ||||||
| - Email address of the exporting user | ||||||
| * - ``created_at`` | ||||||
| - yes | ||||||
| - UTC timestamp when the archive was created | ||||||
| * - ``origin_server`` | ||||||
| - no | ||||||
| - Free-form string identifying the origin CMS instance (typically a | ||||||
| hostname or URL; stored as-is with no format validation) | ||||||
|
|
||||||
| ``[learning_package]`` — library data (restored to the database, with caveats: ``key`` may be overridden by the caller and ``updated`` is not applied during restore): | ||||||
|
|
||||||
| .. list-table:: | ||||||
| :header-rows: 1 | ||||||
| :widths: 25 15 60 | ||||||
|
|
||||||
| * - Field | ||||||
| - Required | ||||||
| - Description | ||||||
| * - ``title`` | ||||||
| - yes | ||||||
| - Human-readable name of the library | ||||||
| * - ``key`` | ||||||
| - yes | ||||||
| - Package reference string, e.g. ``lib:MyOrg:MyLib`` | ||||||
| * - ``description`` | ||||||
| - yes | ||||||
| - Free-text description (may be blank) | ||||||
| * - ``created`` | ||||||
| - yes | ||||||
| - UTC timestamp when the library was originally created | ||||||
| * - ``updated`` | ||||||
| - yes | ||||||
| - UTC timestamp of the library's last modification (written to the | ||||||
| archive for reference; **not** applied during restore) | ||||||
|
|
||||||
| Example:: | ||||||
|
|
||||||
| [meta] | ||||||
| format_version = 1 | ||||||
| created_by = "lp_user" | ||||||
| created_by_email = "lp_user@example.com" | ||||||
| created_at = 2025-10-05T18:23:45.180535Z | ||||||
| origin_server = "cms.test" | ||||||
|
|
||||||
| [learning_package] | ||||||
| title = "Library test" | ||||||
| key = "lib:WGU:LIB_C001" | ||||||
| description = "" | ||||||
| created = 2025-08-19T04:25:10.988166Z | ||||||
| updated = 2025-08-19T04:25:10.988166Z | ||||||
|
|
||||||
| Component entity TOML (``entities/xblock.v1/<type>/<uuid>.toml``) | ||||||
| ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | ||||||
|
|
||||||
| Each XBlock component gets one TOML file. | ||||||
|
|
||||||
| ``[entity]``: | ||||||
|
|
||||||
| .. list-table:: | ||||||
| :header-rows: 1 | ||||||
| :widths: 25 15 60 | ||||||
|
|
||||||
| * - Field | ||||||
| - Required | ||||||
| - Description | ||||||
| * - ``can_stand_alone`` | ||||||
| - yes | ||||||
| - Whether this component can be used independently (almost always ``true``) | ||||||
| * - ``key`` | ||||||
| - yes | ||||||
| - Entity reference in the form ``xblock.v1:<type>:<uuid>`` | ||||||
| * - ``created`` | ||||||
| - yes | ||||||
| - UTC creation timestamp | ||||||
|
|
||||||
| ``[entity.draft]`` / ``[entity.published]`` — each contains ``version_num`` | ||||||
| pointing at the current draft or published ``[[version]]`` entry respectively. | ||||||
| ``[entity.draft]`` is absent when the entity has no draft. | ||||||
| ``[entity.published]`` is **always present** — when the entity has no | ||||||
| published version it is written as an empty table with an explanatory comment | ||||||
| (see the container example below). | ||||||
|
|
||||||
| ``[[version]]`` — at most two entries: the current draft version first, then | ||||||
| the current published version if it differs from draft. The full version | ||||||
| history is not stored. | ||||||
|
|
||||||
| .. list-table:: | ||||||
| :header-rows: 1 | ||||||
| :widths: 25 15 60 | ||||||
|
|
||||||
| * - Field | ||||||
| - Required | ||||||
| - Description | ||||||
| * - ``title`` | ||||||
| - yes | ||||||
| - Display name of the component at this version | ||||||
| * - ``version_num`` | ||||||
| - yes | ||||||
| - Monotonically increasing integer starting at 1 | ||||||
|
|
||||||
| Example:: | ||||||
|
|
||||||
| [entity] | ||||||
| can_stand_alone = true | ||||||
| key = "xblock.v1:html:e32d5479-9492-41f6-9222-550a7346bc37" | ||||||
| created = 2025-08-19T04:25:43.685529Z | ||||||
|
|
||||||
| [entity.draft] | ||||||
| version_num = 5 | ||||||
|
|
||||||
| [entity.published] | ||||||
| version_num = 4 | ||||||
|
|
||||||
| # ### Versions | ||||||
|
|
||||||
| [[version]] | ||||||
| title = "Text" | ||||||
| version_num = 5 | ||||||
|
|
||||||
| [[version]] | ||||||
| title = "Text" | ||||||
| version_num = 4 | ||||||
|
|
||||||
| Container entity TOML (``entities/<slug>.toml``) | ||||||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We should explain what a |
||||||
| ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | ||||||
|
|
||||||
| Sections, subsections, and units share the same base structure with an | ||||||
| additional ``[entity.container.<type>]`` marker (``section``, ``subsection``, | ||||||
| or ``unit``) and a ``[version.container]`` table that lists child keys. | ||||||
|
|
||||||
| Example (section):: | ||||||
|
|
||||||
| [entity] | ||||||
| can_stand_alone = true | ||||||
| key = "section1-8ca126" | ||||||
| created = 2025-09-04T22:51:40.919872Z | ||||||
|
|
||||||
| [entity.draft] | ||||||
| version_num = 2 | ||||||
|
|
||||||
| [entity.published] | ||||||
| # unpublished: no published_version_num | ||||||
|
|
||||||
| [entity.container.section] | ||||||
|
|
||||||
| # ### Versions | ||||||
|
|
||||||
| [[version]] | ||||||
| title = "Section1" | ||||||
| version_num = 2 | ||||||
|
|
||||||
| [version.container] | ||||||
| children = ["subsection1-48afa3"] | ||||||
|
|
||||||
| Collection TOML (``collections/<key>.toml``) | ||||||
| ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | ||||||
|
|
||||||
| .. list-table:: | ||||||
| :header-rows: 1 | ||||||
| :widths: 25 15 60 | ||||||
|
|
||||||
| * - Field | ||||||
| - Required | ||||||
| - Description | ||||||
| * - ``title`` | ||||||
| - yes | ||||||
| - Collection display name | ||||||
| * - ``key`` | ||||||
| - yes | ||||||
| - Unique key within the library | ||||||
| * - ``description`` | ||||||
| - yes | ||||||
| - Free-text description (may be blank) | ||||||
| * - ``created`` | ||||||
| - yes | ||||||
| - UTC creation timestamp | ||||||
| * - ``entities`` | ||||||
| - yes | ||||||
| - List of entity reference strings (``xblock.v1:<type>:<uuid>``) | ||||||
|
|
||||||
| Example:: | ||||||
|
|
||||||
| [collection] | ||||||
| title = "Collection test1" | ||||||
| key = "collection-test" | ||||||
| description = "" | ||||||
| created = 2025-08-19T04:25:27.754968Z | ||||||
| entities = [ | ||||||
| "xblock.v1:html:e32d5479-9492-41f6-9222-550a7346bc37", | ||||||
| "xblock.v1:problem:256739e8-c2df-4ced-bd10-8156f6cfa90b", | ||||||
| ] | ||||||
|
|
||||||
| XBlock content (``component_versions/v<N>/block.xml``) | ||||||
| ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | ||||||
|
|
||||||
| Standard XBlock XML, identical to what Studio stores internally. Static assets | ||||||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. There is a difference in HTMLBlock storage. Namely, we don't currently support storing a separate HTML file, so we inline the HTML with CDATA. In courses, we'd have a tiny XML file for the HTMLBlock that pointed to the HTML file. This is a limitation of our XBlock serialization, but one I hope we can fix before Willow. |
||||||
| (images, PDFs, etc.) referenced with ``/static/<filename>`` in the XML are | ||||||
| stored alongside the XML under ``component_versions/v<N>/static/``. | ||||||
|
|
||||||
| Example ``block.xml``:: | ||||||
|
|
||||||
| <html display_name="Text"> | ||||||
| <![CDATA[<p>Hello <img src="/static/me.png" alt="Me" /></p>]]> | ||||||
| </html> | ||||||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -10,3 +10,4 @@ Django app for modeling and authoring course content structures. | |
|
|
||
| decisions/index | ||
| api_reference | ||
| backup_restore | ||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We're intentionally trying to use "backup/restore" to distinguish it between incremental import/export functionality that we plan to add in the future.