By MW in mediawiki — 21 Apr 2024

Building Structured Wikis with the Cargo Extension in MediaWiki

When you start a MediaWiki project that needs more than a handful of flat pages, you quickly feel the pinch of unstructured data

Why Cargo matters for structured wikis

When you start a MediaWiki project that needs more than a handful of flat pages, you quickly feel the pinch of unstructured data. Infoboxes are nice, but pulling a list of all pages that share a certain field? That used to be a nightmare. Cargo steps in as a lightweight bridge between templates and a real relational store, letting you declare a table, store rows, and query them without writing SQL.

In practice the extension feels like a friendly librarian who keeps a tidy catalog behind the scenes while you continue to edit pages in the familiar wikitext style. It works with the default MediaWiki database, so you don’t need a separate DB server – just a few extra tables that Cargo creates automatically.

Getting your feet wet – the quick‑start cycle

Imagine you want a personal library wiki: one set of pages for books, another for authors. Each book page should record title, author(s), genre, year, and page count; each author page should hold name, country, and a roll‑call of their books. Cargo makes this scenario as simple as defining two templates.

1. Declare the tables

The first line of every Cargo‑driven template lives inside a <noinclude> tag and uses #cargo_declare. It tells Cargo what columns exist, their types, and the table name.

<noinclude>
{{#cargo_declare:
  _table=Books
| Authors=List (,) of Page
| Genres=List (,) of String
| YearOfPublication=Date
| NumberOfPages=Integer
}}
</noinclude>

<noinclude>
{{#cargo_declare:
  _table=Authors
| Country=String
}}
</noinclude>

Note the subtlety: the page name itself is automatically saved as _pageName, so you don’t have to repeat the title field inside the template.

2. Store the data

Once the table schema is declared, the same template needs a #cargo_store call inside the <includeonly> section. This call creates a new row every time the template is transcluded.

<includeonly>
{{#cargo_store:_table=Books}}
 class="infobox"
! Author(s) || {{#arraymap:{{{Authors|}}}|,|x|{{#formredlink:form=Author|target=x}}}}
|- ! Genre(s) || {{{Genres|}}}
|- ! Year of publication || {{{YearOfPublication|}}}
|- ! Number of pages || {{{NumberOfPages|}}}
|}

<includeonly>
{{#cargo_store:_table=Authors}}
{| class="infobox"
! Country of origin || {{{Country|}}}
|- ! Books || {{#cargo_query:
    tables=Books
  | where=Authors HOLDS '{{PAGENAME}}'
  | format=ulist}}
|}

These snippets look a bit like ordinary infobox markup. The difference is the invisible #cargo_store tag that fires each time the page is saved, spitting a fresh row into the “Books” or “Authors” table.

3. Query the data

Fetching a list of all books by a given author, or every author born in a specific country, is a matter of using #cargo_query. The query can be placed on any wiki page – a special “list” page, a sidebar widget, or even inside another template.

{{#cargo_query:
| tables=Books
| fields=Authors, Genres, YearOfPublication
| where=Authors CONTAINS 'J.K. Rowling'
| order=YearOfPublication ASC
| format=template
| template=BookSummary}}

In this example the BookSummary template would receive the three fields and render them however you like – a simple bullet list, a table, or even a carousel of cover images.

Putting it together – a real‑world page walk‑through

Below is a stripped‑down version of a Book page in our personal library wiki.

== {{PAGENAME}} ==
{{Book
| Authors=J.K. Rowling
| Genres=Fantasy, Adventure
| YearOfPublication=1997-06-26
| NumberOfPages=309
}}

When this page is saved, Cargo silently records a row like:

{
  "_pageName": "Harry Potter and the Philosopher's Stone",
  "Authors": ["J. Rowling"],
  "Genres": ["Fantasy", "Adventure"],
  "YearOfPublication": "1997-06-26",
  "NumberOfPages": 309
}

And an Author page looks like this:

== {{PAGENAME}} ==
{{Author
| Country=United Kingdom
}}

Because the #cargo_query clause inside the Author template pulls every book that lists the current page as an author, the author page automatically shows a list such as:

* Harry Potter and the Philosopher's Stone
* Harry Potter and the Chamber of Secrets
* ...

No manual update needed. Add a new book, and the author’s bibliography grows on its own.

Advanced tricks you might not see in the quick‑start guide

Aggregating queries. Use GROUP_CONCAT (available via #cargo_query groupby and aggregate=group_concat) to collect comma‑separated lists in a single field.
Computed fields. Declare a column with type=Number and fill it via a #if or parser function that calculates a score on the fly.
Full‑text search. Cargo’s _text pseudo‑column lets you search across all string fields without building a separate search index.
Export to CSV. Append | format=csv to a query and embed the result in a <pre> block for easy download.

These capabilities let you treat Cargo as a mini‑data‑warehouse sitting inside MediaWiki, without the overhead of a dedicated BI platform.

Performance considerations

Cargo stores data in ordinary MySQL/MariaDB tables, so the same indexing tricks you’d use for any other wiki table apply here. By default Cargo an index on _pageName and on any column you flag as primary=YES in the #cargo_declare line. If you expect heavy queries on a column like Genre, add an explicit index:

{{#cargo_declare:
  _table=Books
| Genres=List (,) of String
| _index=Genres
}}

Remember to run cargo rebuild from the maintenance scripts after modifying declarations – it re‑creates the tables and indexes safely.

Interoperability with Semantic MediaWiki

If you already run Semantic MediaWiki (SMW), Cargo can coexist peacefully. Both extensions use template‑based data entry, but Cargo’s syntax is a bit more concise and its query language resembles MediaWiki’s built‑in parser functions. You can even migrate SMW properties to Cargo tables using the built‑in migration guide, which copies existing triples into Cargo rows without data loss.

Typical pitfalls (and how to dodge ’em)

Forgetting the <noinclude> wrapper. Without it the #cargo_declare line ends up on every transclusion, creating duplicate tables and slowing down saves.
Mixing list separators. Cargo expects commas for List (,) types; using semicolons in the template will store the whole string as one entry.
Over‑using #cargo_query on high‑traffic pages. Each call hits the database; cache the result with #ifexist or move the query to a special page.
Neglecting field names with spaces. Cargo column identifiers must be “safe” – no spaces, no leading underscores (except the reserved _pageName).

These slip‑ups happen to most newcomers. A quick double‑check of your template code usually catches them before they snowball.

Wrapping up

Building a structured wiki doesn’t have to feel like assembling a jigsaw puzzle blindfolded. Cargo hands you a clear set of building blocks – declare, store, query – and lets you stay in the comfortable wikitext world you already know. Whether you’re cataloging a library, tracking conference speakers, or managing a product inventory, the extension turns ordinary pages into a relational backbone without pulling you out of the MediaWiki ecosystem.

Give Cargo a look‑over, sketch a couple of templates, and you’ll see how fast the data starts answering your own questions. The result is a wiki that not only displays information but also *knows* it.