Using MediaWiki's Cargo Extension for Structured Data Management

Why Cargo Matters in a Wiki‑Driven World

Ever opened a Wikipedia page and thought, “That infobox is just a pile of text with a fancy layout”? In truth, those boxes hide rows of data that can be queried, sorted, and repurposed—if you have the right tool. Cargo, a MediaWiki extension, makes that possible without the heavyweight bureaucracy of Semantic MediaWiki. It sits quietly, storing values from template calls, then lets you pull them out with SQL‑like queries that feel natural to anyone who’s ever tinkered with a spreadsheet.

From Templates to Tables: The Core Idea

When a wiki page uses a template—say, {{Infobox book|title=…|author=…}}—Cargo watches those parameters. Each invocation becomes a row in an automatically created table; each template field becomes a column. No extra markup, no separate data store. The extension hooks into MediaWiki’s parser, extracts the values, and writes them to a hidden cargo_ table in the underlying database.

That means a simple edit to an infobox automatically updates a queryable dataset. No need to remember to add a line to a separate data file; the wiki does it for you. The beauty is in its laziness—well, deliberate design, actually.

Getting Started: Installing Cargo

Installation is straightforward for anyone who’s already comfortable with MediaWiki’s LocalSettings.php. Add the extension’s line, run the update script, and you’re set.


wfLoadExtension( 'Cargo' );
$wgCargoEnableStaticTables = true; // optional, speeds up big queries

After a quick php maintenance/update.php, the wiki creates a few system tables and is ready to listen. If you have SMW installed, Cargo plays nicely—both can co‑exist, though you’ll likely choose one as your primary data engine.

Defining a Cargo Table with a Template

Rather than writing SQL yourself, you embed a tiny directive inside the template. That’s where Cargo’s declarative magic shines.


{{#cargo_declare:
|table=Books
|fields=Title=String, Author=String, Year=Number
|format=table
}}
{{!}}-

Place that #cargo_declare call at the top of Template:Infobox book. When the template runs, Cargo checks the declaration, creates (or updates) the Books table, and maps each parameter to the appropriate column. The next time you add {{Infobox book|Title=The Hobbit|Author=J.R.R. Tolkien|Year=1937}}, Cargo silently adds a row.

Running Queries: From Simple Lists to Fancy Charts

Now that data lives in a table, you can retrieve it anywhere on the wiki. A basic list looks like this:


{{#cargo_query:
|tables=Books
|fields=Title, Author, Year
|where=Year > 2000
|order=Year ASC
}}

That snippet spits out a neat HTML table, sorted by year. Want a count instead of a full list? No problem.


{{#cargo_query:
|tables=Books
|fields=COUNT(Title)=TotalBooks
|where=Author = "J.K. Rowling"
}}

Result? “TotalBooks: 7” (or whatever the numbers say). You can also embed the query inside a charting extension, for example Extension:Graph, to visualize trends—perfect for community projects tracking events or species observations.

Advanced Filtering: Joins and Subqueries

Because Cargo uses MySQL (or compatible) under the hood, you can write fairly sophisticated queries. Imagine a second table Authors holding biographical data, declared in a separate template:


{{#cargo_declare:
|table=Authors
|fields=Name=String, BirthYear=Number, Country=String
}}

Now pull together books and author details in one go:


{{#cargo_query:
|tables=Books JOIN Authors ON Books.Author = Authors.Name
|fields=Books.Title, Authors.BirthYear, Authors.Country
|where=Authors.Country = "United Kingdom"
|order=Books.Title
}}

It feels like an ordinary SQL join, yet you stay within the wiki’s markup environment. That’s the sweet spot Cargo aims for: power without leaving MediaWiki.

Performance Tips: When to Use Static Tables

If you’re dealing with thousands of rows, the default dynamic tables can become a drag. Turning on static tables tells Cargo to cache query results as actual MySQL tables—updates happen only when the source data changes, not on every page view.


$wgCargoDataRefreshInterval = 3600; // refresh hourly

Combine that with $wgCargoEnableStaticTables = true; from the install snippet, and you’ll see a noticeable speedup in high‑traffic wikis.

Interoperability with Other Extensions

Cargo isn’t a lone wolf. If you already run Semantic MediaWiki, you can let Cargo handle “heavy lifting” for large numeric datasets while SMW continues to manage semantic properties. The Cargo‑SMW bridge even lets you query SMW data from a Cargo query, opening a hybrid approach that exploits the strengths of both.

Wikibase users can also benefit; Cargo can import data from Wikidata dumps, then expose it through wiki pages without the overhead of Wikibase’s entity model. In practice, many community sites use Cargo for “quick‑and‑dirty” data—think local government lists, event schedules, or hobbyist inventories—while reserving SMW/Wikibase for mission‑critical knowledge graphs.

Best Practices: Keeping Your Data Clean

  • Consistent field names. Cargo is case‑sensitive; Title and title become separate columns.
  • Avoid overly long text. Large blobs work, but they slow queries. For long descriptions, consider a separate page and store only a reference.
  • Document your tables. A simple #cargo_declare at the top of each template doubles as documentation for future editors.
  • Use proper data types. Numbers, dates, and booleans each have a dedicated type; feeding a string into a Number column leads to silent conversion issues.

Real‑World Example: A Community Library Catalog

One regional wiki decided to replace a static HTML catalog with a dynamic Cargo‑driven system. The Template:BookCard declared a LibraryBooks table. Librarians now simply edit the infobox on each book’s page, and the public page Special:CargoQuery automatically shows a searchable list, filterable by author, genre, or year.

Because Cargo stored the raw values, the library could export a CSV file with a single click—useful for syncing with external inventory tools. The whole workflow boiled down to “edit the template, data propagates”. No extra CSV uploads, no separate database admin.

Limitations to Keep on Your Radar

While Cargo covers many use cases, it isn’t a full‑blown ontology engine. Complex relationships, like many‑to‑many links with qualifiers, can become cumbersome. In those scenarios, SMW’s property system or Wikibase’s entity model might be more appropriate. Also, because Cargo relies on the underlying MySQL, certain advanced SQL features (window functions, recursive CTEs) aren’t directly exposed in the query syntax.

If you need versioning of the stored data itself, Cargo doesn’t provide that out of the box. Data changes are reflected instantly, and older values disappear unless you log them manually.

Wrapping Up: When Cargo Is the Right Fit

In a nutshell, Cargo gives you a lightweight bridge between the free‑form world of MediaWiki pages and the structured realm of databases. Its declarative approach means you won’t ask editors to learn a new language—just to fill out templates as usual. Queries feel familiar to anyone who’s used a spreadsheet, and performance knobs let you scale from a handful of rows to tens of thousands without major headaches.

So, if you find yourself asking, “How can we turn these infoboxes into a searchable list?”—Cargo is likely the answer. It sits in the middle, quietly gathering data, letting you ask questions, and presenting the results in a format anyone can read. That’s the power of structured data, made simple enough for a community wiki.

Subscribe to MediaWiki Tips and Tricks

Don’t miss out on the latest articles. Sign up now to get access to the library of members-only articles.
jamie@example.com
Subscribe