Integrating REST API in MediaWiki: Building Dynamic Applications

Why the MediaWiki REST API matters (and why you might care)

Picture this: you’re building a mobile app that lets people browse Wikipedia articles offline, or a dashboard that pulls the latest revision stats for a set of internal wiki pages. You could hack together a bunch of action=parse calls, wrestle with XML, and pray the MediaWiki bot won’t hit a rate‑limit. Or you could simply talk to MediaWiki the way modern web services expect you to – via its REST API.

Since MediaWiki 1.35 the REST API has been baked into core, offering JSON (and sometimes HTML) payloads, clean URL patterns and OAuth‑ready authentication. In practice this means you can write a single fetch request in JavaScript, a curl line in a Bash script, or a Python requests call and get back predictable data. No more parsing wikitext manually, no more guessing which action parameter to use. Just clean, versioned endpoints that behave like any other HTTP service you’ve interacted with.

Getting your hands dirty – a quick curl demo

Let’s start with the most “hello‑world”‑ish thing you can do: search for a page about Earth on English Wikipedia.

curl "https://en.wikipedia.org/w/rest.php/v1/search/page?q=earth&limit=1"

The response is a tidy JSON object, something like:

{
  "pages": [
    {
      "title": "Earth",
      "excerpt": "<span>Earth</span> is the third planet ..."
    }
  ],
  "next": null
}

That’s it. No action=query gymnastics, no format=json fiddling. The API already knows it wants JSON because the Accept header defaults to application/json if you don’t ask otherwise.

Peeking under the hood – URL structure & versioning

All endpoints share a predictable pattern:

https://{project}.org/w/rest.php/v{major}/{resource_path}
  • {project} – the domain you’re targeting (e.g., en.wikipedia.org).
  • {major} – the API version, currently v1. MediaWiki follows semantic versioning, so when a breaking change lands they’ll bump the major number.
  • {resource_path} – the thing you want, like page/Main_Page/history or search/page.

Because the version lives in the path (instead of a header), you can safely cache the responses on a CDN without worrying that a future change will silently break your app.

Fetching page content – the most common use‑case

If you need the raw wikitext or the rendered HTML for a page, the endpoint is /page/{title}. Here’s how you’d grab the HTML of the “Main Page”:

curl -H "Accept: text/html" \
  "https://en.wikipedia.org/w/rest.php/v1/page/Main_Page"

And for the wikitext version:

curl -H "Accept: application/json" \
  "https://en.wikipedia.org/w/rest.php/v1/page/Main_Page/with_content"

The with_content suffix tells the API to include the source field (the raw wikitext). The JSON response looks like:

{
  "title": "Main Page",
  "id": 15580374,
  "source": "{{Main Page}}"
}

Getting media files – not just text

Wikimedia Commons is a treasure chest of images, audio and video. The REST API can fetch a file’s metadata and a direct download URL. Example:

curl "https://commons.wikimedia.org/w/rest.php/v1/file/Example.jpg"

This returns a JSON block with url, size, media_type, and even a thumburl if you ask for it via the width query param.

Authentication – OAuth is the way (but not the only way)

If you’re only reading public data, you can go ahead without any credentials. As soon as you want to edit a page, upload a file, or see hidden revisions, you’ll need to prove who you are. MediaWiki ships with the OAuth extension, which implements the standard three‑legged flow.

In short:

  1. Register your app on the target wiki (via Special:OAuthConsumerRegistration).
  2. Redirect the user to the wiki’s authorization endpoint.
  3. Exchange the received code for an access_token.

Once you have the token, you add it to your requests as an Authorization: Bearer <token> header. The API will automatically respect the user’s permissions, returning only what that user is allowed to see.

For quick prototyping you can also use a BotPassword (username + password) – but that’s discouraged for anything beyond a throw‑away script.

Extensions can expose their own REST endpoints

One of the neat things about MediaWiki’s REST layer is that extensions can plug in extra routes. The TextExtracts extension, for instance, adds /page/{title}/extract to fetch a plain‑text excerpt. If you develop a custom extension, you can register a handler in PHP like:

<?php
// In extension.json
"RestRoutes": {
    "myextension/v1/hello/{name}": "MyExtension\\Rest\\HelloHandler"
}

Then your handler returns whatever JSON you like – perfect for marrying wiki data with a bespoke backend.

Rate‑limits, user‑agents and good‑citizen behavior

Wikimedia sites enforce a “be nice” policy. The docs say there’s no hard request‑limit, but automated traffic that looks abusive will be throttled or blocked. Two simple habits keep you in the clear:

  • Always send a descriptive User-Agent header. Something like MyApp/1.0 (https://mydomain.example; myemail@example.com) lets admins contact you if something goes sideways.
  • Respect the Retry‑After header if you receive a 429 Too Many Requests response. Back off for the indicated seconds before trying again.

In practice, a 1‑second pause between successive calls is more than enough for most client‑side apps.

Putting it together – a tiny Node.js example

Below is a short script that searches for a term, grabs the first result’s title, fetches the page content, and prints the first 200 characters. It uses the native node-fetch library – no extra frameworks required.

const fetch = require('node-fetch');

async function getFirstParagraph(term) {
  // Search for the term
  const searchRes = await fetch(
    `https://en.wikipedia.org/w/rest.php/v1/search/page?q=${encodeURIComponent(term)}&limit=1`
  );
  const searchData = await searchRes.json();
  const title = searchData.pages?.[0]?.title;
  if (!title) throw new Error('No pages found');

  // Get page content (raw wikitext)
  const pageRes = await fetch(
    `https://en.wikipedia.org/w/rest.php/v1/page/${encodeURIComponent(title)}/with_content`
  );
  const pageData = await pageRes.json();

  // Extract a snippet
  const snippet = pageData.source?.slice(0, 200).replace(/\\n/g, ' ');
  console.log(`First 200 chars of ${title}:`);
  console.log(snippet);
}

getFirstParagraph('Earth').catch(console.error);

Run it with node script.js and you’ll see a short excerpt of the Earth article. The same pattern works for any other endpoint – just swap the URL.

Common pitfalls (and how to dodge them)

  • Missing Accept header. If you forget to ask for application/json, the API might return HTML (especially for /page without /with_content). That’s a sneaky source of bugs.
  • URL‑encoding titles. Spaces, slashes and special characters must be escaped, otherwise the request ends up 404. Use encodeURIComponent() in JavaScript or urlencode in PHP.
  • Assuming version stability. While v1 is stable, new minor releases can add optional fields. Write your code defensively – check for field existence before using it.

Beyond the basics – what you can build

Now that the fundamentals are in place, let your imagination run. Here are a few ideas that people have already turned into real‑world tools:

  1. Offline Wikipedia readers. Cache /page JSON responses on a device, then render them with a lightweight Markdown‑to‑HTML converter.
  2. Revision dashboards. Pull /page/{title}/history for a list of edits, aggregate stats, and display a heatmap of editing activity.
  3. Multi‑wiki search aggregators. Query /search/page on several language wikis in parallel, merge results, and present a unified list.
  4. Custom bots. Using OAuth, write a bot that automates repetitive edits (e.g., fixing template syntax) while respecting the same rate‑limits as human editors.

Wrapping up – a few final thoughts

Integrating the MediaWiki REST API isn’t just a “nice‑to‑have” feature. It’s a practical, future‑proof way to make your wiki data talk to the rest of the web. With clean JSON, versioned URLs, OAuth support, and an extension‑friendly design, the API gives you the building blocks for everything from tiny widgets to full‑blown analytics platforms.

So, next time you find yourself reaching for action=query and wrestling with XML, pause. Ask yourself: “Is there a REST endpoint for this?” If the answer is yes (and it usually is), go ahead and give it a try. You’ll probably end up with less code, fewer bugs, and a lot more confidence that your app will keep working when MediaWiki ships the next major release.

Subscribe to MediaWiki Tips and Tricks

Don’t miss out on the latest articles. Sign up now to get access to the library of members-only articles.
jamie@example.com
Subscribe