Building Custom Reports in MediaWiki Using Lua Modules
Why Lua modules matter for MediaWiki reporting
At first glance a MediaWiki installation looks like a giant collection of wikitext pages, nothing more than a glorified blog. In practice, though, the platform hides a surprisingly powerful scripting layer called Scribunto. With Scribunto you can write Lua code that lives inside Module: pages and is callable from any other page via {{#invoke:…}}. This dynamic bridge is what turns a static encyclopedia into a live data‑driven reporting engine.
In my own wiki‑farm we needed a way to pull together page‑view stats, recent edit summaries and even external JSON feeds—all without touching the PHP core. The answer? A handful of well‑structured Lua modules. The rest of this post walks through the thought process, the essential building blocks, and a concrete example that you can adapt to any domain.
Getting the basics right
First things first: make sure Scribunto is installed and enabled. In LocalSettings.php you should see something like:
$wgEnableModules = true;
wfLoadExtension( 'Scribunto' );
If you’re on MediaWiki 1.35 or newer the extension is bundled, so the only real requirement is that your host allows execution of the Lua binary. Once that’s confirmed, you can create a Module:ReportHelper page and start coding.
The data model – what you’ll be crunching
When we talk about “reports” in the wiki context we usually mean one of three things:
- Page‑metadata collections – titles, categories, timestamps.
- Statistics from the core API – page‑view counters, edit counts.
- External data sources – JSON from a public API, CSV files stored on the wiki.
Lua in MediaWiki doesn’t ship with a full‑blown HTTP client, but the mw.http library can fetch JSON over and mw.title plus mw.site give you a convenient wrapper over the core wiki API.
Setting up a reusable helper module
Below is a skeleton that most reporters end up customizing. It abstracts away the boilerplate for fetching a list of pages in a given category and for pulling raw JSON from a URL.
local p = {}
local mw = mw
--- Return a table of page titles belonging to category.
--- Uses the core API via mw.title.makeTitle and mw.site:query.
function p.getPagesInCategory( category )
local result = {}
local cont = nil
repeat
local params = {
action = "query",
list = "categorymembers",
cmtitle = "Category:" .. category,
cmlimit = "500",
cmcontinue = cont,
}
local response = mw.site:query( params )
for _, page in ipairs( response.query.categorymembers ) do
table.insert( result, page.title )
end
cont = response.continue and response.continue.cmcontinue or nil
until not cont
return result
end
--- Fetch JSON from url and decode it. Errors are caught
--- and reported as a simple string, keeping the report robust.
function p.fetchJson( url )
local ok, data = pcall( mw.http.request, url )
if not ok or not data then
return { error = "Unable to fetch " .. url }
end
local success, decoded = pcall( mw.text.jsonDecode, data )
if not success then
return { error = "Invalid JSON from " .. url }
end
return decoded
end
return p
Notice the liberal use of pcall. Error‑handling in Lua is a bit like walking a tightrope over a construction site—one slip and the whole page collapses. By catching errors early we keep the final report tidy, even if an external service hiccups.
Putting it together – a sample “Monthly Edit Summary” report
Imagine you need a monthly snapshot of all edits made to pages in Category:Published‑Articles. The report should include:
- The total number of edits.
- The top five editors by edit count.
- A list of pages that have not been edited in the last 30 days.
The following module does exactly that. It calls the helper above, runs a second API query for revisions, and finally renders an HTML table using mw.html.
local report = {}
local helper = require( "Module:ReportHelper" )
local html = mw.html.create
--- Main entry point used by {{#invoke:MonthlyEditReport|generate}}
function report.generate( frame )
local category = "Published-Articles"
local pages = helper.getPagesInCategory( category )
local editCounts = {}
local lastEdit = {}
for _, title in ipairs( pages ) do
local revs = mw.site:query{
action = "query",
prop = "revisions",
titles = title,
rvprop = "timestamp|user",
rvlimit = "max",
}
local pageInfo = revs.query.pages[1]
if pageInfo.revisions then
editCounts[title] = #pageInfo.revisions
lastEdit[title] = pageInfo.revisions[1].timestamp
else
editCounts[title] = 0
lastEdit[title] = "Never"
end
end
-- 1. total edits
local total = 0
for _, count in pairs( editCounts ) do total = total + count end
-- 2. top editors
local editorFreq = {}
for _, title in ipairs( pages ) do
local revs = mw.site:query{
action = "query",
prop = "revisions",
titles = title,
rvprop = "user",
rvlimit = "max",
}
for _, rev in ipairs( revs.query.pages[1].revisions or {} ) do
local user = rev.user
editorFreq[user] = ( editorFreq[user] or 0 ) + 1
end
end
local topEditors = {}
for user, cnt in pairs( editorFreq ) do
table.insert( topEditors, { user = user, cnt = cnt } )
end
table.sort( topEditors, function(a,b) return a.cnt > b.cnt end )
topEditors = { unpack( topEditors, 1, 5 ) }
-- 3. stale pages
local stale = {}
local thirtyDaysAgo = os.time{ year=os.date("%Y"), month=os.date("%m"), day=os.date("%d")-30 }
for title, ts in pairs( lastEdit ) do
if ts ~= "Never" then
local tsTime = mw.parseDate( ts )
if tsTime < thirtyDaysAgo then
table.insert( stale, title )
end
else
table.insert( stale, title )
end
end
-- Render HTML
local out = html()
out:tag( "h3" ):wikitext( "Monthly Edit Summary for " .. category ):done()
out:tag( "p" ):wikitext( "Total edits: " .. total ):done()
out:tag( "h4" ):wikitext( "Top 5 editors" ):done()
out:tag( "ul" )
for _, e in ipairs( topEditors ) do
out:wikitext( string.format( "* %s (%d edits)", e.user, e.cnt ) )
end
out:done() -- close ul
out:tag( "h4" ):wikitext( "Pages not edited in the last 30 days" ):done()
out:tag( "ul" )
for _, t in ipairs( stale ) do
out:wikitext( "* [[" .. t .. "]]" )
end
out:done() -- close ul
return out:wikitext()
end
return report
Take a minute to skim the code. You’ll see three distinct sections: data gathering, aggregation, and rendering. That separation is intentional; it makes the module easier to test (by calling each function from the Lua REPL) and simple to extend. For example, you could add a second chart that visualizes edit spikes using the mw.graph library.
Calling the module from a wiki page
With the module in place, the actual report page is just a one‑liner:
{{#invoke:MonthlyEditReport|generate}}
When a reader visits the page, MediaWiki runs the Lua code server‑side, injects the generated HTML, and the visitor sees a fresh report each time. No JavaScript, no extra CSS—just plain wikitext plus a dash of Lua magic.
Performance tips you won’t find in the official docs
- Cache aggressively. Wrap expensive calls in <>mw.loadData or use the
mw.smw.storeextension if you have SMW installed. Cached results survive across page loads and dramatically cut API traffic. - Chunk large result sets. The MediaWiki API limits
cmlimitto 500 for regular users; if you need more, paginate withcmcontinue, as shown in the helper. - Avoid string concatenation in loops. Lua’s
table.concatis far faster than the naiveresult = result .. piecepattern. - Watch out for timezone quirks.
mw.parseDatea timestamp in UTC. If you compare it against a local “30‑day ago” value, first convert both sides to the same timezone.
Real‑world use cases that sparked the idea
On a community‑run documentation wiki we built a “License compliance” report that scanned every page for a {{License}} template, aggregated the license types, and presented a colored pie chart. The same pattern works for “Open issues per project”, “Top contributors in the last sprint”, or “List of pages missing a ‘References’ section”. All you need is a way to tag the pages (usually via categories or hidden templates) and a module that pulls that tag tallies, and spits out HTML.
Common pitfalls (and how to sidestep them)
1. Forgot to enable $wgAllowExternalImagesFrom when pulling remote JSON. The Lua sandbox will silently return an empty string. Adding the domain to the whitelist fixes it.
2. Ran into “module load error: recursive require”. That usually means two modules call each other. Break the cycle by moving shared logic into a third “utils” module.
3. Performance spikes after a big edit war. The API query for revisions can explode. Limit rvlimit to a reasonable number (e.g., 100) and cache the totals instead of recomputing each time.
Where to go from here
If you’re already comfortable with the basics, consider blending Lua modules with the Data Transfer extension to export CSV files, or pair them with VisualEditor to let non‑technical users tweak report parameters without touching raw wikitext.
In short, Lua modules give you a programmable backbone inside MediaWiki—something that feels half‑baked at first, but once the first report works, you’ll start seeing every page as a data source. That’s the magic: a wiki that not only stores knowledge, but also analyses it on the fly.