How to Set Up CirrusSearch (Elasticsearch) for MediaWiki: Installation, Configuration, and Performance Tuning

Overview

CirrusSearch replaces MediaWiki’s built‑in MySQL search with a fast, full‑text engine powered by Elasticsearch. The guide walks through the required software, extension installation, basic configuration, index creation and a few tuning tips that keep a medium‑size wiki responsive.

Prerequisites

  • MediaWiki ≥ 1.39 (the guide targets 1.39 + Elasticsearch 7.10.2 which is the last fully supported version on the official site).
  • PHP compiled with the cURL extension.
  • Java 11 (OpenJDK) – required by Elasticsearch.
  • Root or sudo access on the host where Elasticsearch will run.

1. Install Elasticsearch

Download the official Debian package and install it. The steps work on any system that can run dpkg or apt:

wget https://artifacts.elastic.co/downloads/elasticsearch/elasticsearch-7.10.2-amd64.deb
sudo dpkg -i elasticsearch-7.10.2-amd64.deb
sudo systemctl enable --now elasticsearch.service
curl -s http://127.0.0.1:9200 | jq .

The JSON response should show "number" : "7.10.2" and a green cluster status.

2. Install the MediaWiki extensions

Both Elastica (the PHP client) and CirrusSearch are required. The stable releases for the MediaWiki version can be downloaded from the extension pages on mediawiki.org and mediawiki.org. After extracting, the directory structure should look like:

/var/www/wiki/extensions/Elastica
/var/www/wiki/extensions/CirrusSearch

If you install from Git, run Composer inside each extension to pull PHP dependencies:

cd /var/www/wiki/extensions/Elastica
composer install --no-dev
cd ../CirrusSearch
composer install --no-dev

3. Enable the extensions in LocalSettings.php

Append the following lines near the end of the file:

wfLoadExtension( 'Elastica' );
wfLoadExtension( 'CirrusSearch' );
$wgDisableSearchUpdate = true; // temporarily stop live updates
$wgSearchType = 'CirrusSearch'; // tell MediaWiki to use Cirrus

Save the file and verify the extensions appear on Special:Version.

4. Create the Elasticsearch index

Run the maintenance script that writes the index mapping and settings:

php extensions/CirrusSearch/maintenance/UpdateSearchIndexConfig.php --startOver

When the script finishes you should see messages like “Validating mappings… ok”.

5. Populate the index

First re‑enable updates, then bootstrap the content.

# Enable updates again
sed -i 's/^$wgDisableSearchUpdate/# $wgDisableSearchUpdate/' LocalSettings.php

Now run the two indexing passes:

php extensions/CirrusSearch/maintenance/ForceSearchIndex.php --skipLinks --indexOnSkip
php extensions/CirrusSearch/maintenance/ForceSearchIndex.php --skipParse

Both commands report pages indexed per second; a 10 k‑page wiki typically finishes in a few minutes.

6. Performance tuning

Heap size

Set the JVM heap to half the available RAM but never exceed 30 GB (the point where pointer compression is lost). Edit /etc/elasticsearch/jvm.options:

-Xms4g
-Xmx4g

Restart Elasticsearch afterwards.

Shards and replicas

For a wiki under 50 k pages a single primary shard is sufficient. Adjust the index settings in LocalSettings.php if you need more:

$wgCirrusSearchShardCount = 1;   // default
$wgCirrusSearchReplicas = 1;    // one replica gives high availability

Rescore profiles

CirrusSearch ships with several built‑in rescore profiles. The “classic_noboostlinks” profile is a good starting point for balanced relevance:

$wgCirrusSearchRescoreProfile = 'classic_noboostlinks';

Fine‑tune the $wgCirrusSearchWeights array to give more weight to titles or headings:

$wgCirrusSearchWeights = [
    'title'          => 20,
    'heading'        => 5,
    'text'           => 1,
    'file_text'      => 15, // if PdfHandler is installed
];

Timeouts

Short timeouts keep the job queue from stalling when a node misbehaves:

$wgCirrusSearchConnectionAttempts = 3;
$wgCirrusSearchClientSideSearchTimeout = 5; // seconds

Job queue

CirrusSearch relies on MediaWiki’s job queue. Using Redis for the queue gives the best throughput. Example configuration:

$wgJobTypeConf['default'] = [
    'class'       => 'JobQueueRedis',
    'redisServer' => '127.0.0.1',
    'checkDelay' => true,
];
$wgRunJobsAsync = true; // let workers run in the background

7. Verify the setup

Visit Special:Search and perform a search. If the results show the “CirrusSearch” debug link at the bottom, the integration is active. You can also append &cirrusDumpResult to the URL to see the raw Elasticsearch query.

8. Ongoing maintenance

  • When upgrading MediaWiki, re‑run UpdateSearchIndexConfig.php. If the mapping changed, rebuild the index with the --startOver flag.
  • Monitor Elasticsearch heap usage and GC logs; adjust Xmx if you see frequent “GC overhead limit exceeded”.
  • Periodically run php maintenance/runJobs.php to flush pending updates.

With these steps your wiki will have a responsive, feature‑rich search powered by Elasticsearch.

Subscribe to MediaWiki Tips and Tricks

Don’t miss out on the latest articles. Sign up now to get access to the library of members-only articles.
jamie@example.com
Subscribe