Importing XML dumps into MediaWiki

I tried to import a large XML dump from Wikipedia and failed. The automatic import feature of MediaWiki failed and gerenated numerous error messages. After Googling around for along time, I came up with this solution:

1. Remove the tags <restrictions>...</restrictions> from your XML file. Some version inconsistencies appear to cause this problem
To remove the tags, simply run this command in Shell:
sed s/"<restrictions>.*<\/restrictions>"//g test.txt

2. For large XML files, make the following change in the includes/parser/Preprocessor_DOM.php
In line 107, change

$result = $dom->loadXML( $xml );
to
$result = $dom->loadXML( $xml, 1<<19 );
3. Run importDump.php from command line on the server:
php imporDump.php <filename.xml>

Posted in Computing, Notes

Leave a Reply

Your email address will not be published. Required fields are marked *

*

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>

Follow Navid on Twitter
Tags
Links
News
Archives
Meta