Jump to content

Wikipedia:AutoWikiBrowser/Create new pages

From Wikipedia, the free encyclopedia

There are at least two different scenarios for creating new pages in Wikimedia:

  1. When you need to create similar pages, based on a common format, with only several pieces of information different from page to page
    For example, creating stub pages for a lot of cities or plants, with minimum of information
  2. When you need to import in Wikimedia full-fledged pages

For the first scenario, you can use the CSVLoader plugin developed by Ganeshk - assuming that you have a CSV data file with the basic pieces of information that you need to assemble in articles. Step by step tutorial for creating pages this way you can find at Wikipedia:CSVLoader/Walkthrough.

For the second scenario you can follow the walkthrough below. The procedure described here uses a custom batch script that passes to AutoWikiBrowser the generated content for each article, using the External Processing option.

Prerequisites

[edit]

In order to import pages generated from a 3rd party program into Wikipedia using AWB, you need:

  • a computer with Microsoft Windows (Vista or later)
  • AWB installed and working
  • a registered user with approval to use AWB (see details here)
  • the pages that you need to import, saved as text files:
    • each page saved in a different text file
    • each text file having the name of the article to be created (_first_article_name_.txt, _second_article_name.txt etc.) Article names can contain spaces and non-English characters.
    • the text files are saved in UTF-8 format if they contain non-English characters

With Batch

[edit]

Create a list of articles

[edit]

You need a file with the list of article names to be created. The text file with the list should be also saved in UTF-8 format if it contains non-English characters.


If you don't have this list generated by an external tool, you can build it by opening a Command prompt console in the folder where you have the article files and type:

chcp 65001
for %i in (*.txt) do @echo %~ni >> AWBlist.txt

Maybe you should run this code directly in the console if it doesn't work. To do this, open the console in the current folder (SHIFT + right click and select "Open console"), and type the code above.

Create the batch script

[edit]

In the folder where AWB is installed, create a batch script file with the name script.bat and with the following content:

@echo off
chcp 65001
setlocal enableextensions enabledelayedexpansion
set concat=
for %%x in (%*) do set concat=!concat! %%x
set concat=%concat:~1%
copy /Y "C:\_folder_with_my_files_\%concat%.txt" iofile.txt 
echo %concat% >> log.txt

Make sure that you replace _folder_with_my_files_ with the actual folder name where you have stored the article text files.


What the script does:

  • sets the code page to Unicode (to be able to handle articles with non-English charactes)
  • enable the delayed expansion of variables (this is needed for the next step)
  • concatenates the parameters given in a single string – this is needed if the article names contain spaces
  • removes the first space
  • copy the text file named article_name.txt from your source folder as iofile.txt in the AWB folder
  • write the name of the article coped in a log file in the AWB folder – you can use this file to analyse if the script was run, if it derived the name correctly etc. in case of issues.
[edit]

You can use the batch script above as it is, but this will briefly open a Windows Console window at each article, that steals the focus. In order to avoid this, we must instruct the AWB to run the script in background.

So create a shortcut to script.bat in the same folder where AWB is installed (just drag script.bat with the right mouse button in a n empty space in the folder, then choose Create shortcut here). Then change the properties of the script - Shortcut.lnk file so that it runs minimized (right-click on it and choose Properties, then change the Run option from Normal Window to Minimized).

Setup AWB

[edit]

After you start AWB, go to menu Tools -> External processing. In the dialog window enter:

  • check Enabled
  • Program or script = the full link to the script - Shortcut.lnk file ( C:\_awb_folder_\script - Shortcut.lnk)
  • Arguments = %%title%%
  • Input/Output file = C:\_awb_folder_\iofile.txt
  • check Pass article as file

Load list and start

[edit]

In order to load in AWB the list of articles to be imported, you need to use the list that you created in the first step.

Wikipedia:AutoWikiBrowser/User_manual#Make_list
Wikipedia:AutoWikiBrowser/User_manual#Make_list

In Make list Panel, choose Source: Text file (UTF-8), then click on Make list. Choose the AWBlist.txt file that you have generated. The list of articles should be imported in the panel below.


Other settings:

  • Automatic changes: leave as default or change as you need
  • More: Append/Prepend text: you need to check this but leave the text edit below empty (otherwise the script is not called)
  • Skip: it make sense to choose Skip if Page Exists, in order not to overwrite older pages
  • Bots: you can check AutoSave if you use a Bot account and want import automatically without further checks


Then click Start and off you go.

Using a custom module

[edit]

AWB allows to create a C# module in order to find files while processing pages. They are different steps:

  1. Create all the files you want to upload; we highly recommend to put them in a separated folder
  2. Create the list of pages you want to create using a batch script or any other way (even hand-made)
  3. Create and setup the custom module, here is the explanations
  4. Click start

Load the article list

[edit]

The easiest way to upload pages from your computer is to group all the source pages in the same folder, and to create a list where each article name is the same as the file name. When the list is ready, open AWB and select Make list, Source, Text file (both UTF-8 and Windows 1252 / ANSI works). Then select the file containing the list.

Create and enable a custom module

[edit]

Next you will have to load a custom module. Each time an article in the list is loaded, the module will be called, then he has to find the file corresponding to the article's name, and return the text to add. Here is an example of module selecting the file from "C://files/", and returning the result.

Click on Tools, Make module and copy/paste the following code :

public string ProcessArticle(string ArticleText, string ArticleTitle, int wikiNamespace, out string Summary, out bool Skip)
{
	Skip = false;
	// Useful to edit the text with modifying the source file
	string text = "";
	Summary = "";
	try
	{
		text = System.IO.File.ReadAllText(@"C:\\files\" + ArticleTitle + ".txt");
		Summary = "page successfully created.";
	}
	catch 
	{
		// Do something when page was not properly created
	}
	// Eventually we return the result
	return text;
}

In the module's window click on Make module and check Enabled. Then come back on the main window and click Start.