scripts package¶
Scripts folder contains predefined scripts easy to use.
Scripts are only available im Pywikibot if instaled in directory mode and not as side package. They can be run in command line using the pwb wrapper script:
python pwb.py <global options> <name_of_script> <options>
Every script provides a -help
option which shows all available
options, their explanation and usage examples. Global Options
will be shown by -help:global
or using:
python pwb.py -help
The advantages of pwb.py wrapper script are:
check for framework and script depedencies and show a warning if a package is missing or outdated or if the Python release does not fit
check whether user-config.py config file is available and ask to create it by starting the generate_user_files.py script
enable global options even if a script does not support them
start private scripts located in userscripts sub-folder
find a script even if given script name does not match a filename e.g. due to spelling mistake
Subpackages¶
add_text script¶
This is a Bot to add text to the top or bottom of a page
By default this adds the text to the bottom above the categories and interwiki.
These command line parameters can be used to specify which pages to work on:
This script supports use of pywikibot.pagegenerators
arguments.
Furthermore, the following command line parameters are supported:
-text Define what text to add. "\n" are interpreted as newlines.
-textfile Define a texfile name which contains the text to add
-summary Define the summary to use
-up If used, put the text at the top of the page
-always If used, the bot won't ask if it should add the specified
text
-major If used, the edit will be saved without the "minor edit" flag
-talkpage Put the text onto the talk page instead
-talk
-excepturl Use the html page as text where you want to see if there's
the text, not the wiki-page.
-noreorder Avoid reordering cats and interwiki
Example
Append ‘hello world’ to the bottom of the sandbox:
python pwb.py add_text -page:Wikipedia:Sandbox \ -summary:"Bot: pywikibot practice" -text:"hello world"
Add a template to the top of the pages with ‘category:catname’:
python pwb.py add_text -cat:catname -summary:"Bot: Adding a template" \ -text:"{{Something}}" -except:"\{\{([Tt]emplate:\|)[Ss]omething" -up
Command used on it.wikipedia to put the template in the page without any category:
python pwb.py add_text -except:"\{\{([Tt]emplate:\|)[Cc]ategorizzare" \ -text:"{{Categorizzare}}" -excepturl:"class='catlinks'>" -uncat \ -summary:"Bot: Aggiungo template Categorizzare"
-
class
scripts.add_text.
AddTextBot
(**kwargs: Any)[source]¶ Bases:
pywikibot.bot.AutomaticTWSummaryBot
,pywikibot.bot.ExistingPageBot
,pywikibot.bot.NoRedirectPageBot
A bot which adds a text to a page.
Only accept ‘generator’ and options defined in available_options.
- Parameters
kwargs – bot options
- Keyword Arguments
generator – a generator processed by run method
-
summary_key
= 'add_text-adding'¶
-
property
summary_parameters
¶ Return a dictionary of all parameters for i18n.
Line breaks are replaced by dash.
-
update_options
: Dict[str, Any] = {'always': False, 'minor': True, 'regex_skip_url': '', 'reorder': True, 'summary': '', 'talk_page': False, 'text': '', 'textfile': '', 'up': False}¶
-
scripts.add_text.
main
(*argv: str) → None[source]¶ Process command line arguments and invoke bot.
If args is an empty list, sys.argv is used.
- Parameters
argv – Command line arguments
-
scripts.add_text.
parse
(argv: tuple, generator_factory: pywikibot.pagegenerators.GeneratorFactory) → dict[source]¶ Parses our arguments and provide a named tuple with their values.
- Parameters
argv – input arguments to be parsed
generator_factory – factory that will determine the page to edit
- Returns
dictionary with our parsed arguments
- Raises
ValueError – invalid arguments received
archivebot script¶
archivebot.py - discussion page archiving bot
usage:
python pwb.py archivebot [OPTIONS] TEMPLATE_PAGE
Bot examines backlinks (Special:WhatLinksHere) to TEMPLATE_PAGE. Then goes through all pages (unless a specific page specified using options) and archives old discussions. This is done by breaking a page into threads, then scanning each thread for timestamps. Threads older than a specified threshold are then moved to another page (the archive), which can be named either basing on the thread’s name or then name can contain a counter which will be incremented when the archive reaches a certain size.
Transcluded template may contain the following parameters:
{{TEMPLATE_PAGE
\|archive =
\|algo =
\|counter =
\|maxarchivesize =
\|minthreadsleft =
\|minthreadstoarchive =
\|archiveheader =
\|key =
}}
Meanings of parameters are:
archive Name of the page to which archived threads will be put.
Must be a subpage of the current page. Variables are
supported.
algo Specifies the maximum age of a thread. Must be
in the form old(<delay>) where <delay> specifies
the age in seconds (s), hours (h), days (d),
weeks (w), or years (y) like 24h or 5d. Default is
old(24h).
counter The current value of a counter which could be assigned as
variable. Will be updated by bot. Initial value is 1.
maxarchivesize The maximum archive size before incrementing the counter.
Value can be given with appending letter like K or M
which indicates KByte or MByte. Default value is 200K.
minthreadsleft Minimum number of threads that should be left on a page.
Default value is 5.
minthreadstoarchive The minimum number of threads to archive at once. Default
value is 2.
archiveheader Content that will be put on new archive pages as the
header. This parameter supports the use of variables.
Default value is {{talkarchive}}
key A secret key that (if valid) allows archives not to be
subpages of the page being archived.
Variables below can be used in the value for “archive” in the template above:
%(counter)d the current value of the counter
%(year)d year of the thread being archived
%(isoyear)d ISO year of the thread being archived
%(isoweek)d ISO week number of the thread being archived
%(semester)d semester term of the year of the thread being archived
%(quarter)d quarter of the year of the thread being archived
%(month)d month (as a number 1-12) of the thread being archived
%(monthname)s localized name of the month above
%(monthnameshort)s first three letters of the name above
%(week)d week number of the thread being archived
The ISO calendar starts with the Monday of the week which has at least four days in the new Gregorian calendar. If January 1st is between Monday and Thursday (including), the first week of that year started the Monday of that week, which is in the year before if January 1st is not a Monday. If it’s between Friday or Sunday (including) the following week is then the first week of the year. So up to three days are still counted as the year before.
Options (may be omitted):
-help show this help message and exit
-calc:PAGE calculate key for PAGE and exit
-file:FILE load list of pages from FILE
-force override security options
-locale:LOCALE switch to locale LOCALE
-namespace:NS only archive pages from a given namespace
-page:PAGE archive a single PAGE, default ns is a user talk page
-salt:SALT specify salt
-
exception
scripts.archivebot.
AlgorithmError
(arg: str)[source]¶ Bases:
scripts.archivebot.MalformedConfigError
Invalid specification of archiving algorithm.
-
exception
scripts.archivebot.
ArchiveBotSiteConfigError
(arg: str)[source]¶ Bases:
pywikibot.exceptions.Error
There is an error originated by archivebot’s on-site configuration.
-
exception
scripts.archivebot.
ArchiveSecurityError
(arg: str)[source]¶ Bases:
scripts.archivebot.ArchiveBotSiteConfigError
Page title is not a valid archive of page being archived.
The page title is neither a subpage of the page being archived, nor does it match the key specified in the archive configuration template.
-
class
scripts.archivebot.
DiscussionPage
(source, archiver, params=None)[source]¶ Bases:
pywikibot.page.Page
A class that represents a single page of discussion threads.
Feed threads to it and run an update() afterwards.
-
feed_thread
(thread: scripts.archivebot.DiscussionThread, max_archive_size: tuple) → bool[source]¶ Append a new thread to the archive.
-
-
class
scripts.archivebot.
DiscussionThread
(title: str, timestripper: pywikibot.textlib.TimeStripper)[source]¶ Bases:
object
An object representing a discussion thread on a page.
It represents something that is of the form:
== Title of thread == Thread content here. ~~~~ :Reply, etc. ~~~~
-
exception
scripts.archivebot.
MalformedConfigError
(arg: str)[source]¶ Bases:
scripts.archivebot.ArchiveBotSiteConfigError
There is an error in the configuration template.
-
exception
scripts.archivebot.
MissingConfigError
(arg: str)[source]¶ Bases:
scripts.archivebot.ArchiveBotSiteConfigError
The config is missing in the header.
It’s in one of the threads or transcluded from another page.
-
class
scripts.archivebot.
PageArchiver
(page, template, salt: str, force: bool = False)[source]¶ Bases:
object
A class that encapsulates all archiving methods.
- Parameters
page (
pywikibot.Page
) – a page object to be archivedtemplate (
pywikibot.Page
) – a template with configuration settingssalt – salt value
force – override security value
-
algo
= 'none'¶
-
get_archive_page
(title: str, params=None) → scripts.archivebot.DiscussionPage[source]¶ Return the page for archiving.
If it doesn’t exist yet, create and cache it. Also check for security violations.
-
should_archive_thread
(thread: scripts.archivebot.DiscussionThread) → Optional[tuple][source]¶ Check whether a thread has to be archived.
- Returns
the archivation reason as a tuple of localization args
-
class
scripts.archivebot.
TZoneUTC
[source]¶ Bases:
datetime.tzinfo
Class building a UTC tzinfo object.
-
scripts.archivebot.
calc_md5_hexdigest
(txt, salt) → str[source]¶ Return md5 hexdigest computed from text and salt.
-
scripts.archivebot.
checkstr
(string: str) → tuple[source]¶ Return the key and duration extracted from the string.
- Parameters
string – a string defining a time period
Examples:
300s - 300 seconds 36h - 36 hours 7d - 7 days 2w - 2 weeks (14 days) 1y - 1 year
- Returns
key and duration extracted form the string
-
scripts.archivebot.
main
(*args: str) → None[source]¶ Process command line arguments and invoke bot.
If args is an empty list, sys.argv is used.
- Parameters
args – command line arguments
-
scripts.archivebot.
str2localized_duration
(site, string: str) → str[source]¶ Localise a shorthand duration.
Translates a duration written in the shorthand notation (ex. “24h”, “7d”) into an expression in the local wiki language (“24 hours”, “7 days”).
-
scripts.archivebot.
str2size
(string: str) → tuple[source]¶ Return a size for a shorthand size.
Accepts a string defining a size:
1337 - 1337 bytes 150K - 150 kilobytes 2M - 2 megabytes
- @Returns: a tuple
(size, unit)
, wheresize
is an integer and unit is
'B'
(bytes) or'T'
(threads).
- @Returns: a tuple
-
scripts.archivebot.
str2time
(string: str, timestamp=None) → datetime.timedelta[source]¶ Return a timedelta for a shorthand duration.
- Parameters
string – a string defining a time period:
Examples:
300s - 300 seconds 36h - 36 hours 7d - 7 days 2w - 2 weeks (14 days) 1y - 1 year
- Parameters
timestamp (datetime.datetime) – a timestamp to calculate a more accurate duration offset used by years
- Returns
the corresponding timedelta object
-
scripts.archivebot.
template_title_regex
(tpl_page: pywikibot.page.Page) → Pattern[source]¶ Return a regex that matches to variations of the template title.
It supports the transcluding variant as well as localized namespaces and case-insensitivity depending on the namespace.
- Parameters
tpl_page (pywikibot.page.Page) – The template page
basic script¶
An incomplete sample script
This is not a complete bot; rather, it is a template from which simple bots can be made. You can rename it to mybot.py, then edit it in whatever way you want.
Use global -simulate option for test purposes. No changes to live wiki will be done.
The following parameters are supported:
-always The bot won't ask for confirmation when putting a page
-text: Use this text to be added; otherwise 'Test' is used
-replace: Don't add text but replace it
-top Place additional text on top of the page
-summary: Set the action summary message for the edit.
All settings can be made either by giving option with the command line or with a settings file which is scripts.ini by default. If you don’t want the default values you can add any option you want to change to that settings file below the [basic] section like:
[basic] ; inline comments starts with colon
# This is a commend line. Assignments may be done with '=' or ':'
text: A text with line break and
continuing on next line to be put
replace: yes ; yes/no, on/off, true/false and 1/0 is also valid
summary = Bot: My first test edit with pywikibot
Every script has its own section with the script name as header.
In addition the following generators and filters are supported but cannot be set by settings file:
This script supports use of pywikibot.pagegenerators
arguments.
-
class
scripts.basic.
BasicBot
(site: Union[Any, bool] = True, **kwargs: Any)[source]¶ Bases:
pywikibot.bot.SingleSiteBot
,pywikibot.bot.ConfigParserBot
,pywikibot.bot.ExistingPageBot
,pywikibot.bot.NoRedirectPageBot
,pywikibot.bot.AutomaticTWSummaryBot
An incomplete sample bot.
- Variables
summary_key – Edit summary message key. The message that should be used is placed on /i18n subdirectory. The file containing these messages should have the same name as the caller script (i.e. basic.py in this case). Use summary_key to set a default edit summary message.
Create a SingleSiteBot instance.
- Parameters
site – If True it’ll be set to the configured site using pywikibot.Site.
-
summary_key
= 'basic-changing'¶
-
update_options
: Dict[str, Any] = {'replace': False, 'summary': None, 'text': 'Test', 'top': False}¶
blockpageschecker script¶
A bot to remove stale protection templates from pages that are not protected
Very often sysops block the pages for a set time but then they forget to remove the warning! This script is useful if you want to remove those useless warning left in these pages.
These command line parameters can be used to specify which pages to work on:
This script supports use of pywikibot.pagegenerators
arguments.
Furthermore, the following command line parameters are supported:
-protectedpages Check all the blocked pages; useful when you have not
categories or when you have problems with them. (add the
namespace after ":" where you want to check - default checks
all protected pages.)
-moveprotected Same as -protectedpages, for moveprotected pages
-always Doesn't ask every time whether the bot should make the change.
Do it always.
-show When the bot can't delete the template from the page (wrong
regex or something like that) it will ask you if it should
show the page on your browser.
(attention: pages included may give false positives!)
-move The bot will check if the page is blocked also for the move
option, not only for edit
Examples::
python pwb.py blockpageschecker -always
python pwb.py blockpageschecker -cat:Geography -always
python pwb.py blockpageschecker -show -protectedpages:4
category script¶
Script to manage categories
Syntax:
python pwb.py category action [-option]
where action can be one of these
add - mass-add a category to a list of pages.
remove - remove category tag from all pages in a category.
move - move all pages in a category to another category.
tidy - tidy up a category by moving its pages into subcategories.
tree - show a tree of subcategories of a given category.
listify - make a list of all of the articles that are in a category.
and option can be one of these
Options for “add” action:
-person - Sort persons by their last name.
-create - If a page doesn't exist, do not skip it, create it instead.
-redirect - Follow redirects.
If action is “add”, the following options are supported:
This script supports use of pywikibot.pagegenerators
arguments.
Options for “listify” action:
-append - This appends the list to the current page that is already
existing (appending to the bottom by default).
-overwrite - This overwrites the current page with the list even if
something is already there.
-showimages - This displays images rather than linking them in the list.
-talkpages - This outputs the links to talk pages of the pages to be
listified in addition to the pages themselves.
-prefix:# - You may specify a list prefix like "#" for a numbered list or
any other prefix. Default is a bullet list with prefix "*".
Options for “remove” action:
-nodelsum - This specifies not to use the custom edit summary as the
deletion reason. Instead, it uses the default deletion reason
for the language, which is "Category was disbanded" in
English.
Options for “move” action:
-hist - Creates a nice wikitable on the talk page of target category
that contains detailed page history of the source category.
-nodelete - Don't delete the old category after move.
-nowb - Don't update the Wikibase repository.
-allowsplit - If that option is not set, it only moves the talk and main
page together.
-mvtogether - Only move the pages/subcategories of a category, if the
target page (and talk page, if -allowsplit is not set)
doesn't exist.
-keepsortkey - Use sortKey of the old category also for the new category.
If not specified, sortKey is removed.
An alternative method to keep sortKey is to use -inplace
option.
Options for “listify” and “tidy” actions:
-namespaces Filter the arcitles in the specified namespaces. Separate
-namespace multiple namespace numbers or names with commas. Examples::
-ns -ns:0,2,4
-ns:Help,MediaWiki
Options for several actions:
-rebuild - Reset the database.
-from: - The category to move from (for the move option)
Also, the category to remove from in the remove option
Also, the category to make a list of in the listify option.
-to: - The category to move to (for the move option).
- Also, the name of the list to make in the listify option.
NOTE: If the category names have spaces in them you may need to use
a special syntax in your shell so that the names aren't treated as
separate parameters. For instance, in BASH, use single quotes,
e.g. -from:'Polar bears'.
-batch - Don't prompt to delete emptied categories (do it
automatically).
-summary: - Pick a custom edit summary for the bot.
-inplace - Use this flag to change categories in place rather than
rearranging them.
-recurse - Recurse through all subcategories of categories.
-pagesonly - While removing pages from a category, keep the subpage links
and do not remove them.
-match - Only work on pages whose titles match the given regex (for
move and remove actions).
-depth: - The max depth limit beyond which no subcategories will be
listed.
For the actions tidy and tree, the bot will store the category structure locally in category.dump. This saves time and server load, but if it uses these data later, they may be outdated; use the -rebuild parameter in this case.
For example, to create a new category from a list of persons, type:
python pwb.py category add -person
and follow the on-screen instructions.
Or to do it all from the command-line, use the following syntax:
python pwb.py category move -from:US -to:"United States"
This will move all pages in the category US to the category United States.
-
class
scripts.category.
CategoryAddBot
(generator, newcat=None, sort_by_last_name=False, create=False, comment='', follow_redirects=False, editSummary='[deprecated name of comment]', dry=NotImplemented)[source]¶ Bases:
scripts.category.CategoryPreprocess
A robot to mass-add a category to a list of pages.
-
sorted_by_last_name
(catlink, pagelink) → pywikibot.page.Page[source]¶ Return a Category with key that sorts persons by their last name.
- Parameters: catlink - The Category to be linked.
pagelink - the Page to be placed in the category.
Trailing words in brackets will be removed. Example: If category_name is ‘Author’ and pl is a Page to [[Alexandre Dumas (senior)]], this function will return this Category: [[Category:Author|Dumas, Alexandre]].
-
-
class
scripts.category.
CategoryDatabase
(rebuild=False, filename='category.dump.bz2')[source]¶ Bases:
object
Temporary database saving pages and subcategories for each category.
This prevents loading the category pages over and over again.
-
dump
(filename=None) → None[source]¶ Save the dictionaries to disk if not empty.
Pickle the contents of the dictionaries superclassDB and catContentDB if at least one is not empty. If both are empty, removes the file from the disk.
If the filename is None, it’ll use the filename determined in __init__.
-
getArticles
(cat) → set[source]¶ Return the list of pages for a given category.
Saves this list in a temporary database so that it won’t be loaded from the server next time it’s required.
-
getSubcats
(supercat) → set[source]¶ Return the list of subcategories for a given supercategory.
Saves this list in a temporary database so that it won’t be loaded from the server next time it’s required.
-
property
is_loaded
¶ Return whether the contents have been loaded.
-
-
class
scripts.category.
CategoryListifyRobot
(catTitle, listTitle, editSummary, append=False, overwrite=False, showImages=False, subCats=NotImplemented, *, talkPages=False, recurse=False, prefix='*', namespaces=None)[source]¶ Bases:
object
Create a list containing all of the members in a category.
-
class
scripts.category.
CategoryMoveRobot
(oldcat, newcat=None, batch=False, comment='', inplace=False, move_oldcat=True, delete_oldcat=True, title_regex=None, history=False, pagesonly=False, deletion_comment=0, move_comment=None, wikibase=True, allow_split=False, move_together=False, keep_sortkey=None, oldCatTitle='[deprecated name of oldcat]', newCatTitle='[deprecated name of newcat]', batchMode='[deprecated name of batch]', editSummary='[deprecated name of comment]', inPlace='[deprecated name of inplace]', moveCatPage='[deprecated name of move_oldcat]', deleteEmptySourceCat='[deprecated name of delete_oldcat]', titleRegex='[deprecated name of title_regex]', withHistory='[deprecated name of history]')[source]¶ Bases:
scripts.category.CategoryPreprocess
Change or remove the category from the pages.
If the new category is given changes the category from the old to the new one. Otherwise remove the category from the page and the category if it’s empty.
Per default the operation applies to pages and subcategories.
Store all given parameters in the objects attributes.
- Parameters
oldcat – The move source.
newcat – The move target.
batch – If True the user has not to confirm the deletion.
comment – The edit summary for all pages where the category is changed, and also for moves and deletions if not overridden.
inplace – If True the categories are not reordered.
move_oldcat – If True the category page (and talkpage) is copied to the new category.
delete_oldcat – If True the oldcat page and talkpage are deleted (or nominated for deletion) if it is empty.
title_regex – Only pages (and subcats) with a title that matches the regex are moved.
history – If True the history of the oldcat is posted on the talkpage of newcat.
pagesonly – If True only move pages, not subcategories.
deletion_comment – Either string or special value: DELETION_COMMENT_AUTOMATIC: use a generated message, DELETION_COMMENT_SAME_AS_EDIT_COMMENT: use the same message for delete that is used for the edit summary of the pages whose category was changed (see the comment param above). If the value is not recognized, it’s interpreted as DELETION_COMMENT_AUTOMATIC.
move_comment – If set, uses this as the edit summary on the actual move of the category page. Otherwise, defaults to the value of the comment parameter.
wikibase – If True, update the Wikibase item of the old category.
allow_split – If False only moves page and talk page together.
move_together – If True moves the pages/subcategories only if page and talk page could be moved or both source page and target page don’t exist.
-
DELETION_COMMENT_AUTOMATIC
= 0¶
-
DELETION_COMMENT_SAME_AS_EDIT_COMMENT
= 1¶
-
static
check_move
(name, old_page, new_page) → bool[source]¶ Return if the old page can be safely moved to the new page.
- Parameters
name (str) – Title of the new page
old_page (pywikibot.page.BasePage) – Page to be moved
new_page (pywikibot.page.BasePage) – Page to be moved to
- Returns
True if possible to move page, False if not page move not possible
-
class
scripts.category.
CategoryPreprocess
(follow_redirects=False, edit_redirects=False, create=False, **kwargs)[source]¶ Bases:
pywikibot.bot.BaseBot
A class to prepare a list of pages for robots.
-
determine_template_target
(page) → pywikibot.page.Page[source]¶ Return template page to be categorized.
Categories for templates can be included in <includeonly> section of template doc page.
Also the doc page can be changed by doc template parameter.
TODO: decide if/how to enable/disable this feature.
- Parameters
page (pywikibot.Page) – Page to be processed.
- Returns
Page to be categorized.
-
determine_type_target
(page) → Optional[pywikibot.page.Page][source]¶ Return page to be categorized by type.
- Parameters
page (pywikibot.Page) – Existing, missing or redirect page to be processed.
- Returns
Page to be categorized.
-
-
class
scripts.category.
CategoryTidyRobot
(cat_title: str, cat_db, namespaces=None, comment: Optional[str] = None)[source]¶ Bases:
pywikibot.bot.Bot
,scripts.category.CategoryPreprocess
Robot to move members of a category into sub- or super-categories.
Specify the category title on the command line. The robot will pick up the page, look for all sub- and super-categories, and show them listed as possibilities to move page into with an assigned number. It will ask you to type number of the appropriate replacement, and performs the change robotically. It will then automatically loop over all pages in the category.
If you don’t want to move the member to a sub- or super-category, but to another category, you can use the ‘j’ (jump) command.
By typing ‘s’ you can leave the complete page unchanged.
By typing ‘m’ you can show more content of the current page, helping you to find out what the page is about and in which other categories it currently is.
- Parameters
cat_title – a title of the category to process.
cat_db (CategoryDatabase object) – a CategoryDatabase object.
namespaces (iterable of pywikibot.Namespace) – namespaces to focus on.
comment – a custom summary for edits.
-
move_to_category
(member, original_cat, current_cat, article='[deprecated name of member]')[source]¶ Ask whether to move it to one of the sub- or super-categories.
Given a page in the original_cat category, ask the user whether to move it to one of original_cat’s sub- or super-categories. Recursively run through subcategories’ subcategories. NOTE: current_cat is only used for internal recursion. You should always use current_cat = original_cat.
- Parameters
member (pywikibot.Page) – a page to process.
original_cat (pywikibot.Category) – original category to replace.
current_cat (pywikibot.Category) – a category which is questioned.
-
class
scripts.category.
CategoryTreeRobot
(catTitle, catDB, filename=None, maxDepth=10)[source]¶ Bases:
object
Robot to create tree overviews of the category structure.
- Parameters
catTitle - The category which will be the tree's root. (*) –
catDB - A CategoryDatabase object. (*) –
maxDepth - The limit beyond which no subcategories will be listed. (*) – This also guarantees that loops in the category structure won’t be a problem.
filename - The textfile where the tree should be saved; None to print (*) – the tree to stdout.
-
run
() → None[source]¶ Handle the multi-line string generated by treeview.
After string was generated by treeview it is either printed to the console or saved it to a file.
-
treeview
(cat, currentDepth=0, parent=None) → str[source]¶ Return a tree view of all subcategories of cat.
The multi-line string contains a tree view of all subcategories of cat, up to level maxDepth. Recursively calls itself.
- Parameters
cat - the Category of the node we're currently opening. (*) –
currentDepth - the current level in the tree (*) –
parent - the Category of the category we're coming from. (*) –
category_redirect script¶
This bot will move pages out of redirected categories
The bot will look for categories that are marked with a category redirect template, take the first parameter of the template as the target of the redirect, and move all pages and subcategories of the category there. It also changes hard redirects into soft redirects, and fixes double redirects. A log is written under <userpage>/category_redirect_log. Only category pages that haven’t been edited for a certain cooldown period (currently 7 days) are taken into account.
The following parameters are supported:
-delay:# Set an amount of days. If the category is edited more recenty
than given days, ignore it. Default is 7.
-tiny Only loops over Category:Non-empty_category_redirects and
moves all images, pages and categories in redirect categories
to the target category.
Usage:
python pwb.py category_redirect [options]
-
class
scripts.category_redirect.
CategoryRedirectBot
(**kwargs)[source]¶ Bases:
pywikibot.bot.SingleSiteBot
Page category update bot.
-
check_hard_redirect
()[source]¶ Check for hard-redirected categories.
Check categories that are not already marked with an appropriate softredirect template.
-
move_contents
(oldCatTitle, newCatTitle, editSummary)[source]¶ The worker function that moves pages out of oldCat into newCat.
-
update_options
: Dict[str, Any] = {'delay': 7, 'tiny': False}¶
-
change_pagelang script¶
This script changes the content language of pages
These command line parameters can be used to specify which pages to work on:
This script supports use of pywikibot.pagegenerators
arguments.
Furthermore, the following command line parameters are supported:
-setlang What language the pages should be set to
-always If a language is already set for a page, always change it
to the one set in -setlang.
-never If a language is already set for a page, never change it to
the one set in -setlang (keep the current language).
-
class
scripts.change_pagelang.
ChangeLangBot
(**kwargs)[source]¶ Bases:
pywikibot.bot.SingleSiteBot
Change page language bot.
-
changelang
(page)[source]¶ Set page language.
- Parameters
page (pywikibot.page.BasePage) – The page to update and save
-
treat
(page)[source]¶ Treat a page.
- Parameters
page (pywikibot.page.BasePage) – The page to treat
-
update_options
: Dict[str, Any] = {'never': False, 'setlang': ''}¶
-
checkimages script¶
Script to check recently uploaded files
This script checks if a file description is present and if there are other problems in the image’s description.
This script will have to be configured for each language. Please submit translations as addition to the Pywikibot framework.
Everything that needs customisation is indicated by comments.
This script understands the following command-line arguments:
-limit The number of images to check (default: 80)
-commons The Bot will check if an image on Commons has the same name
and if true it reports the image.
-duplicates[:#] Checking if the image has duplicates (if arg, set how many
rollback wait before reporting the image in the report
instead of tag the image) default: 1 rollback.
-duplicatesreport Report the duplicates in a log *AND* put the template in
the images.
-maxusernotify Maximum nofitications added to a user talk page in a single
check, to avoid email spamming.
-sendemail Send an email after tagging.
-break To break the bot after the first check (default: recursive)
-sleep[:#] Time in seconds between repeat runs (default: 30)
-wait[:#] Wait x second before check the images (default: 0)
-skip[:#] The bot skip the first [:#] images (default: 0)
-start[:#] Use allimages() as generator
(it starts already from File:[:#])
-cat[:#] Use a category as generator
-regex[:#] Use regex, must be used with -url or -page
-page[:#] Define the name of the wikipage where are the images
-url[:#] Define the url where are the images
-nologerror If given, this option will disable the error that is risen
when the log is full.
Instructions for the real-time settings. For every new block you have to add:
<------- ------->
In this way the Bot can understand where the block starts in order to take the right parameter.
Name= Set the name of the block
Find= search this text in the image’s description
Findonly= search for exactly this text in the image’s description
- Summary= That’s the summary that the bot will use when it will notify the
problem.
Head= That’s the incipit that the bot will use for the message.
- Text= This is the template that the bot will use when it will report the
image’s problem.
-
exception
scripts.checkimages.
LogIsFull
(arg: str)[source]¶ Bases:
pywikibot.exceptions.Error
Log is full and the Bot cannot add other data to prevent Errors.
-
class
scripts.checkimages.
checkImagesBot
(site, logFulNumber=25000, sendemailActive=False, duplicatesReport=False, logFullError=True, max_user_notify=None)[source]¶ Bases:
object
A robot to check recently uploaded files.
Initializer, define some instance variables.
-
important_image
(listGiven) → pywikibot.page.FilePage[source]¶ Get tuples of image and time, return the most used or oldest image.
- Parameters
listGiven (list) – a list of tuples which hold seconds and FilePage
- Returns
the most used or oldest image
-
miniTemplateCheck
(template) → bool[source]¶ Check if template is in allowed licenses or in licenses to skip.
-
regexGenerator
(regexp, textrun) → Generator[pywikibot.page.FilePage, None, None][source]¶ Find page to yield using regex to parse text.
-
report
(newtext, image_to_report, notification=None, head=None, notification2=None, unver=True, commTalk=None, commImage=None) → None[source]¶ Function to make the reports easier.
-
report_image
(image_to_report, rep_page=None, com=None, rep_text=None, addings=True) → bool[source]¶ Report the files to the report page when needed.
-
skipImages
(skip_number, limit) → bool[source]¶ Given a number of files, skip the first -number- files.
-
smartDetection
() → tuple[source]¶ Detect templates.
The bot instead of checking if there’s a simple template in the image’s description, checks also if that template is a license or something else. In this sense this type of check is smart.
-
templateInList
() → None[source]¶ Check if template is in list.
The problem is the calls to the Mediawiki system because they can be pretty slow. While searching in a list of objects is really fast, so first of all let’s see if we can find something in the info that we already have, then make a deeper check.
-
uploadBotChangeFunction
(reportPageText, upBotArray) → str[source]¶ Detect the user that has uploaded the file through upload bot.
-
static
wait
(generator, wait_time) → Generator[pywikibot.page.FilePage, None, None][source]¶ Skip the images uploaded before x seconds.
Let the users to fix the image’s problem alone in the first x seconds.
-
claimit script¶
A script that adds claims to Wikidata items based on a list of pages
These command line parameters can be used to specify which pages to work on:
This script supports use of pywikibot.pagegenerators
arguments.
Usage:
python pwb.py claimit [pagegenerators] P1 Q2 P123 Q456
You can use any typical pagegenerator (like categories) to provide with a list of pages. Then list the property–>target pairs to add.
For geographic coordinates:
python pwb.py claimit [pagegenerators] P625 [lat-dec],[long-dec],[prec]
[lat-dec] and [long-dec] represent the latitude and longitude respectively, and [prec] represents the precision. All values are in decimal degrees, not DMS. If [prec] is omitted, the default precision is 0.0001 degrees.
Example
python pwb.py claimit [pagegenerators] P625 -23.3991,-52.0910,0.0001
By default, claimit.py does not add a claim if one with the same property already exists on the page. To override this behavior, use the ‘exists’ option:
python pwb.py claimit [pagegenerators] P246 "string example" -exists:p
Suppose the claim you want to add has the same property as an existing claim and the “-exists:p” argument is used. Now, claimit.py will not add the claim if it has the same target, source, and/or the existing claim has qualifiers. To override this behavior, add ‘t’ (target), ‘s’ (sources), or ‘q’ (qualifiers) to the ‘exists’ argument.
For instance, to add the claim to each page even if one with the same property and target and some qualifiers already exists:
python pwb.py claimit [pagegenerators] P246 "string example" -exists:ptq
Note that the ordering of the letters in the ‘exists’ argument does not matter, but ‘p’ must be included.
-
class
scripts.claimit.
ClaimRobot
(claims, exists_arg='', **kwargs)[source]¶ Bases:
pywikibot.bot.WikidataBot
A bot to add Wikidata claims.
- Parameters
claims (list) – A list of wikidata claims
exists_arg (str) – String specifying how to handle duplicate claims
-
treat_page_and_item
(page, item) → None[source]¶ Treat each page.
- Parameters
page (pywikibot.page.BasePage) – The page to update and change
item (pywikibot.page.ItemPage) – The item to treat
-
use_from_page
= None¶
clean_sandbox script¶
This bot resets a (user) sandbox with predefined text
This script understands the following command-line arguments:
This script supports use of pywikibot.pagegenerators
arguments.
Furthermore, the following command line parameters are supported:
-hours:# Use this parameter if to make the script repeat itself
after # hours. Hours can be defined as a decimal. 0.01
hours are 36 seconds; 0.1 are 6 minutes.
-delay:# Use this parameter for a wait time after the last edit
was made. If no parameter is given it takes it from
hours and limits it between 5 and 15 minutes.
The minimum delay time is 5 minutes.
-text The text that substitutes in the sandbox, you can use this
when you haven't configured clean_candbox for your wiki.
-summary Summary of the edit made by bot. Overrides the default
from i18n.
All local parameters can be given inside a scripts.ini file. Options passed to the script are priorized over options read from ini file. See:: https://docs.python.org/3/library/configparser.html#supported-ini-file-structure
For example:
[clean_sandbox]
# the parameter section for clean_sandbox script
summary = Bot: Cleaning sandbox
text = {{subst:Clean Sandbox}}
hours: 0.5
delay: 7
-
class
scripts.clean_sandbox.
SandboxBot
(**kwargs)[source]¶ Bases:
pywikibot.bot.Bot
,pywikibot.bot.ConfigParserBot
Sandbox reset bot.
-
available_options
: Dict[str, Any] = {'delay': -1, 'delay_td': None, 'hours': 1.0, 'no_repeat': True, 'summary': '', 'text': ''}¶
-
commons_information script¶
Insert a language template into the description field
-
class
scripts.commons_information.
InformationBot
(**kwargs)[source]¶ Bases:
pywikibot.bot.SingleSiteBot
,pywikibot.bot.ExistingPageBot
Bot for the Information template.
Initialzer.
-
comment
= {'en': 'Bot: wrap the description parameter of Information in the appropriate language template'}¶
-
desc_params
= ('Description', 'description')¶
-
lang_tmp_cat
= 'Language templates'¶
-
commonscat script¶
With this tool you can add the template {{commonscat}} to categories
The tool works by following the interwiki links. If the template is present on another language page, the bot will use it.
You could probably use it at articles as well, but this isn’t tested.
The following parameters are supported:
-always Don't prompt you for each replacement. Warning message
has not to be confirmed. ATTENTION: Use this with care!
-summary:XYZ Set the action summary message for the edit to XYZ,
otherwise it uses messages from add_text.py as default.
-checkcurrent Work on all category pages that use the primary commonscat
template.
This bot uses pagegenerators to get a list of pages. The following options are supported:
This script supports use of pywikibot.pagegenerators
arguments.
For example to go through all categories:
python pwb.py commonscat -start:Category:!
-
class
scripts.commonscat.
CommonscatBot
(**kwargs: Any)[source]¶ Bases:
pywikibot.bot.ExistingPageBot
,pywikibot.bot.NoRedirectPageBot
Commons categorisation bot.
Only accept ‘generator’ and options defined in available_options.
- Parameters
kwargs – bot options
- Keyword Arguments
generator – a generator processed by run method
-
changeCommonscat
(page=None, oldtemplate='', oldcat='', newtemplate='', newcat='', linktitle='', description=NotImplemented)[source]¶ Change the current commonscat template and target.
-
checkCommonscatLink
(name='')[source]¶ Return the name of a valid commons category.
If the page is a redirect this function tries to follow it. If the page doesn’t exists the function will return an empty string
-
findCommonscatLink
(page) → str[source]¶ Find CommonsCat template on interwiki pages.
- Returns
name of a valid commons category
-
find_commons_category
(page) → str[source]¶ Find CommonsCat template on Wikibase repository.
Use Wikibase property to get the category if possible. Otherwise check all langlinks to find it.
- Returns
name of a valid commons category
-
static
getCommonscatLink
(page)[source]¶ Find CommonsCat template on page.
- Return type
tuple of (<templatename>, <target>, <linktext>, <note>)
-
treat_page
()[source]¶ Add CommonsCat template to page.
Take a page. Go to all the interwiki page looking for a commonscat template. When all the interwiki’s links are checked and a proper category is found add it to the page.
-
update_options
: Dict[str, Any] = {'summary': None}¶
coordinate_import script¶
Coordinate importing script
Usage:
python pwb.py coordinate_import -site:wikipedia:en \
-cat:Category:Coordinates_not_on_Wikidata
This will work on all pages in the category “coordinates not on Wikidata” and will import the coordinates on these pages to Wikidata.
The data from the “GeoData” extension (https://www.mediawiki.org/wiki/Extension:GeoData) is used so that extension has to be setup properly. You can look at the [[Special:Nearby]] page on your local Wiki to see if it’s populated.
You can use any typical pagegenerator to provide with a list of pages:
python pwb.py coordinate_import -lang:it -family:wikipedia \
-namespace:0 -transcludes:Infobox_stazione_ferroviaria
You can also run over a set of items on the repo without coordinates and try to import them from any connected page. To do this, you have to explicitly provide the repo as the site using -site argument. .. admonition:: Example
- python pwb.py coordinate_import -site:wikidata:wikidata
-namespace:0 -querypage:Deadendpages
The following command line parameters are supported:
-create Create items for pages without one.
This script supports use of pywikibot.pagegenerators
arguments.
-
class
scripts.coordinate_import.
CoordImportRobot
(**kwargs)[source]¶ Bases:
pywikibot.bot.WikidataBot
A bot to import coordinates to Wikidata.
-
has_coord_qualifier
(claims) → Optional[str][source]¶ Check if self.prop is used as property for a qualifier.
- Parameters
claims (dict) – the Wikibase claims to check in
- Returns
the first property for which self.prop is used as qualifier, or None if any
-
item_has_coordinates
(item) → bool[source]¶ Check if the item has coordinates.
- Returns
whether the item has coordinates
-
try_import_coordinates_from_page
(page, item) → bool[source]¶ Try import coordinate from the given page to the given item.
- Returns
whether any coordinates were found and the import was successful
-
use_from_page
= None¶
-
cosmetic_changes script¶
This module can do slight modifications to tidy a wiki page’s source code
The changes are not supposed to change the look of the rendered wiki page.
The following parameters are supported:
-always Don't prompt you for each replacement. Warning (see below)
has not to be confirmed. ATTENTION: Use this with care!
-async Put page on queue to be saved to wiki asynchronously.
-summary:XYZ Set the summary message text for the edit to XYZ, bypassing
the predefined message texts with original and replacements
inserted.
-ignore: Ignores if an error occurred and either skips the page or
only that method. It can be set to:
all - dos not ignore errors
match - ignores ISBN related errors (default)
method - ignores fixing method errors
page - ignores page related errors
The following generators and filters are supported:
This script supports use of pywikibot.pagegenerators
arguments.
ATTENTION: You can run this script as a stand-alone for testing purposes. However, the changes that are made are only minor, and other users might get angry if you fill the version histories and watchlists with such irrelevant changes. Some wikis prohibit stand-alone running.
For further information see pywikibot/cosmetic_changes.py
-
class
scripts.cosmetic_changes.
CosmeticChangesBot
(**kwargs: Any)[source]¶ Bases:
pywikibot.bot.AutomaticTWSummaryBot
,pywikibot.bot.ExistingPageBot
,pywikibot.bot.NoRedirectPageBot
Cosmetic changes bot.
Only accept ‘generator’ and options defined in available_options.
- Parameters
kwargs – bot options
- Keyword Arguments
generator – a generator processed by run method
-
summary_key
= 'cosmetic_changes-standalone'¶
-
update_options
: Dict[str, Any] = {'async': False, 'ignore': <CANCEL.MATCH: 3>, 'summary': ''}¶
delete script¶
This script can be used to delete and undelete pages en masse
Of course, you will need an admin account on the relevant wiki.
These command line parameters can be used to specify which pages to work on:
This script supports use of pywikibot.pagegenerators
arguments.
Furthermore, the following command line parameters are supported:
-always Don't prompt to delete pages, just do it.
-summary:XYZ Set the summary message text for the edit to XYZ.
-undelete Actually undelete pages instead of deleting.
Obviously makes sense only with -page and -file.
-isorphan Alert if there are pages that link to page to be
deleted (check 'What links here').
By default it is active and only the summary per namespace
is be given.
If given as -isorphan:n, n pages per namespace will be shown,
If given as -isorphan:0, only the summary per namespace will
be shown,
If given as -isorphan:n, with n < 0, the option is disabled.
This option is disregarded if -always is set.
-orphansonly: Specified namespaces. Separate multiple namespace
numbers or names with commas.
Examples:
-orphansonly:0,2,4
-orphansonly:Help,MediaWiki
Note that Main ns can be indicated either with a 0 or a ',':
-orphansonly:0,1
-orphansonly:,Talk
Usage:
python pwb.py delete [-category categoryName]
Examples
Delete everything in the category “To delete” without prompting:
python pwb.py delete -cat:"To delete" -always
-
class
scripts.delete.
DeletionRobot
(summary: str, **kwargs)[source]¶ Bases:
pywikibot.bot.CurrentPageBot
This robot allows deletion of pages en masse.
- Parameters
summary – the reason for the (un)deletion
-
display_references
() → None[source]¶ Display pages that link to the current page, sorted per namespace.
Number of pages to display per namespace is provided by: - self.opt.isorphan
-
update_options
: Dict[str, Any] = {'isorphan': 0, 'orphansonly': [], 'undelete': False}¶
-
class
scripts.delete.
PageWithRefs
(source, title='', ns=0)[source]¶ Bases:
pywikibot.page.Page
A subclass of Page with convenience methods for reference checking.
Supports the same interface as Page, with some added methods.
-
get_ref_table
(*args, **kwargs) → collections.defaultdict[pywikibot.site._namespace.Namespace, pywikibot.page.Page][source]¶ Build mapping table with pages which links the current page.
-
namespaces_with_ref_to_page
(namespaces=None) → set[source]¶ Check if current page has links from pages in namepaces.
If namespaces is None, all namespaces are checked. Returns a set with namespaces where a ref to page is present.
- Parameters
namespaces (iterable of Namespace objects) – Namespace to check
-
property
ref_table
¶ Build link reference table lazily.
This property gives a default table without any parameter set for getReferences(), whereas self.get_ref_table() is able to accept parameters.
-
djvutext script¶
This bot uploads text from djvu files onto pages in the “Page” namespace
It is intended to be used for Wikisource.
The following parameters are supported:
-index:... name of the index page (without the Index: prefix)
-djvu:... path to the djvu file, it shall be::
- path to a file name
- dir where a djvu file name as index is located
optional, by default is current dir '.'
-pages:<start>-<end>,...<start>-<end>,<start>-<end>
Page range to upload;
optional, start=1, end=djvu file number of images.
Page ranges can be specified as::
A-B -> pages A until B
A- -> pages A until number of images
A -> just page A
-B -> pages 1 until B
-summary: custom edit summary.
Use quotes if edit summary contains spaces.
-force overwrites existing text
optional, default False
-always do not bother asking to confirm any of the changes.
-
class
scripts.djvutext.
DjVuTextBot
(djvu, index, pages: Optional[tuple] = None, **kwargs)[source]¶ Bases:
pywikibot.bot.SingleSiteBot
A bot that uploads text-layer from djvu files to Page:namespace.
Works only on sites with Proofread Page extension installed.
- Parameters
djvu (DjVuFile object) – djvu from where to fetch the text layer
index (Page object) – index page in the Index: namespace
pages – page interval to upload (start, end)
-
property
generator
¶ Generate pages from specified page interval.
-
update_options
: Dict[str, Any] = {'force': False, 'summary': ''}¶
download_dump script¶
This bot downloads dump from dumps.wikimedia.org
This script supports the following command line parameters:
-filename:# The name of the file (e.g. abstract.xml)
-storepath:# The stored file's path.
-dumpdate:# The dumpdate date of the dump (default to `latest`)
formatted as YYYYMMDD.
-
class
scripts.download_dump.
DownloadDumpBot
(site: Optional[Any] = None, **kwargs: Any)[source]¶ Bases:
pywikibot.bot.Bot
Download dump bot.
Create a Bot instance and initialize cached sites.
-
available_options
: Dict[str, Any] = {'dumpdate': 'latest', 'filename': '', 'storepath': './', 'wikiname': ''}¶
-
fixing_redirects script¶
Correct all redirect links in featured pages or only one page of each wiki
Can be used with:
-featured Run over featured pages (for some Wikimedia wikis only)
- -overwrite
Usually only the link is changed ([[Foo]] -> [[Bar|Foo]]). This parameters sets the script to completly overwrite the link text ([[Foo]] -> [[Bar]]).
This script supports use of pywikibot.pagegenerators
arguments.
-
class
scripts.fixing_redirects.
FixingRedirectBot
(site: Union[Any, bool] = True, **kwargs: Any)[source]¶ Bases:
pywikibot.bot.SingleSiteBot
,pywikibot.bot.ExistingPageBot
,pywikibot.bot.NoRedirectPageBot
,pywikibot.bot.AutomaticTWSummaryBot
Run over pages and resolve redirect links.
Create a SingleSiteBot instance.
- Parameters
site – If True it’ll be set to the configured site using pywikibot.Site.
-
ignore_server_errors
= True¶
-
summary_key
= 'fixing_redirects-fixing'¶
-
update_options
: Dict[str, Any] = {'overwrite': False}¶
harvest_template script¶
Template harvesting script
Usage (see below for explanations and examples):
python pwb.py harvest_template -transcludes:"..." \
[default optional arguments] \
template_parameter PID [local optional arguments] \
[template_parameter PID [local optional arguments]]
python pwb.py harvest_template [generators] -template:"..." \
[default optional arguments] \
template_parameter PID [local optional arguments] \
[template_parameter PID [local optional arguments]]
This will work on all pages that transclude the template in the article namespace
These command line parameters can be used to specify which pages to work on:
This script supports use of pywikibot.pagegenerators
arguments.
You can also use additional parameters:
-create Create missing items before importing.
The following command line parameters can be used to change the bot’s behavior. If you specify them before all parameters, they are global and are applied to all param-property pairs. If you specify them after a param-property pair, they are local and are only applied to this pair. If you specify the same argument as both local and global, the local argument overrides the global one (see also examples):
-islink Treat plain text values as links ("text" -> "[[text]]").
-exists If set to 'p', add a new value, even if the item already
has the imported property but not the imported value.
If set to 'pt', add a new value, even if the item already
has the imported property with the imported value and
some qualifiers.
-multi If set, try to match multiple values from parameter.
Examples
This will try to import existing images from “image” parameter of “Infobox person” on English Wikipedia as Wikidata property “P18” (image):
python pwb.py harvest_template -lang:en -family:wikipedia -namespace:0 \
-template:"Infobox person" image P18
This will behave the same as the previous example and also try to import [[links]] from “birth_place” parameter of the same template as Wikidata property “P19” (place of birth):
python pwb.py harvest_template -lang:en -family:wikipedia -namespace:0 \
-template:"Infobox person" image P18 birth_place P19
This will import both “birth_place” and “death_place” params with -islink modifier, ie. the bot will try to import values, even if it doesn’t find a [[link]]:
python pwb.py harvest_template -lang:en -family:wikipedia -namespace:0 \
-template:"Infobox person" -islink birth_place P19 death_place P20
This will do the same but only “birth_place” can be imported without a link:
python pwb.py harvest_template -lang:en -family:wikipedia -namespace:0 \
-template:"Infobox person" birth_place P19 -islink death_place P20
This will import an occupation from “occupation” parameter of “Infobox person” on English Wikipedia as Wikidata property “P106” (occupation). The page won’t be skipped if the item already has that property but there is not the new value:
python pwb.py harvest_template -lang:en -family:wikipedia -namespace:0 \
-template:"Infobox person" occupation P106 -exists:p
This will import band members from the “current_members” parameter of “Infobox musical artist” on English Wikipedia as Wikidata property “P527” (has part). This will only extract multiple band members if each is linked, and will not add duplicate claims for the same member:
python pwb.py harvest_template -lang:en -family:wikipedia -namespace:0 \
-template:"Infobox musical artist" current_members P527 -exists:p \
-multi
-
class
scripts.harvest_template.
HarvestRobot
(template_title, fields, **kwargs)[source]¶ Bases:
pywikibot.bot.WikidataBot
A bot to add Wikidata claims.
- Parameters
template_title (str) – The template to work on
fields (dict) – A dictionary of fields that are of use to us
- Keyword Arguments
islink – Whether non-linked values should be treated as links
create – Whether to create a new item if it’s missing
exists – pattern for merging existing claims with harvested values
multi – Whether multiple values should be extracted from a single parameter
-
getTemplateSynonyms
(title) → list[source]¶ Fetch redirects of the title, so we can check against them.
-
update_options
: Dict[str, Any] = {'always': True, 'create': False, 'exists': '', 'islink': False, 'multi': False}¶
-
class
scripts.harvest_template.
PropertyOptionHandler
(**kwargs: Any)[source]¶ Bases:
pywikibot.bot.OptionHandler
Class holding options for a param-property pair.
Only accept options defined in available_options.
- Parameters
kwargs – bot options
-
available_options
: Dict[str, Any] = {'exists': '', 'islink': False, 'multi': False}¶
illustrate_wikidata script¶
Bot to add images to Wikidata items
The image is extracted from the page_props. For this to be available the PageImages extension (https://www.mediawiki.org/wiki/Extension:PageImages) needs to be installed
Usage:
python pwb.py illustrate_wikidata <some generator>
This script supports use of pywikibot.pagegenerators
arguments.
-
class
scripts.illustrate_wikidata.
IllustrateRobot
(wdproperty='P18', **kwargs)[source]¶ Bases:
pywikibot.bot.WikidataBot
A bot to add Wikidata image claims.
- Parameters
wdproperty (str) – The property to add. Should be of type commonsMedia
image script¶
This script can be used to change one image to another or remove an image
Syntax:
python pwb.py image image_name [new_image_name]
If only one command-line parameter is provided then that image will be removed; if two are provided, then the first image will be replaced by the second one on all pages.
Command line options:
-summary: Provide a custom edit summary. If the summary includes spaces,
surround it with single quotes, such as:
-summary:'My edit summary'
-always Don't prompt to make changes, just do them.
-loose Do loose replacements. This will replace all occurrences of the name
of the image (and not just explicit image syntax). This should work
to catch all instances of the image, including where it is used as a
template parameter or in image galleries. However, it can also make
more mistakes. This only works with image replacement, not image
removal.
Examples
The image “FlagrantCopyvio.jpg” is about to be deleted, so let’s first remove it from everything that displays it:
python pwb.py image FlagrantCopyvio.jpg
The image “Flag.svg” has been uploaded, making the old “Flag.jpg” obsolete:
python pwb.py image Flag.jpg Flag.svg
-
class
scripts.image.
ImageRobot
(generator, old_image: str, new_image: Optional[str] = None, **kwargs)[source]¶ Bases:
scripts.replace.ReplaceRobot
This bot will replace or remove all occurrences of an old image.
- Parameters
generator (iterable) – the pages to work on
old_image – the title of the old image (without namespace)
new_image – the title of the new image (without namespace), or None if you want to remove the image
imagetransfer script¶
Script to copy images to Wikimedia Commons, or to another wiki
Syntax:
python pwb.py imagetransfer {<pagename>\|<generator>} [<options>]
The following parameters are supported:
-interwiki Look for images in pages found through interwiki links.
-keepname Keep the filename and do not verify description while
replacing
-tolang:x Copy the image to the wiki in code x
-tofamily:y Copy the image to a wiki in the family y
-tosite:s Copy the image to the given site like wikipedia:test
-force_if_shared Upload the file to the target, even if it exists on that
wiki's shared repo
-file:z Upload many files from textfile: [[Image:x]]
[[Image:y]]
If pagename is an image description page, offers to copy the image to the target site. If it is a normal page, it will offer to copy any of the images used on that page, or if the -interwiki argument is used, any of the images used on a page reachable via interwiki links.
This script supports use of pywikibot.pagegenerators
arguments.
-
class
scripts.imagetransfer.
ImageTransferBot
(**kwargs)[source]¶ Bases:
pywikibot.bot.SingleSiteBot
,pywikibot.bot.ExistingPageBot
Image transfer bot.
- Keyword Arguments
generator – the pages to work on
target_site – Site to send image to, default none
interwiki – Look for images in interwiki links, default false
keepname – Keep the filename and do not verify description while replacing, default false
force_if_shared – Upload the file even if it’s currently shared to the target site (e.g. when moving from Commons to another wiki)
-
transfer_image
(sourceImagePage)[source]¶ Download image and its description, and upload it to another site.
- Returns
the filename which was used to upload the image
-
update_options
: Dict[str, Any] = {'force_if_shared': False, 'ignore_warning': False, 'interwiki': False, 'keepname': False, 'target': None}¶
interwiki script¶
Script to check language links for general pages
Uses existing translations of a page, plus hints from the command line, to download the equivalent pages from other languages. All of such pages are downloaded as well and checked for interwiki links recursively until there are no more links that are encountered. A rationalization process then selects the right interwiki links, and if this is unambiguous, the interwiki links in the original page will be automatically updated and the modified page uploaded.
These command-line arguments can be used to specify which pages to work on:
-days: Like -years, but runs through all date pages. Stops at
Dec 31. If the argument is given in the form -days:X,
it will start at month no. X through Dec 31. If the
argument is simply given as -days, it will run from
Jan 1 through Dec 31. E.g. for -days:9 it will run
from Sep 1 through Dec 31.
-years: run on all year pages in numerical order. Stop at year 2050.
If the argument is given in the form -years:XYZ, it
will run from [[XYZ]] through [[2050]]. If XYZ is a
negative value, it is interpreted as a year BC. If the
argument is simply given as -years, it will run from 1
through 2050.
This implies -noredirect.
-new: Work on the 100 newest pages. If given as -new:x, will work
on the x newest pages.
When multiple -namespace parameters are given, x pages are
inspected, and only the ones in the selected name spaces are
processed. Use -namespace:all for all namespaces. Without
-namespace, only article pages are processed.
This implies -noredirect.
-restore: restore a set of "dumped" pages the bot was working on
when it terminated. The dump file will be subsequently
removed.
-restore:all restore a set of "dumped" pages of all dumpfiles to a given
family remaining in the "interwiki-dumps" directory. All
these dump files will be subsequently removed. If restoring
process interrupts again, it saves all unprocessed pages in
one new dump file of the given site.
-continue: like restore, but after having gone through the dumped
pages, continue alphabetically starting at the last of the
dumped pages. The dump file will be subsequently removed.
This script supports use of pywikibot.pagegenerators
arguments.
Additionally, these arguments can be used to restrict the bot to certain pages:
-namespace:n Number or name of namespace to process. The parameter can be
used multiple times. It works in combination with all other
parameters, except for the -start parameter. If you e.g.
want to iterate over all categories starting at M, use
-start:Category:M.
-number: used as -number:#, specifies that the bot should process
that amount of pages and then stop. This is only useful in
combination with -start. The default is not to stop.
-until: used as -until:title, specifies that the bot should
process pages in wiki default sort order up to, and
including, "title" and then stop. This is only useful in
combination with -start. The default is not to stop.
Note: do not specify a namespace, even if -start has one.
-bracket only work on pages that have (in the home language)
parenthesis in their title. All other pages are skipped.
(note: without ending colon)
-skipfile: used as -skipfile:filename, skip all links mentioned in
the given file. This does not work with -number!
-skipauto use to skip all pages that can be translated automatically,
like dates, centuries, months, etc.
(note: without ending colon)
-lack: used as -lack:xx with xx a language code: only work on pages
without links to language xx. You can also add a number nn
like -lack:xx:nn, so that the bot only works on pages with
at least nn interwiki links (the default value for nn is 1).
These arguments control miscellaneous bot behaviour:
-quiet Use this option to get less output
(note: without ending colon)
-async Put page on queue to be saved to wiki asynchronously. This
enables loading pages during saving throtteling and gives a
better performance.
NOTE: For post-processing it always assumes that saving the
the pages was successful.
(note: without ending colon)
-summary: Set an additional action summary message for the edit. This
could be used for further explainings of the bot action.
This will only be used in non-autonomous mode.
-hintsonly The bot does not ask for a page to work on, even if none of
the above page sources was specified. This will make the
first existing page of -hint or -hinfile slip in as start
page, determining properties like namespace, disambiguation
state, and so on. When no existing page is found in the
hints, the bot does nothing.
Hitting return without input on the "Which page to check:"
prompt has the same effect as using -hintsonly.
Options like -back, -same or -wiktionary are in effect only
after a page has been found to work on.
(note: without ending colon)
These arguments are useful to provide hints to the bot:
-hint: used as -hint:de:Anweisung to give the bot a hint
where to start looking for translations. If no text
is given after the second ':', the name of the page
itself is used as the title for the hint, unless the
-hintnobracket command line option (see there) is also
selected.
There are some special hints, trying a number of languages
at once::
* all: All languages with at least ca. 100 articles
* 10: The 10 largest languages (sites with most
articles). Analogous for any other natural
number
* arab: All languages using the Arabic alphabet
* cyril: All languages that use the Cyrillic alphabet
* chinese: All Chinese dialects
* latin: All languages using the Latin script
* scand: All Scandinavian languages
Names of families that forward their interlanguage links
to the wiki family being worked upon can be used, they are::
* commons: Interlanguage links of Wikimedia Commons
* incubator: Links in pages on the Wikimedia Incubator
* meta: Interlanguage links of named pages on Meta
* species: Interlanguage links of the Wikispecies wiki
* strategy: Links in pages on Wikimedia Strategy wiki
* test: Take interwiki links from Test Wikipedia
* wikimania: Interwiki links of Wikimania
Languages, groups and families having the same page title
can be combined, as -hint:5,scand,sr,pt,commons:New_York
-hintfile: similar to -hint, except that hints are taken from the given
file, enclosed in [[]] each, instead of the command line.
-askhints: for each page one or more hints are asked. See hint: above
for the format, one can for example give "en:something" or
"20:" as hint.
-repository Include data repository
-same looks over all 'serious' languages for the same title.
-same is equivalent to -hint:all::
(note: without ending colon)
-wiktionary: similar to -same, but will ONLY accept names that are
identical to the original. Also, if the title is not
capitalized, it will only go through other wikis without
automatic capitalization.
-untranslated: works normally on pages with at least one interlanguage
link; asks for hints for pages that have none.
-untranslatedonly: same as -untranslated, but pages which already have a
translation are skipped. Hint: do NOT use this in
combination with -start without a -number limit, because
you will go through the whole alphabet before any queries
are performed!
-showpage when asking for hints, show the first bit of the text
of the page always, rather than doing so only when being
asked for (by typing '?'). Only useful in combination
with a hint-asking option like -untranslated, -askhints
or -untranslatedonly.
(note: without ending colon)
-noauto Do not use the automatic translation feature for years and
dates, only use found links and hints.
(note: without ending colon)
-hintnobracket used to make the bot strip everything in last brackets,
and surrounding spaces from the page name, before it is
used in a -hint:xy: where the page name has been left out,
or -hint:all:, -hint:10:, etc. without a name, or
an -askhint reply, where only a language is given.
These arguments define how much user confirmation is required:
-autonomous run automatically, do not ask any questions. If a question
-auto to an operator is needed, write the name of the page
to autonomous_problems.dat and continue on the next page.
(note: without ending colon)
-confirm ask for confirmation before any page is changed on the
live wiki. Without this argument, additions and
unambiguous modifications are made without confirmation.
(note: without ending colon)
-force do not ask permission to make "controversial" changes,
like removing a language because none of the found
alternatives actually exists.
(note: without ending colon)
-cleanup like -force but only removes interwiki links to non-existent
or empty pages.
-select ask for each link whether it should be included before
changing any page. This is useful if you want to remove
invalid interwiki links and if you do multiple hints of
which some might be correct and others incorrect. Combining
-select and -confirm is possible, but seems like overkill.
(note: without ending colon)
These arguments specify in which way the bot should follow interwiki links:
-noredirect do not follow redirects nor category redirects.
(note: without ending colon)
-initialredirect work on its target if a redirect or category redirect is
entered on the command line or by a generator (note: without
ending colon). It is recommended to use this option with the
-movelog pagegenerator.
-neverlink: used as -neverlink:xx where xx is a language code::
Disregard any links found to language xx. You can also
specify a list of languages to disregard, separated by
commas.
-ignore: used as -ignore:xx:aaa where xx is a language code, and
aaa is a page title to be ignored.
-ignorefile: similar to -ignore, except that the pages are taken from
the given file instead of the command line.
-localright do not follow interwiki links from other pages than the
starting page. (Warning! Should be used very sparingly,
only when you are sure you have first gotten the interwiki
links on the starting page exactly right).
(note: without ending colon)
-hintsareright do not follow interwiki links to sites for which hints
on existing pages are given. Note that, hints given
interactively, via the -askhint command line option,
are only effective once they have been entered, thus
interwiki links on the starting page are followed
regardess of hints given when prompted.
(Warning! Should be used with caution!)
(note: without ending colon)
-back only work on pages that have no backlink from any other
language; if a backlink is found, all work on the page
will be halted. (note: without ending colon)
The following arguments are only important for users who have accounts for multiple languages, and specify on which sites the bot should modify pages:
-localonly only work on the local wiki, not on other wikis in the
family I have a login at. (note: without ending colon)
-limittwo only update two pages - one in the local wiki (if logged-in)
and one in the top available one.
For example, if the local page has links to de and fr,
this option will make sure that only the local site and
the de: (larger) sites are updated. This option is useful
to quickly set two way links without updating all of the
wiki families sites.
(note: without ending colon)
-whenneeded works like limittwo, but other languages are changed in the
following cases::
* If there are no interwiki links at all on the page
* If an interwiki link must be removed
* If an interwiki link must be changed and there has been
a conflict for this page
Optionally, -whenneeded can be given an additional number
(for example -whenneeded:3), in which case other languages
will be changed if there are that number or more links to
change or add. (note: without ending colon)
The following arguments influence how many pages the bot works on at once:
-array: The number of pages the bot tries to be working on at once.
If the number of pages loaded is lower than this number,
a new set of pages is loaded from the starting wiki. The
default is 100, but can be changed in the config variable
interwiki_min_subjects
-query: The maximum number of pages that the bot will load at once.
Default value is 50.
Some configuration option can be used to change the working of this bot:
interwiki_min_subjects: the minimum amount of subjects that should be
processed at the same time.
interwiki_backlink: if set to True, all problems in foreign wikis will
be reported
interwiki_shownew: should interwiki.py display every new link it discovers?
interwiki_graph: output a graph PNG file on conflicts? You need pydot for
this: https://pypi.org/project/pydot/
interwiki_graph_format: the file format for interwiki graphs
without_interwiki: save file with local articles without interwikis
All these options can be changed through the user-config.py configuration file.
If interwiki.py is terminated before it is finished, it will write a dump file to the interwiki-dumps subdirectory. The program will read it if invoked with the “-restore” or “-continue” option, and finish all the subjects in that list. After finishing the dump file will be deleted. To run the interwiki-bot on all pages on a language, run it with option “-start:!”, and if it takes so long that you have to break it off, use “-continue” next time.
-
exception
scripts.interwiki.
GiveUpOnPage
(arg: str)[source]¶ Bases:
pywikibot.exceptions.Error
User chose not to work on this page and its linked pages any more.
-
class
scripts.interwiki.
InterwikiBot
(conf=None)[source]¶ Bases:
object
A class keeping track of a list of subjects.
It controls which pages are queried from which languages when.
-
property
dump_titles
¶ Return generator of titles for dump file.
-
generateMore
(number)[source]¶ Generate more subjects.
This is called internally when the list of subjects becomes too small, but only if there is a PageGenerator
-
maxOpenSite
()[source]¶ Return the site that has the most open queries plus the number.
If there is nothing left, return None. Only languages that are TODO for the first Subject are returned.
-
property
-
class
scripts.interwiki.
InterwikiBotConfig
[source]¶ Bases:
object
Container class for interwikibot’s settings.
-
always
= False¶
-
askhints
= False¶
-
asynchronous
= False¶
-
auto
= True¶
-
autonomous
= False¶
-
cleanup
= False¶
-
confirm
= False¶
-
followinterwiki
= True¶
-
followredirect
= True¶
-
force
= False¶
-
hintnobracket
= False¶
-
hints
= []¶
-
hintsareright
= False¶
-
ignore
= []¶
-
initialredirect
= False¶
-
lacklanguage
= None¶
-
limittwo
= False¶
-
localonly
= False¶
-
maxquerysize
= 50¶
-
minlinks
= 0¶
-
minsubjects
= 100¶
-
needlimit
= 0¶
-
neverlink
= []¶
-
nobackonly
= False¶
-
note
(text)[source]¶ Output a notification message with.
The text will be printed only if conf.quiet isn’t set. :param text: text to be shown :type text: str
-
parenthesesonly
= False¶
-
quiet
= False¶
-
rememberno
= False¶
-
remove
= []¶
-
repository
= False¶
-
restore_all
= False¶
-
same
= False¶
-
select
= False¶
-
showtextlink
= 0¶
-
showtextlinkadd
= 300¶
-
skip
= {}¶
-
skipauto
= False¶
-
strictlimittwo
= False¶
-
summary
= ''¶
-
untranslated
= False¶
-
untranslatedonly
= False¶
-
-
class
scripts.interwiki.
InterwikiDumps
(**kwargs)[source]¶ Bases:
pywikibot.bot.OptionHandler
Handle interwiki dumps.
- Keyword Arguments
do_continue – If true, continue alphabetically starting at the last of the dumped pages.
-
FILE_PATTERN
= '{site.family.name}-{site.code}.txt'¶
-
available_options
: Dict[str, Any] = {'do_continue': False, 'restore_all': False}¶
-
property
files
¶ Return file generator depending on restore_all option.
rtype: generator
-
property
next_namespace
¶ Return next page namespace for continue option.
-
property
next_page
¶ Return next page title string for continue option.
-
exception
scripts.interwiki.
LinkMustBeRemoved
(arg: str)[source]¶ Bases:
scripts.interwiki.SaveError
An interwiki link has to be removed manually.
An interwiki link has to be removed, but this can’t be done because of user preferences or because the user chose not to change the page.
-
class
scripts.interwiki.
PageTree
[source]¶ Bases:
pywikibot.tools.SizedKeyCollection
Structure to manipulate a set of pages.
Allows filtering efficiently by Site.
While using dict values would be faster for the remove() operation, keeping list values is important, because the order in which the pages were found matters: the earlier a page is found, the closer it is to the Subject.origin. Chances are that pages found within 2 interwiki distance from the origin are more related to the original topic than pages found later on, after 3, 4, 5 or more interwiki hops.
Keeping this order is hence important to display an ordered list of pages to the user when he’ll be asked to resolve conflicts.
- Variables
data – dictionary with Site as keys and list of page as values. All pages found within Site are kept in self.data[site].
-
exception
scripts.interwiki.
SaveError
(arg: str)[source]¶ Bases:
pywikibot.exceptions.Error
An attempt to save a page with changed interwiki has failed.
-
class
scripts.interwiki.
Subject
(origin=None, hints=None, conf=None)[source]¶ Bases:
pywikibot.interwiki_graph.Subject
Class to follow the progress of a single ‘subject’.
(i.e. a page with all its translations)
Subject is a transitive closure of the binary relation on Page: “has_a_langlink_pointing_to”.
A formal way to compute that closure would be:
With P a set of pages, NL (‘NextLevel’) a function on sets defined as:
NL(P) = { target | ∃ source ∈ P, target ∈ source.langlinks() }
pseudocode:
todo <- [origin] done <- [] while todo != []: pending <- todo todo <-NL(pending) / done done <- NL(pending) U done return done
There is, however, one limitation that is induced by implementation: to compute efficiently NL(P), one has to load the page contents of pages in P. (Not only the langlinks have to be parsed from each Page, but we also want to know if the Page is a redirect, a disambiguation, etc…)
Because of this, the pages in pending have to be preloaded. However, because the pages in pending are likely to be in several sites we cannot “just” preload them as a batch.
Instead of doing “pending <- todo” at each iteration, we have to elect a Site, and we put in pending all the pages from todo that belong to that Site:
Code becomes:
todo <- {origin.site: [origin]} done <- [] while todo != {}: site <- electSite() pending <- todo[site] preloadpages(site, pending) todo[site] <- NL(pending) / done done <- NL(pending) U done return done
Subject objects only operate on pages that should have been preloaded before. In fact, at any time:
todo contains new Pages that have not been loaded yet
done contains Pages that have been loaded, and that have been treated.
If batch preloadings are successful, Page._get() is never called from this Object.
Takes as arguments the Page on the home wiki plus optionally a list of hints for translation
-
addIfNew
(page, counter, linkingPage)[source]¶ Add the pagelink given to the todo list, if it hasn’t been seen yet.
If it is added, update the counter accordingly.
Also remembers where we found the page, regardless of whether it had already been found before or not.
Returns True if the page is new.
-
batchLoaded
(counter)[source]¶ Notify that the promised batch of pages was loaded.
This is called by a worker to tell us that the promised batch of pages was loaded. In other words, all the pages in self.pending have already been preloaded.
The only argument is an instance of a counter class, that has methods minus() and plus() to keep counts of the total work todo.
-
disambigMismatch
(page, counter)[source]¶ Check whether the given page has a different disambiguation status.
Returns a tuple (skip, alternativePage).
skip is True if the pages have mismatching statuses and the bot is either in autonomous mode, or the user chose not to use the given page.
alternativePage is either None, or a page that the user has chosen to use instead of the given page.
-
finish
()[source]¶ Round up the subject, making any necessary changes.
This should be called exactly once after the todo list has gone empty.
-
getFoundDisambig
(site)[source]¶ Return the first disambiguation found.
If we found a disambiguation on the given site while working on the subject, this method returns it. If several ones have been found, the first one will be returned. Otherwise, None will be returned.
-
getFoundInCorrectNamespace
(site)[source]¶ Return the first page in the extended namespace.
If we found a page that has the expected namespace on the given site while working on the subject, this method returns it. If several ones have been found, the first one will be returned. Otherwise, None will be returned.
-
getFoundNonDisambig
(site)[source]¶ Return the first non-disambiguation found.
If we found a non-disambiguation on the given site while working on the subject, this method returns it. If several ones have been found, the first one will be returned. Otherwise, None will be returned.
-
namespaceMismatch
(linkingPage, linkedPage, counter)[source]¶ Check whether or not the given page has a different namespace.
Returns True if the namespaces are different and the user has selected not to follow the linked page.
-
openSites
()[source]¶ Iterator.
Yields (site, count) pairs: * site is a site where we still have work to do on * count is the number of items in that Site that need work on
-
reportBacklinks
(new, updatedSites)[source]¶ Report missing back links. This will be called from finish() if needed.
updatedSites is a list that contains all sites we changed, to avoid reporting of missing backlinks for pages we already fixed
-
translate
(hints=None, keephintedsites=False)[source]¶ Add the given translation hints to the todo list.
-
scripts.interwiki.
compareLanguages
(old, new, insite, summary)[source]¶ Compare changes and setup i18n message.
-
scripts.interwiki.
main
(*args: str) → None[source]¶ Process command line arguments and invoke bot.
If args is an empty list, sys.argv is used.
- Parameters
args – command line arguments
-
scripts.interwiki.
page_empty_check
(page)[source]¶ Return True if page should be skipped as it is almost empty.
Pages in content namespaces are considered empty if they contain less than 50 characters, and other pages are considered empty if they are not category pages and contain less than 4 characters excluding interlanguage links and categories.
- Return type
bool
interwikidata script¶
Script to handle interwiki links based on Wikibase
This script connects pages to Wikibase items using language links on the page. If multiple language links are present, and they are connected to different items, the bot skips. After connecting the page to an item, language links can be removed from the page.
These command line parameters can be used to specify which pages to work on:
This script supports use of pywikibot.pagegenerators
arguments.
Furthermore, the following command line parameters are supported:
-clean Clean pages.
-create Create items.
-merge Merge items.
-summary: Use your own edit summary for cleaning the page.
-
class
scripts.interwikidata.
IWBot
(**kwargs)[source]¶ Bases:
pywikibot.bot.ExistingPageBot
,pywikibot.bot.SingleSiteBot
The bot for interwiki.
Initialize the bot.
-
create_item
() → pywikibot.page.ItemPage[source]¶ Create item in repo for current_page.
-
handle_complicated
() → bool[source]¶ Handle pages when they have interwiki conflict.
When this method returns True it means conflict has resolved and it’s okay to clean old interwiki links. This method should change self.current_item and fix conflicts. Change it in subclasses.
-
try_to_add
() → Optional[Union[pywikibot.page.ItemPage, bool]][source]¶ Add current page in repo.
-
try_to_merge
(item) → Optional[Union[pywikibot.page.ItemPage, bool]][source]¶ Merge two items.
-
update_options
: Dict[str, Any] = {'clean': False, 'create': False, 'ignore_ns': False, 'merge': False, 'summary': ''}¶
-
listpages script¶
Print a list of pages, as defined by page generator parameters
Optionally, it also prints page content to STDOUT or save it to a file in the current directory.
These parameters are supported to specify which pages titles to print:
-format Defines the output format.
Can be a custom string according to python string.format() notation
or can be selected by a number from following list
(1 is default format):
1 - '{num:4d} {page.title}'
--> 10 PageTitle
2 - '{num:4d} [[{page.title}]]'
--> 10 [[PageTitle]]
3 - '{page.title}'
--> PageTitle
4 - '[[{page.title}]]'
--> [[PageTitle]]
5 - '{num:4d} \03{{lightred}}{page.loc_title:<40}\03{{default}}'
--> 10 localised_Namespace:PageTitle (colorised in lightred)
6 - '{num:4d} {page.loc_title:<40} {page.can_title:<40}'
--> 10 localised_Namespace:PageTitle
canonical_Namespace:PageTitle
7 - '{num:4d} {page.loc_title:<40} {page.trs_title:<40}'
--> 10 localised_Namespace:PageTitle
outputlang_Namespace:PageTitle
(*) requires "outputlang:lang" set.
num is the sequential number of the listed page.
An empty format is equal to -notitle and just shows the total
amount of pages.
-outputlang Language for translation of namespaces.
-notitle Page title is not printed.
-get Page content is printed.
-save Save Page content to a file named as page.title(as_filename=True).
Directory can be set with -save:dir_name
If no dir is specified, current directory will be used.
-encode File encoding can be specified with '-encode:name' (name must be
a valid python encoding: utf-8, etc.).
If not specified, it defaults to config.textfile_encoding.
-put: Save the list to the defined page of the wiki. By default it does
not overwrite an existing page.
-overwrite Overwrite the page if it exists. Can only by applied with -put.
-summary: The summary text when the page is written. If it's one word just
containing letters, dashes and underscores it uses that as a
translation key.
Custom format can be applied to the following items extrapolated from a page object:
site: obtained from page._link._site.
title: obtained from page._link._title.
loc_title: obtained from page._link.canonical_title().
can_title: obtained from page._link.ns_title().
based either the canonical namespace name or on the namespace name
in the language specified by the -trans param;
a default value '******' will be used if no ns is found.
onsite: obtained from pywikibot.Site(outputlang, self.site.family).
trs_title: obtained from page._link.ns_title(onsite=onsite).
If selected format requires trs_title, outputlang must be set.
This script supports use of pywikibot.pagegenerators
arguments.
-
class
scripts.listpages.
Formatter
(page, outputlang=None, default='******')[source]¶ Bases:
object
Structure with Page attributes exposed for formatting from cmd line.
- Parameters
page (Page object.) – the page to be formatted.
outputlang (str or None, if no translation is wanted.) –
language code in which namespace before title should be translated.
Page ns will be searched in Site(outputlang, page.site.family) and, if found, its custom name will be used in page.title().
default – default string to be used if no corresponding namespace is found when outputlang is not None.
-
fmt_need_lang
= ['7']¶
-
fmt_options
= {'1': '{num:4d} {page.title}', '2': '{num:4d} [[{page.title}]]', '3': '{page.title}', '4': '[[{page.title}]]', '5': '{num:4d} \x03{{lightred}}{page.loc_title:<40}\x03{{default}}', '6': '{num:4d} {page.loc_title:<40} {page.can_title:<40}', '7': '{num:4d} {page.loc_title:<40} {page.trs_title:<40}'}¶
misspelling script¶
This script fixes links that contain common spelling mistakes
This is only possible on wikis that have a template for these misspellings.
Command line options:
-always:XY instead of asking the user what to do, always perform the same
action. For example, XY can be "r0", "u" or "2". Be careful with
this option, and check the changes made by the bot. Note that
some choices for XY don't make sense and will result in a loop,
e.g. "l" or "m".
-main only check pages in the main namespace, not in the Talk,
Project, User, etc. namespaces.
-start:XY goes through all misspellings in the category on your wiki
that is defined (to the bot) as the category containing
misspelling pages, starting at XY. If the -start argument is not
given, it starts at the beginning.
-
class
scripts.misspelling.
MisspellingRobot
(*args, **kwargs)[source]¶ Bases:
scripts.solve_disambiguation.DisambiguationRobot
Spelling bot.
-
findAlternatives
(page) → bool[source]¶ Append link target to a list of alternative links.
Overrides the BaseDisambigBot method.
- Returns
True if alternate link was appended
-
property
generator
¶ Generator to retrieve misspelling pages or misspelling redirects.
-
misspelling_categories
= ('Q8644265', 'Q9195708')¶
-
misspelling_templates
= {'wikipedia:de': ('Falschschreibung', 'Obsolete Schreibung')}¶
-
setSummaryMessage
(page, *args, **kwargs) → None[source]¶ Setup the summary message.
Overrides the BaseDisambigBot method.
-
update_options
: Dict[str, Any] = {'start': None}¶
-
movepages script¶
This script can move pages
These command line parameters can be used to specify which pages to work on:
This script supports use of pywikibot.pagegenerators
arguments.
Furthermore, the following command line parameters are supported:
-from and -to The page to move from and the page to move to.
-noredirect Leave no redirect behind.
-notalkpage Do not move this page's talk page (if it exists)
-prefix Move pages by adding a namespace prefix to the names of the
pages. (Will remove the old namespace prefix if any)
Argument can also be given as "-prefix:namespace:".
-always Don't prompt to make changes, just do them.
-skipredirects Skip redirect pages (Warning: increases server load)
-summary Prompt for a custom summary, bypassing the predefined message
texts. Argument can also be given as "-summary:XYZ".
-pairsfile Read pairs of file names from a file. The file must be in a
format [[frompage]] [[topage]] [[frompage]] [[topage]] ...
Argument can also be given as "-pairsfile:filename"
-
class
scripts.movepages.
MovePagesBot
(**kwargs)[source]¶ Bases:
pywikibot.bot.CurrentPageBot
Page move bot.
-
update_options
: Dict[str, Any] = {'movetalkpage': True, 'noredirect': False, 'prefix': '', 'skipredirects': False, 'summary': ''}¶
-
newitem script¶
This script creates new items on Wikidata based on certain criteria
When was the (Wikipedia) page created?
When was the last edit on the page?
Does the page contain interwikis?
This script understands various command-line arguments:
-lastedit The minimum number of days that has passed since the page was
last edited.
-pageage The minimum number of days that has passed since the page was
created.
-touch Do a null edit on every page which has a Wikibase item.
Be careful, this option can trigger edit rates or captchas
if your account is not autoconfirmed.
-
class
scripts.newitem.
NewItemRobot
(**kwargs)[source]¶ Bases:
pywikibot.bot.WikidataBot
,pywikibot.bot.NoRedirectPageBot
A bot to create new items.
Only accepts options defined in available_options.
-
get_skipping_templates
(site) → set[source]¶ Get templates which leads the page to be skipped.
If the script is used for multiple sites, hold the skipping templates as attribute.
-
skip_templates
(page) → str[source]¶ Check whether the page is to be skipped due to skipping template.
- Parameters
page (pywikibot.Page) – treated page
- Returns
the template which leads to skip
-
treat_missing_item
= True¶
-
update_options
: Dict[str, Any] = {'always': True, 'lastedit': 7, 'pageage': 21, 'touch': 'newly'}¶
-
noreferences script¶
This script adds a missing references section to pages
It goes over multiple pages, searches for pages where <references /> is missing although a <ref> tag is present, and in that case adds a new references section.
These command line parameters can be used to specify which pages to work on:
This script supports use of pywikibot.pagegenerators
arguments.
Furthermore, the following command line parameters are supported:
-xml Retrieve information from a local XML dump (pages-articles
or pages-meta-current, see https://dumps.wikimedia.org).
Argument can also be given as "-xml:filename".
-always Don't prompt you for each replacement.
-quiet Use this option to get less output
If neither a page title nor a page generator is given, it takes all pages from the default maintenance category.
It is strongly recommended not to run this script over the entire article namespace (using the -start) parameter, as that would consume too much bandwidth. Instead, use the -xml parameter, or use another way to generate a list of affected articles
-
class
scripts.noreferences.
NoReferencesBot
(**kwargs)[source]¶ Bases:
pywikibot.bot.SingleSiteBot
,pywikibot.bot.ExistingPageBot
,pywikibot.bot.NoRedirectPageBot
References section bot.
-
addReferences
(oldText) → str[source]¶ Add a references tag into an existing section where it fits into.
If there is no such section, creates a new section containing the references tag. Also repair malformed references tags. Set the edit summary accordingly.
- Parameters
oldText (str) – page text to be modified
- Returns
The modified pagetext
-
createReferenceSection
(oldText, index, ident='==') → str[source]¶ Create a reference section and insert it into the given text.
- Parameters
oldText (str) – page text that is going to be be amended
index (int) – the index of oldText where the reference section should be inserted at
ident (str) – symbols to be inserted before and after reference section title
- Returns
the amended page text with reference section added
-
nowcommons script¶
Script to delete files that are also present on Wikimedia Commons
Do not run this script on Wikimedia Commons itself. It works based on a given array of templates defined below.
Files are downloaded and compared. If the files match, it can be deleted on the source wiki. If multiple versions of the file exist, the script will not delete. If the SHA1 comparison is not equal, the script will not delete.
A sysop rights on the local wiki is required if you want all features of this script to work properly.
This script understands various command-line arguments:
-always run automatically, do not ask any questions. All files
that qualify for deletion are deleted. Reduced screen
output.
-replace replace links if the files are equal and the file names
differ
-replacealways replace links if the files are equal and the file names
differ without asking for confirmation
-replaceloose Do loose replacements. This will replace all occurrences
of the name of the file (and not just explicit file
syntax). This should work to catch all instances of the
file, including where it is used as a template parameter
or in galleries. However, it can also make more mistakes.
-replaceonly Use this if you do not have a local sysop rights, but do
wish to replace links from the NowCommons template.
Example
python pwb.py nowcommons -replaceonly -replaceloose -replacealways -replace
-
class
scripts.nowcommons.
NowCommonsDeleteBot
(**kwargs)[source]¶ Bases:
pywikibot.bot.Bot
Bot to delete migrated files.
-
property
generator
¶ Generator method.
-
property
nc_templates
¶ A set of now commons template Page instances.
-
update_options
: Dict[str, Any] = {'replace': False, 'replacealways': False, 'replaceloose': False, 'replaceonly': False}¶
-
property
pagefromfile script¶
Bot to upload pages from a text file
This bot takes its input from the UTF-8 text file that contains a number of pages to be put on the wiki. The pages should all have the same beginning and ending text (which may not overlap). The beginning and ending text is not uploaded with the page content by default.
As a pagename is by default taken the first text block from the page content marked in bold (wrapped between ‘’’ and ‘’’). If you expect the page title not to be present in the text or marked by different markers, use -titlestart, -titleend, and -notitle parameters.
Specific arguments:
-file:xxx The filename we are getting our material from,
the default value is "dict.txt"
-begin:xxx The text that marks the beginning of a page,
the default value is "{{-start-}}"
-end:xxx The text that marks the end of the page,
the default value is "{{-stop-}}"
-include Include the beginning and end markers to the page
-textonly Text is given without markers. Only one page text is given.
-begin and -end options are ignored.
-titlestart:xxx The text used in place of ''' for identifying
the beginning of a page title
-titleend:xxx The text used in place of ''' for identifying
the end of the page title
-notitle Do not include the page title, including titlestart
and titleend, to the page. Can be used to specify unique
page title above the page content
-title:xxx The page title is given directly. Ignores -titlestart,
-titleend and -notitle options
-nocontent:xxx If the existing page contains specified statement,
the page is skipped from editing
-noredirect Do not upload on redirect pages
-summary:xxx The text used as an edit summary for the upload.
If the page exists, standard messages for prepending,
appending, or replacement are appended after it
-autosummary Use MediaWiki's autosummary when creating a new page,
overrides -summary
-minor Set the minor edit flag on page edits
-showdiff Show difference between current page and page to upload,
also forces the bot to ask for confirmation
on every edit
If the page to be uploaded already exists, it is skipped by default. But you can override this behavior if you want to:
-appendtop Add the text to the top of the existing page
-appendbottom Add the text to the bottom of the existing page
-force Overwrite the existing page
It is possible to define a separator after the ‘append’ modes which is added between the existing and the new text. For example a parameter -appendtop:foo would add ‘foo’ between them. A new line can be added between them by specifying ‘n’ as a value.
-
class
scripts.pagefromfile.
PageFromFileReader
(filename, **kwargs)[source]¶ Bases:
pywikibot.bot.OptionHandler
Generator class, responsible for reading the file.
Check if self.file name exists. If not, ask for a new filename. User can quit.
-
available_options
: Dict[str, Any] = {'begin': '{{-start-}}', 'end': '{{-stop-}}', 'include': False, 'notitle': False, 'textonly': False, 'title': None, 'titleend': "'''", 'titlestart': "'''"}¶
-
-
class
scripts.pagefromfile.
PageFromFileRobot
(**kwargs)[source]¶ Bases:
pywikibot.bot.SingleSiteBot
,pywikibot.bot.CurrentPageBot
Responsible for writing pages to the wiki.
Titles and contents are given by a PageFromFileReader.
-
init_page
(item) → pywikibot.page.Page[source]¶ Get the tuple and return the page object to be processed.
-
update_options
: Dict[str, Any] = {'append': None, 'autosummary': False, 'force': False, 'minor': False, 'nocontent': '', 'redirect': True, 'showdiff': False, 'summary': ''}¶
-
parser_function_count script¶
Used to find expensive templates that are subject to be converted to Lua
It counts parser functions and then orders templates by number of these and uploads the first n titles or alternatively templates having count()>n.
Parameters:
-start Will start from the given title (it does not have to exist).
Parameter may be given as "-start" or "-start:title".
Defaults to '!'.
-first Returns the first n results in decreasing order of number
of hits (or without ordering if used with -nosort)
Parameter may be given as "-first" or "-first:n".
-atleast Returns templates with at least n hits.
Parameter may be given as "-atleast" or "-atleast:n".
-nosort Keeps the original order of templates. Default behaviour is
to sort them by decreasing order of count(parserfunctions).
-save Saves the results. The file is in the form you may upload it
to a wikipage. May be given as "-save:<filename>".
If it exists, titles will be appended.
-upload Specify a page in your wiki where results will be uploaded.
Parameter may be given as "-upload" or "-upload:title".
Say good-bye to previous content if existed.
Precedence of evaluation: results are first sorted in decreasing order of templates, unless nosort is switched on. Then first n templates are taken if first is specified, and at last atleast is evaluated. If nosort and first are used together, the program will stop at the nth hit without scanning the rest of the template namespace. This may be used to run it in more sessions (continue with -start next time).
First is strict. That means if results #90-120 have the same number of parser functions and you specify -first:100, only the first 100 will be listed (even if atleast is used as well).
Should you specify neither first nor atleast, all templates using parser functions will be listed.
-
class
scripts.parser_function_count.
ParserFunctionCountBot
(**kwargs)[source]¶ Bases:
pywikibot.bot.SingleSiteBot
,pywikibot.bot.ExistingPageBot
,pywikibot.bot.NoRedirectPageBot
Bot class used for obtaining Parser function Count.
-
property
generator
¶ Generator.
-
update_options
: Dict[str, Any] = {'atleast': None, 'first': None, 'nosort': False, 'save': None, 'start': '!', 'upload': None}¶
-
property
patrol script¶
The bot is meant to mark the edits based on info obtained by whitelist
This bot obtains a list of recent changes and newpages and marks the edits as patrolled based on a whitelist.
Whitelist Format¶
The whitelist is formatted as a number of list entries. Any links outside of lists are ignored and can be used for documentation. In a list the first link must be to the username which should be white listed and any other link following is adding that page to the white list of that username. If the user edited a page on their white list it gets patrolled. It will also patrol pages which start with the mentioned link (e.g. [[foo]] will also patrol [[foobar]]).
To avoid redlinks it’s possible to use Special:PrefixIndex as a prefix so that it will list all pages which will be patrolled. The page after the slash will be used then.
On Wikisource, it’ll also check if the page is on the author namespace in which case it’ll also patrol pages which are linked from that page.
An example can be found at https://en.wikisource.org/wiki/User:Wikisource-bot/patrol_whitelist
Commandline parameters:
-namespace Filter the page generator to only yield pages in
specified namespaces
-ask If True, confirm each patrol action
-whitelist page title for whitelist (optional)
-autopatroluserns Takes user consent to automatically patrol
-versionchecktime Check versionchecktime lapse in sec
-repeat Repeat run after 60 seconds
-newpages Run on unpatrolled new pages
(default for Wikipedia Projects)
-recentchanges Run on complete unpatrolled recentchanges
(default for any project except Wikipedia Projects)
-usercontribs Filter generators above to the given user
-
class
scripts.patrol.
LinkedPagesRule
(page_title: str)[source]¶ Bases:
object
Matches of page site title and linked pages title.
- Parameters
page_title – The page title for this rule
-
class
scripts.patrol.
PatrolBot
(site=None, **kwargs)[source]¶ Bases:
pywikibot.bot.BaseBot
Bot marks the edits as patrolled based on info obtained by whitelist.
- Keyword Arguments
ask – If True, confirm each patrol action
whitelist – page title for whitelist (optional)
autopatroluserns – Takes user consent to automatically patrol
versionchecktime – Check versionchecktime lapse in sec
Patrol a single item.
-
update_options
: Dict[str, Any] = {'ask': False, 'autopatroluserns': False, 'versionchecktime': 300, 'whitelist': None}¶
-
whitelist_subpage_name
= {'en': 'patrol_whitelist'}¶
-
scripts.patrol.
api_feed_repeater
(gen, delay=0, repeat=False, namespaces=None, user=None, recent_new_gen=True)[source]¶ Generator which loads pages details to be processed.
-
scripts.patrol.
main
(*args: str) → None[source]¶ Process command line arguments and invoke PatrolBot.
-
scripts.patrol.
removeprefix
(self, prefix, /)¶ Return a str with the given prefix string removed if present.
If the string starts with the prefix string, return string[len(prefix):]. Otherwise, return a copy of the original string.
protect script¶
This script can be used to protect and unprotect pages en masse
Of course, you will need an admin account on the relevant wiki. These command line parameters can be used to specify which pages to work on:
This script supports use of pywikibot.pagegenerators
arguments.
Furthermore, the following command line parameters are supported:
-always Don't prompt to protect pages, just do it.
-summary: Supply a custom edit summary. Tries to generate summary from
the page selector. If no summary is supplied or couldn't
determine one from the selector it'll ask for one.
-expiry: Supply a custom protection expiry, which defaults to
indefinite. Any string understandable by MediaWiki, including
relative and absolute, is acceptable. See:
https://www.mediawiki.org/wiki/API:Protect#Parameters
-unprotect Acts like "default:all"
-default: Sets the default protection level (default 'sysop'). If no
level is defined it doesn't change unspecified levels.
-[type]:[level] Set [type] protection level to [level]
Usual values for [level] are: sysop, autoconfirmed, all; further levels may be provided by some wikis.
For all protection types (edit, move, etc.) it chooses the default protection level. This is “sysop” or “all” if -unprotect was selected. If multiple parameters -unprotect or -default are used, only the last occurrence is applied.
Usage:
python pwb.py protect <OPTIONS>
Examples
Protect everything in the category ‘To protect’ prompting:
python pwb.py protect -cat:"To protect"
Unprotect all pages listed in text file ‘unprotect.txt’ without prompting:
python pwb.py protect -file:unprotect.txt -unprotect -always
-
class
scripts.protect.
ProtectionRobot
(protections, **kwargs)[source]¶ Bases:
pywikibot.bot.SingleSiteBot
,pywikibot.bot.CurrentPageBot
This bot allows protection of pages en masse.
Create a new ProtectionRobot.
- Parameters
protections (dict) – protections as a dict with “type”: “level”
kwargs – additional arguments directly feed to super().__init__()
-
treat_page
()[source]¶ Run the bot’s action on each page.
treat_page treats every page given by the generator and applies the protections using this method.
-
update_options
: Dict[str, Any] = {'expiry': '', 'summary': ''}¶
redirect script¶
Script to resolve double redirects, and to delete broken redirects
Requires access to MediaWiki’s maintenance pages or to a XML dump file. Delete function requires adminship.
Syntax:
python pwb.py redirect action [-arguments ...]
where action can be one of these
- double
Shortcut: do. Fix redirects which point to other redirects.
- broken
Shortcut: br. Tries to fix redirect which point to nowhere by using the last moved target of the destination page. If this fails and the -delete option is set, it either deletes the page or marks it for deletion depending on whether the account has admin rights. It will mark the redirect not for deletion if there is no speedy deletion template available.
- both
Both of the above. Retrieves redirect pages from live wiki, not from a special page.
and arguments can be:
-xml Retrieve information from a local XML dump
(https://dumps.wikimedia.org). Argument can also be given as
"-xml:filename.xml". Cannot be used with -fullscan or -moves.
-fullscan Retrieve redirect pages from live wiki, not from a special page
Cannot be used with -xml.
-moves Use the page move log to find double-redirect candidates. Only
works with action "double", does not work with -xml.
NOTE: You may use only one of these options above.
If neither of -xml -fullscan -moves is given, info will be
loaded from a special page of the live wiki.
-offset:n With -moves, the number of hours ago to start scanning moved
pages. With -xml, the number of the redirect to restart with
(see progress). Otherwise, ignored.
-start:title The starting page title in each namespace. Page need not exist.
-until:title The possible last page title in each namespace. Page needs not
exist.
-limit:n The maximum count of redirects to work upon. If omitted, there
is no limit.
-delete Prompt the user whether broken redirects should be deleted (or
marked for deletion if the account has no admin rights) instead
of just skipping them.
-sdtemplate:x Add the speedy deletion template string including brackets.
This enables overriding the default template via i18n or
to enable speedy deletion for projects other than Wikipedias.
-always Don't prompt you for each replacement.
Furthermore the following options are provided:
This script supports use of pywikibot.pagegenerators
arguments.
-
class
scripts.redirect.
RedirectGenerator
(action, **kwargs)[source]¶ Bases:
pywikibot.bot.OptionHandler
Redirect generator.
-
available_options
: dict = {'fullscan': False, 'limit': None, 'moves': False, 'namespaces': {0}, 'offset': -1, 'start': None, 'until': None, 'xml': None}¶
-
get_moved_pages_redirects
() → Generator[pywikibot.page.Page, None, None][source]¶ Generate redirects to recently-moved pages.
-
get_redirect_pages_via_api
() → Generator[pywikibot.page.Page, None, None][source]¶ Yield Pages that are redirects.
-
get_redirects_from_dump
(alsoGetPageTitles=False) → tuple[source]¶ Extract redirects from dump.
Load a local XML dump file, look at all pages which have the redirect flag set, and find out where they’re pointing at. Return a dictionary where the redirect names are the keys and the redirect targets are the values.
-
get_redirects_via_api
(maxlen=8) → Generator[tuple, None, None][source]¶ Return a generator that yields tuples of data about redirect Pages.
The description of returned tuple items is as follows:
- [0]
page title of a redirect page
- [1]
type of redirect:
- None
start of a redirect chain of unknown length, or loop
- [0]
broken redirect, target page title missing
- [1]
normal redirect, target page exists and is not a redirect
- [2:maxlen]
start of a redirect chain of that many redirects (currently, the API seems not to return sufficient data to make these return values possible, but that may change)
- [maxlen+1]
start of an even longer chain, or a loop (currently, the API seems not to return sufficient data to allow this return values, but that may change)
- [2]
target page title of the redirect, or chain (may not exist)
- [3]
target page of the redirect, or end of chain, or page title where chain or loop detecton was halted, or None if unknown
-
retrieve_broken_redirects
() → Generator[Union[str, pywikibot.page.Page], None, None][source]¶ Retrieve broken redirects.
-
retrieve_double_redirects
() → Generator[Union[str, pywikibot.page.Page], None, None][source]¶ Retrieve double redirects.
-
-
class
scripts.redirect.
RedirectRobot
(action, **kwargs)[source]¶ Bases:
pywikibot.bot.ExistingPageBot
,pywikibot.bot.RedirectPageBot
Redirect bot.
-
delete_redirect
(page, summary_key) → None[source]¶ Delete the redirect page.
- Parameters
page (pywikibot.page.BasePage) – The page to delete
summary_key (str) – The message key for the deletion summary
-
get_sd_template
(site=None) → Optional[str][source]¶ Look for speedy deletion template and return it.
- Parameters
site (pywikibot.BaseSite) – site for which the template has to be given
- Returns
A valid speedy deletion template.
-
init_page
(item) → pywikibot.page.Page[source]¶ Ensure that we process page objects.
-
property
sdtemplate
¶ Gives the speedy deletion template for the current_page.
-
treat
(page) → None[source]¶ Treat a page.
- Parameters
page (pywikibot.page.BasePage) – Page to be treated.
-
update_options
: dict = {'delete': False, 'limit': inf, 'sdtemplate': ''}¶
-
reflinks script¶
Fetch and add titles for bare links in references
This bot will search for references which are only made of a link without title (i.e. <ref>[https://www.google.fr/]</ref> or <ref>https://www.google.fr/</ref>) and will fetch the html title from the link to use it as the title of the wiki link in the reference, i.e. <ref>[https://www.google.fr/search?q=test test - Google Search]</ref>
The bot checks every 20 edits a special stop page. If the page has been edited, it stops.
As it uses it, you need to configure noreferences.py for your wiki, or it will not work.
pdfinfo is needed for parsing pdf titles.
The following parameters are supported:
-limit:n Stops after n edits
-xml:dump.xml Should be used instead of a simple page fetching method
from pagegenerators.py for performance and load issues
-xmlstart Page to start with when using an XML dump
-ignorepdf Do not handle PDF files (handy if you use Windows and
can't get pdfinfo)
-summary Use a custom edit summary. Otherwise it uses the
default one from translatewiki
The following generators and filters are supported:
This script supports use of pywikibot.pagegenerators
arguments.
-
class
scripts.reflinks.
DuplicateReferences
(site=None)[source]¶ Bases:
object
Helper to de-duplicate references in text.
When some references are duplicated in an article, name the first, and remove the content of the others
-
class
scripts.reflinks.
IX
(value)[source]¶ Bases:
enum.IntEnum
Index class for references data.
-
change_needed
= 3¶
-
name
= 0¶
-
quoted
= 2¶
-
reflist
= 1¶
-
-
class
scripts.reflinks.
RefLink
(link, name, site=None)[source]¶ Bases:
object
Container to handle a single bare reference.
-
class
scripts.reflinks.
ReferencesRobot
(**kwargs)[source]¶ Bases:
pywikibot.bot.SingleSiteBot
,pywikibot.bot.ExistingPageBot
,pywikibot.bot.NoRedirectPageBot
References bot.
-
update_options
: Dict[str, Any] = {'ignorepdf': False, 'limit': 0, 'summary': ''}¶
-
-
scripts.reflinks.
main
(*args: str) → None[source]¶ Process command line arguments and invoke bot.
If args is an empty list, sys.argv is used.
- Parameters
args – command line arguments
-
scripts.reflinks.
removeprefix
(self, prefix, /)¶ Return a str with the given prefix string removed if present.
If the string starts with the prefix string, return string[len(prefix):]. Otherwise, return a copy of the original string.
replace script¶
This bot will make direct text replacements
It will retrieve information on which pages might need changes either from an XML dump or a text file, or only change a single page.
These command line parameters can be used to specify which pages to work on:
This script supports use of pywikibot.pagegenerators
arguments.
Furthermore, the following command line parameters are supported:
-mysqlquery Retrieve information from a local database mirror.
If no query specified, bot searches for pages with
given replacements.
-xml Retrieve information from a local XML dump
(pages-articles or pages-meta-current, see
https://dumps.wikimedia.org). Argument can also
be given as "-xml:filename".
-regex Make replacements using regular expressions. If this argument
isn't given, the bot will make simple text replacements.
-nocase Use case insensitive regular expressions.
-dotall Make the dot match any character at all, including a newline.
Without this flag, '.' will match anything except a newline.
-multiline '^' and '$' will now match begin and end of each line.
-xmlstart (Only works with -xml) Skip all articles in the XML dump
before the one specified (may also be given as
-xmlstart:Article).
-addcat:cat_name Adds "cat_name" category to every altered page.
-excepttitle:XYZ Skip pages with titles that contain XYZ. If the -regex
argument is given, XYZ will be regarded as a regular
expression.
-requiretitle:XYZ Only do pages with titles that contain XYZ. If the -regex
argument is given, XYZ will be regarded as a regular
expression.
-excepttext:XYZ Skip pages which contain the text XYZ. If the -regex
argument is given, XYZ will be regarded as a regular
expression.
-exceptinside:XYZ Skip occurrences of the to-be-replaced text which lie
within XYZ. If the -regex argument is given, XYZ will be
regarded as a regular expression.
-exceptinsidetag:XYZ Skip occurrences of the to-be-replaced text which lie
within an XYZ tag.
-summary:XYZ Set the summary message text for the edit to XYZ, bypassing
the predefined message texts with original and replacements
inserted. To add the replacements to your summary use the
%(description)s placeholder, for example:
-summary:"Bot operated replacement: %(description)s"
Can't be used with -automaticsummary.
-automaticsummary Uses an automatic summary for all replacements which don't
have a summary defined. Can't be used with -summary.
-sleep:123 If you use -fix you can check multiple regex at the same time
in every page. This can lead to a great waste of CPU because
the bot will check every regex without waiting using all the
resources. This will slow it down between a regex and another
in order not to waste too much CPU.
-fix:XYZ Perform one of the predefined replacements tasks, which are
given in the dictionary 'fixes' defined inside the files
fixes.py and user-fixes.py.
The available fixes are listed in :py:mod:`pywikibot.fixes`.
-manualinput Request manual replacements via the command line input even
if replacements are already defined. If this option is set
(or no replacements are defined via -fix or the arguments)
it'll ask for additional replacements at start.
-pairsfile Lines from the given file name(s) will be read as replacement
arguments. i.e. a file containing lines "a" and "b", used as:
python pwb.py replace -page:X -pairsfile:file c d
will replace 'a' with 'b' and 'c' with 'd'.
-always Don't prompt you for each replacement
-recursive Recurse replacement as long as possible. Be careful, this
might lead to an infinite loop.
-allowoverlap When occurrences of the pattern overlap, replace all of them.
Be careful, this might lead to an infinite loop.
-fullsummary Use one large summary for all command line replacements.
- other: First argument is the old text, second argument is the new
text. If the -regex argument is given, the first argument will be regarded as a regular expression, and the second argument might contain expressions like 1 or g<name>. It is possible to introduce more than one pair of old text and replacement.
Examples
If you want to change templates from the old syntax, e.g. {{msg:Stub}}, to the new syntax, e.g. {{Stub}}, download an XML dump file (pages-articles) from https://dumps.wikimedia.org, then use this command:
python pwb.py replace -xml -regex "{{msg:(.*?)}}" "{{\1}}"
If you have a dump called foobar.xml and want to fix typos in articles, e.g. Errror -> Error, use this:
python pwb.py replace -xml:foobar.xml "Errror" "Error" -namespace:0
If you want to do more than one replacement at a time, use this:
python pwb.py replace -xml:foobar.xml "Errror" "Error" "Faail" "Fail" \
-namespace:0
If you have a page called ‘John Doe’ and want to fix the format of ISBNs, use:
python pwb.py replace -page:John_Doe -fix:isbn
This command will change ‘referer’ to ‘referrer’, but not in pages which talk about HTTP, where the typo has become part of the standard:
python pwb.py replace referer referrer -file:typos.txt -excepttext:HTTP
Please type “python pwb.py replace -help | more” if you can’t read the top of the help.
-
class
scripts.replace.
ReplaceRobot
(generator, replacements, exceptions=None, acceptall='[deprecated name of always]', addedCat='[deprecated name of addcat]', **kwargs)[source]¶ Bases:
pywikibot.bot.SingleSiteBot
,pywikibot.bot.ExistingPageBot
A bot that can do text replacements.
- Parameters
generator (generator) – generator that yields Page objects
replacements (list) – a list of Replacement instances or sequences of length 2 with the original text (as a compiled regular expression) and replacement text (as a string).
exceptions (dict) –
a dictionary which defines when not to change an occurrence. This dictionary can have these keys:
- title
A list of regular expressions. All pages with titles that are matched by one of these regular expressions are skipped.
- text-contains
A list of regular expressions. All pages with text that contains a part which is matched by one of these regular expressions are skipped.
- inside
A list of regular expressions. All occurrences are skipped which lie within a text region which is matched by one of these regular expressions.
- inside-tags
A list of strings. These strings must be keys from the dictionary in textlib._create_default_regexes() or must be accepted by textlib._get_regexes().
allowoverlap (bool) – when matches overlap, all of them are replaced.
recursive (bool) – Recurse replacement as long as possible.
addcat (pywikibot.Category or str or None) – category to be added to every page touched
sleep (int) – slow down between processing multiple regexes
summary (str) – Set the summary message text bypassing the default
- Warning
Be careful, this might lead to an infinite loop.
- Keyword Arguments
always – the user won’t be prompted before changes are made
site – Site the bot is working on.
- Warning
site parameter should be passed to constructor. Otherwise the bot takes the current site and warns the operator about the missing site
-
apply_replacements
(original_text, applied, page=None)[source]¶ Apply all replacements to the given text.
- Return type
str, set
-
isTextExcepted
(original_text) → bool[source]¶ Return True iff one of the exceptions applies for the given text.
-
class
scripts.replace.
Replacement
(old, new, use_regex=None, exceptions=None, case_insensitive=None, edit_summary=None, default_summary=True)[source]¶ Bases:
scripts.replace.ReplacementBase
A single replacement with it’s own data.
Create a single replacement entry unrelated to a fix.
-
property
case_insensitive
¶ Return whether the search text is case insensitive.
-
classmethod
from_compiled
(old_regex, new, **kwargs)[source]¶ Create instance from already compiled regex.
-
property
use_regex
¶ Return whether the search text is using regex.
-
property
-
class
scripts.replace.
ReplacementBase
(old, new, edit_summary=None, default_summary=True)[source]¶ Bases:
object
The replacement instructions.
Create a basic replacement instance.
-
property
container
¶ Container object which contains this replacement.
A container object is an object that groups one or more replacements together and provides some properties that are common to all of them. For example, containers may define a common name for a group of replacements, or a common edit summary.
Container objects must have a “name” attribute.
-
property
description
¶ Description of the changes that this replacement applies.
This description is used as the default summary of the replacement. If you do not specify an edit summary on the command line or in some other way, whenever you apply this replacement to a page and submit the changes to the MediaWiki server, the edit summary includes the descriptions of each replacement that you applied to the page.
-
property
edit_summary
¶ Return the edit summary for this fix.
-
property
-
class
scripts.replace.
ReplacementList
(use_regex, exceptions, case_insensitive, edit_summary, name)[source]¶ Bases:
list
A list of replacements which all share some properties.
The shared properties are: * use_regex * exceptions * case_insensitive
Each entry in this list should be a ReplacementListEntry. The exceptions are compiled only once.
Create a fix list which can contain multiple replacements.
-
class
scripts.replace.
ReplacementListEntry
(old, new, fix_set, edit_summary=None, default_summary=True)[source]¶ Bases:
scripts.replace.ReplacementBase
A replacement entry for ReplacementList.
Create a replacement entry inside a fix set.
-
property
case_insensitive
¶ Return whether the fix set is case insensitive.
-
property
container
¶ Container object which contains this replacement.
A container object is an object that groups one or more replacements together and provides some properties that are common to all of them. For example, containers may define a common name for a group of replacements, or a common edit summary.
Container objects must have a “name” attribute.
-
property
edit_summary
¶ Return this entry’s edit summary or the fix’s summary.
-
property
exceptions
¶ Return the exceptions of the fix set.
-
property
use_regex
¶ Return whether the fix set is using regex.
-
property
-
class
scripts.replace.
XmlDumpReplacePageGenerator
(xmlFilename, xmlStart, replacements, exceptions, site)[source]¶ Bases:
object
Iterator that will yield Pages that might contain text to replace.
These pages will be retrieved from a local XML dump file.
- Parameters
xmlFilename (str) – The dump’s path, either absolute or relative
xmlStart (str) – Skip all articles in the dump before this one
replacements (list of 2-tuples) – A list of 2-tuples of original text (as a compiled regular expression) and replacement text (as a string).
exceptions (dict) – A dictionary which defines when to ignore an occurrence. See docu of the ReplaceRobot initializer below.
-
scripts.replace.
main
(*args: str) → None[source]¶ Process command line arguments and invoke bot.
If args is an empty list, sys.argv is used.
- Parameters
args – command line arguments
replicate_wiki script¶
This bot replicates pages in a wiki to a second wiki within one family
Example
python pwb.py replicate_wiki [-r] -ns 10 -family:wikipedia -o nl li fy
or:
python pwb.py replicate_wiki [-r] -ns 10 -family:wikipedia -lang:nl li fy
to copy all templates from nlwiki to liwiki and fywiki. It will show which pages have to be changed if -r is not present, and will only actually write pages if -r /is/ present.
You can add replicate_replace to your user-config.py, which has the following format:
replicate_replace = {
'wikipedia:li': {'Hoofdpagina': 'Veurblaad'}
}
to replace all occurrences of ‘Hoofdpagina’ with ‘Veurblaad’ when writing to liwiki. Note that this does not take the origin wiki into account.
The following parameters are supported:
-r, --replace actually replace pages (without this option
you will only get an overview page)
-o, --original original wiki (you may use -lang:<code> option
instead)
-ns, --namespace specify namespace
-dns, --dest-namespace destination namespace (if different)
destination_wiki destination wiki(s)
revertbot script¶
This script can be used for reverting certain edits
The following command line parameters are supported:
-username Edits of which user need to be reverted.
Default is bot's username (site.username())
-rollback Rollback edits instead of reverting them.
Note that in rollback, no diff would be shown.
-limit:num Use the last num contributions to be checked for revert.
Default is 500.
Users who want to customize the behaviour should subclass the BaseRevertBot
and override its callback
method. Here is a sample:
class myRevertBot(BaseRevertBot)::
'''Example revert bot.'''
def callback(self, item)::
'''Sample callback function for 'private' revert bot.
:param item: an item from user contributions
:type item: dict
:rtype: bool
'''
if 'top' in item::
page = pywikibot.Page(self.site, item['title'])
text = page.get(get_redirect=True)
pattern = re.compile(r'\[\[.+?:.+?\..+?\]\]')
return bool(pattern.search(text))
return False
-
class
scripts.revertbot.
BaseRevertBot
(site=None, **kwargs)[source]¶ Bases:
pywikibot.bot.OptionHandler
Base revert bot.
Subclass this bot and override callback to get it to do something useful.
-
available_options
: Dict[str, Any] = {'comment': '', 'limit': 500, 'rollback': False}¶
-
-
scripts.revertbot.
main
(*args: str) → None[source]¶ Process command line arguments and invoke bot.
If args is an empty list, sys.argv is used.
- Parameters
args – command line arguments
-
scripts.revertbot.
myRevertBot
¶ alias of
scripts.revertbot.BaseRevertBot
shell script¶
Spawns an interactive Python shell and imports the pywikibot library
The following local option is supported:
-noimport Do not import the pywikibot library. All other arguments are
ignored in this case.
Usage:
python pwb.py shell [args]
If no arguments are given, the pywikibot library will not be loaded.
solve_disambiguation script¶
Script to help a human solve disambiguations by presenting a set of options
Specify the disambiguation page on the command line.
The program will pick up the page, and look for all alternative links, and show them with a number adjacent to them. It will then automatically loop over all pages referring to the disambiguation page, and show 30 characters of context on each side of the reference to help you make the decision between the alternatives. It will ask you to type the number of the appropriate replacement, and perform the change.
It is possible to choose to replace only the link (just type the number) or replace both link and link-text (type ‘r’ followed by the number).
Multiple references in one page will be scanned in order, but typing ‘n’ (next) on any one of them will leave the complete page unchanged. To leave only some reference unchanged, use the ‘s’ (skip) option.
Command line options:
-pos:XXXX adds XXXX as an alternative disambiguation
-just only use the alternatives given on the command line, do not
read the page for other possibilities
-dnskip Skip links already marked with a disambiguation-needed
template (e.g., {{dn}})
-primary "primary topic" disambiguation (Begriffsklärung nach Modell 2).
That's titles where one topic is much more important, the
disambiguation page is saved somewhere else, and the important
topic gets the nice name.
-primary:XY like the above, but use XY as the only alternative, instead of
searching for alternatives in [[Keyword (disambiguation)]].
Note: this is the same as -primary -just -pos:XY
-file:XYZ reads a list of pages from a text file. XYZ is the name of the
file from which the list is taken. If XYZ is not given, the
user is asked for a filename. Page titles should be inside
[[double brackets]]. The -pos parameter won't work if -file
is used.
-always:XY instead of asking the user what to do, always perform the same
action. For example, XY can be "r0", "u" or "2". Be careful with
this option, and check the changes made by the bot. Note that
some choices for XY don't make sense and will result in a loop,
e.g. "l" or "m".
-main only check pages in the main namespace, not in the Talk,
Project, User, etc. namespaces.
-first Uses only the first link of every line on the disambiguation
page that begins with an asterisk. Useful if the page is full
of irrelevant links that are not subject to disambiguation.
You won't get all af them as options, just the first on each
line. For a moderated example see
https://en.wikipedia.org/wiki/Szerdahely
A really exotic one is
https://hu.wikipedia.org/wiki/Brabant_(egyértelműsítő lap)
-start:XY goes through all disambiguation pages in the category on your
wiki that is defined (to the bot) as the category containing
disambiguation pages, starting at XY. If only '-start' or
'-start:' is given, it starts at the beginning.
-min:XX (XX being a number) only work on disambiguation pages for which
at least XX are to be worked on.
To complete a move of a page, one can use:
python pwb.py solve_disambiguation -just -pos:New_Name Old_Name
-
class
scripts.solve_disambiguation.
AddAlternativeOption
(option: str, shortcut: str, output: pywikibot.bot_choice.OutputOption, **kwargs: Any)[source]¶ Bases:
pywikibot.bot_choice.OutputProxyOption
Add a new alternative.
Create a new option for the given sequence.
-
class
scripts.solve_disambiguation.
AliasOption
(option, shortcuts, stop=True)[source]¶ Bases:
pywikibot.bot_choice.StandardOption
An option allowing multiple aliases which also select it.
-
class
scripts.solve_disambiguation.
DisambiguationRobot
(*args, **kwargs)[source]¶ Bases:
pywikibot.bot.SingleSiteBot
Disambiguation Bot.
-
available_options
: Dict[str, Any] = {'always': None, 'dnskip': False, 'first': False, 'just': True, 'main': False, 'min': 0, 'pos': [], 'primary': False}¶
-
checkContents
(text: str) → Optional[str][source]¶ Check if the text matches any of the ignore regexes.
- Parameters
text – wikitext of a page
- Returns
None if none of the regular expressions given in the dictionary at the top of this class matches a substring of the text, otherwise the matched substring
-
disambig_options
= {'always': None, 'dnskip': False, 'first': False, 'just': True, 'main': False, 'min': 0, 'pos': [], 'primary': False}¶
-
findAlternatives
(page) → bool[source]¶ Extend self.opt.pos using correctcap of disambPage.linkedPages.
- Parameters
page (pywikibot.Page) – the disambiguation page
- Returns
True if everything goes fine, False otherwise
-
firstize
(page, links) → list[source]¶ Call firstlinks and remove extra links.
This will remove a lot of silly redundant links from overdecorated disambiguation pages and leave the first link of each asterisked line only. This must be done if -first is used in command line.
-
static
firstlinks
(page) → Generator[str, None, None][source]¶ Return a list of first links of every line beginning with
*
.When a disambpage is full of unnecessary links, this may be useful to sort out the relevant links. E.g. from line
* [[Jim Smith (smith)|Jim Smith]] ([[1832]]-[[1932]]) [[English]]
it returns only ‘Jim Smith (smith)’ Lines without an asterisk at the beginning will be disregarded. No check for page existence, it has already been done.
-
ignore_contents
= {'de': ('{{[Ii]nuse}}', '{{[Ll]öschen}}'), 'fi': ('{{[Tt]yöstetään}}',), 'kk': ('{{[Ii]nuse}}', '{{[Pp]rocessing}}'), 'nl': ('{{wiu2}}', '{{nuweg}}'), 'ru': ('{{[Ii]nuse}}', '{{[Pp]rocessing}}')}¶
-
makeAlternativesUnique
() → None[source]¶ Remove duplicate items from self.opt.pos.
Preserve the order of alternatives.
-
primary_redir_template
= {'hu': 'Egyért-redir'}¶
-
setSummaryMessage
(page, new_targets=None, unlink_counter=0, dn=False) → None[source]¶ Setup i18n summary message.
-
treat_disamb_only
(ref_page, disamb_page) → str[source]¶ Resolve the links to disamb_page but don’t look for its redirects.
- Parameters
disamb_page (pywikibot.Page) – the disambiguation page or redirect we don’t want anything to link to
ref_page (pywikibot.Page) – a page linking to disamb_page
- Returns
“nextpage” if the user enters “n” to skip this page, “nochange” if the page needs no change, and “done” if the page is processed successfully
-
treat_links
(ref_page, disamb_page) → bool[source]¶ Resolve the links to disamb_page or its redirects.
- Parameters
disamb_page (pywikibot.Page) – the disambiguation page or redirect we don’t want anything to link to
ref_page (pywikibot.Page) – a page linking to disamb_page
- Returns
Return whether continue with next page (True) or next disambig (False)
-
-
class
scripts.solve_disambiguation.
EditOption
(option, shortcut, text, start, title)[source]¶ Bases:
pywikibot.bot_choice.StandardOption
Edit the text.
-
property
stop
¶ Return whether if user didn’t press cancel and changed it.
-
property
-
class
scripts.solve_disambiguation.
PrimaryIgnoreManager
(disamb_page, enabled=False)[source]¶ Bases:
object
Primary ignore manager.
If run with the -primary argument, reads from a file which pages should not be worked on; these are the ones where the user pressed n last time. If run without the -primary argument, doesn’t ignore any pages.
- Return type
None
-
class
scripts.solve_disambiguation.
ReferringPageGeneratorWithIgnore
(page, primary=False, minimum=0, main_only=False)[source]¶ Bases:
object
Referring Page generator, with an ignore manager.
-
class
scripts.solve_disambiguation.
ShowPageOption
(option, shortcut, start, page)[source]¶ Bases:
pywikibot.bot_choice.StandardOption
Show the page’s contents in an editor.
-
scripts.solve_disambiguation.
correctcap
(link, text: str) → str[source]¶ Return the link capitalized/uncapitalized according to the text.
- Parameters
link (pywikibot.Page) – link page
text – the wikitext that is supposed to refer to the link
- Returns
uncapitalized title of the link if the text links to the link with an uncapitalized title, else capitalized
speedy_delete script¶
Help sysops to quickly check and/or delete pages listed for speedy deletion
This bot trawls through candidates for speedy deletion in a fast and semi-automated fashion. It displays the contents of each page one at a time and provides a prompt for the user to skip or delete the page. Of course, this will require a sysop account.
Future upcoming options include the ability to untag a page as not being eligible for speedy deletion, as well as the option to commute its sentence to Proposed Deletion (see [[en:WP:PROD]] for more details). Also, if the article text is long, to prevent terminal spamming, it might be a good idea to truncate it just to the first so many bytes.
WARNING: This tool shows the contents of the top revision only. It is possible that a vandal has replaced a perfectly good article with nonsense, which has subsequently been tagged by someone who didn’t realize it was previously a good article. The onus is on you to avoid making these mistakes.
NOTE: This script currently only works for the Wikipedia project.
-
class
scripts.speedy_delete.
SpeedyBot
(**kwargs)[source]¶ Bases:
pywikibot.bot.SingleSiteBot
,pywikibot.bot.ExistingPageBot
Bot to delete pages which are tagged as speedy deletion.
This bot will load a list of pages from the category of candidates for speedy deletion on the language’s wiki and give the user an interactive prompt to decide whether each should be deleted or not.
- Keyword Arguments
site – the site to work on
-
LINES
= 22¶
-
csd_cat_item
= 'Q5964'¶
-
csd_cat_title
= {'incubator': {'incubator': 'Category:Maintenance:Delete'}, 'wikibooks': {'en': 'Category:Candidates for speedy deletion'}, 'wikiversity': {'beta': 'Category:Candidates for speedy deletion'}}¶
-
delete_reasons
= {'wikipedia': {'de': {'asdf': 'Tastaturtest', 'egal': 'Eindeutig irrelevant', 'ka': 'Kein Artikel', 'mist': 'Unsinn', 'move': 'Redirectlöschung, um Platz für Verschiebung zu schaffen', 'nde': 'Nicht in deutscher Sprache verfasst', 'pfui': 'Beleidigung', 'redir': 'Unnötiger Redirect', 'spam': 'Spam', 'web': 'Nur ein Weblink', 'wg': 'Wiedergänger (wurde bereits zuvor gelöscht)'}, 'it': {'copyviol': 'Violazione di copyright', 'promo': 'Pagina promozionale', 'redirect': 'Redirect rotto o inutile', 'spam': 'Spam', 'test': 'Si tratta di un test', 'vandalismo': 'Caso di vandalismo'}, 'ja': {'ad': '[[WP:CSD]] 全般4 宣伝', 'auth': '[[WP:CSD]] 記事3 投稿者依頼or初版立項者による白紙化', 'commons': '[[WP:CSD]] マルチメディア7 コモンズの画像ページ', 'cont': '[[WP:CSD]] 全般1 意味不明な内容のページ', 'cp': '[[WP:CSD]] 全般6 コピペ移動or分割', 'ipu': '[[WP:CSD]] 利用者ページ3 IPユーザの利用者ページ', 'nc': '[[WP:CSD]] リダイレクト2 [[WP:NC]]違反', 'nd': '[[WP:CSD]] 記事1 定義なし', 'nr': '[[WP:CSD]] リダイレクト1 無意味なリダイレクト', 'nuu': '[[WP:CSD]] 利用者ページ2 利用者登録されていない利用者ページ', 'ren': '[[WP:CSD]] リダイレクト3 改名提案を経た曖昧回避括弧付きの移動の残骸', 'rep': '[[WP:CSD]] 全般5 削除されたページの改善なき再作成', 'sh': '[[WP:CSD]] 記事1 短すぎ', 'test': '[[WP:CSD]] 全般2 テスト投稿', 'tmp': '[[WP:CSD]] テンプレート1 初版投稿者依頼', 'uau': '[[WP:CSD]] 利用者ページ1 本人希望', 'vand': '[[WP:CSD]] 全般3 荒らしand/orいたずら'}, 'zh': {'ad': '[[WP:CSD#G11]]: 明顯的以廣告宣傳為目而建立的頁面', 'adc': '[[WP:CSD#G11]]: 只有條目名稱中的人物或團體之聯絡資訊', 'anou': '[[WP:CSD#O3]]: 匿名用戶的用戶討論頁,其中的內容不再有用', 'auth': '[[WP:CSD#G10]]: 原作者請求', 'bio': '[[WP:CSD#G12]]: 未列明來源及語調負面的生者傳記', 'cn': '[[WP:CSD#R2]]: 跨空間重定向', 'commons': '[[WP:CSD#F7]]: 此圖片已存在於[[:commons:|維基共享資源]]', 'cont': '[[WP:CSD#A1]]: 非常短,而且沒有定義或內容。', 'empty': '[[WP:CSD#G1]]: 沒有實際內容或歷史記錄的文章。', 'isol': '[[WP:CSD#G15]]: 孤立頁面', 'isol-f': '[[WP:CSD#G15]]: 孤立頁面-沒有對應檔案的檔案頁面', 'isol-sub': '[[WP:CSD#G15]]: 孤立頁面-沒有對應母頁面的子頁面', 'lssd': '[[WP:CSD#F3]]: 沒有版權或來源資訊,無法確認圖片是否符合方針要求', 'mactra': '[[WP:CSD#G13]]: 明顯的機器翻譯', 'move': '[[WP:CSD#G8]]: 依[[Wikipedia:移動請求|移動請求]]暫時刪除以進行移動或合併頁面之工作', 'nc': '[[WP:CSD#A3]]: 跨計劃內容', 'nls': '[[WP:CSD#F3]]: 沒有版權模板,無法確認版權資訊', 'nocont': '[[WP:CSD#A2]]: 內容只包括外部連接、參見、圖書參考、類別標籤、模板標籤、跨語言連接的條目', 'notrans': '[[WP:CSD#G14]]: 未翻譯的頁面', 'oprj': '[[WP:CSD#G7]]: 內容來自其他中文計劃', 'rep': '[[WP:CSD#G5]]: 經討論被刪除後又重新創建的內容', 'repa': '[[WP:CSD#G5]]: 重複的文章', 'repi': '[[WP:CSD#F1]]: 重複的檔案', 'slr': '[[WP:CSD#R5]]: 指向本身的重定向或循環的重定向', 'svg': '[[WP:CSD#F5]]: 被高解析度與SVG檔案取代的圖片', 'tempcp': '[[WP:CSD#G16]]: 臨時頁面依然侵權', 'test': '[[WP:CSD#G2]]: 測試頁', 'tmp': '[[WP:CSD]]: 臨時頁面', 'uc': '[[WP:CSD#O4]]: 空類別', 'ui': '[[WP:CSD#F6]]: 圖片未使用且不自由', 'urs': '[[WP:CSD#O1]]: 用戶請求刪除自己的用戶頁子頁面', 'vand': '[[WP:CSD#G3]]: 純粹破壞', 'wr': '[[WP:CSD#R3]]: 錯誤重定向'}}}¶
-
deletion_messages
= {'wikinews': {'en': {'_default': '[[WN:CSD]]'}, 'zh': {'_default': '[[WN:CSD]]'}}, 'wikipedia': {'ar': {'_default': 'حذف مرشح للحذف السريع حسب [[وProject:حذف سريع|معايير الحذف السريع]]'}, 'cs': {'_default': 'Bylo označeno k [[Wikipedie:Rychlé smazání|rychlému smazání]]'}, 'de': {'_default': 'Lösche Artikel nach [[Wikipedia:Schnelllöschantrag|Schnelllöschantrag]]'}, 'en': {'_default': 'Deleting candidate for speedy deletion per [[WP:CSD|CSD]]', 'db-attack': 'Deleting page per [[WP:CSD|CSD]] G10: Page that exists solely to attack its subject.', 'db-author': 'Deleting page per [[WP:CSD|CSD]] G7: Author requests deletion and is its only editor.', 'db-band': 'Deleting page per [[WP:CSD|CSD]] A7: Article about a non-notable band.', 'db-banned': 'Deleting page per [[WP:CSD|CSD]] G5: Page created by a banned user.', 'db-bio': 'Deleting page per [[WP:CSD|CSD]] A7: Article about a non-notable person.', 'db-catempty': 'Deleting page per [[WP:CSD|CSD]] C1: Empty category.', 'db-copyvio': 'Deleting page per [[WP:CSD|CSD]] G12: Page is a blatant copyright violation.', 'db-disparage': 'Deleting page per [[WP:CSD|CSD]] T1: Divisive or inflammatory template.', 'db-empty': 'Deleting page per [[WP:CSD|CSD]] A1: Empty article.', 'db-experiment': 'Deleting page per [[WP:CSD|CSD]] G2: Page was created as an experiment.', 'db-nocontext': 'Deleting page per [[WP:CSD|CSD]] A1: Short article that provides little or no context.', 'db-nonsense': 'Deleting page per [[WP:CSD|CSD]] G1: Page is patent nonsense or gibberish.', 'db-notenglish': "Deleting page per [[WP:CSD|CSD]] A2: Article isn't written in English.", 'db-r1': 'Deleting page per [[WP:CSD|CSD]] R1: Redirect to a deleted or non-existent page.', 'db-repost': 'Deleting page per [[WP:CSD|CSD]] G4: Recreation of previously deleted material.', 'db-spam': 'Deleting page per [[WP:CSD|CSD]] G11: Blatant advertising.', 'db-talk': 'Deleting page per [[WP:CSD|CSD]] G8: Talk page of a deleted or non-existent page.', 'db-test': 'Deleting page per [[WP:CSD|CSD]] G2: Test page.', 'db-vandalism': 'Deleting page per [[WP:CSD|CSD]] G3: Blatant vandalism.'}, 'fa': {'_default': 'حذف مرشَّح للحذف السريع حسب [[ويكيبيديا:حذف سريع|معايير الحذف السريع]]'}, 'he': {'_default': 'מחיקת מועמד למחיקה מהירה לפי [[ויקיפדיה:מדיניות המחיקה|מדיניות המחיקה]]', 'גם בוויקישיתוף': 'הקובץ זמין כעת בוויקישיתוף.'}, 'it': {'_default': 'Rimuovo pagina che rientra nei casi di [[Wikipedia:IMMEDIATA|cancellazione immediata]].'}, 'ja': {'_default': '[[WP:CSD|即時削除の方針]]に基づい削除'}, 'pl': {'_default': 'Usuwanie artykułu zgodnie z zasadami [[Wikipedia:Ekspresowe kasowanko|ekspresowego kasowania]]'}, 'pt': {'_default': 'Apagando página por [[Wikipedia:Páginas para eliminar|eliminação rápida]]'}, 'zh': {'_default': '[[WP:CSD]]', 'advert': 'ad', 'db-blanked': 'auth', 'db-rediruser': '[[WP:CSD#O1|CSD O6]] 沒有在使用的討論頁', 'db-spam': '[[WP:CSD#G11|CSD G11]]: 廣告、宣傳頁面', 'db-vandalism': 'vand', 'no license': '[[WP:CSD#I3|CSD I3]]: 沒有版權模板,無法確認版權資訊', 'no source': '[[WP:CSD#I3|CSD I3]]: 沒有來源連結,無法確認來源與版權資訊', 'notchinese': '[[WP:CSD#G7|CSD G7]]: 非中文條目且長時間未翻譯', 'notmandarin': 'oprj', 'nowcommons': 'commons', 'roughtranslation': 'mactra', 'temppage': '[[WP:CSD]]: 臨時頁面', 'unknown': '[[WP:CSD#I3|CSD I3]]: 沒有版權模板,無法確認版權資訊', '翻譯': 'oprj', '翻译': 'oprj'}}}¶
-
talk_deletion_msg
= {'wikinews': {'en': 'Orphaned talk page', 'zh': '[[WN:CSD#O1|CSD O1 O2 O6]] 沒有在使用的討論頁'}, 'wikipedia': {'ar': 'صفحة نقاش يتيمة', 'cs': 'Osiřelá diskusní stránka', 'de': 'Verwaiste Diskussionsseite', 'en': 'Orphaned talk page', 'fa': 'بحث یتیم', 'fr': 'Page de discussion orpheline', 'he': 'דף שיחה של ערך שנמחק', 'it': 'Rimuovo pagina di discussione di una pagina già cancellata', 'pl': 'Osierocona strona dyskusji', 'pt': 'Página de discussão órfã', 'zh': '[[WP:CSD#O1|CSD O1 O2 O6]] 沒有在使用的討論頁'}}¶
template script¶
Very simple script to replace a template with another one
It also converts the old MediaWiki boilerplate format to the new format.
Syntax:
python pwb.py template [-remove] [xml[:filename]] oldTemplate \
[newTemplate]
Specify the template on the command line. The program will pick up the template page, and look for all pages using it. It will then automatically loop over them, and replace the template.
Command line options:
-remove Remove every occurrence of the template from every article
-subst Resolves the template by putting its text directly into the
article. This is done by changing {{...}} or {{msg:...}} into
{{subst:...}}. If you want to use safesubst, you
can do -subst:safe. Substitution is not available inside
<ref>...</ref>, <gallery>...</gallery>, <poem>...</poem>
and <pagelist ... /> tags.
-assubst Replaces the first argument as old template with the second
argument as new template but substitutes it like -subst does.
Using both options -remove and -subst in the same command line has
the same effect.
-xml retrieve information from a local dump
(https://dumps.wikimedia.org). If this argument isn't given,
info will be loaded from the maintenance page of the live wiki.
argument can also be given as "-xml:filename.xml".
-onlyuser: Only process pages edited by a given user
-skipuser: Only process pages not edited by a given user
-timestamp: (With -onlyuser or -skipuser). Only check for a user where his
edit is not older than the given timestamp. Timestamp must be
written in MediaWiki timestamp format which is "%Y%m%d%H%M%S".
If this parameter is missed, all edits are checked but this is
restricted to the last 100 edits.
-summary: Lets you pick a custom edit summary. Use quotes if edit summary
contains spaces.
-always Don't bother asking to confirm any of the changes, Just Do It.
-addcat: Appends the given category to every page that is edited. This is
useful when a category is being broken out from a template
parameter or when templates are being upmerged but more
information must be preserved.
- other: First argument is the old template name, second one is the new
name. If you want to address a template which has spaces, put quotation marks around it, or use underscores.
Examples
If you have a template called [[Template:Cities in Washington]] and want to change it to [[Template:Cities in Washington state]], start:
python pwb.py template "Cities in Washington" "Cities in Washington state"
Move the page [[Template:Cities in Washington]] manually afterwards.
If you have a template called [[Template:test]] and want to substitute it only on pages in the User: and User talk: namespaces, do:
python pwb.py template test -subst -namespace:2 -namespace:3
Note that -namespace: is a global Pywikibot parameter
This next example substitutes the template lived with a supplied edit summary. It only performs substitutions in main article namespace and doesn’t prompt to start replacing. Note that -putthrottle: is a global Pywikibot parameter:
python pwb.py template -putthrottle:30 -namespace:0 lived -subst -always \
-summary:"BOT: Substituting {{lived}}, see [[WP:SUBST]]."
This next example removes the templates {{cfr}}, {{cfru}}, and {{cfr-speedy}} from five category pages as given:
python pwb.py template cfr cfru cfr-speedy -remove -always \
-page:"Category:Mountain monuments and memorials" \
-page:"Category:Indian family names" \
-page:"Category:Tennis tournaments in Belgium" \
-page:"Category:Tennis tournaments in Germany" \
-page:"Category:Episcopal cathedrals in the United States" \
-summary:"Removing Cfd templates from category pages that survived."
This next example substitutes templates test1, test2, and space test on all user talk pages (namespace #3):
python pwb.py template test1 test2 "space test" -subst -ns:3 -always
-
class
scripts.template.
TemplateRobot
(generator, templates: dict, **kwargs)[source]¶ Bases:
scripts.replace.ReplaceRobot
This bot will replace, remove or subst all occurrences of a template.
- Parameters
generator (iterable) – the pages to work on
templates – a dictionary which maps old template names to their replacements. If remove or subst is True, it maps the names of the templates that should be removed/resolved to None.
-
update_options
: Dict[str, Any] = {'addcat': None, 'remove': False, 'subst': False, 'summary': ''}¶
templatecount script¶
Display the list of pages transcluding a given list of templates
It can also be used to simply count the number of pages (rather than listing each individually).
Syntax:
python pwb.py templatecount options templates
Command line options:
-count Counts the number of times each template (passed in as an
argument) is transcluded.
-list Gives the list of all of the pages transcluding the templates
(rather than just counting them).
-namespace: Filters the search to a given namespace. If this is specified
multiple times it will search all given namespaces
Examples
Counts how many times {{ref}} and {{note}} are transcluded in articles:
python pwb.py templatecount -count -namespace:0 ref note
Lists all the category pages that transclude {{cfd}} and {{cfdu}}:
python pwb.py templatecount -list -namespace:14 cfd cfdu
-
class
scripts.templatecount.
TemplateCountRobot
[source]¶ Bases:
object
Template count bot.
-
classmethod
countTemplates
(templates, namespaces) → None[source]¶ Display number of transclusions for a list of templates.
Displays the number of transcluded page in the given ‘namespaces’ for each template given by ‘templates’ list.
- Parameters
templates (list) – list of template names
namespaces (list) – list of namespace numbers
-
classmethod
listTemplates
(templates, namespaces) → None[source]¶ Display transcluded pages for a list of templates.
Displays each transcluded page in the given ‘namespaces’ for each template given by ‘templates’ list.
- Parameters
templates (list) – list of template names
namespaces (list) – list of namespace numbers
-
classmethod
template_dict
(templates, namespaces) → dict[source]¶ Create a dict of templates and its transcluded pages.
The names of the templates are the keys, and lists of pages transcluding templates in the given namespaces are the values.
- Parameters
templates (list) – list of template names
namespaces (list) – list of namespace numbers
-
static
template_dict_generator
(templates, namespaces) → Generator[tuple, None, None][source]¶ Yield transclusions of each template in ‘templates’.
For each template in ‘templates’, yield a tuple (template, transclusions), where ‘transclusions’ is a list of all pages in ‘namespaces’ where the template has been transcluded.
- Parameters
templates (list) – list of template names
namespaces (list) – list of namespace numbers
-
classmethod
touch script¶
This bot goes over multiple pages of a wiki, and edits them without changes
This is for example used to get category links in templates working.
Command-line arguments:
-purge Purge the page instead of touching it
Touch mode (default):
-botflag Force botflag in case of edits with changes.
Purge mode:
-converttitles Convert titles to other variants if necessary
-forcelinkupdate Update the links tables
-forcerecursivelinkupdate Update the links table, and update the links tables
for any page that uses this page as a template
-redirects Automatically resolve redirects
This script supports use of pywikibot.pagegenerators
arguments.
-
class
scripts.touch.
PurgeBot
(**kwargs: Any)[source]¶ Bases:
pywikibot.bot.MultipleSitesBot
Purge each page on the generator.
Only accept ‘generator’ and options defined in available_options.
- Parameters
kwargs – bot options
- Keyword Arguments
generator – a generator processed by run method
-
available_options
: Dict[str, Any] = {'converttitles': None, 'forcelinkupdate': None, 'forcerecursivelinkupdate': None, 'redirects': None}¶
-
class
scripts.touch.
TouchBot
(**kwargs: Any)[source]¶ Bases:
pywikibot.bot.MultipleSitesBot
Page touch bot.
Only accept ‘generator’ and options defined in available_options.
- Parameters
kwargs – bot options
- Keyword Arguments
generator – a generator processed by run method
-
update_options
: Dict[str, Any] = {'botflag': False}¶
transferbot script¶
This script transfers pages from a source wiki to a target wiki
It also copies edit history to a subpage.
The following parameters are supported:
-tolang: The target site code.
-tofamily: The target site family.
-prefix: Page prefix on the new site.
-overwrite: Existing pages are skipped by default. Use this option to
overwrite pages.
-target Use page generator of the target site
Internal links are not repaired!
Pages to work on can be specified using any of:
This script supports use of pywikibot.pagegenerators
arguments.
Examples
Transfer all pages in category “Query service” from the English Wikipedia to the Arabic Wiktionary, adding “Wiktionary:Import enwp/” as prefix:
python pwb.py transferbot -family:wikipedia -lang:en -cat:"Query service" \
-tofamily:wiktionary -tolang:ar -prefix:"Wiktionary:Import enwp/"
Copy the template “Query service” from the English Wikipedia to the Arabic Wiktionary:
python pwb.py transferbot -family:wikipedia -lang:en \
-tofamily:wiktionary -tolang:ar -page:"Template:Query service"
- Copy 10 wanted templates of German Wikipedia from English Wikipedia to German
- python pwb.py transferbot -family:wikipedia -lang:en
-tolang:de -wantedtemplates:10 -target
unusedfiles script¶
This bot appends some text to all unused images and notifies uploaders
Parameters:
-always Don't be asked every time.
-nouserwarning Do not warn uploader about orphaned file.
-filetemplate: Use a custom template on unused file pages.
-usertemplate: Use a custom template to warn the uploader.
-limit Specify number of pages to work on with "-limit:n" where
n is the maximum number of articles to work on.
If not used, all pages are used.
-
class
scripts.unusedfiles.
UnusedFilesBot
(**kwargs)[source]¶ Bases:
pywikibot.bot.SingleSiteBot
,pywikibot.bot.AutomaticTWSummaryBot
,pywikibot.bot.ExistingPageBot
Unused files bot.
-
summary_key
= 'unusedfiles-comment'¶
-
update_options
: Dict[str, Any] = {'filetemplate': '', 'nouserwarning': False, 'usertemplate': ''}¶
-
upload script¶
Script to upload images to Wikipedia
The following parameters are supported:
-keep Keep the filename as is
-filename: Target filename without the namespace prefix
-prefix: Add specified prefix to every filename.
-noverify Do not ask for verification of the upload description if one
is given
-abortonwarn: Abort upload on the specified warning type. If no warning type
is specified, aborts on any warning.
-ignorewarn: Ignores specified upload warnings. If no warning type is
specified, ignores all warnings. Use with caution
-chunked: Upload the file in chunks (more overhead, but restartable). If
no value is specified the chunk size is 1 MiB. The value must
be a number which can be preceded by a suffix. The units are::
No suffix: Bytes
'k': Kilobytes (1000 B)
'M': Megabytes (1000000 B)
'Ki': Kibibytes (1024 B)
'Mi': Mebibytes (1024x1024 B)
The suffixes are case insensitive.
-async Make potentially large file operations asynchronous on the
server side when possible.
-always Don't ask the user anything. This will imply -keep and
-noverify and require that either -abortonwarn or -ignorewarn
is defined for all. It will also require a valid file name and
description. It'll only overwrite files if -ignorewarn includes
the 'exists' warning.
-recursive When the filename is a directory it also uploads the files from
the subdirectories.
-summary: Pick a custom edit summary for the bot.
-descfile: Specify a filename where the description is stored
It is possible to combine -abortonwarn and -ignorewarn so that if the specific warning is given it won’t apply the general one but more specific one. So if it should ignore specific warnings and abort on the rest it’s possible by defining no warning for -abortonwarn and the specific warnings for -ignorewarn. The order does not matter. If both are unspecific or a warning is specified by both, it’ll prefer aborting.
If any other arguments are given, the first is either URL, filename or directory to upload, and the rest is a proposed description to go with the upload. If none of these are given, the user is asked for the directory, file or URL to upload. The bot will then upload the image to the wiki.
The script will ask for the location of an image(s), if not given as a parameter, and for a description.
version script¶
Script to determine the Pywikibot version (tag, revision and date)
watchlist script¶
Allows access to the bot account’s watchlist
The watchlist can be updated manually by running this script.
Syntax:
python pwb.py watchlist [-all \| -count \| -count:all \| -new]
Command line options:
-all Reloads watchlists for all wikis where a watchlist is already
present
-count Count only the total number of pages on the watchlist of the
account the bot has access to
-count:all Count only the total number of pages on all wikis watchlists
that the bot is connected to.
-new Load watchlists for all wikis where accounts is setting in
user-config.py
-
scripts.watchlist.
count_watchlist
(site=None)[source]¶ Count only the total number of page(s) in watchlist for this wiki.
-
scripts.watchlist.
count_watchlist_all
()[source]¶ Count only the total number of page(s) in watchlist for all wikis.
-
scripts.watchlist.
main
(*args: str) → None[source]¶ Process command line arguments and invoke bot.
If args is an empty list, sys.argv is used.
- Parameters
args – command line arguments
weblinkchecker script¶
This bot is used for checking external links found at the wiki
It checks several pages at once, with a limit set by the config variable max_external_links, which defaults to 50.
The bot won’t change any wiki pages, it will only report dead links such that people can fix or remove the links themselves.
The bot will store all links found dead in a .dat file in the deadlinks subdirectory. To avoid the removing of links which are only temporarily unavailable, the bot ONLY reports links which were reported dead at least two times, with a time lag of at least one week. Such links will be logged to a .txt file in the deadlinks subdirectory.
The .txt file uses wiki markup and so it may be useful to post it on the wiki and then exclude that page from subsequent runs. For example if the page is named Broken Links, exclude it with ‘-titleregexnot:^Broken Links$’
After running the bot and waiting for at least one week, you can re-check those pages where dead links were found, using the -repeat parameter.
In addition to the logging step, it is possible to automatically report dead links to the talk page of the article where the link was found. To use this feature, set report_dead_links_on_talk = True in your user-config.py, or specify “-talk” on the command line. Adding “-notalk” switches this off irrespective of the configuration variable.
When a link is found alive, it will be removed from the .dat file.
These command line parameters can be used to specify which pages to work on:
-repeat Work on all pages were dead links were found before. This is
useful to confirm that the links are dead after some time (at
least one week), which is required before the script will report
the problem.
-namespace Only process templates in the namespace with the given number or
name. This parameter may be used multiple times.
-xml Should be used instead of a simple page fetching method from
pagegenerators.py for performance and load issues
-xmlstart Page to start with when using an XML dump
-ignore HTTP return codes to ignore. Can be provided several times :
-ignore:401 -ignore:500
This script supports use of pywikibot.pagegenerators
arguments.
Furthermore, the following command line parameters are supported:
-talk Overrides the report_dead_links_on_talk config variable, enabling
the feature.
-notalk Overrides the report_dead_links_on_talk config variable, disabling
the feature.
-day Do not report broken link if the link is there only since
x days or less. If not set, the default is 7 days.
The following config variables are supported:
max_external_links The maximum number of web pages that should be
loaded simultaneously. You should change this
according to your Internet connection speed.
Be careful: if it is set too high, the script
might get socket errors because your network
is congested, and will then think that the page
is offline.
report_dead_links_on_talk If set to true, causes the script to report dead
links on the article's talk page if (and ONLY if)
the linked page has been unavailable at least two
times during a timespan of at least one week.
weblink_dead_days sets the timespan (default: one week) after which
a dead link will be reported
Examples
Loads all wiki pages in alphabetical order using the Special:Allpages feature:
python pwb.py weblinkchecker -start:!
Loads all wiki pages using the Special:Allpages feature, starting at “Example page”:
python pwb.py weblinkchecker -start:Example_page
Loads all wiki pages that link to www.example.org:
python pwb.py weblinkchecker -weblink:www.example.org
Only checks links found in the wiki page “Example page”:
python pwb.py weblinkchecker Example page
Loads all wiki pages where dead links were found during a prior run:
python pwb.py weblinkchecker -repeat
-
class
scripts.weblinkchecker.
DeadLinkReportThread
[source]¶ Bases:
threading.Thread
A Thread that is responsible for posting error reports on talk pages.
There is only one DeadLinkReportThread, and it is using a semaphore to make sure that two LinkCheckerThreads cannot access the queue at the same time.
-
class
scripts.weblinkchecker.
History
(reportThread, site=None)[source]¶ Bases:
object
Store previously found dead links.
The URLs are dictionary keys, and values are lists of tuples where each tuple represents one time the URL was found dead. Tuples have the form (title, date, error) where title is the wiki page where the URL was found, date is an instance of time, and error is a string with error code and message.
We assume that the first element in the list represents the first time we found this dead link, and the last element represents the last time.
Example:
dict = { 'https://www.example.org/page': [ ('WikiPageTitle', DATE, '404: File not found'), ('WikiPageName2', DATE, '404: File not found'), ] }
-
log
(url, error, containingPage, archiveURL)[source]¶ Log an error report to a text file in the deadlinks subdirectory.
-
-
class
scripts.weblinkchecker.
LinkCheckThread
(page, url, history, HTTPignore, day)[source]¶ Bases:
threading.Thread
A thread responsible for checking one URL.
After checking the page, it will die.
-
exception
scripts.weblinkchecker.
NotAnURLError
[source]¶ Bases:
BaseException
The link is not an URL.
-
class
scripts.weblinkchecker.
WeblinkCheckerRobot
(HTTPignore=None, day=7, **kwargs)[source]¶ Bases:
pywikibot.bot.SingleSiteBot
,pywikibot.bot.ExistingPageBot
Bot which will search for dead weblinks.
It uses several LinkCheckThreads at once to process pages from generator.
-
scripts.weblinkchecker.
countLinkCheckThreads
() → int[source]¶ Count LinkCheckThread threads.
- Returns
number of LinkCheckThread threads
welcome script¶
Script to welcome new users
This script works out of the box for Wikis that have been defined in the script.
Ensure you have community support before running this bot!
Everything that needs customisation to support additional projects is indicated by comments.
Description of basic functionality
Request a list of new users every period (default: 3600 seconds) You can choose to break the script after the first check (see arguments)
Check if new user has passed a threshold for a number of edits (default: 1 edit)
Optional: check username for bad words in the username or if the username consists solely of numbers; log this somewhere on the wiki (default: False) Update: Added a whitelist (explanation below).
If user has made enough edits (it can be also 0), check if user has an empty talk page
If user has an empty talk page, add a welcome message.
Optional: Once the set number of users have been welcomed, add this to the configured log page, one for each day (default: True)
If no log page exists, create a header for the log page first.
This script (by default not yet implemented) uses two templates that need to be on the local wiki
{{WLE}}: contains mark up code for log entries (just copy it from Commons)
{{welcome}}: contains the information for new users
This script understands the following command-line arguments:
-edit[:#] Define how many edits a new user needs to be welcomed
(default: 1, max: 50)
-time[:#] Define how many seconds the bot sleeps before restart
(default: 3600)
-break Use it if you don't want that the Bot restart at the end
(it will break) (default: False)
-nlog Use this parameter if you do not want the bot to log all
welcomed users (default: False)
-limit[:#] Use this parameter to define how may users should be
checked (default:50)
-offset[:TIME] Skip the latest new users (those newer than TIME)
to give interactive users a chance to welcome the
new users (default: now)
Timezone is the server timezone, GMT for Wikimedia
TIME format : yyyymmddhhmmss or yyyymmdd
-timeoffset[:#] Skip the latest new users, accounts newer than
# minutes
-numberlog[:#] The number of users to welcome before refreshing the
welcome log (default: 4)
-filter Enable the username checks for bad names (default: False)
-ask Use this parameter if you want to confirm each possible
bad username (default: False)
-random Use a random signature, taking the signatures from a wiki
page (for instruction, see below).
-file[:#] Use a file instead of a wikipage to take the random sign.
If you use this parameter, you don't need to use -random.
-sign Use one signature from command line instead of the default
-savedata This feature saves the random signature index to allow to
continue to welcome with the last signature used.
-sul Welcome the auto-created users (default: False)
-quiet Prevents users without contributions are displayed
***************************** GUIDE *******************************
* Report, Bad and white list guide: *
Set in the code which page it will use to load the badword, the whitelist and the report
In these page you have to add a “tuple” with the names that you want to add in the two list. For example: (‘cat’, ‘mouse’, ‘dog’) You can write also other text in the page, it will work without problem.
What will do the two pages? Well, the Bot will check if a badword is in the username and set the “warning” as True. Then the Bot check if a word of the whitelist is in the username. If yes it remove the word and recheck in the bad word list to see if there are other badword in the username. Example
dio is a badword
Claudio is a normal name
The username is “Claudio90 fuck!”
The Bot finds dio and sets “warning”
The Bot finds Claudio and sets “ok”
The Bot finds fuck at the end and sets “warning”
Result: The username is reported.
When a user is reported you have to check him and do
If he’s ok, put the {{welcome}}
If he’s not, block him
You can decide to put a “you are blocked, change another username” template or not.
Delete the username from the page.
IMPORTANT : The Bot check the user in this order
Search if he has a talkpage (if yes, skip)
Search if he’s blocked, if yes he will be skipped
Search if he’s in the report page, if yes he will be skipped
If no, he will be reported.
* Random signature guide: *
Some welcomed users will answer to the one who has signed the welcome message. When you welcome many new users, you might be overwhelmed with such answers. Therefore you can define usernames of other users who are willing to receive some of these messages from newbies.
Set the page that the bot will load
Add the signatures in this way:
*<SPACE>SIGNATURE <NEW LINE>
Example of signatures:
<pre>
* [[User:Filnik\|Filnik]]
* [[User:Rock\|Rock]]
</pre>
- NOTE: The white space and <pre></pre> aren’t required but I suggest you to
use them.
**************************** Badwords ******************************
The list of Badwords of the code is opened. If you think that a word is international and it must be blocked in all the projects feel free to add it. If also you think that a word isn’t so international, feel free to delete it.
However, there is a dinamic-wikipage to load that badwords of your project or you can add them directly in the source code that you are using without adding or deleting.
Some words, like “Administrator” or “Dio” (God in italian) or “Jimbo” aren’t badwords at all but can be used for some bad-nickname.
-
exception
scripts.welcome.
FilenameNotSet
(arg: str)[source]¶ Bases:
pywikibot.exceptions.Error
An exception indicating that a signature filename was not specified.
-
class
scripts.welcome.
Global
[source]¶ Bases:
object
Container class for global settings.
-
attachEditCount
= 1¶
-
confirm
= False¶
-
defaultSign
= '--~~~~'¶
-
dumpToLog
= 15¶
-
filtBadName
= False¶
-
makeWelcomeLog
= True¶
-
offset
= None¶
-
queryLimit
= 50¶
-
quiet
= False¶
-
randomSign
= False¶
-
recursive
= True¶
-
saveSignIndex
= False¶
-
signFileName
= None¶
-
timeRecur
= 3600¶
-
timeoffset
= 0¶
-
welcomeAuto
= False¶
-
-
class
scripts.welcome.
Msg
(value)[source]¶ Bases:
enum.Enum
Enum for show_status method providing message header and color.
-
DEFAULT
= ('MSG', 'lightpurple')¶
-
DONE
= ('Done', 'lightblue')¶
-
IGNORE
= ('NoAct', 'lightaqua')¶
-
MATCH
= ('Match', 'lightgreen')¶
-
MSG
= ('MSG', 'lightpurple')¶
-
SKIP
= ('Skip', 'lightyellow')¶
-
WARN
= ('Warn', 'lightred')¶
-
-
class
scripts.welcome.
WelcomeBot
(**kwargs)[source]¶ Bases:
pywikibot.bot.SingleSiteBot
Bot to add welcome messages on User pages.
-
property
generator
¶ Retrieve new users.
-
property