tools package¶
Miscellaneous helper functions (not wiki-dependent).
-
exception
pywikibot.tools.
CombinedError
[source]¶ Bases:
KeyError
,IndexError
An error that gets caught by both KeyError and IndexError.
-
class
pywikibot.tools.
ComparableMixin
[source]¶ Bases:
object
Mixin class to allow comparing to other objects which are comparable.
-
class
pywikibot.tools.
DequeGenerator
[source]¶ Bases:
collections.abc.Iterator
,collections.deque
A generator that allows items to be added during generating.
-
class
pywikibot.tools.
EmptyDefault
[source]¶ Bases:
str
,collections.abc.Mapping
A default for a not existing siteinfo property.
It should be chosen if there is no better default known. It acts like an empty collections, so it can be iterated through it safely if treated as a list, tuple, set or dictionary. It is also basically an empty string.
Accessing a value via __getitem__ will result in a combined KeyError and IndexError.
Initialise the default as an empty string.
-
class
pywikibot.tools.
MediaWikiVersion
(version_str: str)[source]¶ Bases:
object
Version object to allow comparing ‘wmf’ versions with normal ones.
The version mainly consist of digits separated by periods. After that is a suffix which may only be ‘wmf<number>’, ‘alpha’, ‘beta<number>’ or ‘-rc.<number>’ (the - and . are optional). They are considered from old to new in that order with a version number without suffix is considered the newest. This secondary difference is stored in an internal _dev_version attribute.
Two versions are equal if their normal version and dev version are equal. A version is greater if the normal version or dev version is greater. For .. admonition:: Example
1.34 < 1.34.1 < 1.35wmf1 < 1.35alpha < 1.35beta1 < 1.35beta2 < 1.35-rc-1 < 1.35-rc.2 < 1.35
Any other suffixes are considered invalid.
- Parameters
version_str – version to parse
-
MEDIAWIKI_VERSION
= re.compile('(\\d+(?:\\.\\d+)+)(-?wmf\\.?(\\d+)|alpha|beta(\\d+)|-?rc\\.?(\\d+)|.*)?$')¶
-
static
from_generator
(generator: str) → pywikibot.tools.MediaWikiVersion[source]¶ Create instance from a site’s generator attribute.
-
class
pywikibot.tools.
RLock
(*args, **kwargs)[source]¶ Bases:
object
Context manager which implements extended reentrant lock objects.
This RLock is implicit derived from threading.RLock but provides a locked() method like in threading.Lock and a count attribute which gives the active recursion level of locks.
Usage:
>>> from pywikibot.tools import RLock >>> lock = RLock() >>> lock.acquire() True >>> with lock: print(lock.count) # nested lock 2 >>> lock.locked() True >>> lock.release() >>> lock.locked() False
New in version 6.2
-
property
count
¶ Return number of acquired locks.
-
property
-
class
pywikibot.tools.
SelfCallDict
[source]¶ Bases:
pywikibot.tools.SelfCallMixin
,dict
Dict with SelfCallMixin.
-
class
pywikibot.tools.
SelfCallMixin
[source]¶ Bases:
object
Return self when called.
When ‘_own_desc’ is defined it’ll also issue a deprecation warning using issue_deprecation_warning(‘Calling ‘ + _own_desc, ‘it directly’).
-
class
pywikibot.tools.
SelfCallString
[source]¶ Bases:
pywikibot.tools.SelfCallMixin
,str
String with SelfCallMixin.
-
class
pywikibot.tools.
SizedKeyCollection
(keyattr: str)[source]¶ Bases:
collections.abc.Container
,collections.abc.Iterable
,collections.abc.Sized
Structure to hold values where the key is given by the value itself.
A stucture like a defaultdict but the key is given by the value itselfvand cannot be assigned directly. It returns the number of all items with len() but not the number of keys.
Samples:
>>> from pywikibot.tools import SizedKeyCollection >>> data = SizedKeyCollection('title') >>> data.append('foo') >>> data.append('bar') >>> data.append('Foo') >>> list(data) ['foo', 'Foo', 'bar'] >>> len(data) 3 >>> 'Foo' in data True >>> 'foo' in data False >>> data['Foo'] ['foo', 'Foo'] >>> list(data.keys()) ['Foo', 'Bar'] >>> data.remove_key('Foo') >>> list(data) ['bar'] >>> data.clear() >>> list(data) []
New in version 6.1.
- Parameters
keyattr – an attribute or method of the values to be hold with this collection which will be used as key.
-
class
pywikibot.tools.
ThreadList
(limit=128, wait_time=2, *args)[source]¶ Bases:
list
A simple threadpool class to limit the number of simultaneous threads.
Any threading.Thread object can be added to the pool using the append() method. If the maximum number of simultaneous threads has not been reached, the Thread object will be started immediately; if not, the append() call will block until the thread is able to start.
>>> pool = ThreadList(limit=10) >>> def work(): ... time.sleep(1) ... >>> for x in range(20): ... pool.append(threading.Thread(target=work)) ...
- Parameters
limit (int) – the number of simultaneous threads
wait_time (int or float) – how long to wait if active threads exceeds limit
-
class
pywikibot.tools.
ThreadedGenerator
(group=None, target=None, name='GeneratorThread', args=(), kwargs=None, qsize=65536)[source]¶ Bases:
threading.Thread
Look-ahead generator class.
Runs a generator in a separate thread and queues the results; can be called like a regular generator.
Subclasses should override self.generator, not self.run
Important: the generator thread will stop itself if the generator’s internal queue is exhausted; but, if the calling program does not use all the generated values, it must call the generator’s stop() method to stop the background thread. Example usage:
>>> gen = ThreadedGenerator(target=range, args=(20,)) >>> try: ... data = list(gen) ... finally: ... gen.stop() >>> data [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19]
Initializer. Takes same keyword arguments as threading.Thread.
target must be a generator function (or other callable that returns an iterable object).
- Parameters
qsize (int) – The size of the lookahead queue. The larger the qsize, the more values will be computed in advance of use (which can eat up memory and processor time).
-
class
pywikibot.tools.
Version
(version)[source]¶ Bases:
pkg_resources.extern.packaging.version.Version
Version from pkg_resouce vendor package.
This Version provides propreties of vendor package 20.4 shipped with setuptools 49.4.0.
Add additional properties of not provided by base class.
-
class
pywikibot.tools.
classproperty
(cls_method)[source]¶ Bases:
object
Descriptor class to access a class method as a property.
This class may be used as a decorator:
class Foo: _bar = 'baz' # a class property @classproperty def bar(cls): # a class property method return cls._bar
Foo.bar gives ‘baz’.
Hold the class method.
-
pywikibot.tools.
compute_file_hash
(filename: str, sha='sha1', bytes_to_read=None)[source]¶ Compute file hash.
Result is expressed as hexdigest().
- Parameters
filename – filename path
sha (str) – hashing function among the following in hashlib: md5(), sha1(), sha224(), sha256(), sha384(), and sha512() function name shall be passed as string, e.g. ‘sha1’.
bytes_to_read (None or int) – only the first bytes_to_read will be considered; if file size is smaller, the whole file will be considered.
-
pywikibot.tools.
file_mode_checker
(filename: str, mode=384, quiet=False, create=False)[source]¶ Check file mode and update it, if needed.
- Parameters
filename – filename path
mode (int) – requested file mode
quiet (bool) – warn about file mode change if False.
create (bool) – create the file if it does not exist already
- Raises
IOError – The file does not exist and
create
is False.
-
pywikibot.tools.
filter_unique
(iterable, container=None, key=None, add=None)[source]¶ Yield unique items from an iterable, omitting duplicates.
By default, to provide uniqueness, it puts the generated items into a set created as a local variable. It only yields items which are not already present in the local set.
For large collections, this is not memory efficient, as a strong reference to every item is kept in a local set which cannot be cleared.
Also, the local set can’t be re-used when chaining unique operations on multiple generators.
To avoid these issues, it is advisable for the caller to provide their own container and set the key parameter to be the function
hash
, or use aweakref
as the key.The container can be any object that supports __contains__. If the container is a set or dict, the method add or __setitem__ will be used automatically. Any other method may be provided explicitly using the add parameter.
Beware that key=id is only useful for cases where id() is not unique.
Note: This is not thread safe.
- Parameters
iterable (collections.abc.Iterable) – the source iterable
container (type) – storage of seen items
key (callable) – function to convert the item to a key
add (callable) – function to add an item to the container
-
pywikibot.tools.
first_lower
(string: str) → str[source]¶ Return a string with the first character uncapitalized.
Empty strings are supported. The original string is not changed.
-
pywikibot.tools.
first_upper
(string: str) → str[source]¶ Return a string with the first character capitalized.
Empty strings are supported. The original string is not changed.
- Note
MediaWiki doesn’t capitalize some characters the same way as Python. This function tries to be close to MediaWiki’s capitalize function in title.php. See T179115 and T200357.
-
pywikibot.tools.
has_module
(module, version=None)[source]¶ Check if a module can be imported.
New in version 3.0.
-
pywikibot.tools.
intersect_generators
(*iterables, allow_duplicates: bool = False)[source]¶ Intersect generators listed in iterables.
Yield items only if they are yielded by all generators of iterables. Threads (via ThreadedGenerator) are used in order to run generators in parallel, so that items can be yielded before generators are exhausted.
Threads are stopped when they are either exhausted or Ctrl-C is pressed. Quitting before all generators are finished is attempted if there is no more chance of finding an item in all queues.
Sample:
>>> iterables = 'mississippi', 'missouri' >>> list(intersect_generators(*iterables)) ['m', 'i', 's'] >>> list(intersect_generators(*iterables, allow_duplicates=True)) ['m', 'i', 's', 's', 'i']
- Parameters
iterables – page generators
allow_duplicates – optional keyword argument to allow duplicates if present in all generators
-
pywikibot.tools.
is_ip_address
(value: str) → bool[source]¶ Check if a value is a valid IPv4 or IPv6 address.
- Parameters
value – value to check
-
pywikibot.tools.
islice_with_ellipsis
(iterable, *args, marker='…')[source]¶ Generator which yields the first n elements of the iterable.
If more elements are available and marker is True, it returns an extra string marker as continuation mark.
Function takes the and the additional keyword marker.
- Parameters
iterable (iterable) – the iterable to work on
args – same args as: -
itertools.islice(iterable, stop)
-itertools.islice(iterable, start, stop[, step])
marker (str) – element to yield if iterable still contains elements after showing the required number. Default value: ‘…’
-
pywikibot.tools.
itergroup
(iterable, size: int)[source]¶ Make an iterator that returns lists of (up to) size items from iterable.
Example:
>>> i = itergroup(range(25), 10) >>> print(next(i)) [0, 1, 2, 3, 4, 5, 6, 7, 8, 9] >>> print(next(i)) [10, 11, 12, 13, 14, 15, 16, 17, 18, 19] >>> print(next(i)) [20, 21, 22, 23, 24] >>> print(next(i)) Traceback (most recent call last): ... StopIteration
-
pywikibot.tools.
merge_unique_dicts
(*args, **kwargs)[source]¶ Return a merged dict and make sure that the original dicts keys are unique.
The positional arguments are the dictionaries to be merged. It is also possible to define an additional dict using the keyword arguments.
-
pywikibot.tools.
open_archive
(filename, mode='rb', use_extension=True)[source]¶ Open a file and uncompress it if needed.
This function supports bzip2, gzip, 7zip, lzma, and xz as compression containers. It uses the packages available in the standard library for bzip2, gzip, lzma, and xz so they are always available. 7zip is only available when a 7za program is available and only supports reading from it.
The compression is either selected via the magic number or file ending.
- Parameters
filename (str) – The filename.
use_extension (bool) – Use the file extension instead of the magic number to determine the type of compression (default True). Must be True when writing or appending.
mode (str) – The mode in which the file should be opened. It may either be ‘r’, ‘rb’, ‘a’, ‘ab’, ‘w’ or ‘wb’. All modes open the file in binary mode. It defaults to ‘rb’.
- Raises
ValueError – When 7za is not available or the opening mode is unknown or it tries to write a 7z archive.
FileNotFoundError – When the filename doesn’t exist and it tries to read from it or it tries to determine the compression algorithm.
OSError – When it’s not a 7z archive but the file extension is 7z. It is also raised by bz2 when its content is invalid. gzip does not immediately raise that error but only on reading it.
lzma.LZMAError – When error occurs during compression or decompression or when initializing the state with lzma or xz.
ImportError – When file is compressed with bz2 but neither bz2 nor bz2file is importable, or when file is compressed with lzma or xz but lzma is not importable.
- Returns
A file-like object returning the uncompressed data in binary mode.
- Return type
file-like object
-
pywikibot.tools.
roundrobin_generators
(*iterables)[source]¶ Yield simultaneous from each iterable.
Sample:
>>> tuple(roundrobin_generators('ABC', range(5))) ('A', 0, 'B', 1, 'C', 2, 3, 4)
New in version 3.0.
- Parameters
iterables (iterable) – any iterable to combine in roundrobin way
- Returns
the combined generator of iterables
- Return type
generator
-
class
pywikibot.tools.
suppress_warnings
(message='', category=<class 'Warning'>, filename='')[source]¶ Bases:
warnings.catch_warnings
A decorator/context manager that temporarily suppresses warnings.
Those suppressed warnings that do not match the parameters will be raised shown upon exit.
New in vesion 3.0.
Initialize the object.
The parameter semantics are similar to those of
warnings.filterwarnings
.- Parameters
message (str) – A string containing a regular expression that the start of the warning message must match. (case-insensitive)
category (type) – A class (a subclass of Warning) of which the warning category must be a subclass in order to match.
filename (str) – A string containing a regular expression that the start of the path to the warning module must match. (case-sensitive)
tools.chars module¶
Character based helper functions (not wiki-dependent).
-
pywikibot.tools.chars.
contains_invisible
(text)[source]¶ Return True if the text contain any of the invisible characters.
-
pywikibot.tools.chars.
replace_invisible
(text)[source]¶ Replace invisible characters by ‘<codepoint>’.
-
pywikibot.tools.chars.
string2html
(string: str, encoding: str) → str[source]¶ Convert unicode string to requested HTML encoding.
Attempt to encode the string into the desired format; if that work return it unchanged. Otherwise encode the non-ASCII characters into HTML &#; entities.
- Parameters
string – String to update
encoding – Encoding to use
-
pywikibot.tools.chars.
string_to_ascii_html
(string: str) → str[source]¶ Convert unicode chars of str to HTML entities if chars are not ASCII.
-
pywikibot.tools.chars.
url2string
(title: str, encodings: Union[str, list, tuple] = 'utf-8') → str[source]¶ Convert URL-encoded text to unicode using several encoding.
Uses the first encoding that doesn’t cause an error.
- Parameters
title – URL-encoded character data to convert
encodings – Encodings to attempt to use during conversion.
- Raises
UnicodeError – Could not convert using any encoding.
tools.djvu module¶
Wrapper around djvulibre to access djvu files properties and content.
-
class
pywikibot.tools.djvu.
DjVuFile
(file: str, file_djvu='[deprecated name of file]')[source]¶ Bases:
object
Wrapper around djvulibre to access djvu files properties and content.
Perform file existence checks.
Control characters in djvu text-layer are converted for convenience (see http://djvu.sourceforge.net/doc/man/djvused.html for control chars details).
- Parameters
file – filename (including path) to djvu file
tools.formatter module¶
Module containing various formatting related utilities.
-
class
pywikibot.tools.formatter.
SequenceOutputter
(sequence)[source]¶ Bases:
object
A class formatting a list of items.
It is possible to customize the appearance by changing
format_string
which is used bystr.format
withindex
,width
anditem
. Each line is joined by the separator and the complete text is surrounded by the prefix and the suffix. All three are by default a new line. The index starts at 1 and for the width it’s using the width of the sequence’s length written as a decimal number. So a length of 100 will result in a with of 3 and a length of 99 in a width of 2.It is iterating over
self.sequence
to generate the text. That sequence can be any iterator but the result is better when it has an order.Create a new instance with a reference to the sequence.
-
format_string
= ' {index:>{width}} - {item}'¶
-
property
out
¶ Create the text with one item on each line.
-
prefix
= '\n'¶
-
separator
= '\n'¶
-
suffix
= '\n'¶
-
-
pywikibot.tools.formatter.
color_format
(text: str, *args, **kwargs) → str[source]¶ Do
str.format
without having to worry about colors.It is automatically adding 03 in front of color fields so it’s unnecessary to add them manually. Any other 03 in the text is disallowed.
You may use a variant {color} by assigning a valid color to a named parameter color.
- Parameters
text – The format template string
- Returns
The formatted string
tools._logging module¶
Logging tools.
-
class
pywikibot.tools._logging.
LoggingFormatter
(fmt=None, datefmt=None, style='%', validate=True)[source]¶ Bases:
logging.Formatter
Format LogRecords for output to file.
Initialize the formatter with specified format strings.
Initialize the formatter either with the specified format string, or a default as described above. Allow for specialized date formatting with the optional datefmt argument. If datefmt is omitted, you get an ISO8601-like (or RFC 3339-like) format.
Use a style parameter of ‘%’, ‘{‘ or ‘$’ to specify that you want to use one of %-formatting,
str.format()
({}
) formatting orstring.Template
formatting in your format string.Changed in version 3.2: Added the
style
parameter.