[twitter] add 'text-only' option (#570)

This commit is contained in:
Mike Fährmann 2021-05-22 17:01:49 +02:00
parent 8fd8126117
commit 724ca61f36
No known key found for this signature in database
GPG Key ID: 5680CA389D365A88
2 changed files with 18 additions and 3 deletions

View File

@ -1715,6 +1715,20 @@ Description
will be taken from the original Tweets, not the Retweets.
extractor.twitter.text-only
---------------------------
Type
``bool``
Default
``false``
Description
Produce metadata for Tweets without media content.
This only has an effect with a ``metadata`` (or ``exec``) post processor
with `"event": "post" <metadata.event_>`_
and appropriate `filename <metadata.filename_>`_.
extractor.twitter.twitpic
-------------------------
Type
@ -2217,7 +2231,7 @@ Postprocessor Options
This section lists all options available inside
`Postprocessor Configuration`_ objects.
Each option is titled as ``<name>.<option>``, meaning a post procesor
Each option is titled as ``<name>.<option>``, meaning a post processor
of type ``<name>`` will look for an ``<option>`` field inside its "body".
For example an ``exec`` post processor will recognize
an `async <exec.async_>`__, `command <exec.command_>`__,
@ -2406,7 +2420,7 @@ Description
The available events are:
``init``
After post procesor initialization
After post processor initialization
and before the first file download
``finalize``
On extractor shutdown, e.g. after all files were downloaded

View File

@ -32,6 +32,7 @@ class TwitterExtractor(Extractor):
def __init__(self, match):
Extractor.__init__(self, match)
self.user = match.group(1)
self.textonly = self.config("text-only", False)
self.retweets = self.config("retweets", True)
self.replies = self.config("replies", True)
self.twitpic = self.config("twitpic", False)
@ -64,7 +65,7 @@ class TwitterExtractor(Extractor):
self._extract_card(tweet, files)
if self.twitpic:
self._extract_twitpic(tweet, files)
if not files:
if not files and not self.textonly:
continue
tdata = self._transform_tweet(tweet)