Mike Fährmann
569747a78d
implement extractor.wait()
2020-01-04 23:42:07 +01:00
Mike Fährmann
ce54b8c04c
let extractors opt-out of cookie option usage
...
useful to avoid sending unnecessary cookies when all authentication
is done through OAuth tokens
2020-01-01 21:12:37 +01:00
Mike Fährmann
48e42e73fb
[reddit] change default value for 'comments' to '0'
2019-12-20 16:54:59 +01:00
Mike Fährmann
9c0928457a
[reddit] fix errors with 't1_…' submissions
2019-12-20 16:49:44 +01:00
Mike Fährmann
df2b3c6888
restore OAuth2 authentication error messages
2019-10-13 22:48:01 +02:00
Mike Fährmann
6d0a533d68
[reddit] respect 'comments:0' for single submissions ( #429 )
2019-09-27 23:11:28 +02:00
Mike Fährmann
46ba173ded
[reddit] fix documentation inconsistencies ( closes #429 )
...
- Require 'reddit.comments' to be a number and convert it to an
integer to be extra sure
- Link to the README's OAuth section were appropriate
2019-09-27 17:34:10 +02:00
Mike Fährmann
913460240d
[reddit] fix 'extractor.blacklist()' arguments
...
The second argument must support 'append()'.
2019-09-24 23:01:12 +02:00
Mike Fährmann
946f2751e2
[reddit] add 'user' extractor ( closes #350 )
2019-09-22 22:18:17 +02:00
Mike Fährmann
c14abb9fb8
[reddit] improve URL parameter handling for subreddit links
2019-09-22 22:03:22 +02:00
Mike Fährmann
f4bc75e854
fix rate limit handling for OAuth APIs ( #368 )
2019-08-03 13:43:00 +02:00
Mike Fährmann
09f37fde39
[reddit] move date-min/-max handling into Extractor class
2019-07-16 22:54:39 +02:00
Mike Fährmann
fdec59f8e2
replace extractor.request() 'expect' argument
...
with
- 'fatal': allow 4xx status codes
- 'notfound': raise NotFoundError on 404
2019-07-05 00:42:16 +02:00
Mike Fährmann
a2af2d2965
adjust cache maxage values
2019-03-14 22:21:49 +01:00
Mike Fährmann
5530871b5a
change results of text.nameext_from_url()
...
Instead of getting a complete 'filename' from an URL and splitting that
into 'name' and 'extension', the new approach gets rid of the complete
version and renames 'name' to 'filename'. (Using anything other than
{extension} for a filename extension doesn't really work anyway)
Example: "https://example.org/path/filename.ext "
before:
- filename : filename.ext
- name : filename
- extension: ext
now:
- filename : filename
- extension: ext
2019-02-14 16:07:17 +01:00
Mike Fährmann
2e516a1e3e
store the full original URL in Extractor.url
2019-02-12 18:46:48 +01:00
Mike Fährmann
4b1880fa5e
propagate 'match' to base extractor constructor
2019-02-11 13:31:10 +01:00
Mike Fährmann
abbd45d0f4
update handling of extractor URL patterns
...
When loading extractor classes during 'extractor.find(…)', their
'pattern' attribute will be replaced with a compiled version of itself.
2019-02-08 20:08:16 +01:00
Mike Fährmann
6284731107
simplify extractor constants
...
- single strings for URL patterns
- tuples instead of lists for 'directory_fmt' and 'test'
- single-tuple tests where applicable
2019-02-08 13:45:40 +01:00
Mike Fährmann
6126615698
update URLs for supportedsites.rst
2019-01-30 16:18:22 +01:00
Mike Fährmann
4ab0960083
[reddit] add metadata to extracted URLs
2018-12-29 17:52:43 +01:00
Mike Fährmann
7471933d5f
use extractor.request for all other API calls
...
- deviantart
- pawoo
- pixiv
- reddit
2018-12-22 14:42:23 +01:00
Mike Fährmann
966a9ca3a0
update test results
2018-11-10 19:14:54 +01:00
Mike Fährmann
c9b8e6aefc
[reddit] fix submission-ID parsing ( #104 )
...
Uppercase characters caused a ValueError exception
2018-09-07 18:27:54 +02:00
Mike Fährmann
4313c95bc9
improve error message for OAuth2 authentication
2018-08-11 23:54:25 +02:00
Mike Fährmann
92fc199b07
[reddit] allow arbitrary subdomains
2018-05-13 11:23:23 +02:00
Mike Fährmann
3cec533c28
Merge branch 'archive'
2018-02-12 18:07:58 +01:00
Mike Fährmann
20af86b2ea
add more extractor tests
...
for mangastream, reddit and imgur
2018-02-12 17:07:18 +01:00
Mike Fährmann
34873dbd90
set 'archive_fmt' values
...
These are going to be used to create an unique id for each image.
2018-02-01 15:30:49 +01:00
Mike Fährmann
cc0c2cca57
[reddit] add extractor for reddit-hosted images ( closes #68 )
2018-01-14 18:55:42 +01:00
Mike Fährmann
676602056c
[reddit] unescape output URLs
2017-12-19 22:22:43 +01:00
Mike Fährmann
864a63ed33
fix typo
...
[skip ci]
2017-10-10 17:42:06 +02:00
Mike Fährmann
f3fbaa5c3e
[reddit] allow users to override the API User-Agent
...
Only overriding the Client-ID is not enough if you want to follow
Reddit's API access rules [1].
[1] https://github.com/reddit/reddit/wiki/API#rules
2017-10-10 17:29:46 +02:00
Mike Fährmann
0dedbe759c
enable '--chapter-filter'
...
The same filter infrastructure that can be applied to image URLS now
also works for manga chapters and other delegated URLs.
TODO: actually provide any metadata (currently supported is only
deviantart and imagefap).
2017-09-12 16:19:00 +02:00
Mike Fährmann
54c0715135
allow users to set their own API access_tokens/client_ids
2017-09-09 17:50:19 +02:00
Mike Fährmann
85696d0b3b
[reddit] fix issue with datetime errors
2017-07-02 08:19:45 +02:00
Mike Fährmann
80c2e03aaa
[reddit] allow 'date-min/max' to be human readable dates
...
If the date-min/max config value is a string, try parsing it using
datetime.strptime [1] with 'date-format' as format string [2]
(default: "%Y-%m-%dT%H:%M:%S")
Example: get all submissions posted in 2016
$ gallery-dl reddit.com/r/... \
-o date-format=%Y \
-o date-min=\"2016\" \
-o date-max=\"2017\"
[1] https://docs.python.org/3/library/datetime.html#datetime.datetime.strptime
[2] https://docs.python.org/3/library/datetime.html#strftime-strptime-behavior
2017-07-01 18:46:38 +02:00
Mike Fährmann
f3d0373120
[reddit] add ability to filter by submission id
...
'extractor.reddit.id-min' and '….id-max' specify the lowest and
highest submission-/post-id to consider, similar to 'date-min' and
'date-max'
2017-06-29 17:39:22 +02:00
Mike Fährmann
2993206c4b
smaller fixes and "security" measures
...
- move the OAuthSession class into util.py
- block special extractors for reddit and recursive
- ignore 'only matching' tests for testresults script
2017-06-16 21:01:40 +02:00
Mike Fährmann
56bec79e6a
[reddit] add ability to load more comments ( #15 )
...
The 'extractor.reddit.morecomments' option enables the use of
the '/api/morechildren' API endpoint (1) to load even more
comments than the usual submission-request provides.
Possible values are the booleans 'true' and 'false' (default).
Note: this feature comes at the cost of 1 extra API call towards
the rate limit for every 100 extra comments.
(1) https://www.reddit.com/dev/api/#GET_api_morechildren
2017-06-13 18:49:07 +02:00
Mike Fährmann
090e11b35d
[reddit] enable user authentication with OAuth2 ( #15 )
...
Call '$ gallery-dl oauth:reddit' to get a refresh_token
for your account.
2017-06-08 16:17:13 +02:00
Mike Fährmann
8456b84a12
fix tests and small stuff
2017-06-06 14:22:09 +02:00
Mike Fährmann
fbfc8d0f78
[reddit] ignore Authorization errors for subreddits
...
- also made the limit for retrieved comments customizable via
the 'extractor.reddit.comments' config value
- default is 500; 0 ignores comments completely
2017-06-05 18:43:08 +02:00
Mike Fährmann
5f05543f23
[reddit] support filtering by timestamp ( #15 )
...
- Added the 'extractor.reddit.date-min' and '….date-max'
config options. These values should be UTC timestamps.
- All submissions not posted in date-min <= T <= date-max
will be ignored.
- Fixed the limit parameter for submission comments by setting
it to its apparent max value (500).
2017-06-03 13:33:48 +02:00
Mike Fährmann
bce51e90a5
[reddit] support sorting options and sub-options ( #15 )
...
Example:
https://www.reddit.com/r/ <subreddit>/top/?sort=top&t=month
(the 'sort=top' parameter is irrelevant and can be omitted)
2017-05-29 12:45:35 +02:00
Mike Fährmann
99b72130ee
[reddit] enable recursion ( #15 )
...
reddit extractors now recursively visit other submissions/posts
linked to in the initial set of submissions.
This behaviour can be configured via the 'extractor.reddit.recursion'
key in the configuration file or by `-o recursion=<value>`.
Example:
{"extractor": {
"reddit": {
"recursion": <value>
}}}
Possible values:
* -1 - infinite recursion (don't do this)
* 0 - recursion is disabled (default)
* 1 and higher - maximum recursion level
2017-05-26 17:01:27 +02:00
Mike Fährmann
e425243b1e
[reddit] some small fixes
...
- filter or complete some URLs
- remove the 'nofollow:' scheme before printing URLs
- (#15 )
2017-05-23 11:48:00 +02:00
Mike Fährmann
a22892f494
[reddit] add subreddit- and submission-extractor
...
- these extractors scan submissions and their comments for
(external) URLs and defer them to other extractors
- (#15 )
2017-05-23 09:38:50 +02:00