946 Commits

Author SHA1 Message Date
Mike Fährmann
6a07e38366
implement extractor.add() and .add_module()
... as a public and non-hacky way to add (external) extractors to
gallery-dl's pool and make them available for extractor.find()
2018-02-02 00:01:41 +01:00
Mike Fährmann
a34cebc253
[luscious] jump to first image if cover does not link to it 2018-01-30 22:39:01 +01:00
Mike Fährmann
915807dd77
log HTTP errors as warnings 2018-01-29 21:55:46 +01:00
Mike Fährmann
db7f04dd97
emit log messages on download failure
and when retrying with fallback URLs
2018-01-28 18:44:10 +01:00
Mike Fährmann
d951f13e37
add config option for unsupported-URL file
for consistency's sake
2018-01-28 18:42:10 +01:00
Mike Fährmann
619387cbb1
update extractor unittest results 2018-01-28 18:29:05 +01:00
Mike Fährmann
364e335440
smaller adjustments and improvements
- requests and urllib3 version on 1 line
- close input file after reading from it
- use expand_path for unsupported-urls file
- remove unnecessary logging from options.py
2018-01-27 01:05:17 +01:00
Mike Fährmann
c9a9664a65
change --write-log behaviour
- log files now get truncated when opening them
  (mode "w" instead of "a")
- log verbosity to file depends on -q/-v
  (same  as logging to stderr)
2018-01-27 00:51:40 +01:00
Mike Fährmann
97f4f15ec0
add option to write logging output to a file
- '--write-log FILE' as cmdline argument
- 'output.logfile' as config file option
2018-01-26 18:51:51 +01:00
Mike Fährmann
f94e3706a8
use logging module for error messages during downloads 2018-01-26 18:11:13 +01:00
Mike Fährmann
db91cf871c
document message identifiers 2018-01-23 21:38:30 +01:00
Mike Fährmann
0dd48d644f
update test results
nothing broke, but things got updated or changed
2018-01-23 21:38:29 +01:00
Mike Fährmann
1e93955170
[batoto] remove module
Site officially shut down on 2018.01.18
2018-01-23 21:37:32 +01:00
Mike Fährmann
27fce6f600
fix UrlJob behavior 2018-01-23 15:42:26 +01:00
Mike Fährmann
76509a6d3c
[imgur] update test results 2018-01-20 18:49:29 +01:00
Mike Fährmann
9fccd7b783
[tumblr] provide fallback URLs (#64)
Each image now produces 3 URLs:
- amazonaws.com _raw (or _1280 for older images)
- amazonaws.com _500
- media.tumblr.com (URL returned by API)
2018-01-19 23:12:15 +01:00
Mike Fährmann
b837420291
fix minor urllist issues 2018-01-19 22:54:15 +01:00
Mike Fährmann
9d69401391
initial support for multiple URLs per image 2018-01-17 22:08:19 +01:00
Mike Fährmann
6174a5c4ef
[download] adjust filename extension on filetype mismatch
(closes #63)
2018-01-17 18:37:06 +01:00
Mike Fährmann
91ed147cef
[oauth] use custom key/secret values during oauth:… 2018-01-16 17:39:46 +01:00
Mike Fährmann
421a9740a3
[tumblr] add 'tumblr:' to force Tumblr extractor (#71) 2018-01-15 18:27:58 +01:00
Mike Fährmann
40d35c87bc
[paheal] add tag- and post-extractors (closes #69) 2018-01-15 16:39:05 +01:00
Mike Fährmann
cc0c2cca57
[reddit] add extractor for reddit-hosted images (closes #68) 2018-01-14 18:55:42 +01:00
Mike Fährmann
f10ffc0839
update extractor blacklist to also allow classes 2018-01-14 18:47:22 +01:00
Mike Fährmann
b6797032e3
release version 1.1.2 2018-01-12 15:09:18 +01:00
Mike Fährmann
35e09869d1
[mangapark] fix image URLs and use HTTPS 2018-01-12 14:59:49 +01:00
Mike Fährmann
9a049bdf51
[tumblr] add 'likes' extractor (#65) 2018-01-12 14:56:01 +01:00
Mike Fährmann
67d4462d26
[batoto] rudimentary Cloudflare bypass 2018-01-11 18:49:19 +01:00
Mike Fährmann
29d75fc3fa
[tumblr] add support for OAuth authentication (#65) 2018-01-11 14:11:37 +01:00
Mike Fährmann
4edb25346e
[slideshare] support mobile URLs (closes #67) 2018-01-10 14:15:00 +01:00
Mike Fährmann
e420a28bbc
fix cookie tests 2018-01-09 21:43:52 +01:00
Mike Fährmann
b33efc99a4
[idolcomplex] add support for idol.sankakucomplex.com 2018-01-09 17:54:37 +01:00
Mike Fährmann
75b2e84b6d
[tumblr] use s3.amazonaws.com for image URLs (#64) 2018-01-09 15:13:00 +01:00
Mike Fährmann
5b094328b5
[puremashiro] add chapter- and manga-extractor (closes #66)
Also adds support for region subtags in language codes (e.g. en-us)
2018-01-07 21:50:43 +01:00
Mike Fährmann
974e73bdbb
[booru] smaller code adjustments 2018-01-06 17:48:49 +01:00
Mike Fährmann
03b8a548cb
[tumblr] change reblogs default value to true (#61) 2018-01-06 15:52:08 +01:00
Mike Fährmann
d235f68f59
[tumblr] add option to filter reblogged posts (#61)
Reblogs are ignored by default, but can be included by setting
'extractor.tumblr.reblogs' to 'true'.
2018-01-05 13:05:57 +01:00
Mike Fährmann
a794fffc6d
[batoto] extend chapter-string regex (closes #60)
Non-numeric chapter indices exist after all ...
2018-01-05 12:53:50 +01:00
Mike Fährmann
1219ebb7f5
[danbooru] use alternate subdomains; support safebooru 2018-01-04 00:51:04 +01:00
Mike Fährmann
9e8a84ab6c
[booru] rewrite using Mixin classes (#59)
- improved code structure
- improved URL patterns
- better pagination to work around page limits on
  - Danbooru
  - e621
  - 3dbooru
2018-01-04 00:01:39 +01:00
Mike Fährmann
0876541e43
[seiga] update tests 2017-12-30 19:19:36 +01:00
Mike Fährmann
1a70857a12
update extractor-unittest capabilities
- "count" can now be a string defining a comparison in the form of
  '<operator> <value>', for example: '> 12' or '!= 1'. If its value
  is not a string, it is assumed to be a concrete integer as before.

- "keyword" can now be a dictionary defining tests for individual keys.
  These tests can either be a type, a concrete value or a regex
  starting with "re:". Dictionaries can be stacked inside each other.
  Optional keys can be indicated with a "?" before its name.

  For example:
      "keyword:" {
          "image_id": int,
          "gallery_id", 123,
          "name": "re:pattern",
          "user": {
              "id": 321,
          },
          "?optional": None,
      }
2017-12-30 19:05:37 +01:00
Mike Fährmann
88bb0798fd
delay initialization of PathFormat objects
This allows the DeviantArt group-check to be moved inside the
Extractor.items() method which in turn allows for better exception
handling.

As a new general rule:
Never raise exceptions during extractor initialization.
2017-12-29 22:15:57 +01:00
Mike Fährmann
c24e0e70a7
[pixiv] simplify main loop 2017-12-28 14:13:39 +01:00
Mike Fährmann
c1e331edbb
[mangapark] replace manga test 2017-12-28 13:58:32 +01:00
Mike Fährmann
5488643fac
add requests and urllib3 versions to debug output 2017-12-27 22:12:40 +01:00
Mike Fährmann
9d73ed4772
fix issue with using 'skip()' when a filter is present
calling skip() skips over unfiltered items and does not apply
the filter expression to them, which is not what should happen
2017-12-27 22:09:10 +01:00
Mike Fährmann
28cd78aae0
[kissmanga] extend chapter-string regex (closes #58) 2017-12-24 22:53:10 +01:00
Mike Fährmann
0ba618dd1a
release version 1.1.1 2017-12-22 17:01:04 +01:00
Mike Fährmann
a3e9b51bea
[imgbox] update test results
Image URLs of older galleries have been updated to the new format.

https://i.imgbox.com/qHhw7lpG.png
 -->
https://images3.imgbox.com/6d/9a/qHhw7lpG_o.png
2017-12-22 16:09:14 +01:00