946 Commits

Author SHA1 Message Date
Mike Fährmann
0ba93650e0
[8chan] replace unit test URL
the other thread is no longer accessible
2018-07-14 09:53:16 +02:00
Mike Fährmann
269dc2bbd5
[sankaku] add 'tags' option (#94) 2018-07-14 09:53:01 +02:00
Mike Fährmann
173add6935
[nijie] fix artist_id extraction
view_popup.php pages for older images or dojins either have the
artist_id value at a different place or not at all.
2018-07-10 12:30:53 +02:00
Mike Fährmann
6996f5c118
[mangahere] fix and improve chapter extraction 2018-07-09 20:07:40 +02:00
Mike Fährmann
1d43cbbf52
[gelbooru] tag-splitting for non-api mode 2018-07-06 15:24:19 +02:00
Mike Fährmann
2eefaa99a3
[mangapark] support .net and .com mirrors 2018-07-05 14:45:05 +02:00
Mike Fährmann
c20c0a4820
[safebooru] add pool extractor 2018-07-04 12:24:57 +02:00
Mike Fährmann
f916279ae6
[rule34] add pool extractor 2018-07-04 12:24:01 +02:00
Mike Fährmann
3dbc7c5f8d
[gelbooru] restore pool functionality 2018-07-04 12:21:41 +02:00
Mike Fährmann
a2c74bc6f0
[gelbooru] inherit from BooruExtractor class
Breaks pool functionality when using API calls (for now),
but reduces code clutter and enables the `tags` option.
2018-07-04 12:21:41 +02:00
Mike Fährmann
4a57509392
generalize tag-splitting option (#92)
- extend functionality to other booru sites:
  - http://behoimi.org/
  - https://konachan.com/
  - https://e621.net/
  - https://rule34.xxx/
  - https://safebooru.org/
  - https://yande.re/
2018-07-04 12:21:16 +02:00
Mike Fährmann
188e956c4e
[imagefap] use HTTPS + update test results 2018-06-30 19:40:46 +02:00
Mike Fährmann
87853538b4
[yandere] add option to split tags by type (#92) 2018-06-29 19:38:53 +02:00
Mike Fährmann
a699787d01
[deviantart] update URL patterns to new format
DeviantArt changed its URL format from
https://<name>.deviantart.com/...
to
https://www.deviantart.com/<name>/...

With this change both formats will be supported.
2018-06-28 20:21:59 +02:00
Mike Fährmann
9e3415886c
[senmanga] fix/update tests 2018-06-27 20:05:22 +02:00
Mike Fährmann
b8c97d2295
use 'extractor.request()' for more HTTP requests 2018-06-25 23:40:59 +02:00
Mike Fährmann
150a6b9064
[xvideos] fix metadata extraction 2018-06-22 16:32:04 +02:00
Mike Fährmann
7a98cc9798
[smugmug] update tests
My test account expired and all uploaded images got deleted.
2018-06-22 15:04:31 +02:00
Mike Fährmann
91340d9d27
[pixiv] fix ugoira test 2018-06-18 19:22:54 +02:00
Mike Fährmann
eb7a1f3b98
[pixiv] rework ugoira handling
Frame information now gets attached to the ZIP file's keyword dict
instead of being written to a separate text file.
2018-06-18 17:57:57 +02:00
Mike Fährmann
017188d268
improve extractor.request()
Replace the 'fatal' parameter with 'expect', which is a list/range
of HTTP status codes >= 400 that should also be accepted.
2018-06-18 16:29:56 +02:00
Mike Fährmann
f10bd5cdbe
[4chan] unescape filenames 2018-06-12 23:19:38 +02:00
Mike Fährmann
2d1a104739
[mangadex] unescape manga names and chapter titles
pretty sure I previously tested if unescaping strings from the
embedded JSON object was necessary ... maybe they changed it
2018-06-11 17:53:21 +02:00
Mike Fährmann
6ac403c5d3
add postprocessor config example 2018-06-08 18:31:59 +02:00
Mike Fährmann
a47c6136cd
[simplyhentai] avoid redirects for all-pages.json (#89) 2018-06-01 22:06:34 +02:00
Mike Fährmann
ad14de19c6
[imgur] support "unmuted" URLs 2018-05-30 16:19:01 +02:00
Mike Fährmann
72e66f0aac
[simplyhentai] improve URL pattern
[ci skip]
2018-05-30 11:44:43 +02:00
Mike Fährmann
cdcc3427a0
[simplyhentai] add video extractor (#89)
All videos hosted on their own servers seem be to dead,
but myhentai.tv embeds, which are most of the videos, work fine.
2018-05-30 11:25:23 +02:00
Mike Fährmann
f9a6a19658
[simplyhentai] add image extractor (#89) 2018-05-30 10:58:48 +02:00
Mike Fährmann
ebf596b399
[pawoo] restore metadata fields + smaller improvements 2018-05-29 11:02:14 +02:00
Mike Fährmann
f7e7306e5a
[komikcast] update URL pattern and unescape image URLs 2018-05-29 10:35:08 +02:00
Mike Fährmann
70f3617d88
[mangafox] fix URL extraction 2018-05-29 10:34:04 +02:00
Mike Fährmann
a62bd81e9b
[pixiv] fix filter for 'type=all' 2018-05-29 10:30:41 +02:00
Mike Fährmann
55b0913412
[simplyhentai] add gallery extractor (#89) 2018-05-27 15:25:04 +02:00
Mike Fährmann
15cce22d82
[mangadex] fix parsing of unusual chapter strings 2018-05-23 18:40:39 +02:00
Mike Fährmann
ecdc3475b8
[pixhost] support .to TLDs 2018-05-23 18:32:34 +02:00
Mike Fährmann
f3d770d4e2
Merge branch '1.4-dev' 2018-05-22 17:24:57 +02:00
Mike Fährmann
1ff626db97
[pixiv] improve bookmark extraction
- combine 'favorite' and 'bookmark' extractors
  - it is now one extractor class, but its subcategory still
    distinguishes between your own bookmarks ('bookmark') and other
    user's bookmarks ('favorite') like before
- allow filtering by bookmark tags and public/private bookmarks
- fix pagination for bookmark results
2018-05-18 17:04:59 +02:00
Mike Fährmann
0a1863fce3
[pixiv] respect more query parameters for user URLs
The API endpoint responsible for user illustrations does not
provide sufficient filter capabilities* to match the actual
website, so we are spinning our own filters.

Respected parameters are
    'type': illust, manga, ugoira
    'tag' : any image tag (this was already supported)
    'p'   : the page to start on

*
- API can filter for illustrations and manga, but not for ugoira.
- 'offset' is applied before filtering
- no 'tag' filter
2018-05-18 15:36:30 +02:00
Mike Fährmann
f43d446692
[mangahere] extract chapter titles 2018-05-16 16:22:05 +02:00
Mike Fährmann
b8e53b8c6b
[pixiv] move query parsing out of constructor
better exception handling, among other things
2018-05-15 13:28:08 +02:00
Mike Fährmann
909d105ae6
[pixiv] add extractor for illusts from followed users 2018-05-15 13:05:15 +02:00
Mike Fährmann
7f899bd5d8
Merge branch 'master' into 1.4-dev 2018-05-14 14:50:02 +02:00
Mike Fährmann
fe69d01083
[pixiv] add extractor for search results 2018-05-14 14:46:05 +02:00
Mike Fährmann
247f785af1
[pixiv] use App API
Transitioning to the App API breaks favorites archive IDs (there is
no longer any bookmark ID information), but the favorites API endpoint
of the public API was gone anyways ...
2018-05-14 10:56:37 +02:00
Mike Fährmann
92fc199b07
[reddit] allow arbitrary subdomains 2018-05-13 11:23:23 +02:00
Mike Fährmann
4cea886177
[imgur] allow longer album hashes 2018-05-13 11:21:51 +02:00
Mike Fährmann
e1e23165a0
[pinterest] catch JSON decode errors 2018-05-11 17:37:27 +02:00
Mike Fährmann
789608c107
[imagebam] fix extraction for certain galleries 2018-05-11 17:11:52 +02:00
Mike Fährmann
7a58151566
fix util.parse_bytes invocations
(should be text.parse_bytes)
2018-05-10 22:07:55 +02:00