872 Commits

Author SHA1 Message Date
Mike Fährmann
68a0a7579c
fix/improve some regular expressions 2017-10-09 22:37:50 +02:00
Mike Fährmann
832b8b76ac
[util] extend global namespace for filter expressions 2017-10-09 22:12:58 +02:00
Mike Fährmann
393755ee94
[tumblr] update tests 2017-10-09 00:10:37 +02:00
Mike Fährmann
75d3a1f72f
[deviantart] always download original images
Deviation-objects returned by the DeviantArt API don't always contain
the URL and metadata of the original image ([1]). Getting this
information requires an additional API call [2], which is indicated by
the 'is_downloadable' and 'download_filesize' metadata within a
deviation-object.

[1] https://myria-moon.deviantart.com/art/Aime-Moi-part-en-vadrouille-261986576
[2] https://www.deviantart.com/developers/http/v1/20160316/deviation_download/bed6982b88949bdb08b52cd6763fcafd
2017-10-07 13:07:34 +02:00
Mike Fährmann
8e6a767109
[util] restructure formatter for better exception propagation 2017-10-06 17:10:35 +02:00
Mike Fährmann
0386503c80
fix (sub)category-transfer for DownloadJob instances (#41)
... and extend "parent" parameters to TestJob- and DataJob-classes
as well.
2017-10-06 15:38:35 +02:00
Mike Fährmann
a1c8b21cfd
[senmanga] improve metadata 2017-10-04 18:54:39 +02:00
Mike Fährmann
8df023e144
[util:filter] re-enable builtins
Trying to restrict access to Python's builtin functions (exec,
print, __import__, ...) can easily be circumvented and is
therefore completely pointless.

This also adds 'safe_int()' and the 'datetime' module to the global
namespace used when evaluating filter expressions.
2017-10-04 16:00:12 +02:00
Mike Fährmann
994b2fc1e7
[deviantart] replace 'author[urlname]' keyword
author[urlname] has always only been the lowercase version of
author[username], which can now be directly converted to lowercase
using the 'l' conversion: '{author[username]!l}'
2017-10-04 15:59:05 +02:00
Mike Fährmann
633b376f35
improve/adjust default filename formats for manga sites 2017-10-02 19:06:24 +02:00
Mike Fährmann
41adb99e9c
[pawoo] fix extraction
- changed access_token
- use account-search instead of general search
2017-10-02 18:33:52 +02:00
Mike Fährmann
b319f4bab3
smaller code and text changes 2017-10-01 18:23:40 +02:00
Mike Fährmann
ad4580800c
[pixiv] add support for more URL patterns
- https://www.pixiv.net/mypage.php#id=USERID
- https://www.pixiv.net/#id=USERID
2017-09-30 18:07:20 +02:00
Mike Fährmann
82ea6c0cd3
adjust format strings with optional titles
... except for anything manga/comic related
2017-09-28 18:00:19 +02:00
Mike Fährmann
c1f0afe4c6
add custom string formatter class 2017-09-28 17:12:39 +02:00
Mike Fährmann
85a2b2ae59
[khinsider] fix extraction 2017-09-28 11:47:26 +02:00
Mike Fährmann
26a866e7d8
implement (sub)category-transfer between extractors (#41)
ImageFap- and all Manga-Extractors will transfer their (sub)category
values to other extractors instantiated by them, which will in turn
allow those to use options set for their parents.

Example:
ImagefapGalleryExtractors will use options set under
extractor.imagefap.user, if (and only if) they have been instantiated by
a ImagefapUserExtractor; and options from extractor.imagefap.gallery
otherwise.
2017-09-26 21:05:11 +02:00
Mike Fährmann
1ab4c7986f
[mangahere] fix extraction
would switch to HTTPS, but there seem to be certificate issues
2017-09-26 21:05:11 +02:00
Mike Fährmann
8e14714c2b
[imgspice] fix extraction 2017-09-26 21:04:48 +02:00
Mike Fährmann
9c138dfc1f
[common] detect empty HTTP response bodies 2017-09-26 16:49:58 +02:00
Mike Fährmann
c51616f8d8
[foolslide] fix minor chapter number 2017-09-26 12:49:50 +02:00
H R X N
77bf923c56 Update imgur.py to include 'title' of single image (#40)
Add {title} keyword..
Images on Imgur don't necessarily have a title, but I think most of them do, and since this should not break anything else..
2017-09-26 12:48:48 +02:00
Mike Fährmann
a85f06d2d1
[foolslide] restructure; convert suitable values to int 2017-09-24 16:57:47 +02:00
Mike Fährmann
deb2e803ba
simplify MangaExtractor class 2017-09-24 16:05:43 +02:00
Mike Fährmann
9fc1d0c901
implement and use 'util.safe_int()'
same as Python's 'int()', except it doesn't raise any exceptions and
accepts a default value
2017-09-24 15:59:25 +02:00
Mike Fährmann
8a97bd0433
rename '--images' and '--chapters'
... to '--range' and '--chapter-range' to be consistent with
'--filter' and '--chapter-filter'
2017-09-23 17:31:40 +02:00
Mike Fährmann
8963da8fd8
[spectrumnexus] extract manga metadata 2017-09-23 16:49:33 +02:00
Mike Fährmann
a3e40734d1
[mangareader] extract manga metadata 2017-09-23 15:42:50 +02:00
Mike Fährmann
9196005a4d
[mangazuki] extract manga metadata 2017-09-22 20:53:43 +02:00
Mike Fährmann
543ba245eb
[deviantart] update test results
thumbnail URLs changed from //tXX.… to //t00.…
2017-09-22 17:53:59 +02:00
Mike Fährmann
b7a54a51d0
[mangapark] extract manga metadata + code improvements 2017-09-22 17:53:32 +02:00
Mike Fährmann
d39b8779af
[mangahere] extract manga metadata 2017-09-22 14:55:37 +02:00
Mike Fährmann
c265cc074a
[hbrowse] fix syntax for Python3.3 and 3.4 2017-09-20 16:41:39 +02:00
Mike Fährmann
a9e7145651
[hbrowse] extract hmanga metadata & general maintenance 2017-09-20 16:25:25 +02:00
Mike Fährmann
92c8a6cb01
[hentai2read] extract hmanga metadata 2017-09-20 13:28:57 +02:00
Mike Fährmann
de174b40d6
[hentaihere] extract hmanga metadata 2017-09-20 13:13:14 +02:00
Mike Fährmann
04cc1ffe34
[kissmanga] extract manga metadata 2017-09-19 16:25:04 +02:00
Mike Fährmann
885bd4cbe2
[readcomiconline] extract comic metadata 2017-09-18 19:18:24 +02:00
Mike Fährmann
cebf800a7f
[foolfuuka] add support for more sites (#18)
- https://arch.b4k.co
- https://archive.whatisthisimnotgoodwithcomputers.com
- https://archive.yeet.net

Notes:
- The name "whatisthisimnotgoodwithcomputers" is way too long ...
- archive.yeet.net is out of date and also blocked by 4chan servers
  - newest threads are 2 weeks old
  - using "https://archive.yeet.net" as Referer header results in
    "403 Forbidden" when accessing 4chan
2017-09-16 21:36:16 +02:00
Mike Fährmann
84d4450410
[fallenangels] extract manga metadata 2017-09-15 20:51:40 +02:00
Mike Fährmann
f32b1a0292
[imgyt] fix extraction 2017-09-14 15:04:32 +02:00
Mike Fährmann
4ad903b797
[warosu] fix extraction 2017-09-14 14:57:40 +02:00
Mike Fährmann
b84f48dfa5
[batoto] extract manga metadata 2017-09-14 14:55:57 +02:00
Mike Fährmann
4ceb176c6b
[foolslide] extract manga metadata
enables chapter filtering for
- https://kobato.hologfx.com/
- https://jaiminisbox.com/
- https://reader.kireicake.com/
- https://powermanga.org/
- https://reader.seaotterscans.com/
- http://sensescans.com/
- http://www.slide.world-three.org/
2017-09-12 16:44:38 +02:00
Mike Fährmann
24e5f154a4
[deviantart] update test results
API responses now contain proper https:// URLs and their image download
server is now "orig00.deviantart.net" for all images.
2017-09-12 16:38:57 +02:00
Mike Fährmann
0dedbe759c
enable '--chapter-filter'
The same filter infrastructure that can be applied to image URLS now
also works for manga chapters and other delegated URLs.

TODO: actually provide any metadata (currently supported is only
deviantart and imagefap).
2017-09-12 16:19:00 +02:00
Mike Fährmann
31cd5b1c1d
[luscious] detect high-load responses 2017-09-12 15:46:21 +02:00
Mike Fährmann
470bbe9d8c
fix smaller stuff
- change filename option in example config file
- adapt default filename format for mangafox
- remove unnecessary newline

[skip ci]
2017-09-11 17:07:29 +02:00
Mike Fährmann
6f30cf4c64
change keyword names to valid Python identifiers
This commit mostly replaces all minus-signs ('-') in keyword names with
underscores ('_') to allow them to be used in filter-expressions. For
example 'gallery-id' got renamed to 'gallery_id'.

(It is theoretically possible to access any variable, regardless of its
name, with 'locals()["NAME"]', but that seems a bit too convoluted if
just 'NAME' could be enough)
2017-09-10 22:20:47 +02:00
Mike Fährmann
81877bb5f6
add '-K' as shortcut for '--list-keywords' 2017-09-09 18:48:28 +02:00