1223 Commits

Author SHA1 Message Date
Mike Fährmann
3bcce77f6d
release version 1.4.0 2018-06-08 22:21:35 +02:00
Mike Fährmann
6ac403c5d3
add postprocessor config example 2018-06-08 18:31:59 +02:00
Mike Fährmann
2403c405e3
Merge branch 'postprocessor' 2018-06-08 17:43:11 +02:00
Mike Fährmann
baccf8a958
improve postprocessor handling
- add pathfmt argument for __init__()
- add finalization step
- add option to keep or delete zipped files
2018-06-08 17:39:02 +02:00
Mike Fährmann
2628911ba0
[pp:exec] add 'async' option 2018-06-07 23:35:18 +02:00
Mike Fährmann
7646bdbcfd
improve postprocessor initialization code 2018-06-07 22:29:54 +02:00
Mike Fährmann
b344f2290f
fix downloader tests 2018-06-07 22:27:36 +02:00
Mike Fährmann
37d97ff02c
[pp:classify] use temppath 2018-06-06 21:11:20 +02:00
Mike Fährmann
97189e50cd
[pp:zip] use temppath; add options 2018-06-06 20:49:52 +02:00
Mike Fährmann
821535b458
adjust PathFormat class 2018-06-06 20:17:17 +02:00
Mike Fährmann
a47c6136cd
[simplyhentai] avoid redirects for all-pages.json (#89) 2018-06-01 22:06:34 +02:00
Mike Fährmann
ad14de19c6
[imgur] support "unmuted" URLs 2018-05-30 16:19:01 +02:00
Mike Fährmann
72e66f0aac
[simplyhentai] improve URL pattern
[ci skip]
2018-05-30 11:44:43 +02:00
Mike Fährmann
cdcc3427a0
[simplyhentai] add video extractor (#89)
All videos hosted on their own servers seem be to dead,
but myhentai.tv embeds, which are most of the videos, work fine.
2018-05-30 11:25:23 +02:00
Mike Fährmann
f9a6a19658
[simplyhentai] add image extractor (#89) 2018-05-30 10:58:48 +02:00
Mike Fährmann
ebf596b399
[pawoo] restore metadata fields + smaller improvements 2018-05-29 11:02:14 +02:00
Mike Fährmann
f7e7306e5a
[komikcast] update URL pattern and unescape image URLs 2018-05-29 10:35:08 +02:00
Mike Fährmann
70f3617d88
[mangafox] fix URL extraction 2018-05-29 10:34:04 +02:00
Mike Fährmann
a62bd81e9b
[pixiv] fix filter for 'type=all' 2018-05-29 10:30:41 +02:00
Mike Fährmann
12797e3b1f
update configuration.rst
... again

- some more 'Path' references
- fixed some inconsistencies and errors
- added note about logging config for files
2018-05-28 22:14:38 +02:00
Mike Fährmann
c43f02245f
update configuration.rst
- fix default values for 'log' and 'unsupportedfile'

[ci skip]
2018-05-27 17:12:57 +02:00
Mike Fährmann
dacda69c9e
update configuration.rst
- document logging options
- add a section for "custom types"

[ci skip]
2018-05-27 16:50:35 +02:00
Mike Fährmann
55b0913412
[simplyhentai] add gallery extractor (#89) 2018-05-27 15:25:04 +02:00
Mike Fährmann
ae9a37a528
implement text.split_html() 2018-05-27 15:00:41 +02:00
Mike Fährmann
53f36176fd
update configuration.rst
- update the API Tokens & IDs section
  - mention redirect URIs for deviantart
  - include api-secret for tumblr
  - add instructions for smugmug
- [ci skip]
2018-05-26 11:26:50 +02:00
Mike Fährmann
b08d95ebe4
add an 'encoding' option for logging files (default 'utf-8') 2018-05-25 16:29:45 +02:00
Mike Fährmann
513d807632
explicitly open config files as utf-8 2018-05-25 16:29:46 +02:00
Mike Fährmann
2df1a15fb8
add '-s/--simulate' to run data extraction without download
Useful for quick testing (even though -g and -j kind of do the same)
and to fill a download archive without actually downloading the files.

-s does the same as the default behaviour, except downloading stuff.
Maybe it should get a more fitting name, as it does actually write to
disk (cache, archive)?
2018-05-25 16:07:18 +02:00
Mike Fährmann
15cce22d82
[mangadex] fix parsing of unusual chapter strings 2018-05-23 18:40:39 +02:00
Mike Fährmann
ecdc3475b8
[pixhost] support .to TLDs 2018-05-23 18:32:34 +02:00
Elvis Yu-Jing Lin
aab2391c7b Fix UnicodeDecodeError during installation (#86)
* fix UnicodeDecodeError during installation

* simplify opening with utf-8 encoding
2018-05-23 17:46:00 +02:00
Mike Fährmann
f3d770d4e2
Merge branch '1.4-dev' 2018-05-22 17:24:57 +02:00
Mike Fährmann
d0ae3ed52c
[postprocessor] add 'zip' to write files to a ZIP archive
(#85)
2018-05-22 16:54:17 +02:00
Mike Fährmann
ca4008e1c1
[postprocessor] add 'classify' to sort downloads by fileext 2018-05-22 16:21:17 +02:00
Mike Fährmann
d378c0a323
[postprocessor] add 'exec' to execute user-defined processes 2018-05-22 15:00:31 +02:00
Mike Fährmann
76c32d58e5
[postprocessor] initial code 2018-05-22 14:59:22 +02:00
Mike Fährmann
1ff626db97
[pixiv] improve bookmark extraction
- combine 'favorite' and 'bookmark' extractors
  - it is now one extractor class, but its subcategory still
    distinguishes between your own bookmarks ('bookmark') and other
    user's bookmarks ('favorite') like before
- allow filtering by bookmark tags and public/private bookmarks
- fix pagination for bookmark results
2018-05-18 17:04:59 +02:00
Mike Fährmann
0a1863fce3
[pixiv] respect more query parameters for user URLs
The API endpoint responsible for user illustrations does not
provide sufficient filter capabilities* to match the actual
website, so we are spinning our own filters.

Respected parameters are
    'type': illust, manga, ugoira
    'tag' : any image tag (this was already supported)
    'p'   : the page to start on

*
- API can filter for illustrations and manga, but not for ugoira.
- 'offset' is applied before filtering
- no 'tag' filter
2018-05-18 15:36:30 +02:00
Mike Fährmann
f43d446692
[mangahere] extract chapter titles 2018-05-16 16:22:05 +02:00
Mike Fährmann
b8e53b8c6b
[pixiv] move query parsing out of constructor
better exception handling, among other things
2018-05-15 13:28:08 +02:00
Mike Fährmann
909d105ae6
[pixiv] add extractor for illusts from followed users 2018-05-15 13:05:15 +02:00
Mike Fährmann
7f899bd5d8
Merge branch 'master' into 1.4-dev 2018-05-14 14:50:02 +02:00
Mike Fährmann
fe69d01083
[pixiv] add extractor for search results 2018-05-14 14:46:05 +02:00
Mike Fährmann
247f785af1
[pixiv] use App API
Transitioning to the App API breaks favorites archive IDs (there is
no longer any bookmark ID information), but the favorites API endpoint
of the public API was gone anyways ...
2018-05-14 10:56:37 +02:00
Mike Fährmann
92fc199b07
[reddit] allow arbitrary subdomains 2018-05-13 11:23:23 +02:00
Mike Fährmann
4cea886177
[imgur] allow longer album hashes 2018-05-13 11:21:51 +02:00
Mike Fährmann
e1e23165a0
[pinterest] catch JSON decode errors 2018-05-11 17:37:27 +02:00
Mike Fährmann
789608c107
[imagebam] fix extraction for certain galleries 2018-05-11 17:11:52 +02:00
Mike Fährmann
7a58151566
fix util.parse_bytes invocations
(should be text.parse_bytes)
2018-05-10 22:07:55 +02:00
Mike Fährmann
1c1e086d01
use common base class for OAuth1.0 based API interfaces 2018-05-10 21:57:45 +02:00