81 Commits

Author SHA1 Message Date
Mike Fährmann
12797e3b1f
update configuration.rst
... again

- some more 'Path' references
- fixed some inconsistencies and errors
- added note about logging config for files
2018-05-28 22:14:38 +02:00
Mike Fährmann
b08d95ebe4
add an 'encoding' option for logging files (default 'utf-8') 2018-05-25 16:29:45 +02:00
Mike Fährmann
2df1a15fb8
add '-s/--simulate' to run data extraction without download
Useful for quick testing (even though -g and -j kind of do the same)
and to fill a download archive without actually downloading the files.

-s does the same as the default behaviour, except downloading stuff.
Maybe it should get a more fitting name, as it does actually write to
disk (cache, archive)?
2018-05-25 16:07:18 +02:00
Mike Fährmann
8bf3cdd82b
implement logging options
Standard logging to stderr, logfiles, and unsupported URL files (which
are now handled through the logging module) can now be configured by
setting their respective option keys (log, logfile, unsupportedfile)
to a dict and specifying the following options;

- format:
    format string for logging messages
    available keys: see [1]
    default: "[{name}][{levelname}] {message}"
- format-date:
    format string for {asctime} fields in logging messages
    available keys: see [2]
    default: "%Y-%m-%d %H:%M:%S"
- level:
    the lowercase levelname until which the logger should activate;
    available levels are debug, info, warning, error, exception
    default: "info"
- path:
    path of the file to be written to
- mode:
    'mode' argument when opening the specified file
    can be either "w" to truncate the file or "a" to append to it (see [3])

If 'output.log', '.logfile', or '.unsupportedfile' is a string, it will
be interpreted, as it has been, as the filepath
(or as format string for .log)

[1] https://docs.python.org/3/library/logging.html#logrecord-attributes
[2] https://docs.python.org/3/library/time.html#time.strftime
[3] https://docs.python.org/3/library/functions.html#open
2018-05-01 17:54:52 +02:00
Mike Fährmann
0381ae5318
replace error handlers for stdout and co.
Python3.5 and lower throw an UnicodeEncodeError when trying to print
not-encodable characters when not using 'utf-8' as encoding.
Setting their error handlers to 'replace' should help.
2018-04-04 17:30:42 +02:00
Mike Fährmann
b50bdbf3d7
change config specifiers in input file format
Instead of a dictionary/object, input file options are now specified
by a 'key=value' pair starting with '-' for options only applying to
the next URL or '-G' for Global options applying to all following URLs.

See the docstring of parse_inputfile() for details.

Example option specifiers:

- filename = "{id}.{extension}"
- extractor.pixiv.user.directory = ["Pixiv Users", "{user[id]}"]
-spaces="are_optional"
-G keywords = {"global": "option"}
2018-02-16 03:10:41 +01:00
Mike Fährmann
7f7c16ae37
add option to specify additional key-value pairs 2018-02-08 23:10:58 +01:00
Mike Fährmann
057668e17e
extend input-file format with per-URL config and comments
- see docstring of parse_inputfile() for details
- TODO: unittests, recursion (currently setting for example
  {"extractor": {"key": "value"}} will override the whole "extractor"
  branch instead of merging {"key": "value"} into the already existing
  dictionary)
2018-02-07 21:47:27 +01:00
Mike Fährmann
d951f13e37
add config option for unsupported-URL file
for consistency's sake
2018-01-28 18:42:10 +01:00
Mike Fährmann
364e335440
smaller adjustments and improvements
- requests and urllib3 version on 1 line
- close input file after reading from it
- use expand_path for unsupported-urls file
- remove unnecessary logging from options.py
2018-01-27 01:05:17 +01:00
Mike Fährmann
c9a9664a65
change --write-log behaviour
- log files now get truncated when opening them
  (mode "w" instead of "a")
- log verbosity to file depends on -q/-v
  (same  as logging to stderr)
2018-01-27 00:51:40 +01:00
Mike Fährmann
97f4f15ec0
add option to write logging output to a file
- '--write-log FILE' as cmdline argument
- 'output.logfile' as config file option
2018-01-26 18:51:51 +01:00
Mike Fährmann
5488643fac
add requests and urllib3 versions to debug output 2017-12-27 22:12:40 +01:00
Mike Fährmann
0e5057b15d
remove deprecated options 2017-12-02 15:31:57 +01:00
Mike Fährmann
8a97bd0433
rename '--images' and '--chapters'
... to '--range' and '--chapter-range' to be consistent with
'--filter' and '--chapter-filter'
2017-09-23 17:31:40 +02:00
Mike Fährmann
0dedbe759c
enable '--chapter-filter'
The same filter infrastructure that can be applied to image URLS now
also works for manga chapters and other delegated URLs.

TODO: actually provide any metadata (currently supported is only
deviantart and imagefap).
2017-09-12 16:19:00 +02:00
Mike Fährmann
470bbe9d8c
fix smaller stuff
- change filename option in example config file
- adapt default filename format for mangafox
- remove unnecessary newline

[skip ci]
2017-09-11 17:07:29 +02:00
Mike Fährmann
9b21d3f13c
add '--filter' command-line option
This allows for image filtering via Python expressions by the same
metadata that is also used to build filenames (--list-keywords).

The usually shunned eval() function is used to evaluate
filter-expressions, but it seemed quite appropriate in this case and
shouldn't introduce any new security issues, as any attacker that could do
> gallery-dl --filter "delete-everything()" ...
could as well do
> python -c "delete-everything()"
2017-09-08 17:52:00 +02:00
Mike Fährmann
f7de048980
add additional debug output 2017-08-13 20:35:44 +02:00
Mike Fährmann
06c4cae05b
extend the output of '--list-extractors'
It now includes category and subcategory values for
each extractor class.
2017-06-28 18:51:47 +02:00
Mike Fährmann
d5a70f2580
add simple progress indicator for multiple URLs (#19)
The output can be configured via the 'output.progress'
config value.

Possible values:
    - true:     Show the default progress indicator
                "[{current}/{total}] {url}" (default)
    - false:    Never show the progress indicator
    - <string>: Show the progress indicator using this
                as a custom format string(1).
                Possible replacement keys are:
                - current: current URL index
                - total  : total number of URLs
                - url    : current URL

(1) https://docs.python.org/3/library/string.html#formatstrings
2017-06-09 20:12:15 +02:00
Mike Fährmann
25bcdc8aa9
add --write-unsupported option (#15) 2017-05-27 16:16:57 +02:00
Mike Fährmann
701c016b97
add '-q/--quiet' option 2017-04-26 11:33:19 +02:00
Mike Fährmann
f0aa35ac84
add '--ignore-config' option 2017-04-25 17:09:10 +02:00
Mike Fährmann
5af35ea150
add -v/--verbose option and reduce error verbosity
(#12)
2017-04-18 11:38:48 +02:00
Mike Fährmann
b43cd88101
add '-j/--dump-json' option
this outputs the extractor-results in JSON format rather then
downloading files
2017-04-12 18:43:41 +02:00
Mike Fährmann
e4b3077168
improve config module
- speed improvements, especially in the 'interpolate' function
- 'interpolate' now prioritizes base-level values if they exist
  - "username" is chosen before "extractor.<category>.username"
  - -u/--username & co can now override config-file values
2017-03-27 11:59:27 +02:00
Mike Fährmann
11d5c6f717
move option parsing to seperate module 2017-03-23 16:29:40 +01:00
Mike Fährmann
abfe7456d6
add '-R/--retries' and '--http-timeout' options
(#10)
2017-03-16 04:28:40 +01:00
Mike Fährmann
80df2b3527
add custom argparse action 2017-03-16 03:47:08 +01:00
Mike Fährmann
27ae152f57
use logging to report errors 2017-03-11 01:47:57 +01:00
Mike Fährmann
0cfe51dc78
add '--config-yaml' option
(#8)
2017-03-08 16:57:42 +01:00
Mike Fährmann
f782282f97
add logger objects to extractors 2017-03-07 23:50:19 +01:00
Mike Fährmann
24f41e13b3
move some exception handling code 2017-02-25 23:53:31 +01:00
Mike Fährmann
6208d9dd79
implement '--images' and '--chapters' options
- the former '--items' has been renamed to '--chapters'
- #6
2017-02-23 21:51:29 +01:00
Mike Fährmann
2a32b12043
add '--items' option
this allows to specify which manga-chapters/comic-issues to download
when using gallery-dl on a manga/comic URL
2017-02-20 22:02:49 +01:00
Mike Fährmann
3bca866185
rework the '-g' cmdline option
the amount of how often the -g option is given now determines up until
what level URLs are resolved.

example:

$ gallery-dl -g http://kissmanga.com/Manga/Dropout
http://kissmanga.com/Manga/Dropout/Ch-000---Oneshot-?id=145847

- when applied to a manga-extractor, specifying the -g option once will
  now print a list of all chapter URls

$ gallery-dl -gg http://kissmanga.com/Manga/Dropout
http://2.bp.blogspot.com/.../000.png
http://2.bp.blogspot.com/.../001.png
...

- specifying it twice (or even more often) will go a level deeper and
  print the image URLs found in those chapters
2017-02-17 22:18:16 +01:00
Mike Fährmann
4f123b8513
code adjustments according to pep8 2017-01-30 19:40:15 +01:00
Mike Fährmann
07ffab04c3
add -i/--input-file option 2016-12-04 16:11:54 +01:00
Mike Fährmann
f434a0711b
put centralized version string in 'version.py' 2016-10-08 11:37:47 +02:00
Mike Fährmann
0f96eb180e
add Python2 compatible version check 2016-10-04 14:33:50 +02:00
Mike Fährmann
0a3fb198f3
[batoto] raise exception if chapter is unavailable (#4) 2016-09-24 13:26:19 +02:00
Mike Fährmann
2418bfe91b
replace JSONDecodeError with ValueError 2016-09-24 12:06:17 +02:00
Mike Fährmann
813317045e
bump version 2016-09-23 08:41:03 +02:00
Mike Fährmann
ba86bbfbdb
add '--list-extractors' argument 2016-09-14 09:51:01 +02:00
Mike Fährmann
effa1084f2
[pixiv] raise NotFoundError instead of failing 2016-08-28 16:21:51 +02:00
Mike Fährmann
143bd9de11
add '--version' 2016-08-24 14:51:15 +02:00
Mike Fährmann
57a616a36f
update README and bump version 2016-08-22 12:21:31 +02:00
Mike Fährmann
f17e49dcf2
write error messages to stderr 2016-08-06 13:40:49 +02:00
Mike Fährmann
b0ea9021dc
handle broken pipes 2016-08-05 10:25:31 +02:00