Merge 0.0.7 (#94)
* Preliminary 0.0.7 changes Moved to a new album/track data parser using demjson. Slimit and Ply are no longer required. Some basic spelling corrections and consistency changes. Function Annotation, return types, and docstrings added. * Initial commit for the Issue Template * Fleshed out the issue template * Switched to rst (oops), reformatted accordingly * Update ISSUE_TEMPLATE.rst * Moved CONTRIBUTING to the hidden .github directory * No longer trips up on unavailable tracks in an album * Much more robust file integrity checking, session file support. Multi-step process in making sure files are downloaded and encoded properly. Bandcamp-dl will now attempt to search for a not.finished file and if it is found load that sessions arguments and resume operation form where it left off. * Improve download status/progress messages Made the download progress and status messages neater, no more multiple progress bars and lines of status messages. * Final 0.0.7 changes Setup imports for distribution again. Reformatted docstrings. Clarified choices in partial download dialog. Updated changelog. Updated manifest. Updated readme.master
parent
0176249237
commit
abf0dd261b
|
@ -14,13 +14,15 @@ Workflow
|
||||||
Please submit as many fixes for typos and grammar bloopers as you can!
|
Please submit as many fixes for typos and grammar bloopers as you can!
|
||||||
- Try to limit each pull request to *one* change only.
|
- Try to limit each pull request to *one* change only.
|
||||||
- Once you've addressed review feedback, make sure to bump the pull request with a short note.
|
- Once you've addressed review feedback, make sure to bump the pull request with a short note.
|
||||||
Maintainers don’t receive notifications when you push new commits.
|
|
||||||
|
|
||||||
|
|
||||||
Code
|
Code
|
||||||
----
|
----
|
||||||
|
|
||||||
- Try to adhere to PEP8 as best you can (Yes some lines will simply be too long, its ok.)
|
- Try to adhere to PEP8 as best you can.
|
||||||
|
- Annotate functions
|
||||||
|
- Specify return types
|
||||||
|
- Add docstrings
|
||||||
|
|
||||||
*****
|
*****
|
||||||
|
|
|
@ -0,0 +1,12 @@
|
||||||
|
**Python version:**
|
||||||
|
|
||||||
|
**Bandcamp-dl version:**
|
||||||
|
|
||||||
|
**Bancamp-dl options:**
|
||||||
|
|
||||||
|
**url:**
|
||||||
|
|
||||||
|
**options:**
|
||||||
|
|
||||||
|
**Describe the issue:**
|
||||||
|
-------------------------
|
|
@ -38,3 +38,4 @@ nosetests.xml
|
||||||
.pydevproject
|
.pydevproject
|
||||||
*.iml
|
*.iml
|
||||||
*.xml
|
*.xml
|
||||||
|
bandcamp_dl/asyncdownloader.py
|
||||||
|
|
|
@ -17,3 +17,14 @@ Version 0.0.6
|
||||||
- [Enhancement] Individual track downloads work now.
|
- [Enhancement] Individual track downloads work now.
|
||||||
- [Bugfix] Fixed imports, now working when installed via pip.
|
- [Bugfix] Fixed imports, now working when installed via pip.
|
||||||
- [Note] Last version to officially support Python 2.7.x
|
- [Note] Last version to officially support Python 2.7.x
|
||||||
|
|
||||||
|
Version 0.0.7
|
||||||
|
-------------
|
||||||
|
- [Enhancement] Will now resume if it finds a valid ``not.finished`` file.
|
||||||
|
- [Enhancement] Interrupting downloads is safe, they will resume on next run.
|
||||||
|
- [Enhancement] Interrupting encoding is safe, it will finish on next run.
|
||||||
|
- [Enhancement] CLI output is now much neater.
|
||||||
|
- [Bugfix] Partial albums (some previews disabled) will now download properly.
|
||||||
|
- [Dependency] Slimit is no longer required.
|
||||||
|
- [Dependency] Ply is no longer required.
|
||||||
|
- [Dependency] demjson is now required.
|
||||||
|
|
|
@ -1,5 +1,5 @@
|
||||||
include README.rst AUTHORS.rst CHANGELOG.rst LICENSE
|
include README.rst AUTHORS.rst CHANGELOG.rst LICENSE
|
||||||
exclude *.mp3 .gitignore yacctab.py lextab.py .travis.yml
|
exclude *.mp3 .gitignore .travis.yml setup.cfg
|
||||||
|
|
||||||
global-exclude *.pyc
|
global-exclude *.pyc
|
||||||
global-exclude *.DS_STORE
|
global-exclude *.DS_STORE
|
||||||
|
|
60
README.rst
60
README.rst
|
@ -11,7 +11,14 @@ Installation
|
||||||
From PyPI
|
From PyPI
|
||||||
---------
|
---------
|
||||||
|
|
||||||
pip install bandcamp-downloader
|
``pip install bandcamp-downloader``
|
||||||
|
|
||||||
|
From Wheel
|
||||||
|
----------
|
||||||
|
|
||||||
|
1. Download the wheel (``.whl``) from PyPI or the Releases page
|
||||||
|
2. ``cd`` to the directory containing the ``.whl`` file
|
||||||
|
2. ``pip install <filename>.whl``
|
||||||
|
|
||||||
From Source
|
From Source
|
||||||
-----------
|
-----------
|
||||||
|
@ -24,7 +31,7 @@ Description
|
||||||
===========
|
===========
|
||||||
|
|
||||||
bandcamp-dl is a small command-line app to download audio from
|
bandcamp-dl is a small command-line app to download audio from
|
||||||
BandCamp.com. It requires the Python interpreter, version 2.7.12+ - 3.5.2+ and is
|
BandCamp.com. It requires the Python interpreter, version 3.5+ and is
|
||||||
not platform specific. It is released to the public domain, which means
|
not platform specific. It is released to the public domain, which means
|
||||||
you can modify it, redistribute it or use it how ever you like.
|
you can modify it, redistribute it or use it how ever you like.
|
||||||
|
|
||||||
|
@ -77,9 +84,9 @@ The default template is: ``%{artist}/%{album}/%{track} - %{title}``.
|
||||||
Bugs
|
Bugs
|
||||||
====
|
====
|
||||||
|
|
||||||
Bugs should be reported `here <https://github.com/iheanyi/bandcamp-dl/issues>`_. Please include
|
Bugs should be reported `here <https://github.com/iheanyi/bandcamp-dl/issues>`_.
|
||||||
the full output of the command when run with ``--verbose``. The output
|
Please include the full output of the command when run with ``--verbose``.
|
||||||
(including the first lines) contain important debugging information.
|
The output (including the first lines) contain important debugging information.
|
||||||
Issues without the full output are often not reproducible and therefore
|
Issues without the full output are often not reproducible and therefore
|
||||||
do not get solved in short order, if ever.
|
do not get solved in short order, if ever.
|
||||||
|
|
||||||
|
@ -88,38 +95,6 @@ For discussions, join us in `Discord <https://discord.gg/nwdT4MP>`_.
|
||||||
When you submit a request, please re-read it once to avoid a couple of
|
When you submit a request, please re-read it once to avoid a couple of
|
||||||
mistakes (you can and should use this as a checklist):
|
mistakes (you can and should use this as a checklist):
|
||||||
|
|
||||||
Is the description of the issue itself sufficient?
|
|
||||||
==================================================
|
|
||||||
|
|
||||||
We often get issue reports that we cannot really decipher. While in most
|
|
||||||
cases we eventually get the required information after asking back
|
|
||||||
multiple times, this poses an unnecessary drain on our resources. Many
|
|
||||||
contributors, including myself, are also not native speakers, so we may
|
|
||||||
misread some parts.
|
|
||||||
|
|
||||||
So please elaborate on what feature you are requesting, or what bug you
|
|
||||||
want to be fixed. Make sure that it's obvious
|
|
||||||
|
|
||||||
- What the problem is
|
|
||||||
- How it could be fixed
|
|
||||||
- How your proposed solution would look like
|
|
||||||
|
|
||||||
If your report is shorter than two lines, it is almost certainly missing
|
|
||||||
some of these, which makes it hard for us to respond to it. We're often
|
|
||||||
too polite to close the issue outright, but the missing info makes
|
|
||||||
misinterpretation likely. As a commiter myself, I often get frustrated
|
|
||||||
by these issues, since the only possible way for me to move forward on
|
|
||||||
them is to ask for clarification over and over.
|
|
||||||
|
|
||||||
For bug reports, this means that your report should contain the
|
|
||||||
*complete* output of bandcamp-dl when called with the ``-v`` flag. The
|
|
||||||
error message you get for (most) bugs even says so, but you would not
|
|
||||||
believe how many of our bug reports do not contain this information.
|
|
||||||
|
|
||||||
Site support requests **must contain an example URL**. An example URL is
|
|
||||||
a URL you might want to download, like
|
|
||||||
``lifeformed.bandcamp.com/album/fastfall``.
|
|
||||||
|
|
||||||
Are you using the latest version?
|
Are you using the latest version?
|
||||||
=================================
|
=================================
|
||||||
|
|
||||||
|
@ -209,14 +184,11 @@ related to bandcamp-dl, by all means, go ahead and report the bug.
|
||||||
Dependencies
|
Dependencies
|
||||||
============
|
============
|
||||||
|
|
||||||
- `BeautifulSoup <https://pypi.python.org/pypi/beautifulsoup4>`_ -
|
- `BeautifulSoup <https://pypi.python.org/pypi/beautifulsoup4>`_ - HTML Parsing
|
||||||
HTML Parsing
|
- `Demjson <https://pypi.python.org/pypi/demjson>`_- JavaScript dict to JSON conversion
|
||||||
- `Mutagen <https://pypi.python.org/pypi/mutagen>`_ - ID3 Encoding
|
- `Mutagen <https://pypi.python.org/pypi/mutagen>`_ - ID3 Encoding
|
||||||
- `Requests <https://pypi.python.org/pypi/requests>`_ - for retriving
|
- `Requests <https://pypi.python.org/pypi/requests>`_ - for retriving the HTML
|
||||||
the HTML
|
- `Unicode-Slugify <https://pypi.python.org/pypi/unicode-slugify>`_ - A slug generator that turns strings into unicode slugs.
|
||||||
- `Slimit <https://pypi.python.org/pypi/slimit>`_ - Javascript parsing
|
|
||||||
- `Unicode-Slugify <https://pypi.python.org/pypi/unicode-slugify>`_ -
|
|
||||||
A slug generator that turns strings into unicode slugs.
|
|
||||||
|
|
||||||
Copyright
|
Copyright
|
||||||
=========
|
=========
|
||||||
|
|
|
@ -1,119 +1,118 @@
|
||||||
|
from .bandcampjson import BandcampJSON
|
||||||
from bs4 import BeautifulSoup
|
from bs4 import BeautifulSoup
|
||||||
|
from bs4 import FeatureNotFound
|
||||||
import requests
|
import requests
|
||||||
from .jsobj import read_js_object
|
import json
|
||||||
|
|
||||||
|
|
||||||
class Bandcamp:
|
class Bandcamp:
|
||||||
def parse(self, url, no_art=True):
|
def parse(self, url: str, art: bool=True) -> dict or None:
|
||||||
|
"""Requests the page, cherry picks album info
|
||||||
|
|
||||||
|
:param url: album/track url
|
||||||
|
:param art: if True download album art
|
||||||
|
:return: album metadata
|
||||||
|
"""
|
||||||
try:
|
try:
|
||||||
r = requests.get(url)
|
r = requests.get(url)
|
||||||
except requests.exceptions.MissingSchema:
|
except requests.exceptions.MissingSchema:
|
||||||
return None
|
return None
|
||||||
|
|
||||||
self.no_art = no_art
|
|
||||||
|
|
||||||
if r.status_code is not 200:
|
|
||||||
return None
|
|
||||||
|
|
||||||
try:
|
try:
|
||||||
self.soup = BeautifulSoup(r.text, "lxml")
|
self.soup = BeautifulSoup(r.text, "lxml")
|
||||||
except:
|
except FeatureNotFound:
|
||||||
self.soup = BeautifulSoup(r.text, "html.parser")
|
self.soup = BeautifulSoup(r.text, "html.parser")
|
||||||
|
|
||||||
|
self.generate_album_json()
|
||||||
|
self.tracks = self.tralbum_data_json['trackinfo']
|
||||||
|
|
||||||
album = {
|
album = {
|
||||||
"tracks": [],
|
"tracks": [],
|
||||||
"title": "",
|
"title": self.embed_data_json['album_title'],
|
||||||
"artist": "",
|
"artist": self.embed_data_json['artist'],
|
||||||
"full": False,
|
"full": False,
|
||||||
"art": "",
|
"art": "",
|
||||||
"date": ""
|
"date": self.tralbum_data_json['album_release_date']
|
||||||
}
|
}
|
||||||
|
|
||||||
album_meta = self.extract_album_meta_data(r)
|
for track in self.tracks:
|
||||||
|
if track['file'] is not None:
|
||||||
|
track = self.get_track_metadata(track)
|
||||||
|
album['tracks'].append(track)
|
||||||
|
|
||||||
album['artist'] = album_meta['artist']
|
album['full'] = self.all_tracks_available()
|
||||||
album['title'] = album_meta['title']
|
if art:
|
||||||
album['date'] = album_meta['date']
|
|
||||||
|
|
||||||
for track in album_meta['tracks']:
|
|
||||||
track = self.get_track_meta_data(track)
|
|
||||||
album['tracks'].append(track)
|
|
||||||
|
|
||||||
album['full'] = self.all_tracks_available(album)
|
|
||||||
if self.no_art:
|
|
||||||
album['art'] = self.get_album_art()
|
album['art'] = self.get_album_art()
|
||||||
|
|
||||||
return album
|
return album
|
||||||
|
|
||||||
def all_tracks_available(self, album):
|
# Possibly redundant now, we skip unavailable tracks.
|
||||||
for track in album['tracks']:
|
def all_tracks_available(self) -> bool:
|
||||||
if track['url'] is None:
|
"""Verify that all tracks have a url
|
||||||
return False
|
|
||||||
|
|
||||||
|
:return: True if all urls accounted for
|
||||||
|
"""
|
||||||
|
for track in self.tracks:
|
||||||
|
if track['file'] is None:
|
||||||
|
return False
|
||||||
return True
|
return True
|
||||||
|
|
||||||
def is_basestring(self, obj):
|
@staticmethod
|
||||||
if isinstance(obj, str) or isinstance(obj, bytes) or isinstance(obj, bytearray):
|
def get_track_metadata(track: dict or None) -> dict:
|
||||||
return True
|
"""Extract individual track metadata
|
||||||
return False
|
|
||||||
|
|
||||||
def get_track_meta_data(self, track):
|
:param track: track dict
|
||||||
new_track = {}
|
:return: track metadata dict
|
||||||
if not self.is_basestring(track['file']):
|
"""
|
||||||
if 'mp3-128' in track['file']:
|
track_metadata = {
|
||||||
new_track['url'] = track['file']['mp3-128']
|
"duration": track['duration'],
|
||||||
|
"track": str(track['track_num']),
|
||||||
|
"title": track['title'],
|
||||||
|
"url": None
|
||||||
|
}
|
||||||
|
|
||||||
|
if 'mp3-128' in track['file']:
|
||||||
|
track_metadata['url'] = "http:" + track['file']['mp3-128']
|
||||||
else:
|
else:
|
||||||
new_track['url'] = None
|
track_metadata['url'] = None
|
||||||
|
return track_metadata
|
||||||
|
|
||||||
new_track['duration'] = track['duration']
|
def generate_album_json(self):
|
||||||
new_track['track'] = track['track_num']
|
"""Retrieve JavaScript dictionaries from page and generate JSON
|
||||||
new_track['title'] = track['title']
|
|
||||||
|
|
||||||
return new_track
|
:return: True if successful
|
||||||
|
"""
|
||||||
|
try:
|
||||||
|
embed = BandcampJSON(self.soup, "EmbedData")
|
||||||
|
tralbum = BandcampJSON(self.soup, "TralbumData")
|
||||||
|
|
||||||
def extract_album_meta_data(self, request):
|
embed_data = embed.js_to_json()
|
||||||
album = {}
|
tralbum_data = tralbum.js_to_json()
|
||||||
|
|
||||||
embedData = self.get_embed_string_block(request)
|
self.embed_data_json = json.loads(embed_data)
|
||||||
|
self.tralbum_data_json = json.loads(tralbum_data)
|
||||||
block = request.text.split("var TralbumData = ")
|
except Exception as e:
|
||||||
|
print(e)
|
||||||
stringBlock = block[1]
|
return None
|
||||||
|
return True
|
||||||
stringBlock = stringBlock.split("};")[0] + "};"
|
|
||||||
stringBlock = read_js_object(u"var TralbumData = {}".format(stringBlock))
|
|
||||||
|
|
||||||
if 'album_title' not in embedData['EmbedData']:
|
|
||||||
album['title'] = "Unknown Album"
|
|
||||||
else:
|
|
||||||
album['title'] = embedData['EmbedData']['album_title']
|
|
||||||
|
|
||||||
album['artist'] = stringBlock['TralbumData']['artist']
|
|
||||||
album['tracks'] = stringBlock['TralbumData']['trackinfo']
|
|
||||||
|
|
||||||
if stringBlock['TralbumData']['album_release_date'] == "null":
|
|
||||||
album['date'] = ""
|
|
||||||
else:
|
|
||||||
album['date'] = stringBlock['TralbumData']['album_release_date'].split()[2]
|
|
||||||
|
|
||||||
return album
|
|
||||||
|
|
||||||
@staticmethod
|
@staticmethod
|
||||||
def generate_album_url(artist, album):
|
def generate_album_url(artist: str, album: str) -> str:
|
||||||
|
"""Generate an album url based on the artist and album name
|
||||||
|
|
||||||
|
:param artist: artist name
|
||||||
|
:param album: album name
|
||||||
|
:return: album url as str
|
||||||
|
"""
|
||||||
return "http://{0}.bandcamp.com/album/{1}".format(artist, album)
|
return "http://{0}.bandcamp.com/album/{1}".format(artist, album)
|
||||||
|
|
||||||
def get_album_art(self):
|
def get_album_art(self) -> str:
|
||||||
|
"""Find and retrieve album art url from page
|
||||||
|
|
||||||
|
:return: url as str
|
||||||
|
"""
|
||||||
try:
|
try:
|
||||||
url = self.soup.find(id='tralbumArt').find_all('img')[0]['src']
|
url = self.soup.find(id='tralbumArt').find_all('img')[0]['src']
|
||||||
return url
|
return url
|
||||||
except:
|
except None:
|
||||||
pass
|
pass
|
||||||
|
|
||||||
def get_embed_string_block(self, request):
|
|
||||||
embedBlock = request.text.split("var EmbedData = ")
|
|
||||||
|
|
||||||
embedStringBlock = embedBlock[1]
|
|
||||||
embedStringBlock = embedStringBlock.split("};")[0] + "};"
|
|
||||||
embedStringBlock = read_js_object(u"var EmbedData = {}".format(embedStringBlock))
|
|
||||||
|
|
||||||
return embedStringBlock
|
|
||||||
|
|
|
@ -1,27 +1,28 @@
|
||||||
"""bandcamp-dl
|
"""bandcamp-dl
|
||||||
|
|
||||||
Usage:
|
Usage:
|
||||||
bandcamp-dl.py <url>
|
bandcamp-dl [url]
|
||||||
bandcamp-dl.py [--template=<template>] [--base-dir=<dir>]
|
bandcamp-dl [--template=<template>] [--base-dir=<dir>]
|
||||||
[--full-album]
|
[--full-album]
|
||||||
(<url> | --artist=<artist> --album=<album>)
|
(<url> | --artist=<artist> --album=<album>)
|
||||||
[--overwrite]
|
[--overwrite]
|
||||||
[--no-art]
|
[--no-art]
|
||||||
bandcamp-dl.py (-h | --help)
|
bandcamp-dl (-h | --help)
|
||||||
bandcamp-dl.py (--version)
|
bandcamp-dl (--version)
|
||||||
|
|
||||||
Options:
|
Options:
|
||||||
-h --help Show this screen.
|
-h --help Show this screen.
|
||||||
-v --version Show version.
|
-v --version Show version.
|
||||||
-a --artist=<artist> The artist's slug (from the URL)
|
-a --artist=<artist> The artist's slug (from the URL)
|
||||||
-b --album=<album> The album's slug (from the URL)
|
-b --album=<album> The album's slug (from the URL)
|
||||||
-t --template=<template> Output filename template.
|
-t --template=<template> Output filename template.
|
||||||
[default: %{artist}/%{album}/%{track} - %{title}]
|
[default: %{artist}/%{album}/%{track} - %{title}]
|
||||||
-d --base-dir=<dir> Base location of which all files are downloaded.
|
-d --base-dir=<dir> Base location of which all files are downloaded.
|
||||||
-f --full-album Download only if all tracks are available.
|
-f --full-album Download only if all tracks are available.
|
||||||
-o --overwrite Overwrite tracks that already exist. Default is False.
|
-o --overwrite Overwrite tracks that already exist. Default is False.
|
||||||
-n --no-art Skip grabbing album art
|
-n --no-art Skip grabbing album art
|
||||||
|
"""
|
||||||
|
"""
|
||||||
Coded by:
|
Coded by:
|
||||||
|
|
||||||
Iheanyi Ekechukwu
|
Iheanyi Ekechukwu
|
||||||
|
@ -39,19 +40,32 @@ Anthony Forsberg:
|
||||||
|
|
||||||
Iheanyi:
|
Iheanyi:
|
||||||
Feel free to use this in any way you wish. I made this just for fun.
|
Feel free to use this in any way you wish. I made this just for fun.
|
||||||
Shout out to darkf for writing a helper function for parsing the JavaScript!
|
Shout out to darkf for writing the previous helper function for parsing the JavaScript!
|
||||||
"""
|
"""
|
||||||
|
|
||||||
import os
|
import os
|
||||||
|
import ast
|
||||||
from docopt import docopt
|
from docopt import docopt
|
||||||
from .bandcamp import Bandcamp
|
from .bandcamp import Bandcamp
|
||||||
from .bandcampdownloader import BandcampDownloader
|
from .bandcampdownloader import BandcampDownloader
|
||||||
|
|
||||||
|
|
||||||
def main():
|
def main():
|
||||||
arguments = docopt(__doc__, version='bandcamp-dl 0.0.6-03')
|
arguments = docopt(__doc__, version='bandcamp-dl 0.0.7')
|
||||||
bandcamp = Bandcamp()
|
bandcamp = Bandcamp()
|
||||||
|
|
||||||
|
basedir = arguments['--base-dir'] or os.getcwd()
|
||||||
|
session_file = basedir + "/not.finished"
|
||||||
|
|
||||||
|
if os.path.isfile(session_file):
|
||||||
|
with open(session_file, "r") as f:
|
||||||
|
arguments = ast.literal_eval(f.readline())
|
||||||
|
elif arguments['<url>'] is None:
|
||||||
|
print(__doc__)
|
||||||
|
else:
|
||||||
|
with open(session_file, "w") as f:
|
||||||
|
f.write("".join(str(arguments).split('\n')))
|
||||||
|
|
||||||
if arguments['--artist'] and arguments['--album']:
|
if arguments['--artist'] and arguments['--album']:
|
||||||
url = Bandcamp.generate_album_url(arguments['--artist'], arguments['--album'])
|
url = Bandcamp.generate_album_url(arguments['--artist'], arguments['--album'])
|
||||||
else:
|
else:
|
||||||
|
@ -61,7 +75,6 @@ def main():
|
||||||
album = bandcamp.parse(url, False)
|
album = bandcamp.parse(url, False)
|
||||||
else:
|
else:
|
||||||
album = bandcamp.parse(url)
|
album = bandcamp.parse(url)
|
||||||
basedir = arguments['--base-dir'] or os.getcwd()
|
|
||||||
|
|
||||||
if not album:
|
if not album:
|
||||||
print("The url {} is not a valid bandcamp page.".format(url))
|
print("The url {} is not a valid bandcamp page.".format(url))
|
||||||
|
|
|
@ -9,6 +9,13 @@ from slugify import slugify
|
||||||
|
|
||||||
class BandcampDownloader:
|
class BandcampDownloader:
|
||||||
def __init__(self, urls=None, template=None, directory=None, overwrite=False):
|
def __init__(self, urls=None, template=None, directory=None, overwrite=False):
|
||||||
|
"""Initialize variables we will need throughout the Class
|
||||||
|
|
||||||
|
:param urls: list of urls
|
||||||
|
:param template: filename template
|
||||||
|
:param directory: download location
|
||||||
|
:param overwrite: if True overwrite existing files
|
||||||
|
"""
|
||||||
if type(urls) is str:
|
if type(urls) is str:
|
||||||
self.urls = [urls]
|
self.urls = [urls]
|
||||||
|
|
||||||
|
@ -17,11 +24,28 @@ class BandcampDownloader:
|
||||||
self.directory = directory
|
self.directory = directory
|
||||||
self.overwrite = overwrite
|
self.overwrite = overwrite
|
||||||
|
|
||||||
def start(self, album):
|
def start(self, album: dict):
|
||||||
print("Starting download process.")
|
"""Start album download process
|
||||||
self.download_album(album)
|
|
||||||
|
|
||||||
def template_to_path(self, track):
|
:param album: album dict
|
||||||
|
"""
|
||||||
|
if album['full'] is not True:
|
||||||
|
choice = input("Track list incomplete, some tracks may be private, download anyway? (yes/no): ").lower()
|
||||||
|
if choice == "yes" or choice == "y":
|
||||||
|
print("Starting download process.")
|
||||||
|
self.download_album(album)
|
||||||
|
else:
|
||||||
|
print("Cancelling download process.")
|
||||||
|
return None
|
||||||
|
else:
|
||||||
|
self.download_album(album)
|
||||||
|
|
||||||
|
def template_to_path(self, track: dict) -> str:
|
||||||
|
"""Create valid filepath based on template
|
||||||
|
|
||||||
|
:param track: track metadata
|
||||||
|
:return: filepath
|
||||||
|
"""
|
||||||
path = self.template
|
path = self.template
|
||||||
path = path.replace("%{artist}", slugify(track['artist']))
|
path = path.replace("%{artist}", slugify(track['artist']))
|
||||||
path = path.replace("%{album}", slugify(track['album']))
|
path = path.replace("%{album}", slugify(track['album']))
|
||||||
|
@ -32,14 +56,24 @@ class BandcampDownloader:
|
||||||
return path.encode('utf-8')
|
return path.encode('utf-8')
|
||||||
|
|
||||||
@staticmethod
|
@staticmethod
|
||||||
def create_directory(filename):
|
def create_directory(filename: str) -> str:
|
||||||
|
"""Create directory based on filename if it doesn't exist
|
||||||
|
|
||||||
|
:param filename: full filename
|
||||||
|
:return: directory path
|
||||||
|
"""
|
||||||
directory = os.path.dirname(filename)
|
directory = os.path.dirname(filename)
|
||||||
if not os.path.exists(directory):
|
if not os.path.exists(directory):
|
||||||
os.makedirs(directory)
|
os.makedirs(directory)
|
||||||
|
|
||||||
return directory
|
return directory
|
||||||
|
|
||||||
def download_album(self, album):
|
def download_album(self, album: dict) -> bool:
|
||||||
|
"""Download all MP3 files in the album
|
||||||
|
|
||||||
|
:param album: album dict
|
||||||
|
:return: True if successful
|
||||||
|
"""
|
||||||
for track_index, track in enumerate(album['tracks']):
|
for track_index, track in enumerate(album['tracks']):
|
||||||
track_meta = {
|
track_meta = {
|
||||||
"artist": album['artist'],
|
"artist": album['artist'],
|
||||||
|
@ -49,53 +83,60 @@ class BandcampDownloader:
|
||||||
"date": album['date']
|
"date": album['date']
|
||||||
}
|
}
|
||||||
|
|
||||||
print("Accessing track " + str(track_index + 1) + " of " + str(len(album['tracks'])))
|
self.num_tracks = len(album['tracks'])
|
||||||
|
self.track_num = track_index + 1
|
||||||
|
|
||||||
filename = self.template_to_path(track_meta).decode()
|
filepath = self.template_to_path(track_meta) + ".tmp"
|
||||||
dirname = self.create_directory(filename)
|
filename = filepath.rsplit('/', 1)[1]
|
||||||
|
dirname = self.create_directory(filepath)
|
||||||
|
|
||||||
if not track.get('url'):
|
attempts = 0
|
||||||
print("Skipping track {0} - {1} as it is not available"
|
skip = False
|
||||||
.format(track['track'], track['title']))
|
|
||||||
continue
|
|
||||||
|
|
||||||
try:
|
while True:
|
||||||
track_url = track['url']
|
try:
|
||||||
# Check and see if HTTP is in the track_url
|
r = requests.get(track['url'], stream=True)
|
||||||
if 'http' not in track_url:
|
file_length = int(r.headers['content-length'])
|
||||||
track_url = 'http:{}'.format(track_url)
|
total = int(file_length/100)
|
||||||
|
# If file exists and is still a tmp file skip downloading and encode
|
||||||
r = requests.get(track_url, stream=True)
|
if os.path.exists(filepath):
|
||||||
file_length = r.headers.get('content-length')
|
self.write_id3_tags(filepath, track_meta)
|
||||||
|
# Set skip to True so that we don't try encoding again
|
||||||
if not self.overwrite and os.path.isfile(filename):
|
skip = True
|
||||||
file_size = os.path.getsize(filename) - 128
|
# break out of the try/except and move on to the next file
|
||||||
if int(file_size) != int(file_length):
|
break
|
||||||
print("{} is incomplete, redownloading.".format(filename))
|
elif os.path.exists(filepath[:-4]) and self.overwrite is not True:
|
||||||
os.remove(filename)
|
print("File: {} already exists and is complete, skipping..".format(filename[:-4]))
|
||||||
else:
|
skip = True
|
||||||
print("Skipping track {0} - {1} as it's already downloaded, use --overwrite to overwrite existing files"
|
break
|
||||||
.format(track['track'], track['title']))
|
with open(filepath, "wb") as f:
|
||||||
continue
|
|
||||||
|
|
||||||
with open(filename, "wb") as f:
|
|
||||||
print("Downloading: {}".format(filename[:-4]))
|
|
||||||
if file_length is None:
|
|
||||||
f.write(r.content)
|
|
||||||
else:
|
|
||||||
dl = 0
|
dl = 0
|
||||||
total_length = int(file_length)
|
for data in r.iter_content(chunk_size=total):
|
||||||
for data in r.iter_content(chunk_size=int(total_length/100)):
|
|
||||||
dl += len(data)
|
dl += len(data)
|
||||||
f.write(data)
|
f.write(data)
|
||||||
done = int(50 * dl / total_length)
|
done = int(50 * dl / file_length)
|
||||||
sys.stdout.write("\r[%s%s]" % ('=' * done, ' ' * (50 - done)))
|
sys.stdout.write("\r({}/{}) [{}{}] :: Downloading: {}".format(self.track_num, self.num_tracks, "=" * done, " " * (50 - done), filename[:-8]))
|
||||||
sys.stdout.flush()
|
sys.stdout.flush()
|
||||||
self.write_id3_tags(filename, track_meta)
|
local_size = os.path.getsize(filepath)
|
||||||
except Exception as e:
|
# if the local filesize before encoding doesn't match the remote filesize redownload
|
||||||
print(e)
|
if local_size != file_length and attempts != 3:
|
||||||
print("Downloading failed..")
|
print("{} is incomplete, retrying..".format(filename))
|
||||||
return False
|
continue
|
||||||
|
# if the maximum number of retry attempts is reached give up and move on
|
||||||
|
elif attempts == 3:
|
||||||
|
print("Maximum retries reached.. skipping.")
|
||||||
|
# Clean up incomplete file
|
||||||
|
os.remove(filepath)
|
||||||
|
break
|
||||||
|
# if all is well continue the download process for the rest of the tracks
|
||||||
|
else:
|
||||||
|
break
|
||||||
|
except Exception as e:
|
||||||
|
print(e)
|
||||||
|
print("Downloading failed..")
|
||||||
|
return False
|
||||||
|
if skip is not True:
|
||||||
|
self.write_id3_tags(filepath, track_meta)
|
||||||
if album['art']:
|
if album['art']:
|
||||||
try:
|
try:
|
||||||
with open("{}/cover.jpg".format(dirname), "wb") as f:
|
with open("{}/cover.jpg".format(dirname), "wb") as f:
|
||||||
|
@ -105,17 +146,26 @@ class BandcampDownloader:
|
||||||
print(e)
|
print(e)
|
||||||
print("Couldn't download album art.")
|
print("Couldn't download album art.")
|
||||||
|
|
||||||
|
if os.path.isfile("not.finished"):
|
||||||
|
os.remove("not.finished")
|
||||||
return True
|
return True
|
||||||
|
|
||||||
@staticmethod
|
def write_id3_tags(self, filepath: str, meta: dict):
|
||||||
def write_id3_tags(filename, meta):
|
"""Write metadata to the MP3 file
|
||||||
print("\nEncoding . . .")
|
|
||||||
|
|
||||||
audio = MP3(filename)
|
:param filepath: name of mp3 file
|
||||||
|
:param meta: dict of track metadata
|
||||||
|
"""
|
||||||
|
filename = filepath.rsplit('/', 1)[1][:-8]
|
||||||
|
|
||||||
|
sys.stdout.flush()
|
||||||
|
sys.stdout.write("\r({}/{}) [{}] :: Encoding: {}".format(self.track_num, self.num_tracks, "=" * 50, filename))
|
||||||
|
|
||||||
|
audio = MP3(filepath)
|
||||||
audio["TIT2"] = TIT2(encoding=3, text=["title"])
|
audio["TIT2"] = TIT2(encoding=3, text=["title"])
|
||||||
audio.save(filename=None, v1=2)
|
audio.save(filename=None, v1=2)
|
||||||
|
|
||||||
audio = EasyID3(filename)
|
audio = EasyID3(filepath)
|
||||||
audio["tracknumber"] = meta['track']
|
audio["tracknumber"] = meta['track']
|
||||||
audio["title"] = meta['title']
|
audio["title"] = meta['title']
|
||||||
audio["artist"] = meta['artist']
|
audio["artist"] = meta['artist']
|
||||||
|
@ -123,5 +173,7 @@ class BandcampDownloader:
|
||||||
audio["date"] = meta['date']
|
audio["date"] = meta['date']
|
||||||
audio.save()
|
audio.save()
|
||||||
|
|
||||||
audio.save(filename)
|
audio.save(filepath[:-4])
|
||||||
print("Done encoding . . .")
|
os.remove(filepath)
|
||||||
|
|
||||||
|
sys.stdout.write("\r({}/{}) [{}] :: Finished: {}".format(self.track_num, self.num_tracks, "=" * 50, filename))
|
||||||
|
|
|
@ -0,0 +1,39 @@
|
||||||
|
import demjson
|
||||||
|
import re
|
||||||
|
|
||||||
|
|
||||||
|
class BandcampJSON:
|
||||||
|
def __init__(self, body, var_name: str, js_data=None):
|
||||||
|
self.body = body
|
||||||
|
self.var_name = var_name
|
||||||
|
self.js_data = js_data
|
||||||
|
|
||||||
|
def get_js(self) -> str:
|
||||||
|
"""Get <script> element containing the data we need and return the raw JS
|
||||||
|
|
||||||
|
:return js_data: Raw JS as str
|
||||||
|
"""
|
||||||
|
self.js_data = self.body.find("script", {"src": False}, text=re.compile(self.var_name)).string
|
||||||
|
return self.js_data
|
||||||
|
|
||||||
|
def extract_data(self, js: str) -> str:
|
||||||
|
"""Extract values from JS dictionary
|
||||||
|
|
||||||
|
:param js: Raw JS
|
||||||
|
:return: Contents of dictionary as str
|
||||||
|
"""
|
||||||
|
self.js_data = re.search(r"(?<=var\s" + self.var_name + "\s=\s)[^;]*", js).group().replace('" + "', '')
|
||||||
|
return self.js_data
|
||||||
|
|
||||||
|
def js_to_json(self) -> str:
|
||||||
|
"""Convert JavaScript dictionary to JSON
|
||||||
|
|
||||||
|
:return: JSON as str
|
||||||
|
"""
|
||||||
|
js = self.get_js()
|
||||||
|
data = self.extract_data(js)
|
||||||
|
# Decode with demjson first to reformat keys and lists
|
||||||
|
js_data = demjson.decode(data)
|
||||||
|
# Encode to make valid JSON
|
||||||
|
js_data = demjson.encode(js_data)
|
||||||
|
return js_data
|
|
@ -1,7 +1,6 @@
|
||||||
beautifulsoup4==4.5.1
|
beautifulsoup4==4.5.1
|
||||||
|
demjson==2.2.4
|
||||||
docopt==0.6.2
|
docopt==0.6.2
|
||||||
mutagen==1.35.1
|
mutagen==1.35.1
|
||||||
ply==3.9
|
|
||||||
requests==2.12.4
|
requests==2.12.4
|
||||||
slimit==0.8.1
|
|
||||||
unicode-slugify==0.1.3
|
unicode-slugify==0.1.3
|
||||||
|
|
|
@ -1,81 +0,0 @@
|
||||||
"""
|
|
||||||
Simple JavaScript/ECMAScript object literal reader
|
|
||||||
Only supports object literals wrapped in `var x = ...;` statements, so you
|
|
||||||
might want to do read_js_object('var x = %s;' % literal) if it's in another format.
|
|
||||||
|
|
||||||
Requires the slimit <https://github.com/rspivak/slimit> library for parsing.
|
|
||||||
|
|
||||||
Basic constand folding on strings and numbers is done, e.g. "hi " + "there!" reduces to "hi there!",
|
|
||||||
and 1+1 reduces to 2.
|
|
||||||
|
|
||||||
Copyright (c) 2013 darkf
|
|
||||||
Licensed under the terms of the WTFPL:
|
|
||||||
|
|
||||||
DO WHAT THE FUCK YOU WANT TO PUBLIC LICENSE
|
|
||||||
Version 2, December 2004
|
|
||||||
|
|
||||||
Everyone is permitted to copy and distribute verbatim or modified
|
|
||||||
copies of this license document, and changing it is allowed as long
|
|
||||||
as the name is changed.
|
|
||||||
|
|
||||||
DO WHAT THE FUCK YOU WANT TO PUBLIC LICENSE
|
|
||||||
TERMS AND CONDITIONS FOR COPYING, DISTRIBUTION AND MODIFICATION
|
|
||||||
|
|
||||||
0. You just DO WHAT THE FUCK YOU WANT TO.
|
|
||||||
"""
|
|
||||||
|
|
||||||
from slimit.parser import Parser
|
|
||||||
import slimit.ast as ast
|
|
||||||
|
|
||||||
|
|
||||||
def read_js_object(code):
|
|
||||||
parser = Parser()
|
|
||||||
|
|
||||||
def visit(node):
|
|
||||||
if isinstance(node, ast.Program):
|
|
||||||
d = {}
|
|
||||||
for child in node:
|
|
||||||
if not isinstance(child, ast.VarStatement):
|
|
||||||
raise ValueError("All statements should be var statements")
|
|
||||||
key, val = visit(child)
|
|
||||||
d[key] = val
|
|
||||||
return d
|
|
||||||
elif isinstance(node, ast.VarStatement):
|
|
||||||
return visit(node.children()[0])
|
|
||||||
elif isinstance(node, ast.VarDecl):
|
|
||||||
return visit(node.identifier), visit(node.initializer)
|
|
||||||
elif isinstance(node, ast.Object):
|
|
||||||
d = {}
|
|
||||||
for property in node:
|
|
||||||
key = visit(property.left)
|
|
||||||
value = visit(property.right)
|
|
||||||
d[key] = value
|
|
||||||
return d
|
|
||||||
elif isinstance(node, ast.BinOp):
|
|
||||||
# simple constant folding
|
|
||||||
if node.op == '+':
|
|
||||||
if isinstance(node.left, ast.String) and isinstance(node.right, ast.String):
|
|
||||||
return visit(node.left) + visit(node.right)
|
|
||||||
elif isinstance(node.left, ast.Number) and isinstance(node.right, ast.Number):
|
|
||||||
return visit(node.left) + visit(node.right)
|
|
||||||
else:
|
|
||||||
raise ValueError("Cannot + on anything other than two literals")
|
|
||||||
else:
|
|
||||||
raise ValueError("Cannot do operator '{}'".format(node.op))
|
|
||||||
|
|
||||||
elif isinstance(node, ast.String):
|
|
||||||
return node.value.strip('"').strip("'")
|
|
||||||
elif isinstance(node, ast.Array):
|
|
||||||
return [visit(x) for x in node]
|
|
||||||
elif isinstance(node, ast.Number) or isinstance(node, ast.Identifier)\
|
|
||||||
or isinstance(node, ast.Boolean) or isinstance(node, ast.Null):
|
|
||||||
return node.value
|
|
||||||
else:
|
|
||||||
raise Exception("Unhandled node: {}".format(node))
|
|
||||||
|
|
||||||
return visit(parser.parse(code))
|
|
||||||
|
|
||||||
|
|
||||||
if __name__ == "__main__":
|
|
||||||
print(read_js_object("""var foo = {x: 10, y: "hi " + "there!"};
|
|
||||||
var bar = {derp: ["herp", "it", "up", "forever"]};"""))
|
|
|
@ -1,9 +1,8 @@
|
||||||
--index-url https://pypi.python.org/simple/
|
--index-url https://pypi.python.org/simple/
|
||||||
|
|
||||||
beautifulsoup4==4.5.1
|
beautifulsoup4==4.5.1
|
||||||
|
demjson==2.2.4
|
||||||
docopt==0.6.2
|
docopt==0.6.2
|
||||||
mutagen==1.35.1
|
mutagen==1.35.1
|
||||||
ply==3.9
|
|
||||||
requests==2.12.4
|
requests==2.12.4
|
||||||
slimit==0.8.1
|
unicode-slugify==0.1.3
|
||||||
unicode-slugify==0.1.3
|
|
||||||
|
|
6
setup.py
6
setup.py
|
@ -6,7 +6,7 @@ here = path.abspath(path.dirname(__file__))
|
||||||
|
|
||||||
setup(
|
setup(
|
||||||
name='bandcamp-downloader',
|
name='bandcamp-downloader',
|
||||||
version='0.0.6-03',
|
version='0.0.7',
|
||||||
description='bandcamp-dl downloads albums and tracks from Bandcamp for you',
|
description='bandcamp-dl downloads albums and tracks from Bandcamp for you',
|
||||||
long_description=open('README.rst').read(),
|
long_description=open('README.rst').read(),
|
||||||
url='https://github.com/iheanyi/bandcamp-dl',
|
url='https://github.com/iheanyi/bandcamp-dl',
|
||||||
|
@ -18,18 +18,16 @@ setup(
|
||||||
'Intended Audience :: End Users/Desktop',
|
'Intended Audience :: End Users/Desktop',
|
||||||
'Topic :: Multimedia :: Sound/Audio',
|
'Topic :: Multimedia :: Sound/Audio',
|
||||||
'License :: Public Domain',
|
'License :: Public Domain',
|
||||||
'Programming Language :: Python :: 2.7',
|
|
||||||
'Programming Language :: Python :: 3.5',
|
'Programming Language :: Python :: 3.5',
|
||||||
],
|
],
|
||||||
keywords=['bandcamp', 'downloader', 'music', 'cli', 'albums', 'dl'],
|
keywords=['bandcamp', 'downloader', 'music', 'cli', 'albums', 'dl'],
|
||||||
packages=find_packages(),
|
packages=find_packages(),
|
||||||
install_requires=[
|
install_requires=[
|
||||||
'beautifulsoup4',
|
'beautifulsoup4',
|
||||||
|
'demjson',
|
||||||
'docopt',
|
'docopt',
|
||||||
'mutagen',
|
'mutagen',
|
||||||
'ply',
|
|
||||||
'requests',
|
'requests',
|
||||||
'slimit',
|
|
||||||
'unicode-slugify',
|
'unicode-slugify',
|
||||||
],
|
],
|
||||||
entry_points={
|
entry_points={
|
||||||
|
|
Loading…
Reference in New Issue