Merge 0.0.7 (#94)
* Preliminary 0.0.7 changes Moved to a new album/track data parser using demjson. Slimit and Ply are no longer required. Some basic spelling corrections and consistency changes. Function Annotation, return types, and docstrings added. * Initial commit for the Issue Template * Fleshed out the issue template * Switched to rst (oops), reformatted accordingly * Update ISSUE_TEMPLATE.rst * Moved CONTRIBUTING to the hidden .github directory * No longer trips up on unavailable tracks in an album * Much more robust file integrity checking, session file support. Multi-step process in making sure files are downloaded and encoded properly. Bandcamp-dl will now attempt to search for a not.finished file and if it is found load that sessions arguments and resume operation form where it left off. * Improve download status/progress messages Made the download progress and status messages neater, no more multiple progress bars and lines of status messages. * Final 0.0.7 changes Setup imports for distribution again. Reformatted docstrings. Clarified choices in partial download dialog. Updated changelog. Updated manifest. Updated readme.master
parent
0176249237
commit
abf0dd261b
|
@ -14,13 +14,15 @@ Workflow
|
|||
Please submit as many fixes for typos and grammar bloopers as you can!
|
||||
- Try to limit each pull request to *one* change only.
|
||||
- Once you've addressed review feedback, make sure to bump the pull request with a short note.
|
||||
Maintainers don’t receive notifications when you push new commits.
|
||||
|
||||
|
||||
Code
|
||||
----
|
||||
|
||||
- Try to adhere to PEP8 as best you can (Yes some lines will simply be too long, its ok.)
|
||||
- Try to adhere to PEP8 as best you can.
|
||||
- Annotate functions
|
||||
- Specify return types
|
||||
- Add docstrings
|
||||
|
||||
*****
|
||||
|
|
@ -0,0 +1,12 @@
|
|||
**Python version:**
|
||||
|
||||
**Bandcamp-dl version:**
|
||||
|
||||
**Bancamp-dl options:**
|
||||
|
||||
**url:**
|
||||
|
||||
**options:**
|
||||
|
||||
**Describe the issue:**
|
||||
-------------------------
|
|
@ -38,3 +38,4 @@ nosetests.xml
|
|||
.pydevproject
|
||||
*.iml
|
||||
*.xml
|
||||
bandcamp_dl/asyncdownloader.py
|
||||
|
|
|
@ -17,3 +17,14 @@ Version 0.0.6
|
|||
- [Enhancement] Individual track downloads work now.
|
||||
- [Bugfix] Fixed imports, now working when installed via pip.
|
||||
- [Note] Last version to officially support Python 2.7.x
|
||||
|
||||
Version 0.0.7
|
||||
-------------
|
||||
- [Enhancement] Will now resume if it finds a valid ``not.finished`` file.
|
||||
- [Enhancement] Interrupting downloads is safe, they will resume on next run.
|
||||
- [Enhancement] Interrupting encoding is safe, it will finish on next run.
|
||||
- [Enhancement] CLI output is now much neater.
|
||||
- [Bugfix] Partial albums (some previews disabled) will now download properly.
|
||||
- [Dependency] Slimit is no longer required.
|
||||
- [Dependency] Ply is no longer required.
|
||||
- [Dependency] demjson is now required.
|
||||
|
|
|
@ -1,5 +1,5 @@
|
|||
include README.rst AUTHORS.rst CHANGELOG.rst LICENSE
|
||||
exclude *.mp3 .gitignore yacctab.py lextab.py .travis.yml
|
||||
exclude *.mp3 .gitignore .travis.yml setup.cfg
|
||||
|
||||
global-exclude *.pyc
|
||||
global-exclude *.DS_STORE
|
||||
|
|
60
README.rst
60
README.rst
|
@ -11,7 +11,14 @@ Installation
|
|||
From PyPI
|
||||
---------
|
||||
|
||||
pip install bandcamp-downloader
|
||||
``pip install bandcamp-downloader``
|
||||
|
||||
From Wheel
|
||||
----------
|
||||
|
||||
1. Download the wheel (``.whl``) from PyPI or the Releases page
|
||||
2. ``cd`` to the directory containing the ``.whl`` file
|
||||
2. ``pip install <filename>.whl``
|
||||
|
||||
From Source
|
||||
-----------
|
||||
|
@ -24,7 +31,7 @@ Description
|
|||
===========
|
||||
|
||||
bandcamp-dl is a small command-line app to download audio from
|
||||
BandCamp.com. It requires the Python interpreter, version 2.7.12+ - 3.5.2+ and is
|
||||
BandCamp.com. It requires the Python interpreter, version 3.5+ and is
|
||||
not platform specific. It is released to the public domain, which means
|
||||
you can modify it, redistribute it or use it how ever you like.
|
||||
|
||||
|
@ -77,9 +84,9 @@ The default template is: ``%{artist}/%{album}/%{track} - %{title}``.
|
|||
Bugs
|
||||
====
|
||||
|
||||
Bugs should be reported `here <https://github.com/iheanyi/bandcamp-dl/issues>`_. Please include
|
||||
the full output of the command when run with ``--verbose``. The output
|
||||
(including the first lines) contain important debugging information.
|
||||
Bugs should be reported `here <https://github.com/iheanyi/bandcamp-dl/issues>`_.
|
||||
Please include the full output of the command when run with ``--verbose``.
|
||||
The output (including the first lines) contain important debugging information.
|
||||
Issues without the full output are often not reproducible and therefore
|
||||
do not get solved in short order, if ever.
|
||||
|
||||
|
@ -88,38 +95,6 @@ For discussions, join us in `Discord <https://discord.gg/nwdT4MP>`_.
|
|||
When you submit a request, please re-read it once to avoid a couple of
|
||||
mistakes (you can and should use this as a checklist):
|
||||
|
||||
Is the description of the issue itself sufficient?
|
||||
==================================================
|
||||
|
||||
We often get issue reports that we cannot really decipher. While in most
|
||||
cases we eventually get the required information after asking back
|
||||
multiple times, this poses an unnecessary drain on our resources. Many
|
||||
contributors, including myself, are also not native speakers, so we may
|
||||
misread some parts.
|
||||
|
||||
So please elaborate on what feature you are requesting, or what bug you
|
||||
want to be fixed. Make sure that it's obvious
|
||||
|
||||
- What the problem is
|
||||
- How it could be fixed
|
||||
- How your proposed solution would look like
|
||||
|
||||
If your report is shorter than two lines, it is almost certainly missing
|
||||
some of these, which makes it hard for us to respond to it. We're often
|
||||
too polite to close the issue outright, but the missing info makes
|
||||
misinterpretation likely. As a commiter myself, I often get frustrated
|
||||
by these issues, since the only possible way for me to move forward on
|
||||
them is to ask for clarification over and over.
|
||||
|
||||
For bug reports, this means that your report should contain the
|
||||
*complete* output of bandcamp-dl when called with the ``-v`` flag. The
|
||||
error message you get for (most) bugs even says so, but you would not
|
||||
believe how many of our bug reports do not contain this information.
|
||||
|
||||
Site support requests **must contain an example URL**. An example URL is
|
||||
a URL you might want to download, like
|
||||
``lifeformed.bandcamp.com/album/fastfall``.
|
||||
|
||||
Are you using the latest version?
|
||||
=================================
|
||||
|
||||
|
@ -209,14 +184,11 @@ related to bandcamp-dl, by all means, go ahead and report the bug.
|
|||
Dependencies
|
||||
============
|
||||
|
||||
- `BeautifulSoup <https://pypi.python.org/pypi/beautifulsoup4>`_ -
|
||||
HTML Parsing
|
||||
- `BeautifulSoup <https://pypi.python.org/pypi/beautifulsoup4>`_ - HTML Parsing
|
||||
- `Demjson <https://pypi.python.org/pypi/demjson>`_- JavaScript dict to JSON conversion
|
||||
- `Mutagen <https://pypi.python.org/pypi/mutagen>`_ - ID3 Encoding
|
||||
- `Requests <https://pypi.python.org/pypi/requests>`_ - for retriving
|
||||
the HTML
|
||||
- `Slimit <https://pypi.python.org/pypi/slimit>`_ - Javascript parsing
|
||||
- `Unicode-Slugify <https://pypi.python.org/pypi/unicode-slugify>`_ -
|
||||
A slug generator that turns strings into unicode slugs.
|
||||
- `Requests <https://pypi.python.org/pypi/requests>`_ - for retriving the HTML
|
||||
- `Unicode-Slugify <https://pypi.python.org/pypi/unicode-slugify>`_ - A slug generator that turns strings into unicode slugs.
|
||||
|
||||
Copyright
|
||||
=========
|
||||
|
|
|
@ -1,119 +1,118 @@
|
|||
from .bandcampjson import BandcampJSON
|
||||
from bs4 import BeautifulSoup
|
||||
from bs4 import FeatureNotFound
|
||||
import requests
|
||||
from .jsobj import read_js_object
|
||||
import json
|
||||
|
||||
|
||||
class Bandcamp:
|
||||
def parse(self, url, no_art=True):
|
||||
def parse(self, url: str, art: bool=True) -> dict or None:
|
||||
"""Requests the page, cherry picks album info
|
||||
|
||||
:param url: album/track url
|
||||
:param art: if True download album art
|
||||
:return: album metadata
|
||||
"""
|
||||
try:
|
||||
r = requests.get(url)
|
||||
except requests.exceptions.MissingSchema:
|
||||
return None
|
||||
|
||||
self.no_art = no_art
|
||||
|
||||
if r.status_code is not 200:
|
||||
return None
|
||||
|
||||
try:
|
||||
self.soup = BeautifulSoup(r.text, "lxml")
|
||||
except:
|
||||
except FeatureNotFound:
|
||||
self.soup = BeautifulSoup(r.text, "html.parser")
|
||||
|
||||
self.generate_album_json()
|
||||
self.tracks = self.tralbum_data_json['trackinfo']
|
||||
|
||||
album = {
|
||||
"tracks": [],
|
||||
"title": "",
|
||||
"artist": "",
|
||||
"title": self.embed_data_json['album_title'],
|
||||
"artist": self.embed_data_json['artist'],
|
||||
"full": False,
|
||||
"art": "",
|
||||
"date": ""
|
||||
"date": self.tralbum_data_json['album_release_date']
|
||||
}
|
||||
|
||||
album_meta = self.extract_album_meta_data(r)
|
||||
for track in self.tracks:
|
||||
if track['file'] is not None:
|
||||
track = self.get_track_metadata(track)
|
||||
album['tracks'].append(track)
|
||||
|
||||
album['artist'] = album_meta['artist']
|
||||
album['title'] = album_meta['title']
|
||||
album['date'] = album_meta['date']
|
||||
|
||||
for track in album_meta['tracks']:
|
||||
track = self.get_track_meta_data(track)
|
||||
album['tracks'].append(track)
|
||||
|
||||
album['full'] = self.all_tracks_available(album)
|
||||
if self.no_art:
|
||||
album['full'] = self.all_tracks_available()
|
||||
if art:
|
||||
album['art'] = self.get_album_art()
|
||||
|
||||
return album
|
||||
|
||||
def all_tracks_available(self, album):
|
||||
for track in album['tracks']:
|
||||
if track['url'] is None:
|
||||
return False
|
||||
# Possibly redundant now, we skip unavailable tracks.
|
||||
def all_tracks_available(self) -> bool:
|
||||
"""Verify that all tracks have a url
|
||||
|
||||
:return: True if all urls accounted for
|
||||
"""
|
||||
for track in self.tracks:
|
||||
if track['file'] is None:
|
||||
return False
|
||||
return True
|
||||
|
||||
def is_basestring(self, obj):
|
||||
if isinstance(obj, str) or isinstance(obj, bytes) or isinstance(obj, bytearray):
|
||||
return True
|
||||
return False
|
||||
@staticmethod
|
||||
def get_track_metadata(track: dict or None) -> dict:
|
||||
"""Extract individual track metadata
|
||||
|
||||
def get_track_meta_data(self, track):
|
||||
new_track = {}
|
||||
if not self.is_basestring(track['file']):
|
||||
if 'mp3-128' in track['file']:
|
||||
new_track['url'] = track['file']['mp3-128']
|
||||
:param track: track dict
|
||||
:return: track metadata dict
|
||||
"""
|
||||
track_metadata = {
|
||||
"duration": track['duration'],
|
||||
"track": str(track['track_num']),
|
||||
"title": track['title'],
|
||||
"url": None
|
||||
}
|
||||
|
||||
if 'mp3-128' in track['file']:
|
||||
track_metadata['url'] = "http:" + track['file']['mp3-128']
|
||||
else:
|
||||
new_track['url'] = None
|
||||
track_metadata['url'] = None
|
||||
return track_metadata
|
||||
|
||||
new_track['duration'] = track['duration']
|
||||
new_track['track'] = track['track_num']
|
||||
new_track['title'] = track['title']
|
||||
def generate_album_json(self):
|
||||
"""Retrieve JavaScript dictionaries from page and generate JSON
|
||||
|
||||
return new_track
|
||||
:return: True if successful
|
||||
"""
|
||||
try:
|
||||
embed = BandcampJSON(self.soup, "EmbedData")
|
||||
tralbum = BandcampJSON(self.soup, "TralbumData")
|
||||
|
||||
def extract_album_meta_data(self, request):
|
||||
album = {}
|
||||
embed_data = embed.js_to_json()
|
||||
tralbum_data = tralbum.js_to_json()
|
||||
|
||||
embedData = self.get_embed_string_block(request)
|
||||
|
||||
block = request.text.split("var TralbumData = ")
|
||||
|
||||
stringBlock = block[1]
|
||||
|
||||
stringBlock = stringBlock.split("};")[0] + "};"
|
||||
stringBlock = read_js_object(u"var TralbumData = {}".format(stringBlock))
|
||||
|
||||
if 'album_title' not in embedData['EmbedData']:
|
||||
album['title'] = "Unknown Album"
|
||||
else:
|
||||
album['title'] = embedData['EmbedData']['album_title']
|
||||
|
||||
album['artist'] = stringBlock['TralbumData']['artist']
|
||||
album['tracks'] = stringBlock['TralbumData']['trackinfo']
|
||||
|
||||
if stringBlock['TralbumData']['album_release_date'] == "null":
|
||||
album['date'] = ""
|
||||
else:
|
||||
album['date'] = stringBlock['TralbumData']['album_release_date'].split()[2]
|
||||
|
||||
return album
|
||||
self.embed_data_json = json.loads(embed_data)
|
||||
self.tralbum_data_json = json.loads(tralbum_data)
|
||||
except Exception as e:
|
||||
print(e)
|
||||
return None
|
||||
return True
|
||||
|
||||
@staticmethod
|
||||
def generate_album_url(artist, album):
|
||||
def generate_album_url(artist: str, album: str) -> str:
|
||||
"""Generate an album url based on the artist and album name
|
||||
|
||||
:param artist: artist name
|
||||
:param album: album name
|
||||
:return: album url as str
|
||||
"""
|
||||
return "http://{0}.bandcamp.com/album/{1}".format(artist, album)
|
||||
|
||||
def get_album_art(self):
|
||||
def get_album_art(self) -> str:
|
||||
"""Find and retrieve album art url from page
|
||||
|
||||
:return: url as str
|
||||
"""
|
||||
try:
|
||||
url = self.soup.find(id='tralbumArt').find_all('img')[0]['src']
|
||||
return url
|
||||
except:
|
||||
except None:
|
||||
pass
|
||||
|
||||
def get_embed_string_block(self, request):
|
||||
embedBlock = request.text.split("var EmbedData = ")
|
||||
|
||||
embedStringBlock = embedBlock[1]
|
||||
embedStringBlock = embedStringBlock.split("};")[0] + "};"
|
||||
embedStringBlock = read_js_object(u"var EmbedData = {}".format(embedStringBlock))
|
||||
|
||||
return embedStringBlock
|
||||
|
|
|
@ -1,27 +1,28 @@
|
|||
"""bandcamp-dl
|
||||
|
||||
Usage:
|
||||
bandcamp-dl.py <url>
|
||||
bandcamp-dl.py [--template=<template>] [--base-dir=<dir>]
|
||||
[--full-album]
|
||||
(<url> | --artist=<artist> --album=<album>)
|
||||
[--overwrite]
|
||||
[--no-art]
|
||||
bandcamp-dl.py (-h | --help)
|
||||
bandcamp-dl.py (--version)
|
||||
bandcamp-dl [url]
|
||||
bandcamp-dl [--template=<template>] [--base-dir=<dir>]
|
||||
[--full-album]
|
||||
(<url> | --artist=<artist> --album=<album>)
|
||||
[--overwrite]
|
||||
[--no-art]
|
||||
bandcamp-dl (-h | --help)
|
||||
bandcamp-dl (--version)
|
||||
|
||||
Options:
|
||||
-h --help Show this screen.
|
||||
-v --version Show version.
|
||||
-a --artist=<artist> The artist's slug (from the URL)
|
||||
-b --album=<album> The album's slug (from the URL)
|
||||
-t --template=<template> Output filename template.
|
||||
[default: %{artist}/%{album}/%{track} - %{title}]
|
||||
-d --base-dir=<dir> Base location of which all files are downloaded.
|
||||
-f --full-album Download only if all tracks are available.
|
||||
-o --overwrite Overwrite tracks that already exist. Default is False.
|
||||
-n --no-art Skip grabbing album art
|
||||
|
||||
-h --help Show this screen.
|
||||
-v --version Show version.
|
||||
-a --artist=<artist> The artist's slug (from the URL)
|
||||
-b --album=<album> The album's slug (from the URL)
|
||||
-t --template=<template> Output filename template.
|
||||
[default: %{artist}/%{album}/%{track} - %{title}]
|
||||
-d --base-dir=<dir> Base location of which all files are downloaded.
|
||||
-f --full-album Download only if all tracks are available.
|
||||
-o --overwrite Overwrite tracks that already exist. Default is False.
|
||||
-n --no-art Skip grabbing album art
|
||||
"""
|
||||
"""
|
||||
Coded by:
|
||||
|
||||
Iheanyi Ekechukwu
|
||||
|
@ -39,19 +40,32 @@ Anthony Forsberg:
|
|||
|
||||
Iheanyi:
|
||||
Feel free to use this in any way you wish. I made this just for fun.
|
||||
Shout out to darkf for writing a helper function for parsing the JavaScript!
|
||||
Shout out to darkf for writing the previous helper function for parsing the JavaScript!
|
||||
"""
|
||||
|
||||
import os
|
||||
import ast
|
||||
from docopt import docopt
|
||||
from .bandcamp import Bandcamp
|
||||
from .bandcampdownloader import BandcampDownloader
|
||||
|
||||
|
||||
def main():
|
||||
arguments = docopt(__doc__, version='bandcamp-dl 0.0.6-03')
|
||||
arguments = docopt(__doc__, version='bandcamp-dl 0.0.7')
|
||||
bandcamp = Bandcamp()
|
||||
|
||||
basedir = arguments['--base-dir'] or os.getcwd()
|
||||
session_file = basedir + "/not.finished"
|
||||
|
||||
if os.path.isfile(session_file):
|
||||
with open(session_file, "r") as f:
|
||||
arguments = ast.literal_eval(f.readline())
|
||||
elif arguments['<url>'] is None:
|
||||
print(__doc__)
|
||||
else:
|
||||
with open(session_file, "w") as f:
|
||||
f.write("".join(str(arguments).split('\n')))
|
||||
|
||||
if arguments['--artist'] and arguments['--album']:
|
||||
url = Bandcamp.generate_album_url(arguments['--artist'], arguments['--album'])
|
||||
else:
|
||||
|
@ -61,7 +75,6 @@ def main():
|
|||
album = bandcamp.parse(url, False)
|
||||
else:
|
||||
album = bandcamp.parse(url)
|
||||
basedir = arguments['--base-dir'] or os.getcwd()
|
||||
|
||||
if not album:
|
||||
print("The url {} is not a valid bandcamp page.".format(url))
|
||||
|
|
|
@ -9,6 +9,13 @@ from slugify import slugify
|
|||
|
||||
class BandcampDownloader:
|
||||
def __init__(self, urls=None, template=None, directory=None, overwrite=False):
|
||||
"""Initialize variables we will need throughout the Class
|
||||
|
||||
:param urls: list of urls
|
||||
:param template: filename template
|
||||
:param directory: download location
|
||||
:param overwrite: if True overwrite existing files
|
||||
"""
|
||||
if type(urls) is str:
|
||||
self.urls = [urls]
|
||||
|
||||
|
@ -17,11 +24,28 @@ class BandcampDownloader:
|
|||
self.directory = directory
|
||||
self.overwrite = overwrite
|
||||
|
||||
def start(self, album):
|
||||
print("Starting download process.")
|
||||
self.download_album(album)
|
||||
def start(self, album: dict):
|
||||
"""Start album download process
|
||||
|
||||
def template_to_path(self, track):
|
||||
:param album: album dict
|
||||
"""
|
||||
if album['full'] is not True:
|
||||
choice = input("Track list incomplete, some tracks may be private, download anyway? (yes/no): ").lower()
|
||||
if choice == "yes" or choice == "y":
|
||||
print("Starting download process.")
|
||||
self.download_album(album)
|
||||
else:
|
||||
print("Cancelling download process.")
|
||||
return None
|
||||
else:
|
||||
self.download_album(album)
|
||||
|
||||
def template_to_path(self, track: dict) -> str:
|
||||
"""Create valid filepath based on template
|
||||
|
||||
:param track: track metadata
|
||||
:return: filepath
|
||||
"""
|
||||
path = self.template
|
||||
path = path.replace("%{artist}", slugify(track['artist']))
|
||||
path = path.replace("%{album}", slugify(track['album']))
|
||||
|
@ -32,14 +56,24 @@ class BandcampDownloader:
|
|||
return path.encode('utf-8')
|
||||
|
||||
@staticmethod
|
||||
def create_directory(filename):
|
||||
def create_directory(filename: str) -> str:
|
||||
"""Create directory based on filename if it doesn't exist
|
||||
|
||||
:param filename: full filename
|
||||
:return: directory path
|
||||
"""
|
||||
directory = os.path.dirname(filename)
|
||||
if not os.path.exists(directory):
|
||||
os.makedirs(directory)
|
||||
|
||||
return directory
|
||||
|
||||
def download_album(self, album):
|
||||
def download_album(self, album: dict) -> bool:
|
||||
"""Download all MP3 files in the album
|
||||
|
||||
:param album: album dict
|
||||
:return: True if successful
|
||||
"""
|
||||
for track_index, track in enumerate(album['tracks']):
|
||||
track_meta = {
|
||||
"artist": album['artist'],
|
||||
|
@ -49,53 +83,60 @@ class BandcampDownloader:
|
|||
"date": album['date']
|
||||
}
|
||||
|
||||
print("Accessing track " + str(track_index + 1) + " of " + str(len(album['tracks'])))
|
||||
self.num_tracks = len(album['tracks'])
|
||||
self.track_num = track_index + 1
|
||||
|
||||
filename = self.template_to_path(track_meta).decode()
|
||||
dirname = self.create_directory(filename)
|
||||
filepath = self.template_to_path(track_meta) + ".tmp"
|
||||
filename = filepath.rsplit('/', 1)[1]
|
||||
dirname = self.create_directory(filepath)
|
||||
|
||||
if not track.get('url'):
|
||||
print("Skipping track {0} - {1} as it is not available"
|
||||
.format(track['track'], track['title']))
|
||||
continue
|
||||
attempts = 0
|
||||
skip = False
|
||||
|
||||
try:
|
||||
track_url = track['url']
|
||||
# Check and see if HTTP is in the track_url
|
||||
if 'http' not in track_url:
|
||||
track_url = 'http:{}'.format(track_url)
|
||||
|
||||
r = requests.get(track_url, stream=True)
|
||||
file_length = r.headers.get('content-length')
|
||||
|
||||
if not self.overwrite and os.path.isfile(filename):
|
||||
file_size = os.path.getsize(filename) - 128
|
||||
if int(file_size) != int(file_length):
|
||||
print("{} is incomplete, redownloading.".format(filename))
|
||||
os.remove(filename)
|
||||
else:
|
||||
print("Skipping track {0} - {1} as it's already downloaded, use --overwrite to overwrite existing files"
|
||||
.format(track['track'], track['title']))
|
||||
continue
|
||||
|
||||
with open(filename, "wb") as f:
|
||||
print("Downloading: {}".format(filename[:-4]))
|
||||
if file_length is None:
|
||||
f.write(r.content)
|
||||
else:
|
||||
while True:
|
||||
try:
|
||||
r = requests.get(track['url'], stream=True)
|
||||
file_length = int(r.headers['content-length'])
|
||||
total = int(file_length/100)
|
||||
# If file exists and is still a tmp file skip downloading and encode
|
||||
if os.path.exists(filepath):
|
||||
self.write_id3_tags(filepath, track_meta)
|
||||
# Set skip to True so that we don't try encoding again
|
||||
skip = True
|
||||
# break out of the try/except and move on to the next file
|
||||
break
|
||||
elif os.path.exists(filepath[:-4]) and self.overwrite is not True:
|
||||
print("File: {} already exists and is complete, skipping..".format(filename[:-4]))
|
||||
skip = True
|
||||
break
|
||||
with open(filepath, "wb") as f:
|
||||
dl = 0
|
||||
total_length = int(file_length)
|
||||
for data in r.iter_content(chunk_size=int(total_length/100)):
|
||||
for data in r.iter_content(chunk_size=total):
|
||||
dl += len(data)
|
||||
f.write(data)
|
||||
done = int(50 * dl / total_length)
|
||||
sys.stdout.write("\r[%s%s]" % ('=' * done, ' ' * (50 - done)))
|
||||
done = int(50 * dl / file_length)
|
||||
sys.stdout.write("\r({}/{}) [{}{}] :: Downloading: {}".format(self.track_num, self.num_tracks, "=" * done, " " * (50 - done), filename[:-8]))
|
||||
sys.stdout.flush()
|
||||
self.write_id3_tags(filename, track_meta)
|
||||
except Exception as e:
|
||||
print(e)
|
||||
print("Downloading failed..")
|
||||
return False
|
||||
local_size = os.path.getsize(filepath)
|
||||
# if the local filesize before encoding doesn't match the remote filesize redownload
|
||||
if local_size != file_length and attempts != 3:
|
||||
print("{} is incomplete, retrying..".format(filename))
|
||||
continue
|
||||
# if the maximum number of retry attempts is reached give up and move on
|
||||
elif attempts == 3:
|
||||
print("Maximum retries reached.. skipping.")
|
||||
# Clean up incomplete file
|
||||
os.remove(filepath)
|
||||
break
|
||||
# if all is well continue the download process for the rest of the tracks
|
||||
else:
|
||||
break
|
||||
except Exception as e:
|
||||
print(e)
|
||||
print("Downloading failed..")
|
||||
return False
|
||||
if skip is not True:
|
||||
self.write_id3_tags(filepath, track_meta)
|
||||
if album['art']:
|
||||
try:
|
||||
with open("{}/cover.jpg".format(dirname), "wb") as f:
|
||||
|
@ -105,17 +146,26 @@ class BandcampDownloader:
|
|||
print(e)
|
||||
print("Couldn't download album art.")
|
||||
|
||||
if os.path.isfile("not.finished"):
|
||||
os.remove("not.finished")
|
||||
return True
|
||||
|
||||
@staticmethod
|
||||
def write_id3_tags(filename, meta):
|
||||
print("\nEncoding . . .")
|
||||
def write_id3_tags(self, filepath: str, meta: dict):
|
||||
"""Write metadata to the MP3 file
|
||||
|
||||
audio = MP3(filename)
|
||||
:param filepath: name of mp3 file
|
||||
:param meta: dict of track metadata
|
||||
"""
|
||||
filename = filepath.rsplit('/', 1)[1][:-8]
|
||||
|
||||
sys.stdout.flush()
|
||||
sys.stdout.write("\r({}/{}) [{}] :: Encoding: {}".format(self.track_num, self.num_tracks, "=" * 50, filename))
|
||||
|
||||
audio = MP3(filepath)
|
||||
audio["TIT2"] = TIT2(encoding=3, text=["title"])
|
||||
audio.save(filename=None, v1=2)
|
||||
|
||||
audio = EasyID3(filename)
|
||||
audio = EasyID3(filepath)
|
||||
audio["tracknumber"] = meta['track']
|
||||
audio["title"] = meta['title']
|
||||
audio["artist"] = meta['artist']
|
||||
|
@ -123,5 +173,7 @@ class BandcampDownloader:
|
|||
audio["date"] = meta['date']
|
||||
audio.save()
|
||||
|
||||
audio.save(filename)
|
||||
print("Done encoding . . .")
|
||||
audio.save(filepath[:-4])
|
||||
os.remove(filepath)
|
||||
|
||||
sys.stdout.write("\r({}/{}) [{}] :: Finished: {}".format(self.track_num, self.num_tracks, "=" * 50, filename))
|
||||
|
|
|
@ -0,0 +1,39 @@
|
|||
import demjson
|
||||
import re
|
||||
|
||||
|
||||
class BandcampJSON:
|
||||
def __init__(self, body, var_name: str, js_data=None):
|
||||
self.body = body
|
||||
self.var_name = var_name
|
||||
self.js_data = js_data
|
||||
|
||||
def get_js(self) -> str:
|
||||
"""Get <script> element containing the data we need and return the raw JS
|
||||
|
||||
:return js_data: Raw JS as str
|
||||
"""
|
||||
self.js_data = self.body.find("script", {"src": False}, text=re.compile(self.var_name)).string
|
||||
return self.js_data
|
||||
|
||||
def extract_data(self, js: str) -> str:
|
||||
"""Extract values from JS dictionary
|
||||
|
||||
:param js: Raw JS
|
||||
:return: Contents of dictionary as str
|
||||
"""
|
||||
self.js_data = re.search(r"(?<=var\s" + self.var_name + "\s=\s)[^;]*", js).group().replace('" + "', '')
|
||||
return self.js_data
|
||||
|
||||
def js_to_json(self) -> str:
|
||||
"""Convert JavaScript dictionary to JSON
|
||||
|
||||
:return: JSON as str
|
||||
"""
|
||||
js = self.get_js()
|
||||
data = self.extract_data(js)
|
||||
# Decode with demjson first to reformat keys and lists
|
||||
js_data = demjson.decode(data)
|
||||
# Encode to make valid JSON
|
||||
js_data = demjson.encode(js_data)
|
||||
return js_data
|
|
@ -1,7 +1,6 @@
|
|||
beautifulsoup4==4.5.1
|
||||
demjson==2.2.4
|
||||
docopt==0.6.2
|
||||
mutagen==1.35.1
|
||||
ply==3.9
|
||||
requests==2.12.4
|
||||
slimit==0.8.1
|
||||
unicode-slugify==0.1.3
|
||||
|
|
|
@ -1,81 +0,0 @@
|
|||
"""
|
||||
Simple JavaScript/ECMAScript object literal reader
|
||||
Only supports object literals wrapped in `var x = ...;` statements, so you
|
||||
might want to do read_js_object('var x = %s;' % literal) if it's in another format.
|
||||
|
||||
Requires the slimit <https://github.com/rspivak/slimit> library for parsing.
|
||||
|
||||
Basic constand folding on strings and numbers is done, e.g. "hi " + "there!" reduces to "hi there!",
|
||||
and 1+1 reduces to 2.
|
||||
|
||||
Copyright (c) 2013 darkf
|
||||
Licensed under the terms of the WTFPL:
|
||||
|
||||
DO WHAT THE FUCK YOU WANT TO PUBLIC LICENSE
|
||||
Version 2, December 2004
|
||||
|
||||
Everyone is permitted to copy and distribute verbatim or modified
|
||||
copies of this license document, and changing it is allowed as long
|
||||
as the name is changed.
|
||||
|
||||
DO WHAT THE FUCK YOU WANT TO PUBLIC LICENSE
|
||||
TERMS AND CONDITIONS FOR COPYING, DISTRIBUTION AND MODIFICATION
|
||||
|
||||
0. You just DO WHAT THE FUCK YOU WANT TO.
|
||||
"""
|
||||
|
||||
from slimit.parser import Parser
|
||||
import slimit.ast as ast
|
||||
|
||||
|
||||
def read_js_object(code):
|
||||
parser = Parser()
|
||||
|
||||
def visit(node):
|
||||
if isinstance(node, ast.Program):
|
||||
d = {}
|
||||
for child in node:
|
||||
if not isinstance(child, ast.VarStatement):
|
||||
raise ValueError("All statements should be var statements")
|
||||
key, val = visit(child)
|
||||
d[key] = val
|
||||
return d
|
||||
elif isinstance(node, ast.VarStatement):
|
||||
return visit(node.children()[0])
|
||||
elif isinstance(node, ast.VarDecl):
|
||||
return visit(node.identifier), visit(node.initializer)
|
||||
elif isinstance(node, ast.Object):
|
||||
d = {}
|
||||
for property in node:
|
||||
key = visit(property.left)
|
||||
value = visit(property.right)
|
||||
d[key] = value
|
||||
return d
|
||||
elif isinstance(node, ast.BinOp):
|
||||
# simple constant folding
|
||||
if node.op == '+':
|
||||
if isinstance(node.left, ast.String) and isinstance(node.right, ast.String):
|
||||
return visit(node.left) + visit(node.right)
|
||||
elif isinstance(node.left, ast.Number) and isinstance(node.right, ast.Number):
|
||||
return visit(node.left) + visit(node.right)
|
||||
else:
|
||||
raise ValueError("Cannot + on anything other than two literals")
|
||||
else:
|
||||
raise ValueError("Cannot do operator '{}'".format(node.op))
|
||||
|
||||
elif isinstance(node, ast.String):
|
||||
return node.value.strip('"').strip("'")
|
||||
elif isinstance(node, ast.Array):
|
||||
return [visit(x) for x in node]
|
||||
elif isinstance(node, ast.Number) or isinstance(node, ast.Identifier)\
|
||||
or isinstance(node, ast.Boolean) or isinstance(node, ast.Null):
|
||||
return node.value
|
||||
else:
|
||||
raise Exception("Unhandled node: {}".format(node))
|
||||
|
||||
return visit(parser.parse(code))
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
print(read_js_object("""var foo = {x: 10, y: "hi " + "there!"};
|
||||
var bar = {derp: ["herp", "it", "up", "forever"]};"""))
|
|
@ -1,9 +1,8 @@
|
|||
--index-url https://pypi.python.org/simple/
|
||||
|
||||
beautifulsoup4==4.5.1
|
||||
demjson==2.2.4
|
||||
docopt==0.6.2
|
||||
mutagen==1.35.1
|
||||
ply==3.9
|
||||
requests==2.12.4
|
||||
slimit==0.8.1
|
||||
unicode-slugify==0.1.3
|
||||
unicode-slugify==0.1.3
|
||||
|
|
6
setup.py
6
setup.py
|
@ -6,7 +6,7 @@ here = path.abspath(path.dirname(__file__))
|
|||
|
||||
setup(
|
||||
name='bandcamp-downloader',
|
||||
version='0.0.6-03',
|
||||
version='0.0.7',
|
||||
description='bandcamp-dl downloads albums and tracks from Bandcamp for you',
|
||||
long_description=open('README.rst').read(),
|
||||
url='https://github.com/iheanyi/bandcamp-dl',
|
||||
|
@ -18,18 +18,16 @@ setup(
|
|||
'Intended Audience :: End Users/Desktop',
|
||||
'Topic :: Multimedia :: Sound/Audio',
|
||||
'License :: Public Domain',
|
||||
'Programming Language :: Python :: 2.7',
|
||||
'Programming Language :: Python :: 3.5',
|
||||
],
|
||||
keywords=['bandcamp', 'downloader', 'music', 'cli', 'albums', 'dl'],
|
||||
packages=find_packages(),
|
||||
install_requires=[
|
||||
'beautifulsoup4',
|
||||
'demjson',
|
||||
'docopt',
|
||||
'mutagen',
|
||||
'ply',
|
||||
'requests',
|
||||
'slimit',
|
||||
'unicode-slugify',
|
||||
],
|
||||
entry_points={
|
||||
|
|
Loading…
Reference in New Issue