anime-downloader/anime_downloader/extractors/mp4upload.py

import logging
import re

from anime_downloader.extractors.base_extractor import BaseExtractor
from anime_downloader.sites import helpers

logger = logging.getLogger(__name__)


class MP4Upload(BaseExtractor):
    '''Extracts video url from mp4upload embed pages, performs a request
    back to the non-embed mp4upload page to extract the title of the video
    albeit imperfectly as mp4upload doesn't place full title on the main
    page of whichever video you are dealing with.
    '''

    def _get_data(self):
        # Extract the important bits from the embed page, with thanks to the
        # code I saw from github user py7hon in his/her mp4upload-direct
        # program as inspiration for this. Only with regex.
        source_parts_re = re.compile(
                                r'.*?100\|(.*?)\|.*?\|video\|(.*?)\|(\d+)\|.*?',
                                re.DOTALL)

        mp4u_embed = helpers.get(self.url).text
        domain, video_id, protocol = source_parts_re.match(mp4u_embed).groups()

        logger.debug('Domain: %s, Video ID: %s, Protocol: %s' %
                      (domain, video_id, protocol))

        url = self.url.replace('embed-', '')
        # Return to non-embed page to collect title
        mp4u_page = helpers.soupify(helpers.get(url).text)

        title = mp4u_page.find('span', {'class': 'dfilename'}).text
        title = title[:title.rfind('_')][:title.rfind('.')].replace(' ', '_')

        logger.debug('Title is %s' % title)

        # Create the stream url
        stream_url = 'https://{}.mp4upload.com:{}/d/{}/{}.mp4'
        stream_url = stream_url.format(domain, protocol, video_id, title)

        logger.debug('Stream URL: %s' % stream_url)

        return {
            'stream_url': stream_url,
            'meta': {
                'title': title,
                'thumbnail': ''
            }
        }
Fixes MP4Upload issues and minor improvements (#72) Some minor improvements and fixes have been made. The MP4Upload extractor is guaranteed to return the correct url unless changes happen to the mp4upload site. An SSL error can occur when an attempt is made to download from the url in the HTTPDownloader class though I believe that is an issue involving said class as this extractor is correctly getting the url for now. For now I've mostly noticed the SSL error problem when trying to download through masterani as the urls that are extracted from it seem to require that. Though works just fine with animepahe. To test you can just run this command in python to see correct link resolving `MP4Upload('https://mp4upload.com/embed-dz2jeya02ace.html').stream_url` which is for an embed link from masterani. 2018-08-31 01:17:03 -07:00			`import logging`
Added mp4upload extractor 2018-08-27 13:13:52 -07:00			`import re`
Add "Disable SSL cert verifying" via options for requests 2018-10-04 06:00:29 -07:00
added animepahe created mp4upload extractor but not implemented 2018-08-19 13:03:07 -07:00			`from anime_downloader.extractors.base_extractor import BaseExtractor`
chore: logging replaced with logger, style 2019-03-22 06:47:00 -07:00			`from anime_downloader.sites import helpers`
added animepahe created mp4upload extractor but not implemented 2018-08-19 13:03:07 -07:00
chore: logging replaced with logger, style 2019-03-22 06:47:00 -07:00			`logger = logging.getLogger(__name__)`
Fixes for review 2018-10-16 02:59:53 -07:00
added animepahe created mp4upload extractor but not implemented 2018-08-19 13:03:07 -07:00
			`class MP4Upload(BaseExtractor):`
Added mp4upload extractor 2018-08-27 13:13:52 -07:00			`'''Extracts video url from mp4upload embed pages, performs a request`
			`back to the non-embed mp4upload page to extract the title of the video`
			`albeit imperfectly as mp4upload doesn't place full title on the main`
			`page of whichever video you are dealing with.`
added animepahe created mp4upload extractor but not implemented 2018-08-19 13:03:07 -07:00			`'''`
chore: logging replaced with logger, style 2019-03-22 06:47:00 -07:00
Added mp4upload extractor 2018-08-27 13:13:52 -07:00			`def _get_data(self):`
			`# Extract the important bits from the embed page, with thanks to the`
			`# code I saw from github user py7hon in his/her mp4upload-direct`
			`# program as inspiration for this. Only with regex.`
			`source_parts_re = re.compile(`
Multi fixes, some reversions and new functionality (#96) Made Gogoanime functional, added a new way to collect episodes. Had to revert back to old code for animepahe and overall minor tweaks here and there. 2018-10-20 01:37:42 -07:00			`r'.?100\\|(.?)\\|.?\\|video\\|(.?)\\|(\d+)\\|.*?',`
Added mp4upload extractor 2018-08-27 13:13:52 -07:00			`re.DOTALL)`

chore: logging replaced with logger, style 2019-03-22 06:47:00 -07:00			`mp4u_embed = helpers.get(self.url).text`
Added mp4upload extractor 2018-08-27 13:13:52 -07:00			`domain, video_id, protocol = source_parts_re.match(mp4u_embed).groups()`

chore: logging replaced with logger, style 2019-03-22 06:47:00 -07:00			`logger.debug('Domain: %s, Video ID: %s, Protocol: %s' %`
Fixes MP4Upload issues and minor improvements (#72) Some minor improvements and fixes have been made. The MP4Upload extractor is guaranteed to return the correct url unless changes happen to the mp4upload site. An SSL error can occur when an attempt is made to download from the url in the HTTPDownloader class though I believe that is an issue involving said class as this extractor is correctly getting the url for now. For now I've mostly noticed the SSL error problem when trying to download through masterani as the urls that are extracted from it seem to require that. Though works just fine with animepahe. To test you can just run this command in python to see correct link resolving `MP4Upload('https://mp4upload.com/embed-dz2jeya02ace.html').stream_url` which is for an embed link from masterani. 2018-08-31 01:17:03 -07:00			`(domain, video_id, protocol))`

Added mp4upload extractor 2018-08-27 13:13:52 -07:00			`url = self.url.replace('embed-', '')`
			`# Return to non-embed page to collect title`
chore: logging replaced with logger, style 2019-03-22 06:47:00 -07:00			`mp4u_page = helpers.soupify(helpers.get(url).text)`
Added mp4upload extractor 2018-08-27 13:13:52 -07:00
			`title = mp4u_page.find('span', {'class': 'dfilename'}).text`
Fixes MP4Upload issues and minor improvements (#72) Some minor improvements and fixes have been made. The MP4Upload extractor is guaranteed to return the correct url unless changes happen to the mp4upload site. An SSL error can occur when an attempt is made to download from the url in the HTTPDownloader class though I believe that is an issue involving said class as this extractor is correctly getting the url for now. For now I've mostly noticed the SSL error problem when trying to download through masterani as the urls that are extracted from it seem to require that. Though works just fine with animepahe. To test you can just run this command in python to see correct link resolving `MP4Upload('https://mp4upload.com/embed-dz2jeya02ace.html').stream_url` which is for an embed link from masterani. 2018-08-31 01:17:03 -07:00			`title = title[:title.rfind('_')][:title.rfind('.')].replace(' ', '_')`

chore: logging replaced with logger, style 2019-03-22 06:47:00 -07:00			`logger.debug('Title is %s' % title)`
Added mp4upload extractor 2018-08-27 13:13:52 -07:00
			`# Create the stream url`
			`stream_url = 'https://{}.mp4upload.com:{}/d/{}/{}.mp4'`
			`stream_url = stream_url.format(domain, protocol, video_id, title)`

chore: logging replaced with logger, style 2019-03-22 06:47:00 -07:00			`logger.debug('Stream URL: %s' % stream_url)`
Fixes MP4Upload issues and minor improvements (#72) Some minor improvements and fixes have been made. The MP4Upload extractor is guaranteed to return the correct url unless changes happen to the mp4upload site. An SSL error can occur when an attempt is made to download from the url in the HTTPDownloader class though I believe that is an issue involving said class as this extractor is correctly getting the url for now. For now I've mostly noticed the SSL error problem when trying to download through masterani as the urls that are extracted from it seem to require that. Though works just fine with animepahe. To test you can just run this command in python to see correct link resolving `MP4Upload('https://mp4upload.com/embed-dz2jeya02ace.html').stream_url` which is for an embed link from masterani. 2018-08-31 01:17:03 -07:00
Added mp4upload extractor 2018-08-27 13:13:52 -07:00			`return {`
			`'stream_url': stream_url,`
			`'meta': {`
			`'title': title,`
			`'thumbnail': ''`
			`}`
			`}`