youtube-dl/devscripts/make_lazy_extractors.py

from __future__ import unicode_literals, print_function

from inspect import getsource
import io
import os
from os.path import dirname as dirn
import re
import sys

print('WARNING: Lazy loading extractors is an experimental feature that may not always work', file=sys.stderr)

sys.path.insert(0, dirn(dirn((os.path.abspath(__file__)))))

lazy_extractors_filename = sys.argv[1]
if os.path.exists(lazy_extractors_filename):
    os.remove(lazy_extractors_filename)
# Py2: may be confused by leftover lazy_extractors.pyc
try:
    os.remove(lazy_extractors_filename + 'c')
except OSError:
    pass

from youtube_dl.compat import compat_register_utf8

compat_register_utf8()

from youtube_dl.extractor import _ALL_CLASSES
from youtube_dl.extractor.common import InfoExtractor, SearchInfoExtractor

with open('devscripts/lazy_load_template.py', 'rt') as f:
    module_template = f.read()


def get_source(m):
    return re.sub(r'(?m)^\s*#.*\n', '', getsource(m))


module_contents = [
    module_template,
    get_source(InfoExtractor.suitable),
    get_source(InfoExtractor._match_valid_url) + '\n',
    'class LazyLoadSearchExtractor(LazyLoadExtractor):\n    pass\n',
    # needed for suitable() methods of Youtube extractor (see #28780)
    'from youtube_dl.utils import parse_qs, variadic\n',
]

ie_template = '''
class {name}({bases}):
    _VALID_URL = {valid_url!r}
    _module = '{module}'
'''

make_valid_template = '''
    @classmethod
    def _make_valid_url(cls):
        return {valid_url!r}
'''


def get_base_name(base):
    if base is InfoExtractor:
        return 'LazyLoadExtractor'
    elif base is SearchInfoExtractor:
        return 'LazyLoadSearchExtractor'
    else:
        return base.__name__


def build_lazy_ie(ie, name):
    valid_url = getattr(ie, '_VALID_URL', None)
    s = ie_template.format(
        name=name,
        bases=', '.join(map(get_base_name, ie.__bases__)),
        valid_url=valid_url,
        module=ie.__module__)
    if ie.suitable.__func__ is not InfoExtractor.suitable.__func__:
        s += '\n' + get_source(ie.suitable)
    if hasattr(ie, '_make_valid_url'):
        # search extractors
        s += make_valid_template.format(valid_url=ie._make_valid_url())
    return s


# find the correct sorting and add the required base classes so that subclasses
# can be correctly created
classes = _ALL_CLASSES[:-1]
ordered_cls = []
while classes:
    for c in classes[:]:
        bases = set(c.__bases__) - set((object, InfoExtractor, SearchInfoExtractor))
        stop = False
        for b in bases:
            if b not in classes and b not in ordered_cls:
                if b.__name__ == 'GenericIE':
                    exit()
                classes.insert(0, b)
                stop = True
        if stop:
            break
        if all(b in ordered_cls for b in bases):
            ordered_cls.append(c)
            classes.remove(c)
            break
ordered_cls.append(_ALL_CLASSES[-1])

names = []
for ie in ordered_cls:
    name = ie.__name__
    src = build_lazy_ie(ie, name)
    module_contents.append(src)
    if ie in _ALL_CLASSES:
        names.append(name)

module_contents.append(
    '_ALL_CLASSES = [{0}]'.format(', '.join(names)))

module_src = '\n'.join(module_contents) + '\n'

with io.open(lazy_extractors_filename, 'wt', encoding='utf-8') as f:
    f.write(module_src)

# work around JVM byte code module limit in Jython
if sys.platform.startswith('java') and sys.version_info[:2] == (2, 7):
    import subprocess
    from youtube_dl.compat import compat_subprocess_get_DEVNULL
    # if Python 2.7 is available, use it to compile the module for Jython
    try:
        # if Python 2.7 is available, use it to compile the module for Jython
        subprocess.check_call(['python2.7', '-m', 'py_compile', lazy_extractors_filename], stdout=compat_subprocess_get_DEVNULL())
    except Exception:
        pass
Add experimental support for lazy loading the info extractors 'make lazy-extractors' creates the youtube_dl/extractor/lazy_extractors.py (imported by youtube_dl/extractor/__init__.py), which contains simplified classes that only have the 'suitable' class method and that load the appropiate class with the '__new__' method when a instance is created. 2016-02-10 14:01:31 +01:00			`from __future__ import unicode_literals, print_function`

			`from inspect import getsource`
[devscripts/make_lazy_extractors] Fix making lazy extractors on python 3 under Windows 2017-02-23 20:09:13 +01:00			`import io`
Add experimental support for lazy loading the info extractors 'make lazy-extractors' creates the youtube_dl/extractor/lazy_extractors.py (imported by youtube_dl/extractor/__init__.py), which contains simplified classes that only have the 'suitable' class method and that load the appropiate class with the '__new__' method when a instance is created. 2016-02-10 14:01:31 +01:00			`import os`
			`from os.path import dirname as dirn`
[InfoExtractor] Add `_match_valid_url()` class method and refactor * API compatible with yt-dlp * also support Sequence of patterns in _VALID_URL * one place to compile _VALID_URL * TODO: remove existing extractor shims 2023-07-19 15:14:50 +02:00			`import re`
Add experimental support for lazy loading the info extractors 'make lazy-extractors' creates the youtube_dl/extractor/lazy_extractors.py (imported by youtube_dl/extractor/__init__.py), which contains simplified classes that only have the 'suitable' class method and that load the appropiate class with the '__new__' method when a instance is created. 2016-02-10 14:01:31 +01:00			`import sys`

			`print('WARNING: Lazy loading extractors is an experimental feature that may not always work', file=sys.stderr)`

			`sys.path.insert(0, dirn(dirn((os.path.abspath(__file__)))))`

			`lazy_extractors_filename = sys.argv[1]`
			`if os.path.exists(lazy_extractors_filename):`
			`os.remove(lazy_extractors_filename)`
[YouTube] Refresh compat/utils usage * import parse_qs() * import parse_qs in lazy_extractors (clears old TODO) * clean up old compiled lazy_extractors for Py2 * use update_url() 2023-02-06 17:19:21 +01:00			`# Py2: may be confused by leftover lazy_extractors.pyc`
			`try:`
			`os.remove(lazy_extractors_filename + 'c')`
			`except OSError:`
			`pass`
Add experimental support for lazy loading the info extractors 'make lazy-extractors' creates the youtube_dl/extractor/lazy_extractors.py (imported by youtube_dl/extractor/__init__.py), which contains simplified classes that only have the 'suitable' class method and that load the appropiate class with the '__new__' method when a instance is created. 2016-02-10 14:01:31 +01:00
[workflows/ci.yml] Fix test support for Py 2.6 2023-06-30 04:52:39 +02:00			`from youtube_dl.compat import compat_register_utf8`

			`compat_register_utf8()`

Add experimental support for lazy loading the info extractors 'make lazy-extractors' creates the youtube_dl/extractor/lazy_extractors.py (imported by youtube_dl/extractor/__init__.py), which contains simplified classes that only have the 'suitable' class method and that load the appropiate class with the '__new__' method when a instance is created. 2016-02-10 14:01:31 +01:00			`from youtube_dl.extractor import _ALL_CLASSES`
lazy-extractors: Fix after commit 6e6b9f600f2f447604f6108fb6486b73cc25def1 The problem was in the following code: class ArteTVPlus7IE(ArteTVBaseIE): ... @classmethod def suitable(cls, url): return False if ArteTVPlaylistIE.suitable(url) else super(ArteTVPlus7IE, cls).suitable(url) And its sublcasses like ArteTVCinemaIE. Since in the lazy_extractors.py file ArteTVCinemaIE was not a subclass of ArteTVPlus7IE, super(ArteTVPlus7IE, cls) failed. To fix it we have to make it a subclass. Since the order of _ALL_CLASSES is arbitrary we must sort them so that the base classes are defined first. We also must add base classes like YoutubeBaseInfoExtractor. 2016-06-22 19:13:46 +02:00			`from youtube_dl.extractor.common import InfoExtractor, SearchInfoExtractor`
Add experimental support for lazy loading the info extractors 'make lazy-extractors' creates the youtube_dl/extractor/lazy_extractors.py (imported by youtube_dl/extractor/__init__.py), which contains simplified classes that only have the 'suitable' class method and that load the appropiate class with the '__new__' method when a instance is created. 2016-02-10 14:01:31 +01:00
			`with open('devscripts/lazy_load_template.py', 'rt') as f:`
			`module_template = f.read()`

[InfoExtractor] Add `_match_valid_url()` class method and refactor * API compatible with yt-dlp * also support Sequence of patterns in _VALID_URL * one place to compile _VALID_URL * TODO: remove existing extractor shims 2023-07-19 15:14:50 +02:00
			`def get_source(m):`
			`return re.sub(r'(?m)^\s#.\n', '', getsource(m))`


lazy-extractors: Fix after commit 6e6b9f600f2f447604f6108fb6486b73cc25def1 The problem was in the following code: class ArteTVPlus7IE(ArteTVBaseIE): ... @classmethod def suitable(cls, url): return False if ArteTVPlaylistIE.suitable(url) else super(ArteTVPlus7IE, cls).suitable(url) And its sublcasses like ArteTVCinemaIE. Since in the lazy_extractors.py file ArteTVCinemaIE was not a subclass of ArteTVPlus7IE, super(ArteTVPlus7IE, cls) failed. To fix it we have to make it a subclass. Since the order of _ALL_CLASSES is arbitrary we must sort them so that the base classes are defined first. We also must add base classes like YoutubeBaseInfoExtractor. 2016-06-22 19:13:46 +02:00			`module_contents = [`
[InfoExtractor] Add `_match_valid_url()` class method and refactor * API compatible with yt-dlp * also support Sequence of patterns in _VALID_URL * one place to compile _VALID_URL * TODO: remove existing extractor shims 2023-07-19 15:14:50 +02:00			`module_template,`
			`get_source(InfoExtractor.suitable),`
			`get_source(InfoExtractor._match_valid_url) + '\n',`
[YouTube] Refresh compat/utils usage * import parse_qs() * import parse_qs in lazy_extractors (clears old TODO) * clean up old compiled lazy_extractors for Py2 * use update_url() 2023-02-06 17:19:21 +01:00			`'class LazyLoadSearchExtractor(LazyLoadExtractor):\n pass\n',`
			`# needed for suitable() methods of Youtube extractor (see #28780)`
[InfoExtractor] Add `_match_valid_url()` class method and refactor * API compatible with yt-dlp * also support Sequence of patterns in _VALID_URL * one place to compile _VALID_URL * TODO: remove existing extractor shims 2023-07-19 15:14:50 +02:00			`'from youtube_dl.utils import parse_qs, variadic\n',`
[YouTube] Refresh compat/utils usage * import parse_qs() * import parse_qs in lazy_extractors (clears old TODO) * clean up old compiled lazy_extractors for Py2 * use update_url() 2023-02-06 17:19:21 +01:00			`]`
Add experimental support for lazy loading the info extractors 'make lazy-extractors' creates the youtube_dl/extractor/lazy_extractors.py (imported by youtube_dl/extractor/__init__.py), which contains simplified classes that only have the 'suitable' class method and that load the appropiate class with the '__new__' method when a instance is created. 2016-02-10 14:01:31 +01:00
			`ie_template = '''`
lazy-extractors: Fix after commit 6e6b9f600f2f447604f6108fb6486b73cc25def1 The problem was in the following code: class ArteTVPlus7IE(ArteTVBaseIE): ... @classmethod def suitable(cls, url): return False if ArteTVPlaylistIE.suitable(url) else super(ArteTVPlus7IE, cls).suitable(url) And its sublcasses like ArteTVCinemaIE. Since in the lazy_extractors.py file ArteTVCinemaIE was not a subclass of ArteTVPlus7IE, super(ArteTVPlus7IE, cls) failed. To fix it we have to make it a subclass. Since the order of _ALL_CLASSES is arbitrary we must sort them so that the base classes are defined first. We also must add base classes like YoutubeBaseInfoExtractor. 2016-06-22 19:13:46 +02:00			`class {name}({bases}):`
Add experimental support for lazy loading the info extractors 'make lazy-extractors' creates the youtube_dl/extractor/lazy_extractors.py (imported by youtube_dl/extractor/__init__.py), which contains simplified classes that only have the 'suitable' class method and that load the appropiate class with the '__new__' method when a instance is created. 2016-02-10 14:01:31 +01:00			`_VALID_URL = {valid_url!r}`
			`_module = '{module}'`
			`'''`

			`make_valid_template = '''`
			`@classmethod`
			`def _make_valid_url(cls):`
lazy extractors: Fix building with python2.6 2016-02-21 11:53:48 +01:00			`return {valid_url!r}`
Add experimental support for lazy loading the info extractors 'make lazy-extractors' creates the youtube_dl/extractor/lazy_extractors.py (imported by youtube_dl/extractor/__init__.py), which contains simplified classes that only have the 'suitable' class method and that load the appropiate class with the '__new__' method when a instance is created. 2016-02-10 14:01:31 +01:00			`'''`


lazy-extractors: Fix after commit 6e6b9f600f2f447604f6108fb6486b73cc25def1 The problem was in the following code: class ArteTVPlus7IE(ArteTVBaseIE): ... @classmethod def suitable(cls, url): return False if ArteTVPlaylistIE.suitable(url) else super(ArteTVPlus7IE, cls).suitable(url) And its sublcasses like ArteTVCinemaIE. Since in the lazy_extractors.py file ArteTVCinemaIE was not a subclass of ArteTVPlus7IE, super(ArteTVPlus7IE, cls) failed. To fix it we have to make it a subclass. Since the order of _ALL_CLASSES is arbitrary we must sort them so that the base classes are defined first. We also must add base classes like YoutubeBaseInfoExtractor. 2016-06-22 19:13:46 +02:00			`def get_base_name(base):`
			`if base is InfoExtractor:`
			`return 'LazyLoadExtractor'`
			`elif base is SearchInfoExtractor:`
			`return 'LazyLoadSearchExtractor'`
			`else:`
			`return base.__name__`


Add experimental support for lazy loading the info extractors 'make lazy-extractors' creates the youtube_dl/extractor/lazy_extractors.py (imported by youtube_dl/extractor/__init__.py), which contains simplified classes that only have the 'suitable' class method and that load the appropiate class with the '__new__' method when a instance is created. 2016-02-10 14:01:31 +01:00			`def build_lazy_ie(ie, name):`
			`valid_url = getattr(ie, '_VALID_URL', None)`
			`s = ie_template.format(`
			`name=name,`
lazy-extractors: Fix after commit 6e6b9f600f2f447604f6108fb6486b73cc25def1 The problem was in the following code: class ArteTVPlus7IE(ArteTVBaseIE): ... @classmethod def suitable(cls, url): return False if ArteTVPlaylistIE.suitable(url) else super(ArteTVPlus7IE, cls).suitable(url) And its sublcasses like ArteTVCinemaIE. Since in the lazy_extractors.py file ArteTVCinemaIE was not a subclass of ArteTVPlus7IE, super(ArteTVPlus7IE, cls) failed. To fix it we have to make it a subclass. Since the order of _ALL_CLASSES is arbitrary we must sort them so that the base classes are defined first. We also must add base classes like YoutubeBaseInfoExtractor. 2016-06-22 19:13:46 +02:00			`bases=', '.join(map(get_base_name, ie.__bases__)),`
Add experimental support for lazy loading the info extractors 'make lazy-extractors' creates the youtube_dl/extractor/lazy_extractors.py (imported by youtube_dl/extractor/__init__.py), which contains simplified classes that only have the 'suitable' class method and that load the appropiate class with the '__new__' method when a instance is created. 2016-02-10 14:01:31 +01:00			`valid_url=valid_url,`
			`module=ie.__module__)`
			`if ie.suitable.__func__ is not InfoExtractor.suitable.__func__:`
[InfoExtractor] Add `_match_valid_url()` class method and refactor * API compatible with yt-dlp * also support Sequence of patterns in _VALID_URL * one place to compile _VALID_URL * TODO: remove existing extractor shims 2023-07-19 15:14:50 +02:00			`s += '\n' + get_source(ie.suitable)`
Add experimental support for lazy loading the info extractors 'make lazy-extractors' creates the youtube_dl/extractor/lazy_extractors.py (imported by youtube_dl/extractor/__init__.py), which contains simplified classes that only have the 'suitable' class method and that load the appropiate class with the '__new__' method when a instance is created. 2016-02-10 14:01:31 +01:00			`if hasattr(ie, '_make_valid_url'):`
			`# search extractors`
lazy extractors: Fix building with python2.6 2016-02-21 11:53:48 +01:00			`s += make_valid_template.format(valid_url=ie._make_valid_url())`
Add experimental support for lazy loading the info extractors 'make lazy-extractors' creates the youtube_dl/extractor/lazy_extractors.py (imported by youtube_dl/extractor/__init__.py), which contains simplified classes that only have the 'suitable' class method and that load the appropiate class with the '__new__' method when a instance is created. 2016-02-10 14:01:31 +01:00			`return s`

Update coding style after pycodestyle 2.1.0 In pycodestyle 2.1.0, E305 was introduced, which requires two blank lines after top level declarations, too. See https://github.com/PyCQA/pycodestyle/issues/400 See also #10689; thanks @stepshal for first mentioning this issue and initial patches 2016-11-17 12:42:56 +01:00
[devscripts/make_lazy_extractors] Correct a spelling mistake (#26991) 2020-11-16 16:08:20 +01:00			`# find the correct sorting and add the required base classes so that subclasses`
lazy-extractors: Fix after commit 6e6b9f600f2f447604f6108fb6486b73cc25def1 The problem was in the following code: class ArteTVPlus7IE(ArteTVBaseIE): ... @classmethod def suitable(cls, url): return False if ArteTVPlaylistIE.suitable(url) else super(ArteTVPlus7IE, cls).suitable(url) And its sublcasses like ArteTVCinemaIE. Since in the lazy_extractors.py file ArteTVCinemaIE was not a subclass of ArteTVPlus7IE, super(ArteTVPlus7IE, cls) failed. To fix it we have to make it a subclass. Since the order of _ALL_CLASSES is arbitrary we must sort them so that the base classes are defined first. We also must add base classes like YoutubeBaseInfoExtractor. 2016-06-22 19:13:46 +02:00			`# can be correctly created`
			`classes = _ALL_CLASSES[:-1]`
			`ordered_cls = []`
			`while classes:`
			`for c in classes[:]:`
			`bases = set(c.__bases__) - set((object, InfoExtractor, SearchInfoExtractor))`
			`stop = False`
			`for b in bases:`
			`if b not in classes and b not in ordered_cls:`
			`if b.__name__ == 'GenericIE':`
			`exit()`
			`classes.insert(0, b)`
			`stop = True`
			`if stop:`
			`break`
			`if all(b in ordered_cls for b in bases):`
			`ordered_cls.append(c)`
			`classes.remove(c)`
			`break`
			`ordered_cls.append(_ALL_CLASSES[-1])`

Add experimental support for lazy loading the info extractors 'make lazy-extractors' creates the youtube_dl/extractor/lazy_extractors.py (imported by youtube_dl/extractor/__init__.py), which contains simplified classes that only have the 'suitable' class method and that load the appropiate class with the '__new__' method when a instance is created. 2016-02-10 14:01:31 +01:00			`names = []`
lazy-extractors: Fix after commit 6e6b9f600f2f447604f6108fb6486b73cc25def1 The problem was in the following code: class ArteTVPlus7IE(ArteTVBaseIE): ... @classmethod def suitable(cls, url): return False if ArteTVPlaylistIE.suitable(url) else super(ArteTVPlus7IE, cls).suitable(url) And its sublcasses like ArteTVCinemaIE. Since in the lazy_extractors.py file ArteTVCinemaIE was not a subclass of ArteTVPlus7IE, super(ArteTVPlus7IE, cls) failed. To fix it we have to make it a subclass. Since the order of _ALL_CLASSES is arbitrary we must sort them so that the base classes are defined first. We also must add base classes like YoutubeBaseInfoExtractor. 2016-06-22 19:13:46 +02:00			`for ie in ordered_cls:`
			`name = ie.__name__`
Add experimental support for lazy loading the info extractors 'make lazy-extractors' creates the youtube_dl/extractor/lazy_extractors.py (imported by youtube_dl/extractor/__init__.py), which contains simplified classes that only have the 'suitable' class method and that load the appropiate class with the '__new__' method when a instance is created. 2016-02-10 14:01:31 +01:00			`src = build_lazy_ie(ie, name)`
			`module_contents.append(src)`
lazy-extractors: Fix after commit 6e6b9f600f2f447604f6108fb6486b73cc25def1 The problem was in the following code: class ArteTVPlus7IE(ArteTVBaseIE): ... @classmethod def suitable(cls, url): return False if ArteTVPlaylistIE.suitable(url) else super(ArteTVPlus7IE, cls).suitable(url) And its sublcasses like ArteTVCinemaIE. Since in the lazy_extractors.py file ArteTVCinemaIE was not a subclass of ArteTVPlus7IE, super(ArteTVPlus7IE, cls) failed. To fix it we have to make it a subclass. Since the order of _ALL_CLASSES is arbitrary we must sort them so that the base classes are defined first. We also must add base classes like YoutubeBaseInfoExtractor. 2016-06-22 19:13:46 +02:00			`if ie in _ALL_CLASSES:`
			`names.append(name)`
Add experimental support for lazy loading the info extractors 'make lazy-extractors' creates the youtube_dl/extractor/lazy_extractors.py (imported by youtube_dl/extractor/__init__.py), which contains simplified classes that only have the 'suitable' class method and that load the appropiate class with the '__new__' method when a instance is created. 2016-02-10 14:01:31 +01:00
			`module_contents.append(`
lazy extractors: Fix building with python2.6 2016-02-21 11:53:48 +01:00			`'_ALL_CLASSES = [{0}]'.format(', '.join(names)))`
Add experimental support for lazy loading the info extractors 'make lazy-extractors' creates the youtube_dl/extractor/lazy_extractors.py (imported by youtube_dl/extractor/__init__.py), which contains simplified classes that only have the 'suitable' class method and that load the appropiate class with the '__new__' method when a instance is created. 2016-02-10 14:01:31 +01:00
lazy extractors: Style fixes * Sort extractors alphabetically * Add newlines when needed (youtube_dl/extractors/lazy_extractors.py pass the flake8 test now) 2016-02-21 12:22:12 +01:00			`module_src = '\n'.join(module_contents) + '\n'`
Add experimental support for lazy loading the info extractors 'make lazy-extractors' creates the youtube_dl/extractor/lazy_extractors.py (imported by youtube_dl/extractor/__init__.py), which contains simplified classes that only have the 'suitable' class method and that load the appropiate class with the '__new__' method when a instance is created. 2016-02-10 14:01:31 +01:00
[devscripts/make_lazy_extractors] Fix making lazy extractors on python 3 under Windows 2017-02-23 20:09:13 +01:00			`with io.open(lazy_extractors_filename, 'wt', encoding='utf-8') as f:`
Add experimental support for lazy loading the info extractors 'make lazy-extractors' creates the youtube_dl/extractor/lazy_extractors.py (imported by youtube_dl/extractor/__init__.py), which contains simplified classes that only have the 'suitable' class method and that load the appropiate class with the '__new__' method when a instance is created. 2016-02-10 14:01:31 +01:00			`f.write(module_src)`
[build] Fix various Jython CI and test issues 2023-07-20 19:49:48 +02:00
			`# work around JVM byte code module limit in Jython`
			`if sys.platform.startswith('java') and sys.version_info[:2] == (2, 7):`
			`import subprocess`
			`from youtube_dl.compat import compat_subprocess_get_DEVNULL`
			`# if Python 2.7 is available, use it to compile the module for Jython`
			`try:`
			`# if Python 2.7 is available, use it to compile the module for Jython`
			`subprocess.check_call(['python2.7', '-m', 'py_compile', lazy_extractors_filename], stdout=compat_subprocess_get_DEVNULL())`
			`except Exception:`
			`pass`