Git URL Parser - libvcs.parse.git#

For git, aka git(1).

This module is an all-in-one parser and validator for Git URLs.

libvcs.parse.git.DEFAULT_MATCHERS#

Core regular expressions. These are patterns understood by git(1)

See also: https://git-scm.com/docs/git-clone#URLS

class libvcs.parse.git.GitBaseURL(url)#

Git repository location. Parses URLs on initialization.

Examples

>>> GitBaseURL(url='https://github.com/vcs-python/libvcs.git')
GitBaseURL(url=https://github.com/vcs-python/libvcs.git,
        scheme=https,
        hostname=github.com,
        path=vcs-python/libvcs,
        suffix=.git,
        matcher=core-git-https)
>>> myrepo = GitBaseURL(url='https://github.com/myproject/myrepo.git')
>>> myrepo.hostname
'github.com'
>>> myrepo.path
'myproject/myrepo'
>>> GitBaseURL(url='[email protected]:vcs-python/libvcs.git')
GitBaseURL([email protected]:vcs-python/libvcs.git,
        user=git,
        hostname=github.com,
        path=vcs-python/libvcs,
        suffix=.git,
        matcher=core-git-scp)
Parameters:

url (str) –

matcher#

name of the Matcher

Type:

str

hostname#
classmethod is_valid(url, is_explicit=None)#

Whether URL is compatible with VCS or not.

Examples

>>> GitBaseURL.is_valid(url='https://github.com/vcs-python/libvcs.git')
True
>>> GitBaseURL.is_valid(url='[email protected]:vcs-python/libvcs.git')
True
>>> GitBaseURL.is_valid(url='notaurl')
False

Unambiguous VCS detection

Sometimes you may want to match a VCS exclusively, without any change for, e.g. in order to outright detect the VCS system being used.

>>> GitBaseURL.is_valid(
...     url='[email protected]:vcs-python/libvcs.git', is_explicit=True
... )
False

In this case, check GitPipURL.is_valid() or GitURL.is_valid()’s examples.

Parameters:
  • url (str) –

  • is_explicit (Optional[bool]) –

Return type:

bool

matcher#
matchers#
path#
port#
scheme#
suffix#
to_url()#

Return a git(1)-compatible URL. Can be used with git clone.

Examples

>>> git_url = GitBaseURL(url='[email protected]:vcs-python/libvcs.git')
>>> git_url
GitBaseURL([email protected]:vcs-python/libvcs.git,
        user=git,
        hostname=github.com,
        path=vcs-python/libvcs,
        suffix=.git,
        matcher=core-git-scp)

Switch repo libvcs -> vcspull:

>>> git_url.path = 'vcs-python/vcspull'
>>> git_url.to_url()
'[email protected]:vcs-python/vcspull.git'

Switch them to gitlab:

>>> git_url.hostname = 'gitlab.com'
>>> git_url.to_url()
'[email protected]:vcs-python/vcspull.git'
Return type:

str

url#
user#
class libvcs.parse.git.GitPipURL(url)#

Supports pip git URLs.

Parameters:

url (str) –

hostname#
classmethod is_valid(url, is_explicit=None)#

Whether URL is compatible with Pip Git’s VCS URL pattern or not.

Examples

Will not match normal git(1) URLs, use GitURL.is_valid() for that.

>>> GitPipURL.is_valid(url='https://github.com/vcs-python/libvcs.git')
False
>>> GitPipURL.is_valid(url='[email protected]:vcs-python/libvcs.git')
False

Pip-style URLs:

>>> GitPipURL.is_valid(url='git+https://github.com/vcs-python/libvcs.git')
True
>>> GitPipURL.is_valid(url='git+ssh://[email protected]:vcs-python/libvcs.git')
True
>>> GitPipURL.is_valid(url='notaurl')
False

Explicit VCS detection

Pip-style URLs are prefixed with the VCS name in front, so its matchers can unambigously narrow the type of VCS:

>>> GitPipURL.is_valid(
...     url='git+ssh://[email protected]:vcs-python/libvcs.git', is_explicit=True
... )
True
Parameters:
  • url (str) –

  • is_explicit (Optional[bool]) –

Return type:

bool

matcher#
matchers#
path#
port#
rev#
scheme#
suffix#
to_url()#

Exports a pip-compliant URL.

Examples

>>> git_url = GitPipURL(
...     url='git+ssh://[email protected]:7999/PROJ/repo.git'
... )
>>> git_url
GitPipURL(url=git+ssh://[email protected]:7999/PROJ/repo.git,
        scheme=git+ssh,
        user=git,
        hostname=bitbucket.example.com,
        port=7999,
        path=PROJ/repo,
        suffix=.git,
        matcher=pip-url)
>>> git_url.path = 'libvcs/vcspull'
>>> git_url.to_url()
'git+ssh://bitbucket.example.com/libvcs/vcspull.git'

It also accepts revisions, e.g. branch, tag, ref:

>>> git_url = GitPipURL(
...     url='git+https://github.com/vcs-python/[email protected]'
... )
>>> git_url
GitPipURL(url=git+https://github.com/vcs-python/[email protected],
        scheme=git+https,
        hostname=github.com,
        path=vcs-python/libvcs,
        suffix=.git,
        matcher=pip-url,
        rev=v0.10.0)
>>> git_url.path = 'libvcs/vcspull'
>>> git_url.to_url()
'git+https://github.com/libvcs/[email protected]'
Return type:

str

url#
user#
class libvcs.parse.git.GitURL(url)#

Batteries included URL Parser. Supports git(1) and pip URLs.

Ancestors (MRO) This URL parser inherits methods and attributes from the following parsers:

Parameters:

url (str) –

hostname#
classmethod is_valid(url, is_explicit=None)#

Whether URL is compatible included Git URL matchers or not.

Examples

Will match normal git(1) URLs, use GitURL.is_valid() for that.

>>> GitURL.is_valid(url='https://github.com/vcs-python/libvcs.git')
True
>>> GitURL.is_valid(url='[email protected]:vcs-python/libvcs.git')
True

Pip-style URLs:

>>> GitURL.is_valid(url='git+https://github.com/vcs-python/libvcs.git')
True
>>> GitURL.is_valid(url='git+ssh://[email protected]:vcs-python/libvcs.git')
True
>>> GitURL.is_valid(url='notaurl')
False

Explicit VCS detection

Pip-style URLs are prefixed with the VCS name in front, so its matchers can unambigously narrow the type of VCS:

>>> GitURL.is_valid(
...     url='git+ssh://[email protected]:vcs-python/libvcs.git', is_explicit=True
... )
True

Below, while it’s GitHub, that doesn’t necessarily mean that the URL itself is conclusively a git URL (e.g. the pattern is too lax):

>>> GitURL.is_valid(
...     url='[email protected]:vcs-python/libvcs.git', is_explicit=True
... )
False

You could create a GitHub matcher that consider github.com hostnames to be exclusively git:

>>> GitHubMatcher = Matcher(
...     # Since github.com exclusively serves git repos, make explicit
...     label='gh-matcher',
...     description='Matches github.com https URLs, exact VCS match',
...     pattern=re.compile(
...         rf'''
...         ^(?P<scheme>ssh)?
...         ((?P<user>\w+)@)?
...         (?P<hostname>(github.com)+):
...         (?P<path>(\w[^:]+))
...         {RE_SUFFIX}?
...         ''',
...         re.VERBOSE,
...     ),
...     is_explicit=True,
...     pattern_defaults={
...         'hostname': 'github.com'
...     }
... )
>>> GitURL.matchers.register(GitHubMatcher)
>>> GitURL.is_valid(
...     url='[email protected]:vcs-python/libvcs.git', is_explicit=True
... )
True
>>> GitURL(url='[email protected]:vcs-python/libvcs.git').matcher
'gh-matcher'

This is just us cleaning up:

>>> GitURL.matchers.unregister('gh-matcher')
>>> GitURL(url='[email protected]:vcs-python/libvcs.git').matcher
'core-git-scp'
Parameters:
  • url (str) –

  • is_explicit (Optional[bool]) –

Return type:

bool

matcher#
matchers#
path#
port#
rev#
scheme#
suffix#
to_url()#

Return a git(1)-compatible URL. Can be used with git clone.

Examples

SSH style URL:

>>> git_url = GitURL(url='[email protected]:vcs-python/libvcs')
>>> git_url.path = 'vcs-python/vcspull'
>>> git_url.to_url()
'[email protected]:vcs-python/vcspull'

HTTPs URL:

>>> git_url = GitURL(url='https://github.com/vcs-python/libvcs.git')
>>> git_url.path = 'vcs-python/vcspull'
>>> git_url.to_url()
'https://github.com/vcs-python/vcspull.git'

Switch them to gitlab:

>>> git_url.hostname = 'gitlab.com'
>>> git_url.to_url()
'https://gitlab.com/vcs-python/vcspull.git'

Pip style URL, thanks to this class implementing GitPipURL:

>>> git_url = GitURL(url='git+ssh://[email protected]/vcs-python/libvcs')
>>> git_url.hostname = 'gitlab.com'
>>> git_url.to_url()
'git+ssh://gitlab.com/vcs-python/libvcs'
Return type:

str

url#
user#
libvcs.parse.git.NPM_DEFAULT_MATCHERS = []#

NPM-style git URLs.

Git URL pattern (from docs.npmjs.com):

<protocol>://[<user>[:<password>]@]<hostname>[:<port>][:][/]<path>[#<commit-ish> | #semver:<semver>]

Examples of NPM-style git URLs (from docs.npmjs.com):

ssh://git@github.com:npm/cli.git#v1.0.27
git+ssh://git@github.com:npm/cli#semver:^5.0
git+https://isaacs@github.com/npm/cli.git
git://github.com/npm/cli.git#v1.0.27

Notes

libvcs.parse.git.PIP_DEFAULT_MATCHERS#

pip-style git URLs.

Examples of PIP-style git URLs (via pip.pypa.io):

MyProject @ git+ssh://git.example.com/MyProject
MyProject @ git+file:///home/user/projects/MyProject
MyProject @ git+https://git.example.com/MyProject

Refs (via pip.pypa.io):

MyProject @ git+https://git.example.com/MyProject.git@master
MyProject @ git+https://git.example.com/MyProject.git@v1.0
MyProject @ git+https://git.example.com/MyProject.git@da39a3ee5e6b4b0d3255bfef95601890afd80709
MyProject @ git+https://git.example.com/MyProject.git@refs/pull/123/head

Notes

libvcs.parse.git.RE_PATH =     ((?P<user>\w+)@)?     (?P<hostname>([^/:]+))     (:(?P<port>\d{1,5}))?     (?P<separator>[:,/])?     (?P<path>       (\w[^:.@]*)  # cut the path at . to negate .git, @ from pip     )? #
libvcs.parse.git.RE_PIP_REV =     (@(?P<rev>.*)) #
libvcs.parse.git.RE_PIP_SCHEME =     (?P<scheme>       (         git\+ssh|         git\+https|         git\+http|         git\+file       )     ) #
libvcs.parse.git.RE_PIP_SCP_SCHEME =     (?P<scheme>       (         git\+ssh|         git\+file       )     ) #
libvcs.parse.git.RE_SCHEME =     (?P<scheme>       (         http|https       )     ) #
libvcs.parse.git.RE_SUFFIX = (?P<suffix>\.git)#
libvcs.parse.git.SCP_REGEX =     # Optional user, e.g. 'git@'     ((?P<user>\w+)@)?     # Server, e.g. 'github.com'.     (?P<hostname>([^/:]+)):     # The server-side path. e.g. 'user/project.git'. Must start with an     # alphanumeric character so as not to be confusable with a Windows paths     # like 'C:/foo/bar' or 'C:\foo\bar'.     (?P<path>(\w[^:.]+))     #