VCS Detection - libvcs.url.registry¶
Detect which VCS a URL belongs to — git, Mercurial, or Subversion — before
you shell out to any binary. The module-level registry checks a URL against
every parser — GitURL,
HgURL, SvnURL — and returns
each hit as a ParserMatch. Most readers only
need match(); registering rules of
your own is the rarer case covered further down.
Matching URLs¶
Pass is_explicit to narrow matching to rules where the URL names its VCS
outright (True) — like the git+ssh:// pip-style scheme — or to
pattern-inference rules only (False):
>>> from libvcs.url.registry import registry, ParserMatch
>>> from libvcs.url.git import GitURL
>>> registry.match('[email protected]:plasma/plasma-sdk.git')
[ParserMatch(vcs='git', match=GitURL(...))]
>>> registry.match('[email protected]:plasma/plasma-sdk.git', is_explicit=True)
[ParserMatch(vcs='git', match=GitURL(...))]
>>> registry.match('git+ssh://[email protected]:plasma/plasma-sdk.git')
[ParserMatch(vcs='git', match=GitURL(...))]
>>> registry.match('git+ssh://[email protected]:plasma/plasma-sdk.git', is_explicit=False)
[]
>>> registry.match('git+ssh://[email protected]:plasma/plasma-sdk.git', is_explicit=True)
[ParserMatch(vcs='git', match=GitURL(...))]
Adding your own rules¶
For the rarer cases — organization shorthands, self-hosted forges — teach a
parser new URL shapes: subclass Rule for the
pattern, attach it to your own GitURL subclass
through RuleMap, and hand that parser to a fresh
VCSRegistry. Subclassing keeps the rules
local — registering on GitURL.rule_map directly would mutate the shared
class-level map and change GitURL for every caller in the process.
This registry understands github:org/repo and converts matches to
cloneable URLs, leaving the module-level registry untouched. An ambiguous
SSH URL still matches every VCS — narrow it with is_explicit=True:
>>> import dataclasses
>>> from libvcs.url.base import Rule, RuleMap
>>> from libvcs.url.registry import ParserMatch, VCSRegistry, registry
>>> from libvcs.url.git import GitURL
>>> class GitHubPrefix(Rule):
... label = 'gh-prefix'
... description = 'Matches prefixes like github:org/repo'
... pattern = r'^github:(?P<path>.*)$'
... defaults = {
... 'hostname': 'github.com',
... 'scheme': 'https'
... }
... is_explicit = True # We know it's git, not any other VCS
... weight = 100
>>> @dataclasses.dataclass(repr=False)
... class MyGitURLParser(GitURL):
... rule_map = RuleMap(
... _rule_map={
... **GitURL.rule_map._rule_map,
... 'github_prefix': GitHubPrefix,
... }
... )
>>> my_parsers: "ParserLazyMap" = {
... "git": MyGitURLParser,
... "hg": "libvcs.url.hg.HgURL",
... "svn": "libvcs.url.svn.SvnURL",
... }
>>> vcs_matcher = VCSRegistry(parsers=my_parsers)
>>> registry.match('[email protected]:plasma/plasma-sdk.git')
[ParserMatch(vcs='git', match=GitURL(...))]
>>> vcs_matcher.match('[email protected]:plasma/plasma-sdk.git')
[ParserMatch(vcs='git', match=MyGitURLParser(...)),
ParserMatch(vcs='hg', match=HgURL(...)),
ParserMatch(vcs='svn', match=SvnURL(...))]
>>> vcs_matcher.match('git+ssh://[email protected]:plasma/plasma-sdk.git', is_explicit=True)
[ParserMatch(vcs='git', match=MyGitURLParser(...))]
>>> vcs_matcher.match('github:webpack/webpack', is_explicit=True)
[ParserMatch(vcs='git',
match=MyGitURLParser(url=github:webpack/webpack,
scheme=https,
hostname=github.com,
path=webpack/webpack,
rule=gh-prefix))]
>>> git_match = vcs_matcher.match('github:webpack/webpack', is_explicit=True)[0].match
>>> git_match.to_url()
'https://github.com/webpack/webpack'
>>> git_match.scheme = None
>>> git_match.to_url()
'[email protected]:webpack/webpack'
The same pattern handles infrastructure shorthands like KDE’s
kde:group/repository convention:
>>> import dataclasses
>>> from libvcs.url.base import Rule, RuleMap
>>> from libvcs.url.registry import ParserMatch, VCSRegistry
>>> from libvcs.url.git import GitURL
>>> class KDEPrefix(Rule): # https://community.kde.org/Infrastructure/Git
... label = 'kde-prefix'
... description = 'Matches prefixes like kde:org/repo'
... pattern = r'^kde:(?P<path>\w[^:]+)$'
... defaults = {
... 'hostname': 'invent.kde.org',
... 'scheme': 'https'
... }
... is_explicit = True
... weight = 100
>>> @dataclasses.dataclass(repr=False)
... class MyKDEURLParser(GitURL):
... rule_map = RuleMap(
... _rule_map={
... **GitURL.rule_map._rule_map,
... 'kde_prefix': KDEPrefix,
... }
... )
>>> vcs_matcher = VCSRegistry(parsers={
... "git": MyKDEURLParser,
... "hg": "libvcs.url.hg.HgURL",
... "svn": "libvcs.url.svn.SvnURL",
... })
>>> vcs_matcher.match('kde:frameworks/kirigami', is_explicit=True)
[ParserMatch(vcs='git',
match=MyKDEURLParser(url=kde:frameworks/kirigami,
scheme=https,
hostname=invent.kde.org,
path=frameworks/kirigami,
rule=kde-prefix))]
>>> kde_match = vcs_matcher.match('kde:frameworks/kirigami', is_explicit=True)[0].match
>>> kde_match.to_url()
'https://invent.kde.org/frameworks/kirigami'
>>> kde_match.scheme = None
>>> kde_match.to_url()
'[email protected]:frameworks/kirigami'
API Reference¶
Registry of VCS URL Parsers for libvcs.
-
class libvcs.url.registry.ParserMatch¶class libvcs.url.registry.ParserMatch¶
Bases:
NamedTupleMatch or hit that suggests or identifies a VCS by URL Pattern.