URL Parser - libvcs.url
¶
We all love urllib.parse
, but what about VCS systems?
Also, things like completions and typings being in demand, what of all these factories? Good python code, but how to we get editor support and the nice satisfaction of types snapping together?
If there was a type-friendly structure - like writing our own abstract base class - or a
dataclasses
- while also being extensible to patterns and groupings, maybe we could strike a
perfect balance.
If we could make it ready-to-go out of the box, but also have framework-like extensibility, it could satisfy the niche.
Validate and detect VCS URLs¶
libvcs.url.git.GitURL.is_valid()
>>> from libvcs.url.git import GitURL
>>> GitURL.is_valid(url='https://github.com/vcs-python/libvcs.git')
True
>>> from libvcs.url.git import GitURL
>>> GitURL.is_valid(url='[email protected]:vcs-python/libvcs.git')
True
libvcs.url.hg.HgURL.is_valid()
>>> from libvcs.url.hg import HgURL
>>> HgURL.is_valid(url='https://hg.mozilla.org/mozilla-central/mozilla-central')
True
>>> from libvcs.url.hg import HgURL
>>> HgURL.is_valid(url='[email protected]:MyProject/project')
True
libvcs.url.svn.SvnURL.is_valid()
>>> from libvcs.url.svn import SvnURL
>>> SvnURL.is_valid(
... url='https://svn.project.org/project-central/project-central')
True
>>> from libvcs.url.svn import SvnURL
>>> SvnURL.is_valid(url='[email protected]:MyProject/project')
True
Parse VCS URLs¶
Compare to urllib.parse.ParseResult
>>> from libvcs.url.git import GitURL
>>> GitURL(url='[email protected]:vcs-python/libvcs.git')
GitURL([email protected]:vcs-python/libvcs.git,
user=git,
hostname=github.com,
path=vcs-python/libvcs,
suffix=.git,
rule=core-git-scp)
>>> from libvcs.url.hg import HgURL
>>> HgURL(
... url="http://hugin.hg.sourceforge.net:8000/hgroot/hugin/hugin")
HgURL(url=http://hugin.hg.sourceforge.net:8000/hgroot/hugin/hugin,
scheme=http,
hostname=hugin.hg.sourceforge.net,
port=8000,
path=hgroot/hugin/hugin,
rule=core-hg)
>>> from libvcs.url.svn import SvnURL
>>> SvnURL(
... url='svn+ssh://svn.debian.org/svn/aliothproj/path/in/project/repository')
SvnURL(url=svn+ssh://svn.debian.org/svn/aliothproj/path/in/project/repository,
scheme=svn+ssh,
hostname=svn.debian.org,
path=svn/aliothproj/path/in/project/repository,
rule=pip-url)
Export usable URLs¶
pip
knows what a certain URL string means, but git clone
won’t.
e.g. pip install git+https://github.com/django/django.git@3.2
works great with pip
.
$ pip install git+https://github.com/django/[email protected]
...
Successfully installed Django-3.2
but git clone
can’t use that:
$ git clone git+https://github.com/django/[email protected] # Fail
...
Cloning into [email protected]''...'
git: 'remote-git+https' is not a git command. See 'git --help'.
It needs something like this:
$ git clone https://github.com/django/django.git --branch 3.2
But before we get there, we don’t know if we want a URL yet. We return a structure, e.g. GitURL
.
Common result primitives across VCS, e.g.
GitURL
.Compare to a
urllib.parse.ParseResult
inurlparse
This is where fun can happen, or you can just parse a URL.
Allow mutating / replacing parse of a vcs (e.g. just the hostname)
Support common cases with popular VCS systems
Support extending parsing for users needing to do so
Scope¶
Out of the box¶
The ambition for this is to build extendable parsers for package-like URLs, e.g.
Extendability¶
Patterns can be registered. Similar behavior exists
in urlparse
(undocumented).
Any formats not covered by the stock
Custom urls
For orgs on , e.g:
python:mypy
->git@github.com:python/mypy.git
inkscape:inkscape
->git@gitlab.com:inkscape/inkscape.git
For out of domain trackers, e.g.
Direct to site:
cb:python-vcs/libtmux
->https://codeberg.org/vcs-python/libvcs
kde:plasma/plasma-sdk
->git@invent.kde.org:plasma/plasma-sdk.git
Aside: Note KDE’s git docs use of
url.<base>.insteadOf
andurl.<base>.pushInsteadOf
Direct to site + org / group:
gnome:gedit
->git@gitlab.gnome.org:GNOME/gedit.git
openstack:openstack
->https://opendev.org/openstack/openstack.git
mozilla:central
->https://hg.mozilla.org/mozilla-central/
From there, GitURL
can be used downstream directly by other projects.
In our case, libvcs
s’ own Commands - libvcs.cmd and Sync - libvcs.sync, as well as a $ vcspull ·
configuration, will be able to detect and accept various URL patterns.
Matchers: Defaults¶
When a match occurs, its defaults
will fill in non-matched groups.
Matchers: First wins¶
When registering new matchers, higher weight
s are checked first. If it’s a valid regex grouping,
it will be picked.
Explore¶
- Git URL Parser -
libvcs.url.git
- SVN URL Parser -
libvcs.url.svn
- Mercurial URL Parser -
libvcs.url.hg
- Framework: Add and extend URL parsers -
libvcs.url.base
- VCS Detection -
libvcs.url.registry
- Constants -
libvcs.url.constants