Astral's first paid offering announced - pyx, a private package registry and pypi frontend
Posted by tomster10010@reddit | Python | View on Reddit | 28 comments
https://astral.sh/pyx
https://x.com/charliermarsh/status/1955695947716985241
Looks like this is how they're going to try to make a profit? Seems pretty not evil, though I haven't had the problems they're solving.
Czerwona@reddit
I feel like most of these problems are already solved by pixi which uses UV under the hood for dependencies that aren’t pure python
Trick_Brain7050@reddit
Making a wheel is easy, making a conda recipe sucks asssss
emaniac0@reddit
I was thinking the same thing reading this, I don't regularly have the issues they listed.
When I did more ML stuff I remember hearing conda was better for packages that expected different CUDA versions, so maybe pyx would solve that problem too? I'm interested to hear from others that do have these problems.
nonamenomonet@reddit
So Pypj can only handle files that are in Python and cython. Where conda can work with executables in other languages (openjdk and cuda for example).
ThatsALovelyShirt@reddit
I've definitely seen non-python related precompiles runtime libraries (cuDNN, cublas, mkl, etc) in wheels served by PyPi. I have also seen (and personally made) wheels which contain typescript/JavaScript, image files, and all sorts of other things.
Pretty sure you can put whatever you want into a wheel file.
nonamenomonet@reddit
Sorry, I may have misspoken. Pip cannot work with executables like the JVM.
ThatsALovelyShirt@reddit
Doesn't pip just pull and extract whl files to the active environment's libs/bin folder?
I never really like Conda because of its mess of a package repository. Like 5 different flavors of the same package all named basically the same, some in forge, some not, some supporting the python version you need, others not, some completely abandoned, and so on.
At least with PyPi there's just a single repository.
nonamenomonet@reddit
I think you’re correct, but I’m not an expert on Python packages or dependency architecture.
james_pic@reddit
I feel like they might have shot themselves in the foot a bit, since UV fixes much of the brokenness of Pip's multi-repo support, which is often a key reason organisations end with complex repo setups.
Fearless-Elephant-81@reddit
People who train large code models may benefit extremely from this.
ichunddu9@reddit
How? Installation is not the problem on a cluster for competent teams
Fearless-Elephant-81@reddit
You would be surprised how difficult it is to get versions properly running for all the nightly builds at once for different hardware.
But my motive was more along the lines of faster install speeds from pypi. Downloading and installing repos for evals and potentially even in the training loop can see faster times I guess if I read the description correctly.
ijkxyz@reddit
I don't get it, are people installing the full environment from scratch, on every single machine, every single time they want to run something?
LightShadow@reddit
Yes.
Not everything is brought up all at the same time and new nodes need to reach parity with their computing brothers. Things come and go in the cluster and it's a nightmare keeping them all up to date.
Fearless-Elephant-81@reddit
Generally, evals procedure to do swebench involves cloning a repo (at a particular commit) and running all the tests. So you have to clone and install for literally each datapoint.
ijkxyz@reddit
Apparently swebench dataset contains just under 2300 issues from 12 repos. Couldn't you in theory, pre-build a Docker image for each of the test repos, that has it already cloned, along with a pre-populated uv cache, since all of the ~192 relevant commit IDs are known ahead of time. You can then reuse this image until the dataset changes?
Fearless-Elephant-81@reddit
Spot on! But I imagine the scale is far far higher during training and what massive companies do internally. That’s where the challenge comes. You can’t (I imagine) pre warm in the millions.
ijkxyz@reddit
Thanks! I think I get it. So basically, the benefit of pyx here is that it provides a fairly easy and flexible way to speed up a process like this (by simply speeding up the installations), without the need for more specialized optimizations (like the example with pre-built images).
Fearless-Elephant-81@reddit
I would say when you can not pre build the image. Rather have the luxury too. Pre building will always be faster because no build haha.
Rodot@reddit
You'd be surprised when you need all matching cuda versions and compilers across 10 packages and everything needs to be arm64 because you're running on a GH cluster with shitty module scripts
Spent all day yesterday with a national lab research consultant and an Nvidia developer trying to get our environment setup and working
suedepaid@reddit
I love this — great monetization approach and definitely solves enterprise pain-points.
Longjumpingfish0403@reddit
If Pyx is geared towards solving issues with Python package management, it could appeal to teams dealing with diverse dependency setups and looking for speed boosts in their CI/CD pipelines. It might be worth exploring how Pyx could integrate with Docker or other containerization tools for smoother deployment workflows.
betazoid_one@reddit
Larger start ups may use this. This will basically replace cloudsmith, rip
Jmc_da_boss@reddit
I mean, tbf if this can replace cloudsmith then the company was not really large enough to be using cloudsmith anyways. Their value prop is supporting ALL the registries
tecedu@reddit
Keep it python only, add CVE Monitoring and proper RBAC User access and you got a customer in me.
Its so hard to find an enterprise version which isnt setup with bonkers licensing or useless features
revonrat@reddit
Absolutely. Or some home-grown abomination maintained by a team that just got RIF'ed last quarter.
tecedu@reddit
Theres already so many abominations, all of the enterprise package registries are fucked because they want to target everything rather than just one
tomster10010@reddit (OP)
i also think it's crazy that they want it to be pronounced as the acronym rather than as "pix" or "pikes"