Pyrefly: Python type checker and language server in Rust
They all lack certain features vs basedpyright (what I was using) such as auto imports (Ty has experimental support), showing signature/doc when selecting autocomplete options (I think Ty does have this one) and some other features that I can't remember.
One feature that always existed in Jedi (and now also Zuban) is "goto declaration" in Python. It allowed me to goto the "import" instead of the original definition of a function/class which I'm surprised either isn't supported at all (pyright?) or would just do the same thing as "goto definition" (Ty), seems like an obvious oversight imho.
Edit: Also, I wish all these new tools give more importance to such IDE features as much as they do for type checking.
New typecheckers don't need to be perfect. They need to be good enough, easy to integrate and have low false positives. Sure, they will improve with time, but if feels like a pain then no one will pick it up. Python users are famously averse to tools that slow down their dev cycles, even if it means better long term stability
BasedPyright is popular because it comes built-in with Cursor and disappears into the background. I have a positive bias towards Astral figuring it out given their track record. But, none of these tools have reached the point of effortlessness just yet.
New typecheckers don't need to be perfect. They need to be good enough, easy to integrate and have low false positives. Sure, they will improve with time, but if feels like a pain then no one will pick it up. Python users are famously averse to tools that slow down their dev cycles, even if it means better long term stability
I really don't agree. Sure, they don't need to be perfect but also keep in mind many codebases have already standardized on something like (based)pyright or mypy. So there's a migration cost. If your analysis has a lot of false positives or misses a lot of what those type checkers miss then there's little incentive to migrate. Sure, ty and pyrefly are much faster, but at the end of the day speed is only one consideration for a type checker.
x =[3]
Should the type checker guess that the type of x is list[Any], or list[int], or list[Literal[3]], or something else? Libraries you depend on will pose more difficult versions of this question.
So it's not possible to expect the tool to be "perfect", whatever that means - usually people think of it as meaning it allows all code that they think is idiomatic and reasonable, and disallows all code that could raise a TypeError at runtime. You can only expect the tool to be reasonably good and perhaps to have an opinionated design that matches the way you would like to write Python.
Ignoring the question of why there's some random assignment to x in the first place, where are its type hints? Those were added starting with Python 3.5 via PEP 484 just over 10 years ago and have been added to since then. If our goal is maintainable code at scale, the first thing I'd expect of a type checker is for it to communicate with the user, stating that a) the variable in file foo.py on line 43 is missing a type hint and couldn't be guessed with confidence, and that it's just gonna guess it's a list[int]. For bonus points, the tool could run in interactive mode and ask the user directly. But even without a hint, assuming the variable actually is used later, how does it get used?
It seems like you'd just start with the most strict interpretation, list[Literal[3]], and relax it as it gets used. If it gets fed into a function that doesn't have hints and can't be introspected for whatever reason, relax it to list[int] and then further relax it to list[Any] as necessary. Then depending on the mode the user configured the tool to run in, print nothing, or a warning, or error out or ask the user what to do if running interactively. Some of the config options could be strict, tolerant, and loose. Or maybe enforcing, permissive, and loose. Whatever color we want this bike shed to be.
More advanced tools let the user configure individual issues and their corresponding levels, but that may be considered too many options to give the user, overwhelming the user who then doesn't use the tool.
As far as perfect typing goes, I mean, yeah, ideally, after satisfying the type checker in strict mode, which means no types need to be guessed at, as all variables and other locations without type annotations that need them were reported, or even annotated interactively, or if we're real fancy, automatically with a comment. IMO programs should feel free to edit code directly (as a non-default option) That would mean the program could not throw an an uncaught TypeError, unexpectedly. That's a lot to ask of a dynamically strongly typed language, but I'd settle for it never being utterly wrong.
What that means is if the type checker says the function is returning a string, but of the 200 calls to that function, the type checker throws an error and says ten of those callsites expect it to return a dict, all I'm saying is this hypothetical type checker had better be right about that. Nothing worse than losing hours rooting around in the code looking for something that isn't actually there (or an errant semicolon, but this isn't C)
Ignoring the question of why there's some random assignment to x in the first place, where are its type hints? Those were added starting with Python 3.5 via PEP 484 just over 10 years ago and have been added to since then.
The code is in a third party library which doesn't have comprehensive type hints, and perhaps has type expectations so complicated that they can't be expressed in the Python type system.
Even if you enforce type hints on every line in your internal code, you're going to be relying at some point on libraries which are reliable but poorly type hinted. If you're not using that vast ecosystem of libraries, Python probably wasn't the best choice.
With enough time, ty and pyrefly will approach perfectness. If they're easy enough to use today, group 1 should be able to adopt them without any extra pain. (some typechecker is better than no typechecker). This gives them momentum. In couple of years, ty/pyrefly may finally be better than mypy/pyright. Then, Group 2 can start their ports.
This way, no one misses out. Group 2 still gets their perfect typechecker, just not immediately. But in that time, Group 1 is getting familiar with using typecheckers and their sheer size helps build institutional momentum towards typecheckers as an essential piece of any python dev flow.
If A. 'certain class of python problems are permanently solved by typecheckers' and B. 'every python user has some typechecker' become true, then that opens a lot of doors. Today, B is a harder problem than A. I'm guessing that compiled/JIT python will be the next frontier once python typing is solved. Wide typechecker adoption is a blocking requirement for that door to be opened.
If you're looking for a car analogy, I would suggest comparing Python type checking to installing speed cameras on the factory floor.
At this point you either did check the breaks, or did not. If not you are out of luck if the breaks did infact not work.
We're dedicated to providing a great IDE experience, though it does take some time. Please bother us on github / our discord if you have features you want - bug reports / community asks are our biggest priority.
- auto import is implemented in Pyrefly: it uses your pyrefly.toml project structure or falls back to your VSCode workspace (up to the first 2500 files). we're happy to fix it for your situation if you want to provide a reproduction
- signature/doc when selecting autocomplete options is a known bug [here](https://github.com/facebook/pyrefly/issues/1090)
- go to declaration: I've created an issue for that [here](https://github.com/facebook/pyrefly/issues/1291), it should be quick.
- speed: by far the biggest issue. your problem is likely related to [this](https://github.com/facebook/pyrefly/issues/360) but we need more information to speed it up. we're happy to work with you to to improve this if you're willing to provide a project structure
def fn(x: str):
if x is None:
x = "123" # pyright flags that as unreachable code, pyrefly does not
Autocomplete for modules also doesn't work for me yet: import os
os. # I'll get `ABC, `Any`, `AnyStr`, `AnyStr_co`, `BinaryIO`, ...
Looking forward to have a fast language server for python though, pyright is way too slow for large projects.Any nullable type has to be unwrapped before accessing the value, or again, won't compile.
There's no way to get an actual null reference, afaik. Variables always have some value, possibly None.
(not sure what happens if you set a reference to null from C - a crash, probably?)
We're planning on adding unreachable code diagnostics soon (github issue [here](https://github.com/facebook/pyrefly/issues/1292)). These come for free with Pyright so we don't want to regress features.
I'm happy to help diagnose/fix your autocomplete issue: it should work on modules. If you want to provide details here, on [discord](https://discord.gg/Cf7mFQtW7W), or as a Github issue (github/discord preferred) we'll fix it for you + anyone else with the problem.
Looks like none of these new type checkers are ready yet.
I used to find this kind of tooling explosion exhausting (and back then with JS it truly was), but generally speaking it's a good sign that the community is hungry to push things forward. Choice is good, as long as we keep the Unix spirit alive: small, composable tools that play nicely together and can be interchanged.
Anything whatever the FE team feels like using, and the less I know about it, the better.
* uv: project management, formatting
* Pyright: type checking
* Pylint: linting (this is probably optional though I would strongly recommend it). Ruff is an option but I don't think it is quite as comprehensive yet.
There are alternatives for those tools but they are pretty clearly the best options at the moment. There's nothing in uv's league, and the only alternatives in Pyright's league is BasedPyright. Hopefully Ty and Pyrefly will be good options in future but I don't think they're ready for production use just yet.
So which is it that you want, to just reach for one tool or have tools that have specific design goals and then have to reach for many tools in a full workflow?
FWIW Astral's long term goal appears to be to just reach for one tool, hence why you can now do "uv format".
The problem with the python tooling is no one can get it right. There aren't clear winners for a lot of the tooling.
There is a reason us old timers mostly wait on the sidelines until the dust settles.
We have seen this movie already too many times.
- zuban
- ty (from ruff team)
- pyrefly
One year ago, we had none of them, only slow options.
https://github.com/microsoft/pyright/issues/1739
I think EAFP is a very unfortunate and ill-advised practice.
They want you to not write the idiomatic Python:
try:
foo = bar["baz"]["qux"]
...
except KeyError:
...
…and instead write the non-idiomatic version: if "baz" in bar and "qux" in bar["baz"]:
foo = bar["baz"]["qux"]
...
else:
...
If this were a linter then I would accept that it is going to be opinionated. But this is not a linter, it’s a type checker. Their opinions about EAFP are irrelevant. That’s idiomatic Python.They don't always choose the same options, so some Python code may type check in one type checker and not in another.
Yes this is a dumb situation but that's how it is. So Pyright has to make a choice here, and they chose the most sensible option.
You're free to disagree of course.
All I know is it is much more strict about stuff than pylance was.
Also a me problem!
It is interesting that nobody was writing these tools in C or in C++. There are obvious ergonomic reasons, but perhaps also it matters that Rust cares a lot more about types than either of those languages.
The author of Zuban started writing it back in 2020 or 2021, so it took him more than 4 years to complete it. And he is the author of Jedi, so he had prior experience already.
Zuban seems to have a bunch of scary "I'm not sure if this is correct" unsafe blocks, which to me would be a red flag. I mean, it's better that there's a comment expressing the doubt, but my experience is that if you're not sure whether it's correct, it's probably not correct.
For anyone reaching for unsafe, there are in many cases either an existing API (split_at_mut comes to mind). For others, using zero-copy or bytemuck instead of unsafe is a good idea too.
None of that is to say "never write unsafe", unsafe existing is pretty much one of the reasons for Rust to be :)
For example in crates/zuban_python/src/file/diagnostics.rs:
TODO this unsafe feels very wrong, because a bit lower we might modify the complex/ points.
or crates/zuban_python/src/database.rs:
Points are guarded by specific logic and if they are overwritten by something that shouldn't it should not be that tragic.
I saw nothing where I was like "ZOMG this is definitely busted" but I definitely did not get the robust "Oh, I see now why this is correct" that I like from a good unsafe rationale comment, and these aren't tiny things like the small unsafe bit twiddling transmutes which are probably either actually correct or in any case will do what you expected at compile time and so any surprises are priced in without a rationale text.
This one's easier to explain. People interested in tooling for a specific language probably want to write that tooling in that language (hence pip, poetry, mypy, jedi, etc). Normally that would be the end if it, if Python wasn't 10-100x slower than a natively compiled language. And going from Python to Rust is an order of magnitude easier than going from Python to C or Python to C++, because the compiler is so good at identifying silly mistakes. Rust is just a friendlier language.
All of them had some big issue that prevented it from getting mainstream. Either it was slow, or didn’t work with existing workflow, or had complex configuration, or something that prevented gradual adoption.
uv is universally praised as the second coming Christ in Python world (and for a good reason). So no, I doubt there will be something else. Not only you need to be better than uv, you also need to have community momentum.
Pydantic is probably the problem here, but it is what it is.
Already there is some support in Pydantic for native dataclasses: https://docs.pydantic.dev/latest/concepts/dataclasses/
[1] https://docs.python.org/3/library/dataclasses.html#dataclass...
[2] https://docs.python.org/3/library/dataclasses.html#dataclass...
What I'm trying to point out is that these features exist in core Python and yet the type system they built can't express it. By contrast, TypeScript is designed in such a way that you can implement everything yourself without having to write "custom typescript checker plugins".
[1] https://docs.python.org/3/library/typing.html#typing.datacla...
Aside from basic inheritance and complex nested types, the pydantic ‘TypeAdapter’ is awesome for simply validating native dataclasses. It’s a little slow, but everything pydantic is =)
foo = eval(result)
It can’t know what you’re going to load until it actually does it.Things which lean heavily into metaprogramming, typically ORMs or things like Pydantic, fall into that category. I can’t hold that against the type system.
I can’t hold that against the type system.
I think we should. Dataclasses have existed in Python for an extremely long time, and yet the type system doesn't support defining your own similar classes. Kwargs have also existed forever, but they forgot to support that and had to add TypedDict's much later. And it still doesn't properly support optional fields. There's a lot of stuff like this in the language which are unbelievably frustrating, because for some reason they implemented the syntax before implementing a typechecker. Everything has been hacked in ever since. I consider python's type system to be a lost cause, just hoping for someone to make the Typescript equivalent for Python.
I don't think you understand what Pydantic brings to the table or why people use it. It has lots more to do with serialization, complex validation and data mapping.
Nowadays I'm finding myself using Zed a lot more, so maybe the story will be that all these nice Rust based tools become baked in giving it super powers for Python.
They are designed for code which are more or less fully typed, as opposed to PyCharm which cobbles together a bunch of heuristics to try to make sense out of untyped code. An admirable quest, but not one I'm personally interested in.
And their insistence on only supporting this approach drove my entire team away from using PyCharm.
(From shallowly observing notifications on the 20+ typehint related issues I'm subscribed to, they seem to have kinda turned around and working toward fully supporting the python type system finally - possibly by integrating with one of the third-party type-checkers)
pyrefly, ty, pyright, and basedpyright
All of them will complain 2-4x more about your code than PyCharm. I had more than 300 typing errors when I first opened my 20k LOC project in pyright that I wrote in Pycharm.
PyCharm works great when your code is not annotated. It infers types very well. But it won't complain in a lot of cases when your code is annotated.
Related reddit post https://old.reddit.com/r/Python/comments/1ajnikt/to_pycharm_...
You can take a look yourself if you think some of them are wrong: https://github.com/python/typeshed/tree/main/stdlib/asyncio
The advantage is type hints can be fixed without needing to release a new version of Python. The disadvantage is there's a lot of code in the standard library that doesn't really consider how to represent it with type hints, and it can be really tricky and not always possible.
I'm surprised to see so many people moving to pyrefly, ty, and zuban so quickly. I was going to wait until some time in 2026 to see which has matured to the point I find useful, I guess some users really find existing solutions actually unworkable.
You can take a look yourself if you think some of them are wrong: https://github.com/python/typeshed/tree/main/stdlib/asyncio
Hmm. Presumably mypy and pyrefly use the same ones, but then I don't understand why pyrefly is complaining and mypy isn't:
ERROR Argument `Future[list[BaseException | Any]]` is not assignable to parameter `future` with type `Awaitable[tuple[Any]]` in function `asyncio.events.AbstractEventLoop.run_until_complete` [bad-argument-type]
--> xxx/xxx.py:513:33
|
513 | loop.run_until_complete(tasks.gather(*x, return_exceptions=True))
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
The definition in typeshed is this: def run_until_complete(self, future: _AwaitableLike[_T]) -> _T: ...
…where is it even puling "tuple[Any]" from…(tbh this is rather insignificant compared to the noise from external packages without type annotations, or with incomplete ones… pyrefly's inferences at the existence of attributes and their types are extremely conservative…)
Hmm. Presumably mypy and pyrefly use the same ones, but then I don't understand why pyrefly is complaining and mypy isn't:…where is it even puling "tuple[Any]" from…
Perhaps it's a bug in pyrefly, perhaps mypy or pyrefly is able to infer something about the types that the other isn't. I would strongly suggest checking their issues page, and if not seeing a report already report it yourself.
While there is an ongoing push to more consistently document the typing spec: https://typing.python.org/. It does not actually cover all the things you can infer from type hints, and different type hint checkers have decided to take different design choices compared to mypy and will produce different errors even in the most ideal situation.
This is one of the reasons why I am waiting for these libraries to mature a little more.
it does not actually cover what rules you can check or infer from type hints
Indeed this is the cause of maybe 30% of the warnings I'm seeing… items being added to lists or dicts in some place (or something else making it infer a container type), and pyrefly then refusing other types getting added elsewhere. The most "egregious" one I saw was:
def something(x: list[str]) -> str:
foo = []
for i in x:
foo.append(i)
return "".join(foo)
Where it complains: ERROR Argument `str` is not assignable to parameter `object` with type `LiteralString` in function `list.append` [bad-argument-type]
--> test.py:4:20
4 | foo.append(i)
Edit: now that I have posted it, this might actually be a bug in the .join type annotation… or somethingEdit: nah, it's the loop (and the LiteralString variant of .join is just the first overload listed in the type hints)… https://github.com/facebook/pyrefly/issues/1107 - this is kinda important, I don't think I can use it before this is improved :/
foo: list[str] = []
If so this a type checking design choice: * What can I infer from an empty collection declaration?
* What do I allow further down the code to update inferences further up the code?
I don't know Pyrefly's philosophy here, but I assume it's guided by opinionated coding guidelines inside Meta, not what is perhaps the easiest for users to understand.As far as their philosophy goes, it's an open issue they're working on, so their philosophy seems to agree this particular pattern should work :)
Today, the “: list[str]” is 11 wasted characters and it’s not as aesthetically pleasing. Tomorrow, you do some refactor and your inferred list[str] becomes a list[int] without you realizing it… I’m sure that sounds silly in this toy example, but just imagine it’s a much more complex piece of code. The fact of the matter is, you declared foo as a list[any] and you’re passing it to a function that takes an iterable[str] — it ought to complain at you! Type hints are optional in Python, and you can tell linters to suppress warnings in a scope with a comment too.
That being said, perhaps these more permissive rules might be useful in a legacy codebase where no type annotations exist yet.
Really, it’d be extra nice if they made this behavior configurable, but I’m probably asking for too much there. What’s next, a transpiler? Dogs and cats living together?!
Today, the “: list[str]” is 11 wasted characters and it’s not as aesthetically pleasing. Tomorrow, you do some refactor and your inferred list[str] becomes a list[int] without you realizing it… I’m sure that sounds silly in this toy example, but just imagine it’s a much more complex piece of code.
Hmm. I'm looking at a codebase that is still in a lot of "early flux", where one day I might be looking at a "list[VirtualRouter]" but the next day it's a "list[VirtualRouterSpec]". It's already gone through several refactors and it kinda felt like the type hints were pretty much spot on in terms of effort-benefit. It's not a legacy codebase; it has reasonably good type hint coverage, but it's focused on type hinting interfaces (a few Protocol in there), classes and functions. The type hinting inline in actual code is limited.
I do understand your perspective, but tbh to me it feels like if I went that far I might rather not choose Python to begin with…
HN discussion of above: https://news.ycombinator.com/item?id=44107655
How Well Do New Python Type Checkers Conform? A Deep Dive into Ty, Pyrefly, and Zuban (2025-08-29) https://sinon.github.io/future-python-type-checkers/
I've been leaning on pyright + django-stubs, but wondering if I'm missing something better with fewer gaps and pain points.
We've seen a lot of people have success with the mypy plugin + django-stubs.
Full out-of-the-box support is being actively worked on in Pyrefly: we will have specialized django enum support in the next release and we expect real experimental support by the end of the year. At that time we'll likely post a blog post to announce it [here](https://pyrefly.org/blog/).