My least favorite Rust type is
Range is a foundational type, with magic syntax but a simple definition:
Idx is typically an integral type, but you can make a Range of anything, which will become important later. Here’s a range of Unit:
(In case you were wondering, the Unit range does not contain Unit.)
What’s wrong with Range? This will be a critique on Rust’s own terms.
Range needs random-feeling borrows
Any Rust tutorial will cover borrowing, lifetimes, ownership, etc. You think you understand, then this:
You need the borrow because you can’t like, OWN a number, man!
Heh, no, it’s because, what if you wanted to make a
(try to guess what value
v has above)
Range<Vec> requires the borrow, so the vastly more common
Range<usize> etc. forces it as well.
Range needs random-feeling clones
We’re not done picking on the borrow checker. Range makes you clone even when there’s no ownership transfer:
Range could be
Copy, but some Ranges are also iterators and you might copy one by mistake:
so Ranges which are not used as iterators (many cannot be, like
Range<char>) pay the price, and so does any type which embeds a
This is abusing the borrow checker as a bad linter. Range undermines the story of lifetimes and ownership, making the borrow checker feel arbitrary.
Range is unsure when it is valid
Rust will happily construct a backwards Range, which ends before it starts:
Is this backwards range valid? Its
len() is 0, it
contains() nothing, it yields nothing as an iterator. But if you try to use it to index into a slice, you get a panic! So it looks valid but is primed to explode!
This is because - again - you can make a Range of anything. If you try to enforce that start <= end, you lose the ability to make a Range of non-comparable things. A Range of
dyn Error can’t be invalid, so a Range of int never gets to be.
A practical problem is writing correct bounds checks. For example, consider the
get_unchecked function on slice - it says “an out-of-bounds index is undefined behavior” but never defines what out of bounds means. So how does one even call this function safely?
Restated: a Range with start 0 and length 0 may be out of bounds. That’s wild.
Range hides a footgun
A regex engine (here’s one) often deals in ranges of char, for example
/[a-z]/. Should it use
No! The footgun is
/[\u10FFFF]/, which is the largest char.
Range<char> cannot represent this value, even though it has the bits to do so.
RangeInclusive a better choice? This uses an additional bool field to mean…stuff, and it needs to be separately checked in several places. This is a silly expensive representation, pushing
RangeInclusive<char> to 12 bytes even though it would fit in 8, with bits to spare. Not a good choice for a perf-sensitive regex algorithm.
A Recipe for Rearranging Range
The problem is Range is overloaded: it’s too flexible, it wants to be everything.
- It’s sometimes a set
- It’s sometimes an iterator
- It’s sometimes a slice index
- You can make silly Ranges out of anything
These goals are in tension, and it meets none of them well.
Perhaps it’s too late and we must live with this wart, but a recipe for a different approach:
Limit Range to
Copy + PartialOrdtypes. There may be occasional uses for
Range<BigNum>, but it’s not worth forcing borrowing for the common case.
Now that Range always knows how to compare its ends, enforce that
start <= endat construction time. For Ranges constructed from constants, this will be free. This avoids the “looks valid, actually explodes” problem and will unlock further optimizations:
Give Range an
iter()method, like other collections have. Now Range is not an Iterator, just like Vec is not an Iterator, and Range can be easily made Copy. This does mean writing
for n in (1..10).iter(), but Rust already requires that for collections, so it’s more consistent.
Now that Range is not an Iterator,
RangeInclusivecan drop its extra bool. It would simply be a pair of indexes, and could not be empty (that’s what Swift does).
Thanks for reading!