Change my mind: Floating point should not be the default number representation in high-level programming languages.

Posted by KingSupernova@reddit | programming | View on Reddit | 57 comments

Floating point is useful because it's so fast. But the whole point of high-level languages is to trade performance for saving programmer time. A python program is never going to be as fast as one in C, but it will be a lot easier to write.

Floating point is already a tradeoff in this direction; integer arithmetic is even faster, but less useful, so floating point was a reasonable compromise... 80 years ago. Nowadays, hardware improvements have made performance a non-issue for a huge number of cases. I wouldn't be surprised if more than half of today's actively-developed software could take a 10x speed penalty on its arithmetic operations without breaking. (Just look at how slow most modern apps and webpages are; performance is clearly not a priority for most large companies.)

Floating point has severe drawbacks! The fact that you can't use it for exact math with non-integers is something most programmers have gotten used to, but in reality it's an absolutely *massive* cost. Millions of person-hours have been wasted on learning, debugging, and implementing workarounds for floating point's quirks; all of which would be unnecessary if programming languages supported arbitrary-precision arithmetic out-of-the-box.

Indeed, many programming languages already have separate types for integers vs. floating point. I don't think this division makes sense. The default number type in any high-level language should be one that roughly matches how numbers actually work in the real world, with floating point as a secondary alternative for performance-critical cases.

[-]

BenchEmbarrassed7316@reddit

I don't think this division makes sense.

Why?

[-]

CircumspectCapybara@reddit

Indeed, the distinction makes perfect sense.

Conceptually and logically, there's a clear difference between an integral number of things, and what a floating point number represents.

For example, indices are integer-valued. A fractional index wouldn't make any sense. So separating them out in the type system leads to a more expressive and safe type system in which things that are conceptually different don't get conflated.

[-]

BenchEmbarrassed7316@reddit

By the way, this also applies to unsigned numbers.

Just to remind you that a type is a set of possible values. If you can exclude values from that set that you don't need, that's a good idea.

[-]

Perfect_Field_4092@reddit

OP might mean that if I create a number in JavaScript, and do 1 + 2 it should be the same precision as 0.3 + 0.6 or any combination of numbers with/without decimals. One representation for all numbers without any float oddities. Basically a decimal.

I think this makes sense.

But I also would want to able to create an int specifically for discrete operations like counting and indexing arrays. But that would be my choice to narrow the type or improve performance.

[-]

eikenberry@reddit

I think the author mean a division operator that changes the type of the data from integer input to floating point output. Most languages don't do this and use something like / for division with the remainder dropped and % giving the remainder.

[-]

imachug@reddit

There's literally no reasonable way to implement real numbers other than with floats.

How are you going to represent sin(1)? Either you'll need to force the programmer to specify the exact precision they're aiming for, which they likely aren't qualified to do, or you're going to make it implicit/context-dependent, which is even worse.

Forget sine, how are you going to represent 1 / 7? It causes the same issue, i.e. the only way to represent it exactly is symbolically with fractions. Which complicates everything even further, because if you're trying to keep the values exact, you're suddenly using an unpredictable amount of memory. 1/1 + 1/2 + 1/3 + ... + 1/1000 alone has a 433-digit denominator, good luck doing anything useful with realistic random numbers.

[-]

w1n5t0nM1k3y@reddit

You can still have floating point, but just use base 10 floating points instead of base 2 floating points.

[-]

Valarauka_@reddit

How is that better?

[-]

w1n5t0nM1k3y@reddit

Because you can represent commonly used numbers like 0.1 exactly.

When dealing with financial calculations, which computers do a lot of, you don't have to worry about cases where you add up a bunch of prices with 2 decimal places and end up with a result with more than 2 decimal places.

[-]

hokanst@reddit

In financial systems a value like 0.1 dollar, is better represented by the integer value of 10 cents.

Using the smallest money unit (e.g. cents) avoids most of the decimal issues related to money, as you're dealing with integers rather than floats.

Do note that floats may still show up during things like interest rate calculations and currency conversions.

[-]

w1n5t0nM1k3y@reddit

So how does that work when you need to calculate the tax of 9.975% on a value?

Even simple tax calculations become unwieldly when you try to use integers for currency values.

Having a data type that works in base 10 decimals gets rid of a lot of issues and makes coding anything to do with money so much easier.

[-]

imachug@reddit

So how does that work when you need to calculate the tax of 9.975% on a value?

x * 9975 / 1000 or something like that.

Even simple tax calculations become unwieldly when you try to use integers for currency values.

Surprise, they're unwieldy even with built-in types. You need to round numbers at some point, which in a financial context forces you to follow very specific standards about what should be rounded to which precision where. You'll still have to write a ton of code for that, so it's not really easier.

[-]

w1n5t0nM1k3y@reddit

With .Net decimal data type I can just do

Math.Round(Price * (TaxPercent / 100), 2)

Also, for your calculation to work out properly you would need to do

x * 9975 / 100000

The "or something like that" is exactly why using a data type that deals in base 10 just makes things so much easier. You can think more intuitively about things when you don't have to do weird math to make things work. Using datatypes where calculations just work like using a calculator can be useful in so many ways.

[-]

imachug@reddit

I guess there's a slightly smaller chance of making a mistake like I did with Math.round(Price * 9.975 / 100, 2), but it's not really simpler than Price * 9975 / 100000. But you're also more likely to forget to round values, so I don't know about that.

[-]

w1n5t0nM1k3y@reddit

It also gets more complicated when your tax is a variable.

If your tax is 15% then using your method you would need to do

TaxAmount = Price * 15 / 100

Unless you always want to scale up your taxes to whatever you think the most number of decimals you could possibly have, so you could also do

TaxAmount = Price * 15000 / 100000

But then you have to hope that nobody decides to make a tax rate with 4 digits after the decimal.

[-]

Valarauka_@reddit

If you're doing financial calculations with machine floats you have bigger problems and probably shouldn't be employed.

[-]

w1n5t0nM1k3y@reddit

That's why I think we should have a different data type available. So that we have something that's natively supported that just works.

I never said that people should be using binary floats for financial calculations.

[-]

imachug@reddit

This is specific to financial calculations. Currency does not behave like real numbers do in the actual world: there's very specific rules for rounding, you're unlikely to use anything but basic arithmetic, etc. I'm not opposed to a separate type for financial stuff, but a) it's not a general-purpose type by any means, so I don't see why it has to be the default, b) it's much more complicated than just using base 10, e.g. you should use fixed-point numbers rather than floating-point ones.

[-]

imachug@reddit

How does this improve anything? 1 / 7 is unrepresentable in both decimal and binary. You are not solving any real problems, you're just making them less obvious.

[-]

w1n5t0nM1k3y@reddit

Sure, but common numbers like 0.1 can be represented exactly.

[-]

MayIHaveBaconPlease@reddit

It’s called fixed-point arithmetic and it’s been around forever

[-]

imachug@reddit

I've replied to this elsewhere in the thread. Fixed-point numbers are not scale-invariant and require even more careful analysis than floats to avoid unexpected precision loss. Hardly a good general-purpose choice.

[-]

bethebunny@reddit

There are well studied ways to represent these numbers, and symbolic algebra systems in languages like Mathematica, PARI/GP and Lean do so. Essentially all computations outside some very narrow research fields can be expressed as algebraic numbers extended with I and a couple transcendentals.

As for a 433 digit rational, python already very effectively makes this tradeoff for its integers: small enough ints are represented efficiently with machine-native types, and larger ints are represented as bigints with dynamic precision. In practice it's very rare to go outside the fast domain here, but you can if you want.

In this imagined case where you want to sum the series 1/n (not a common thing to want to do in my experience) would you rather have the result as a floating point that is a (very poor) approximation of the result, or get an exact result but in ~1 microsecond instead of ~100 nanoseconds? Certainly I can see the case for either, which suggests that OP has a point that programming languages should be exploring these number systems.

[-]

imachug@reddit

My point is that it's much harder to avoid slow-down due to long arithmetic with real numbers than it is with integers.

If you want to stay within the fast range of integers, you just need to avoid large numbers. Easy. If you want to satay within the fast range of fractions, you need to... avoid summing up numbers with different denominators? How are you even going to track that, let along intuit?

In the context of a general-purpose language, you have no idea what people are going to do with your types. Maybe they're calculating sizes of widgets, or predicting weather, or analyzing statistical data. Maybe they're drawing diagrams or gradients, or perhaps they're using sin as a random number generator. Floats aren't amazing, but they work well enough for most of these use cases (with an occasional exception in statistics). Fractions would make everything a minefield.

Also, I'm pretty sure you're overestimating the performance of long arithmetic. 10x doesn't seem right. Maybe it's correct for trivial calculations in Python? But not everything is trivial, and not everything is Python.

[-]

eikenberry@reddit

I believe the standard idea is to use 2 integers for each real number. One for each side of the decimal point. This is how money is dealt with and I don't see why it wouldn't work in general. Though IANAM, so apply salt as necessary.

[-]

imachug@reddit

This is called fixed-point. The problem is that, unlike currency, which typically has a well-defined range (say, 0.0001 to 1000000000000000), general-purpose data can be arbitrarily tiny and large. The point of floating-point types is to store numbers with the same precision regardless of scale.

[-]

programming-ModTeam@reddit

This post was removed for violating the "/r/programming is not a support forum" rule. Please see the side-bar for details.

[-]

DrXaos@reddit

For people who care about numbers (scientific programming), speed matters and they are OK with and prefer floats as default and know how it works.

For people who don’t care about numbers, they are also OK with floats.

The people who care about exact math with non integers and don’t use transcendental functions which makes it difficult to do in exact rationals are very few.

[-]

Willing_Value1396@reddit

I unironically don't get your point

[-]

ff3ale@reddit

I think OP is suggesting using a precise decimal number everywhere and taking the performance penalty, with optional outs using ints or floats

[-]

chillebekk@reddit

Same. The difference in performance between integer math and floating point math alone is enough to want to separate the two, and enough for any programmer to want to know when he is doing which operation.

[-]

KawaiiNeko-@reddit

Floating point is useful because it's so fast.

It's so fast only because CPUs have had decades of optimization baked into the hardware. Just saying.

[-]

Newmillstream@reddit

Yeah - Floating point math was famously slow, and still can be if you do a lot of it on the wrong platform, like an 8-bit entry level microcontroller. Even on powerful hardware, if efficiency or scale is important, avoiding needless floating point operations is good practice.

[-]

KerPop42@reddit

This sort of makes me want to republish a bunch of physical constants for binary precision, instead of translating the constants from decimal to binary. After all, they aren't more naturally in decimal.

For example, a number used all the time in orbital dynamics is the product of the Earth's mass and the universal gravitational constant. It's called the standard gravitational parameter. We actually know this value, mu, to a higher precision than the two values that go into it. It's equal to 398600.4418 +/- 0.008 km^3 /s^2 but like, that's assuming there isn't a repeated decimal that's closer to the middle of the bell curve and just impossible to represent in a concise number.

[-]

MaizeGlittering6163@reddit

It's the usual engineering tradeoff. Which of the following three can you do without, as you ain't getting all of them:

Speed - if you don't need to go fast, use arbitrary precision data types. Unbounded range and precision. (Well needs to fit into your address space).
Range - if you can fit exactly into n bits, use machine integers. Very fast and perfect accuracy.
Precision - if you only need n significant digits, use machine floats. Very fast and can deal with a huge range. But don't do accounting in floats

Floats are ideal for science and engineering as every real world measurement is imprecise. Nothing actually moves at exactly 37.3 meters a second you just can't measure more precisely than that on your ship based anemometer. The problem, of course, is that floats were designed in the 1970s for people doing things like integrating in the complex plane in the vicinity of a pole. Lesser mortals don't know that they don't know about floats and build accounting systems with them.

[-]

jhill515@reddit

Discrete Mathematics called and wants you to return your degree.

[-]

CircumspectCapybara@reddit

Most languages don't even given you arbitrary precision integers. At some point your high level type system's primitive types have to map to platform specific primitives.

[-]

jrochkind@reddit

I've thought that too for some time, especially for high-level "scripting" languages. The issue is really with *binary* floating point though, which is really what we generally mean when we say "floats".

IEEE 754 binary floating point should be available --

-- but source code decimal number literals and defaults for things converted from external input etc, should be a *decimal* (possibly still floating point!) representation, like java.math.BigDecimal, or ruby BigDecimal, or python `decimal.Decimal`

[-]

Solumin@reddit

Change my mind: Floating point should not be the default number representation in high-level programming languages.

It isn't.

all of which would be unnecessary if programming languages supported arbitrary-precision arithmetic out-of-the-box.

This is your actual thesis. "Arbitrary-Precision Arithmetic should be the default number representation in high-level programming languages" should be your title. Then you argue about why floats are a bad choice for non-integer math.

Which languages are "high-level" enough that they should be using this by default? Python and Ruby already provide this by default, and Go and Java have it available as separate libraries. JS/TS is the notable exception where the only number representation is a float.

[-]

w1n5t0nM1k3y@reddit

Honestly I agree. I like that .Net has a "Decimal" data type and I use it almost exclusively for non-integer numbers. It works just so much more predictably than binary floating point. Sure it's a little slower, but it's more than fast enough for just about every thing I do. I don't see why every high level language doesn't have an equivalent.

[-]

Rainbows4Blood@reddit

Decimal is such a beautiful datatype and I wish that .NET courses would opt teaching it as the default fractional datatype...

[-]

Leverkaas2516@reddit

I use a variety of languages. The only ones that use floating point by default are JavaScript and Matlab.

many programming languages already have separate types for integers vs. floating point. I don't think this division makes sense.

If you don't like floating point, and you don't think the programmer should be able to choose, then what do you want? Should all numbers be integers?

[-]

Rainbows4Blood@reddit

OP isn't talking about all numbers in their first point. OP is talking about that floating point is the default for fractional numbers which it is in almost every common language. float/double is the first fractional datatype you learn in C++,C#,Java, etc.

It is possible to represent fractionals without using floating point and most high level languages do have implementations for that. They are not the default though, because they are slower to operate with.

[-]

dtechnology@reddit

Arbitrary precision decimals

[-]

kitsnet@reddit

Change my mind: Floating point should not be the default number representation in high-level programming languages.

It has never been in any serious language.

Floating point is useful because it's so fast.

Floating point has only recently became that fast.

But the whole point of high-level languages is to trade performance for saving programmer time. A python program is never going to be as fast as one in C, but it will be a lot easier to write.

Actually, a typical Python program may as well be faster than a "naive" C program with matrix operations on floating points, because it uses highly optimized libraries.

[-]

tadfisher@reddit

/u/KingSupernova, I will change your mind right now:

Don't use JavaScript. No other language encodes integers as floating-point values, which is what I think you mean by "default number representation".

When can I expect my prize?

[-]

inputwtf@reddit

Not everything needs arbitrary precision. This is why we have different types, for different purposes. If you need arbitrary precision, use it. If not, there's no reason to take the performance hit. It's like using a 64 bit integer for a loop variable that only has 10 iterations.

[-]

regular_lamp@reddit

Floating point has severe drawbacks! The fact that you can't use it for exact math with non-integers is something most programmers have gotten used to, but in reality it's an absolutely *massive* cost.

Can you name a non variable length number type that you can use for "exact math"?

[-]