How It Works: SQL Parsing of Number(s), Numeric and Float Conversions
Published Jan 15 2019 01:17 PM 2,693 Views
Microsoft
First published on MSDN on Nov 01, 2010

SQL Server and other documentation have always indicated that float values are not precise and comparison or conversion of them can be problematic and imprecise.   Working on a recent customer case it required me to dig into the single/double precision point format as well as SQL Server NUMERIC format and handling.

The application was designed in SQL Server 6.x when BIGINT did not exist.  In order to handle large integer values the application used SQL Server a float data type and never stored anything other than zeros (0's) in the decimal positions.   They developers knew that the float could be imprecise and tried to account for it when building their where clauses using >= style of actions.   However, they later found out that when the numbers got large, such as 576460752305 and there were .000000 (6 zeros) vs .00000 (5 zeros) in the decimal positions they could encounter rounding behaviors.

Resolution: In this particular case only readable integer values were being used so the table should be updated to a BIGINT.  BIGINT requires 8 bytes of storage just like the float.  If decimal precision was required the column should be converted to NUMERIC.

Several versions ago the SQL Server parser was changed to use the NUMERIC data type for all numeric values.   Prior to this the values were treated within domain ranges and types.  For example:

1 = INTEGER

1. =  FLOAT – Scale of 0

1.0 =  FLOAT – Scale of 1

The problem with the old design was that it leads to unwanted conversions in plans (usually table scans).   Using a NUMERIC we have number capable of handling large values and exact precision to manipulate cleanly and avoid unwanted conversions.

I want to take a specific look at a numeric to float conversion because float is imprecise and floating point arithmetic can fool you.

The sqltypes.h file included with the sql.h header outlines the SQL_NUMERIC_STRUCT structure.
{
SQLCHAR precision;
SQLSCHAR scale;
SQLCHAR sign;
SQLCHAR val[16];
} SQL_NUMERIC_STRUCT;

Simplified, the design of a NUMERIC is a 128 bit integer value.  Look at the number, remove the decimal (ignore it for the 128 bit integer view) and store it in the val member, keeping the decimal position (power of 10) for the scale.

Here is a small program that shows how a NUMERIC is converted to a float by the SQL Server components.  I used 4 ULONG values for the 128 bit value and it shows how the double is accumulated and converted.

#include "stdafx.h"

#include "windows.h"

#include "math.h"

#include "assert.h"

const ULONGLONG x_dwlBaseUI4      = 0x100000000ui64;

const ULONG          x_ulBase10           = 10;

int main(int argc, char* argv[])

{

ULONG         m_l128Val[4] = {};

ULONGLONG     ullVal =      576654928035000000;  //     576654928035.000000

BYTE          bScale =      6;

assert(0xFFFFFFFFFFFFFFFF > ullVal);            // Use only 2 slots for the test

m_l128Val[0] = ullVal & 0xFFFFFFFF;

m_l128Val[1] = (ullVal >> 32) & 0xFFFFFFFF;

int           iulData     = 0;

double        dVal        = 0.0;

ULONGLONG     ullAccum    = 0;

for (iulData = 2; iulData >= 0; iulData--)

dVal = m_l128Val[iulData] + dVal * x_dwlBaseUI4;

dVal /= pow ((double) x_ulBase10, bScale);

//            Testing Sanity Check

for (iulData = 2; iulData >= 0; iulData--)
ullAccum = m_l128Val[iulData] + ullAccum * x_dwlBaseUI4;
assert(ullAccum == ullVal);

dVal = ullAccum / (pow ((double) x_ulBase10, bScale));

return 0;

}

The value in the sample is specifically picked because floating point arithmetic will result in a small change from what appears to be an integer value to the reader.  You can actually break this down to a very simple form.  Taking the large integer that fits in an UINT64 for the example and add the decimal place based on the power of 10 and the scale.  Notice that the floating point arithmetic results in the additional of 0.00012 to the readable value.

ULONGLONG f6= 576654928035000000 ui64 ;

dRetVal1 = f6 / pow ((double) 10.0, 6);

dRetVal2 = f6 / (double)1000000;

dRetVal3 = f6 / 1000000.0;

dRetVal1   576654928035.00012   double

dRetVal2   576654928035.00012   double

dRetVal3   576654928035.00012   double

Note: When testing I see differences between a 32 bit project running in WOW64 and a pure X64 project.

SQL Server stores float values as 8 byte, double precision values.   You can see the parser and data storage in action with a simple select using a cast.  The additional 1 at the end of the binary storage represents the 0.00012.

select cast( cast(576654928035.00000 as float) as varbinary)      -- 0x4260C869FD146000
select cast( cast(576654928035.00000 0 as float) as varbinary)    --   0x4260C869FD14600 1

The floating point arithmetic is outlined well on various internet sites so I am not going to repeat the details here.   You can research more using: " ANSI/IEEE Standard 754-1985, Standard for Binary Floating Point Arithmetic" if you want to try to reverse the binary to the floating point value.  I prefer to use '.formats' in the Windows Debugger and let it do the heavy lifting for me.  It is difficult to predict the behavior without running the algorithm for a very broad set of values.

I wanted to know why, when I called atof with 6 digits to the right of the decimal it returned .000000 and not .000001 like the SQL NUMERIC to floating point was doing.   The answer is an optimization in the atof logic, or that is appears to consider trailing 0's to the right of the decimal position insignificant to a degree.  ( _IsZeroMan is used to check for zero mantissa values.)

After running some simple iteration tests I found that the six zeros (.000000) seemed to really start showing imprecisions from the raw integer at 576460752305.000000 and larger values when 64 bit precision was in play.

for(ULONGLONG ullValue = 576654928035000000; ullValue > 476654928035000000 && ullValue < 0xFFFFFFFFFFFFFFFF; ullValue -= 1000000)

{

double dVal = ullValue / pow(10.0,6);

ULONGLONG ullCast = ullValue / pow(10.0, 6);

if(ullCast != dVal)

{

printf("----- %f\r\n", dVal);

}

}

NOTE: You can't just assume the display from the client is the same as the storage of the SQL Server.  For example the following print the same on the client but they don't actually match when physical/storage comparison takes place.

select cast(576460752305.000000 as float)
select cast(cast(576460752305.000000 as float) as numeric)

if( cast(cast(576460752305.000000 as float) as numeric) = cast(576460752305.000000 as float))
begin
print('Match')
end
else
begin
print('NOT A Match')    <-------- !!! This branch is printed !!!
end

Looking at the actual storage you can see the difference.

select cast(cast(576460752305.000000 as float) as varbinary)      -     storage (0x4260C6F7A0B61FFF)

select cast(576460752305 as float)                                -     storage (0x4260C6F7A0B62000)

atof (“576460752305.000000”) = 576460752305.000000                 -     storage (0x4260C6F7A0B62000)

Breaking this down you can see the actual double precision value when using 64 bit precision.

S EEEEEEEEEEE FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF

0 10000100110 0000110001101111011110100000101101100010000000000000

10000100110 = 1062 – 1023 = 39, (2^39) = 549755813888

10000110001101111011110100000101101100010000000000000 = 4722366482882560

4722366482882560 / (2 ^52 th = [4503599627370496])

4722366482882560 / 4503599627370496 = 1.0485760000028676586225628852844

549755813888 * 1.0485760000028676586225628852844 = 576460752304.99999999999999999999 ß Close but not exact

If you force different rounding behavior by manipulating the precision and scale you can see the storage values change.

select cast(cast(cast(cast(576460752305.000000 as float(53)) as numeric(38,3))as float) as varbinary) – storage (0x4260c6f7a0b62000)
select cast(cast(cast(cast(576460752305.000000 as float(53)) as numeric(38,4))as float) as varbinary) – storage (0x4260c6f7a0b61fff)

If you look at the display in SQL Server Management Studio for these variants you get the same value of 576460752305 but the actual storage of the float sent from the SQL Server contains the additional precision.

select cast(cast(576460752305.000000 as float(53))  as varbinary)

0x4260C6F7A0B61FFF

2010-11-01 13:31:38.22 spid51      ODS Event: language_exec : Xact 0 ORS#: 1, connId: 0

2010-11-01 13:31:38.22 spid51         Text:select cast(cast(576460752305.000000 as float(53))  as varbinary)

2010-11-01 13:31:38.22 Server      2010-11-01 13:31:38.

2010-11-01 13:31:38.23 Server      spid51

2010-11-01 13:31:38.23 Server      Printing send buffer:

2010-11-01 13:31:38.23 Server      04 01 00 2D 00 33 01 00 81 01 00 00 00 00 00 21   ...-.3.........!

2010-11-01 13:31:38.23 Server      00 A5 1E 00 00 D1 08 00 42 60 C6 F7 A0 B6 1F FF   ........B`......

2010-11-01 13:31:38.23 Server      FD 10 00 C1 00 01 00 00 00 00 00 00 00            .............

select cast(cast(576460752305.0000 as float(53))  as varbinary)

0x4260C6F7A0B62000

2010-11-01 13:30:00.56 spid51      ODS Event: language_exec : Xact 0 ORS#: 1, connId: 0

2010-11-01 13:30:00.57 spid51         Text:select cast(cast(576460752305.0000 as float(53))  as varbinary)

2010-11-01 13:30:00.58 Server      2010-11-01 13:30:00.

2010-11-01 13:30:00.58 Server      spid51

2010-11-01 13:30:00.58 Server      Printing send buffer:

2010-11-01 13:30:00.59 Server      04 01 00 2D 00 33 01 00 81 01 00 00 00 00 00 21   ...-.3.........!

2010-11-01 13:30:00.59 Server      00 A5 1E 00 00 D1 08 00 42 60 C6 F7 A0 B6 20 00   ........B`.... .

2010-11-01 13:30:00.60 Server      FD 10 00 C1 00 01 00 00 00 00 00 00 00            .............

select cast(576460752305.0000 as float(53))

select cast(576460752305.000000 as float(53))

----------------------

576460752305

----------------------

576460752305

Using a different SQL Server client ( SQLCMD.exe ) it displays a value of 576460752304.99988 showing the additional precision.  Looking closer at SSMS it appears to round to the 3rd decimal position by default.

select cast(576460752305.001901 as float(53))       -- 576460752305.002

For safety use the convert when selecting the data to validate the values.

select convert(varchar(32), cast(576460752305.0000 as float(53)), 2)
select convert(varchar(32), cast(576460752305.000000 as float(53)), 2)

--------------------------------
5.764607523050000e+011
--------------------------------
5.764607523049999e+011

I now get SQL Server and printf to display the value as 576460752305 but using the following you can see we get different values when you start to involve various floating point instructions (divsd, mulsd, …)

double dValueAgain = atof("576460752305.000000");

printf("%f\n",dValueAgain);                            576460752305.000000 – storage (0x4260c6f7a0b62000)

ULONGLONG ullCheck = 576460752305000000;

double dNew = ullCheck / pow(10.0, 6);

printf("%f\n", dNew);                                 576460752304.999880 – storage (0x4260c6f7a0b61fff)

Stepping into the logic for atof there is logic to check for an all zero mantissa as well as round the mantissa which appears to keep the floating point closer to the readable integer value a user might expect from the conversion.  Looking at the assembly for the ULONGLONG division the CPU logic using xmm and/or floating point instructions apply.   Depending on the precision control flags (PC) you might get different behaviors.  Reference: http://msdn.microsoft.com/en-us/library/c9676k6h.aspx

I updated my sample to force different precision levels

_controlfp_s(&currentFP, _PC_24, MCW_PC);

double dNew = ullCheck / pow(10.0, 6);

printf("%f\n", dNew);                                        576460752305.000000 – storage (0x4260c6f7a0b62000)

_controlfp_s(&currentFP, _PC_64, MCW_PC);

ULONGLONG ullCheck = 576460752305000000;

double dNew = ullCheck / pow(10.0, 6);

printf("%f\n", dNew);                                        576460752304.999880 – storage (0x4260c6f7a0b61fff)

To understand this better I looked at the raw, binary format for the double precision float values.

select cast(cast(576460752305 as float) as varbinary)

576460752305 = 1000011000110111101111010000010110110001

0x4260C6F7A0B62000 = 1000010011000001100011011110111101000001011011000 1 0000000000000

select cast(cast(576460752304 as float) as varbinary)

576460752304 = 1000011000110111101111010000010110110000

0x4260C6F7A0B60000 = 100001001100000110001101111011110100000101101100000000000000000

Looking at the double precision format details again you can see the difference between the values …305 and … 304 shown below.

S EEEEEEEEEEE FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF

576460752305 =    0 10000100110 0000110001101111011110100000101101100010000000000000

10000100110 = 1062 – 1023 = 39, (2^39) = 549755813888

10000110001101111011110100000101101100010000000000000 = 4722366482882560

4722366482882560 / (2 ^52 th = [4503599627370496])

4722366482882560 / 4503599627370496 = 1.0485760000028676586225628852844

549755813888 * 1.0485760000028676586225628852844 = 576460752304.99999999999999999999 ß Close but not exact

Looking at the mathematical formula for this it would be the following.

549755813888 * X = 576460752305

Transform the equation.

576460752305 / 549755813888 = X

Solve for X = 1.0485760000028676586225628852844

576460752304 =    0 10000100110 0000110001101111011110100000101101100000000000000000

10000100110 = 1062 – 1023 = 39, (2^39) = 549755813888    ß  Same Exponent

10000110001101111011110100000101101100000000000000000

4722366482874368 / (2 ^52 th = [4503599627370496])

4722366482874368 / 4503599627370496 = 1.0485760000010486692190170288086

549755813888 * 1.0485760000010486692190170288086 = 576460752304

Thinking about this more it becomes clear that the exponent can only represent approximately (2 ^ 11th = ~2048) exact values if you leave the mantissa all zeros.   (There are special cases for NaN and Infinite states).   With a all zero mantissa the mathematics will be Exponent * 1.0 for an exact match.  Anything outside of 1.0 for the mantissa has the possibility to vary from a strict integer, whole value.

When looking at this for representation of whole integer values only each power of 2 step has to represent 1/nth of the step.  For example the only integer between 2 and 4 is 3 and so 1/n for 1/2 = .50 or 2 + (2 * .50).  Looking at 4 to 8 we need to represent 5, 6, and 7 which is 1/4th or .25, .50 and .75 + base (4) respectively.   When you get out to larger powers of 2 the precision becomes more complex and plays a factor.

Going back to the original statement : Floating point quality involves many factors from precision handling, which conversion is involved and even what the client vs the server uses to do the default conversions can play a factor in what values you see.  Just because you see a number you may need to go to the actual storage format to determine the raw value.

Bob Dorr - Principal SQL Server Escalation Engineer

Version history
Last update:
‎Jan 15 2019 01:17 PM
Updated by: