c++ - Emulate "double" using 2 "float"s -

- August 15, 2011

i writing program embedded hardware supports 32-bit single-precision floating-point arithmetic. algorithm implementing, however, requires 64-bit double-precision addition , comparison. trying emulate double datatype using tuple of 2 floats. double d emulated struct containing tuple: (float d.hi, float d.low).

the comparison should straightforward using lexicographic ordering. addition bit tricky because not sure base should use. should flt_max? , how can detect carry?

how can done?

edit (clarity): need significant digits rather range.

double-float technique uses pairs of single-precision numbers achieve twice precision of single precision arithmetic accompanied slight reduction of single precision exponent range (due intermediate underflow , overflow @ far ends of range). basic algorithms developed t.j. dekker , william kahan in 1970s. below list 2 recent papers show how these techniques can adapted gpus, of material covered in these papers applicable independent of platform should useful task @ hand.

http://hal.archives-ouvertes.fr/docs/00/06/33/56/pdf/float-float.pdf guillaume da graça, david defour implementation of float-float operators on graphics hardware, 7th conference on real numbers , computers, rnc7.

http://andrewthall.org/papers/df64_qf128.pdf andrew thall extended-precision floating-point numbers gpu computation.

Search This Blog

C A N B

c++ - Emulate "double" using 2 "float"s -

Comments

Post a Comment

Popular posts from this blog

javascript - Iterate over array and calculate average values of array-parts -

php - Time zone issue -

iphone - Using nested NSDictionary with Picker -