Don’t use PHP7 type hint for external data

(更新日: 2015/05/13)

PHP7, expected 2015 Q4, comes with scalar type hints that supports “int”, “float”, “array”. While it is good thing to have type hints for basic data types, but it changes the way data was handled in older PHP versions.

 

Current PHP

PHP variables are handled properly by its context. String is converted to integer or float for arithmetics. Integer and float are converted to string when they are used as string. Variables are converted appropriate data type automatically. It’s called type juggling.

PHP’s “int” type is unsigned 32 integer under 32 bit CPUs, but thanks to type juggling, PHP could compute up to signed 53 bit integer by using float. PHP does type conversion automatically and most PHP programmers do not bother to change(cast) supplied data’s data types. This allows PHP programs to receive and send huge numeric values with database/etc.

※ The fact that Windows build has always 32 bit integer currently is ignored. PHP7 has 64 bit integer support under Windows.

 

PHP7’s Scalar Type Hints

Scalar type hints is simple. It allows to use “int”/”float”/”array” as type hint and has 2 modes.

  • Strict mode – do not allow data type conversion at all. Requires declared data type strictly.
  • Weak mode – allow data to be convert to “int” type or “float” type. (Default)

To use strict mode, user has to declare it in script file as follows

If there is no “strict_type=1” declaration, PHP uses weak mode that allows data type conversion to specified type.

It’s simple and may seem nothing wrong to use scalar type hints in your code.

HOWEVER. New PHP7 scalar type hints changes the way PHP handled “integer”/”float” like variables. i.e.  Type hint cannot be used for “string” contains integer/float like values without careful consideration.

 

How External Data Is Handled Now

External data like database record ID is handled as “string”. In fact, all of database data is handled as “string” regardless of its column type basically. This fact is not limited to database record values, but almost all external data like $_GET/$_POST is stored as “string”.

External data is not converted to PHP’s native data type corresponding to it because of precision. Even if external data specification is “integer” (or “float”) type, it does not have to fit into PHP’s native “int” (or “float”) type. PHP native “int” type in 32 bit machine is “signed 32 bit” which can express from -2^31 to 2^31-1. It’s -2^61 to 2^61-1 on 64 bit machines.

Database record ID does not have be in -2^31 to 2^31-1 or -2^61 to 2^61-1 range, for example. Most databases use “signed 64 bit integer” for record IDs, yet PHP programs under 32 bit machines works as it should.

The magic is because PHP handles external data as “string”, not as “int” (or “float”). As long as user does not apply arithmetics or explicit cast, PHP programs handled all external inputs as string. (JSON is an exception. I’ll mention this later)

Therefore, database record IDs/etc are kept correctly and works as it should. Even when users supply large “integer like string” to PHP array key works as expected.

This code outputs following with 64 bit machines. (Windows excluded. PHP “int” is signed 32 bit integer under Windows regardless of CPU)

PHP accepts “integer like string” as it is and worked with large numbers without a hitch.

 

New Scalar Type Hints Requires Valid Integer/Float

New scalar type hint requires valid integer or float regardless of its mode.  Let’s see how it works.

This code result in fatal error. Please note that this occurs in “weak” type hint mode.

PHP7 scalar type hints require valid integer or float strictly. If “integer like string” exceeds PHP’s int range, it fails with fatal error.

Float type hints behaves differently for large float value because float type support INF(infinity).

This code gives following result.

The result is incorrect because the code only needs to print the given possible float value, yet it prints “INF”. Even when value is in float range, it loses precision.

outputs

 

PHP 5.6 supports GMP integer, but PHP7 “int” type hint does not accept it as valid “int” at all. Example is omitted.

These are example codes and outputs under 32 bit CPU.

outputs

outputs

These 2 codes works well without “int” type hint.

 

PHP Developers Should Not Use Scalar Type Hints for External Numeric Values

External numeric values could be out of PHP’s int/float range. This means database/JSON/etc values may not be handled correctly with new PHP7 scalar type hints.

Database system supports “unsigned int8″(MySQL support this), NUMERIC/DECIMAL data type which can store huge number correctly. PostgreSQL NUMERIC/DECIMAL supports values up to 131072 digits before the decimal point; up to 16383 digits after the decimal point. Even if underlying database/etc supports much larger integer/float, large integer may result in fatal error and/or large float may lose precision because of type hint.

PHP itself has lessens already. JSON module converts JSON numeric to PHP’s “int” or “float” blindly. As a result, PHP lose numeric data information by casts even when programs are passing values back and forth only.  RFC 7159 “6. Numbers” states

This specification allows implementations to set limits on the range
and precision of numbers accepted. Since software that implements
IEEE 754-2008 binary64 (double precision) numbers [IEEE754] is
generally available and widely used, good interoperability can be
achieved by implementations that expect no more precision or range
than these provide, in the sense that implementations will
approximate JSON numbers within the expected precision. A JSON
number such as 1E400 or 3.141592653589793238462643383279 may indicate
potential interoperability problems, since it suggests that the
software that created it expects receiving software to have greater
capabilities for numeric magnitude and precision than is widely
available.

Note that when such software is used, numbers that are integers and
are in the range [-(2**53)+1, (2**53)-1] are interoperable in the
sense that implementations will agree exactly on their numeric
values.

Unless programs agree to have certain type, programmers should not enforce certain type. Otherwise, programs create issues like the RFC mentioned. PHP’s JSON module has/had exact issues. Non destructive JSON parsing for integer was added later. Non destructive parsing for float is under discussion currently. (Note: JSON only have “numeric” specification allows any base 10 integer and float with/without exponent.)

Lessons from JSON is “Use string if you do not want to lose information”.

 

Scalar Type Hint Should Not Be Used for Arithmetics Blindly Also

PHP developer cannot use “int”/”float” scalar type hint for databases/JSON/etc. They should not use “int” type hint blindly for arithmetics. Let’s consider simple integer addition example.

This function works well up to sum between -2^53+1 and 2^53-1 regardless of CPU. PHP converts large integer to float (IEEE 754 double) under 32 bit CPU. Float can express integer between -2^53+1 and 2^53-1 without losing precision. PHP under 32 bit CPU can compute integer exceeds 32 bit integer range correctly by using float.

If PHP developer uses “int” type hint, it limits parameters strictly within native unsigned integer range. If return type hint is used, result is limited to PHP’s “int” range.

outputs

As you see fatal errors in parameters, return value which exceeds “int” type limit yields fatal error.

PHP developers cannot use “float” type hint universally neither, because 64 bit integer exceeds “float” range.

 

PHP Developer Should Consider IoT Device Clients

Many developers may think “We don’t have to care much about 32 bit CPUs since servers are 64 bit nowadays”. Servers are 64 bit machines, but we cannot forget about IoT device client.

Basic IoT devices will have 32 bit CPUs at least a decade because 32 bit CPU enough for its purpose. If PHP library is designed only for 64 bit CPUs, PHP developer will have hard time to make PHP apps on these devices.

According to Embedded Industry Questionnaires’ Result 2012 by Ministry of Economy, Trade and Industry (METI) which is Japanese government ministry, (Page 14. Survey is done in 2010. PDF written by Japanese) Ignoring DSP and other, about 90% of embedded systems use 32 bit or less CPUs.

  • 0.2% 4 bit CPU
  • 10.0% 8 bit CPU
  • 25.2% 16 bit CPU
  • 46.1% 32 bit CPU
  • 10.8% 64 bit CPU or more
  • 5.4% DSP
  • 2.2% Other

 

What PHP7 Should Have

PHP7’s type hints is too strict even when it is working in weak mode. PHP may have

  • Weaker type hint restriction that allows arbitrary number.

or

  • Numeric type hint that allows arbitrary number.

or

  • Have 64 bit “int” for 32 bit CPUs. (This only reduces impact of the issue. Issue with type conversion remains)

In addition to one of these,

  • Validation functions that checks int/float candidate values.
  • Function overloading and template.

Current PHP7 does not have validation functions that see if a value fits in int/float range of it’s CPU. The fatal errors raised by type hints are E_RECOVERBLE_ERROR. It can be catched by custom error handler. However, resuming execution with custom error handler is much harder than validating before supplying data to a function.

Type hint errors may be exceptions in final release, but there would be many cases programmers require to handle parameter/return value errors by themselves rather than raising exceptions.

Without function overloading and template, programmers have to write many functions and use them to handle various types. PHP7 does not have neither.

 

Conclusion

It is sad to say “Don’t use type hints for this and that”, but I have to. PHP developers should not use “int”/”float” type hint unless they are sure what the data range is.

In general, PHP developers should not use “int”/”float” type hints for

  • Database numeric (integer, decimal) value
  • JSON numeric
  • XML numeric
  • Any other external numeric values

In general, PHP developers should not use “int” type hint for

  • Arithmetics
  • Array numeric key

unless they are certain that data is in unsigned 32 bit integer range. (Do not ignore 32 bit CPUs)

These are important for portable libraries/frameworks especially.

In addition, PHP developers should realize “int” type hint raises fatal errors even when a value contains only digits which should be valid as integer. “float” type hint may lose precision significantly without errors unlike “int”.

Current scalar type hint implementation is restrictive and usage is limited.

  • Do not use scalar type hint blindly.
  • Do not expect PHP to work as it used be with scalar type hints. i.e. Large numbers cause fatal error or truncated result.
  • Beware that unconditional casts are evil, as it will never raise errors, and hides problems.

Otherwise, you will end up with serious bugs including side wide DoS and/or interoperability issues. Like lessons from JSON, if you need to use type hints for numeric scalars

  • Use “string” type hint for numeric scalars if you need hint.

You may feel silly to use “string” scalar type hint for numbers, but this is the best solution for external data now.

Simple is better. However, “Make things as simple as possible, but not simpler.” (Albert Einstein) is truth for science and engineering. Current type hint is “too simple”; if it’s a integer, convert/force it to PHP native “int” type. PHP is used for interacting external data like browsers, databases, JSON, XML, etc.

Since reasonable resolution is not available, I hope noone use PHP7 type hints inappropriate manner, library/framework developers at least. However, I saw too many incorrect casts while I was auditing code in past. Therefore, I cannot be optimistic.

Comments are appreciated.

 

Where Developer Can Use Basic Type Hint

When developer is sure number range within signed 32 bit integer (PHP’s “int” on 32 bit CPU), “int” type hint can be safely used.

Safe with “int” type hint:

  • Year
  • Age
  • Top 10 list number
  • Country number
  • Anything that fits unsigned 32 bit integers absolutely.

“float” is safe if number is signed 32 bit integer (PHP’s “int” on 32 bit CPU) or IEEE 754 double.

Save with “float” type hint:

  • Temperature
  • Distance
  • Weight
  • Height
  • Anything that fits IEEE 754 double and do not lose precision absolutely.

Please do not forget that developer MUST NOT store values exceed “int” or “float” limit to anywhere that are applied to basic type hints. Otherwise, you may end up with fatal error, i.e. DoS, or losing precision.

“array” is safe. Use it. Don’t forget that non-array value raises fatal error.

 

Where Developer Can NOT use Basic Type Hint

These are examples.

Unsafe with “int” and/or “float” type hint:

  • Database record ID
  • Database numeric
  • JSON numeric
  • Numeric values in XML
  • Numeric values in YAML
  • Numeric values from Web browser
  • Any strings look like numbers

If developer is absolutely sure  values fit in signed 32 bit integer (PHP’s “int” on 32 bit CPU) or IEEE 754 double, developer may use “int” or “float” hint even for these. Please do not forget that developer MUST NOT store values exceeds “int” or “float” limit to anywhere that are applied to basic type hints. Otherwise, you may end up with fatal error, i.e. DoS, or losing precision.

When developer is writing NON portable codes, developer may assume signed 64 bit integer can fit if all of related system/data use signed 64 bit integer absolutely.

 

References

 

Comments

comments

コメントを残す

メールアドレスが公開されることはありません。 * が付いている欄は必須項目です