PHP Internals News: Episode 98: Deprecating utf8_encode and utf8_decode
Thursday, March 3rd 2022, 09:02 GMT London, UKIn this episode of "PHP Internals News" I chat with Rowan Tommins (GitHub, Website, Twitter) about the "Deprecate and Remove utf8_encode and utf8_decode" RFC.
The RSS feed for this podcast is https://derickrethans.nl/feed-phpinternalsnews.xml, you can download this episode's MP3 file, and it's available on Spotify and iTunes. There is a dedicated website: https://phpinternals.news
Transcript
Derick Rethans 0:14Hi, I'm Derick. Welcome to PHP Internals News, a podcast dedicated to explaining the latest developments in the PHP language. This is episode 98. Today I'm talking with Rowan Tommins about the "Deprecate and remove UTF8_encode and UTF8_decode" RFC that he's proposing. Hi, Rowan, would you please introduce yourself?
Rowan Tommins 0:38Hi, I'm Rowan Tommins. I'm a PHP software architect by day and try and contribute back to the community and have been hanging around in the internals mailing list for about 10 years and contributed to make the language better, where I can.
Derick Rethans 0:57Excellent. Yeah, that's how I started out as well, many, many more years before that, to be honest. This RFC, what problem is this trying to solve?
Rowan Tommins 1:08PHP has these two functions, utf8_encode and utf8_decode, which, in themselves, they're not broken. They do what they are designed to do. But they are very frequently misunderstood. Mostly because of their name. And because Character Encodings in general, are not very well understood. People use them wrong, and end up getting in all sorts of pickles that are worse than if the functions weren't there in first place.
Derick Rethans 1:37What are you proposing with the RFC then?
Rowan Tommins 1:39Fundamentally, I'm proposing to remove the functions. As of PHP 8.2, there will be a deprecation notice whenever you use them, and then in 9.0, they would be gone forever, and you wouldn't be able to use them by mistake, because they just wouldn't be there.
Derick Rethans 1:56I reckon there's going to be a way to actually do what people originally intended to do with it at some point, right?
Rowan Tommins 2:02So yeah, there are alternatives to these functions, which are much clearer in what you're doing, and much more flexible in what you can do with them so that they cover the cases that these functions sound like they're going to do, but don't actually do when you understand what they're really doing.
Derick Rethans 2:20I think we'll get back to that a little bit later on. You're wanting to deprecate these functions. But what do these functions actually do?
Rowan Tommins 2:27What they actually do is convert between a character encoding called Latin-1, ISO 8859-1, and UTF-8. So utf8_encode converts from Latin-1 in
Thông Tin
- Chương trình
- Đã xuất bản09:02 UTC 3 tháng 3, 2022
- Xếp hạngSạch