PHP Internals News: Episode 98: Deprecating utf8_encode and utf8_decode

PHP Internals News

PHP Internals News: Episode 98: Deprecating utf8_encode and utf8_decode

Thursday, March 3rd 2022, 09:02 GMT London, UK

In this episode of "PHP Internals News" I chat with Rowan Tommins (GitHub, Website, Twitter) about the "Deprecate and Remove utf8_encode and utf8_decode" RFC.

The RSS feed for this podcast is https://derickrethans.nl/feed-phpinternalsnews.xml, you can download this episode's MP3 file, and it's available on Spotify and iTunes. There is a dedicated website: https://phpinternals.news

Transcript

Derick Rethans 0:14

Hi, I'm Derick. Welcome to PHP Internals News, a podcast dedicated to explaining the latest developments in the PHP language. This is episode 98. Today I'm talking with Rowan Tommins about the "Deprecate and remove UTF8_encode and UTF8_decode" RFC that he's proposing. Hi, Rowan, would you please introduce yourself?

Rowan Tommins 0:38

Hi, I'm Rowan Tommins. I'm a PHP software architect by day and try and contribute back to the community and have been hanging around in the internals mailing list for about 10 years and contributed to make the language better, where I can.

Derick Rethans 0:57

Excellent. Yeah, that's how I started out as well, many, many more years before that, to be honest. This RFC, what problem is this trying to solve?

Rowan Tommins 1:08

PHP has these two functions, utf8_encode and utf8_decode, which, in themselves, they're not broken. They do what they are designed to do. But they are very frequently misunderstood. Mostly because of their name. And because Character Encodings in general, are not very well understood. People use them wrong, and end up getting in all sorts of pickles that are worse than if the functions weren't there in first place.

Derick Rethans 1:37

What are you proposing with the RFC then?

Rowan Tommins 1:39

Fundamentally, I'm proposing to remove the functions. As of PHP 8.2, there will be a deprecation notice whenever you use them, and then in 9.0, they would be gone forever, and you wouldn't be able to use them by mistake, because they just wouldn't be there.

Derick Rethans 1:56

I reckon there's going to be a way to actually do what people originally intended to do with it at some point, right?

Rowan Tommins 2:02

So yeah, there are alternatives to these functions, which are much clearer in what you're doing, and much more flexible in what you can do with them so that they cover the cases that these functions sound like they're going to do, but don't actually do when you understand what they're really doing.

Derick Rethans 2:20

I think we'll get back to that a little bit later on. You're wanting to deprecate these functions. But what do these functions actually do?

Rowan Tommins 2:27

What they actually do is convert between a character encoding called Latin-1, ISO 8859-1, and UTF-8. So utf8_encode converts from Latin-1 in

Bạn cần đăng nhập để nghe các tập có chứa nội dung thô tục.

Luôn cập nhật thông tin về chương trình này

Đăng nhập hoặc đăng ký để theo dõi các chương trình, lưu các tập và nhận những thông tin cập nhật mới nhất.

Chọn quốc gia hoặc vùng

Châu Phi, Trung Đông và Ấn Độ

Châu Á Thái Bình Dương

Châu Âu

Châu Mỹ Latinh và Caribê

Hoa Kỳ và Canada