PHP Internals News

Derick Rethans
PHP Internals News

This is the 'PHP Internals News' podcast, where we discuss the latest PHP news, implementations, and issues with PHP internals developers and other guests.

  1. 06/24/2022

    PHP Internals News: Episode 103: Disjunctive Normal Form (DNF) Types

    PHP Internals News: Episode 103: Disjunctive Normal Form (DNF) Types Friday, June 24th 2022, 09:07 BST London, UK In this episode of "PHP Internals News" I talk with George Peter Banyard (Website, Twitter, GitHub, GitLab) about the "Disjunctive Normal Form Types" RFC that he has proposed with Larry Garfield. The RSS feed for this podcast is https://derickrethans.nl/feed-phpinternalsnews.xml, you can download this episode's MP3 file, and it's available on Spotify and iTunes. There is a dedicated website: https://phpinternals.news Transcript Derick Rethans 0:15 Hi, I'm Derick. Welcome to PHP internals news, a podcast dedicated to explaining the latest developments in the PHP language. This is episode 103. Today I'm talking with George Peter Banyard again, this time about a disjunctive normal form types RFC, or DNF, for short, which he's proposing together with Larry Garfield. George Peter, would you please introduce yourself? George Peter Banyard 0:39 Hello, my name is George Peter Banyard, I work on PHP paid part time, by the PHP foundation. Derick Rethans 0:44 Just like last time, we are still got colleagues. George Peter Banyard 0:46 Yes, we are indeed still call it. Derick Rethans 0:48 What is this RFC about? What is it trying to solve? George Peter Banyard 0:52 The problems of this RFC is to be able to mix intersection and union types together. Last year, when intersection types were added to PHP, they were explicitly disallowed to be used with Union types. Because: a) mental framework, b) implementation complexity, because intersection types were already complicated on their own, to try to get them to work with Union types was kind of a big step. So it was done in chunks. And this is the second part of the chunk, being able to use it with Union types in a specific way. Derick Rethans 1:25 What is the specific way? George Peter Banyard 1:27 The specific way is where the disjoint normal form thing comes into play. So the joint normal form just means it's a normalized form of the type, where it's unions of intersections. The reason for that it helps the engine be able to like handle all of the various parts it needs to do, because at one point, it would need to normalize the type anyway. And we currently is just forced on to the developer because it makes the implementation easier. And probably also the source code, it's easier to read. Derick Rethans 1:54 When you say, forcing it up on a developer to check out you basically mean that PHP won't try to normalize any types, but instead throws a compilation error? George Peter Banyard 2:05 Exactly. It's, it's the job of the developer to do the normalization step. The normalization step is pretty easy, because I don't expect people to do too many stuff as intersection types. But as can always be done as a future scope of like adding a normalization step, then you get into the issues of like, maybe not having deterministic code, because normalization steps can take very, very long, and you can't necessarily prove that it will terminate, which is not a great situation to be in. Imagine just having PHP not running at all, because it's stuck in an infinite loop trying to normalize the format. It's just like, oh, I can't compile Derick Rethans 2:39 Would a potential type alias kind of syntax help with that? George Peter Banyard 2:44 Maybe, I'm not really sure. Actually reading like research about it from computer scientists, in functional programming languages, which is everything is compiled on my head. And they have the whole thing was like, well, they need to type type normalize, and especially with type aliases, they haven't really figured out a way yet. So I'm not sure how we are going to figure out a way if experts and PhD students and researchers haven't really figured out a way. Derick Rethans 3:08 And is the reason for that mostly, because PHP, resolves types while it is running code sometimes because it has to overload classes, and then it might find out it is an inherited class, for example? George Peter Banyard 3:19 Yes, I think it's like this weird thing where might maybe PHP has like kind of an advantage, because it doesn't need to, like resolve all of the types at once. And if you have a type alias, it's just oh, if it's used, and you just need to resolve it, and then try to figure it out. There's also the added complexity of like, variance checks, because most functional programming languages, they have variance to some degree, but they don't have the whole inheritance of like typical OOP languages have. It's kind of a very strange field, the fact that yeah, PHP is just like, well, we kind of do stuff at runtime, and you don't necessarily need everything. And it just works is like, well, we'll do. That's mainly the reason why the dev needs to do the normalization step, the form is done. It's also I think, the most easiest to understand, it's just like, Oh, you have this and this, or this group, or stuff, or this group of stuff, or this thing, simple type. The other form would be another normalized form would be conjunctive normal form, which is a list of ANDs of ORs to just have this thing, or X, like (A or B or C) and X and (Y or Z), which I think is harder to understand. Derick Rethans 4:26 What is the exact syntax then? George Peter Banyard 4:28 So the exact syntax is, if you want to have an intersection type was in a union type, you need to like bracket it by parentheses. And then you have like the normal pipe union operator and you can mix it with single types, you can mix it with true, you can mix it with false, which are literal types, which now exist, or just normal, bool types. Derick Rethans 4:48 The parenthesis is actually required. You don't rely on operator precedence to make things work? George Peter Banyard 4:53 Yes. Relying on operator precedence is terrible. Derick Rethans 4:57 Yep, I agree. George Peter Banyard 4:58 I'd say Oh, yeah, but I think I've heard this argument on the list like a couple of times, it's just, oh, yeah, but maths, like, has like, and as priority over like, or, I mean, I did three years of a maths degree and not gonna lie. Maths notation is terrible for most of us. People don't even agree on terminology. I'm just gonna say, let's, let's just do better. Derick Rethans 5:19 I agree. I mean, most coding standards for any sort of variable for like conditions, will already require parenthesis around multiple complex clauses anyway, right? I mean, it's a sensible thing to do, just for readability, in my opinion. So the RFC also talks about a few syntax that you aren't allowed to do, and that you have to normalize or deconstruct yourself, what kinds of things are these? George Peter Banyard 5:41 if you would want to have a type which has an intersection of a class A with at least one other class, so let's say X or Y, but you can always convert it into DNF form, how this type would be, it would be (A and X) or (A and Y). This seems to be the more unusual case, I would imagine. One of the motivating cases of DNF types is to do something like Array or (Traversable and Countable). I don't really see mixing and matching various different object interfaces in differencing, the most useful user land cases to be able to do Array or (Traversable and Countable) so that you can use just count or seeing something as an array, or you have like Traversable and Countable and ArrayAccess. And it's just like, Oh, here's an object, which kind of behaves like an array. Derick Rethans 6:32 I think there's currently another RFC just being proposed, that extends iterator_to_array to multiple types as well to accept more things. So that sort of fits into this category of things to do with iterables and traversals then I suppose. George Peter Banyard 6:49 yeah Derick Rethans 6:50 I'm hoping to talk to the author of that RFC as well. At the moment where two and a half weeks or so before a feature freeze, you now see a whole flurry of RFCs while it was a bit quiet in the last few months. So because you're adding to the type system, that's also usually has consequences for variance rules, or rather, how inheriting works with return types and argument types, as well as property types. What do DNF types mean for these variance checks? George Peter Banyard 7:19 The variance is checks, kind of follow the similar rules as before. So property types are easy. They are invariant, so you can't change them. You can reorder types, like was in your union if you want to. But that was already the case with Union types previously, because PHP will just check that, well, the types match. So contravariant, you can always restrict types, meaning you can either add intersections, or you can remove unions, br

  2. 06/02/2022

    PHP Internals News: Episode 102: Add True Type

    PHP Internals News: Episode 102: Add True Type Thursday, June 2nd 2022, 09:06 BST London, UK In this episode of "PHP Internals News" I talk with George Peter Banyard (Website, Twitter, GitHub, GitLab) about the "Add True Type" RFC that he has proposed. The RSS feed for this podcast is https://derickrethans.nl/feed-phpinternalsnews.xml, you can download this episode's MP3 file, and it's available on Spotify and iTunes. There is a dedicated website: https://phpinternals.news Transcript Derick Rethans 0:00 Hi I'm Derick. Welcome to PHP internals news, the podcast dedicated to explaining the latest developments in the PHP language. This is episode 102. Today I'm talking with George Peter Banyard about the Add True Type RFC that he's proposing. Hello George Peter, would you please introduce yourself? George Peter Banyard 0:33 Hello, my name is George Peter Banyard, I work part time for the PHP Foundation. And I work on the documentation. Derick Rethans 0:40 Very well. We're co workers really aren't we? George Peter Banyard 0:43 Yes, indeed, we all co workers. Derick Rethans 0:45 Excellent. We spoke in the past about related RFCs. I remember, which one was that again? George Peter Banyard 0:51 Making null and false stand alone types Derick Rethans 0:53 That's the one I was thinking of him. But what is this RFC about? George Peter Banyard 0:56 So this RFC is about adding true as a single type. So we have false, which is one part of the Boolean type, but we don't have true. Now the reasons for that are a bit like historical in some sense, although it's only from PHP 8.0. So talking about something historical. When it's only a year ago, it's a bit weird. The main reason was that like PHP has many internal functions, which return false on failure. So that was a reason to include it in the Union types RFC, so that we could probably document these types because I know it would be like, string and Boolean when it could only return false and never true. So which is a bit pointless and misleading, so that was the point of adding false. And this statement didn't apply to true for the most part. With PHP 8, we did a lot of warning to value error promotions, or type error promotions, and a lot of cases where a lot of functions which used to return false, stopped returning false, and they would throw an exception instead. These functions now always return true, but we can't type them as true because we don't have it, and have so they are typed as bool, which is kind of also misleading in the same sense, with the union type is like, well, it only returns false. So no point using the boolean, but these functions always return true. But if you look at the type signature, you can see like, well, I need to cater to the case where the returns true and when returns false. Derick Rethans 2:19 Do they return true or throw an exception? George Peter Banyard 2:22 Yeah, so they either return true, or they either throw an exception. If you would design these functions from scratch, you would make them void, but legacy... and we did, I know it was like PHP 8.0, we did change a couple of functions from true to void. But then you get into these weird shenanigans where like, if you use the return value of the function in a in an if statement, null gets because in PHP, any function does return a value, even a void function, which returns null. Null gets coerced to false. So you now get like, basically a BC break, which you can't really? Yeah, we did a bit and then probably we sort of, it's probably a bad idea. That's also the point of like, making choices, things that are static analysers can be like, more informants being like, Okay, your if statement is kind of pointless here. Derick Rethans 3:06 Yeah, you don't want to end up breaking BC. Now, we already had false and bool, you're adding true to this. How does that work with Union types? Can you make a union type of true or false? George Peter Banyard 3:18 No. So there are two reasons mainly. A. true and false is the same as like boolean, which is like just use Boolean in this case. But you can say, well, it's more specific, so just allow it. So that's would be reasonable. But the problem is, false has different semantics than boolean. False does not coerce values. So it only accepts false as a literal value. Whereas boolean, if you're not in strict type, which is a lot of code, it will cause values like zero to false one, or any other integers to true. It will coerce every other integer to true, like the true type follows the behaviour of false of being a value type. So it only accepts true, you would get into this weird distinction of does true or false, mean exactly true or false? Or do you get the same behaviour as using the boolean type? Derick Rethans 4:07 So I would say that true or false would than be more restrictive than bool. George Peter Banyard 4:12 Exactly, which is a bit of a problem, because PHP internally has true and false and separate types, which also makes the implementation of this RFC extremely easy, because PHP already makes the distinction of them. But at the same time, the boolean type is just a union of the bitmask of true and false. You can't really distinguish between the types, true or false, or the boolean type within the type system. Currently just does it by checking if it only has one then it can do like two checks. Specifically, you would need to add like an extra flag. I mean, it's doable, but it's just like, Well, who knows which semantics we want? Therefore, just leave it for future discussion because I'm not very keen on it to be fair. Derick Rethans 4:55 True or false are really only useful for return values and not so much for arguments types, because if you have an argument that that always must be true, then it's kind of pointless to have of course. George Peter Banyard 5:05 Same as like it was with the null type RFC. Although there might be one case where PHP internal functions might change the value to true for an argument, I can maybe two types, would be like with the define function, this thing being like case insensitive or case sensitive, I don't remember what the parameter actually; could actually either be false or true, because at the moment, I think emits a notice, things do like the this thing is not supported, therefore the values what was ignored. But we could conceivably see that in PHP 9, we would actually implement this as a proper like: Okay, this only accepts true, yes, this argument is pointless, but it's in the middle of the function signature, so you can't really move it. The spl_register_overload function has like as its second argument, the throw on error or not, which since PHP 8 only accepts true, but it's in the middle of the function. The last argument is still very useful. It's prepend, instead of append the autoloader, I think, or might be the other way around, check the docs. Since PHP 8, this only accepts true. So if you pass in false, it will emit a notice and saying you'd like this argument has just been ignored. So whatever. But we can't really remove the argument. Because well, it's, if you use the third argument, as with positional arguments, then you would change like the signature and you would break it. Now, we don't have a way to enforce in PHP to use named arguments, because that would be a solution. It's just like, well, if you want to set this argument, you need to use named arguments, but we can't do that. Otherwise, then creating a new function, which has an alias, which is also kind of terrible. That would be one of the maybe only cases where you would actually get like true as a as an argument Derick Rethans 6:39 is that now currently bool? And there's a specific check for it? George Peter Banyard 6:42 It's currently bool, and if you pass in false enrolment, like a warning, or notice. Derick Rethans 6:47 How would inheritance work? As return types, you can always make them smaller, right? More restrictive. George Peter Banyard 6:53 Yes, that's also the thing. But that already exists in some sense a problem of. Like if you go from boolean to false, you're already restricting the type. And that problem existed, even before the restricting, well allowing false as a stand-alone type if you had like, as a union, because you could always say like, I don't know. That problem already existed with Union types. Because you could have something like overturn an array or bool and then you change it to either an array or false. And then if you try to return like zero, then you will get like a coercion problem. So the same problem applies with true, because it only affects return values. And like you control the code within a function compared to like how you pass it, that's less of an issue. It applies also, with argument types where you can go from true to like boolean, or true and like a union type. Derick Rethans 7:44 So there's nothing surprising here. I see that the RFC also talks

  3. 05/19/2022

    PHP Internals News: Episode 101: More Partially Supported Callable Deprecations

    PHP Internals News: Episode 101: More Partially Supported Callable Deprecations Thursday, May 19th 2022, 09:05 BST London, UK In this episode of "PHP Internals News" I talk with Juliette Reinders Folmer (Website, Twitter, GitHub) about the "More Partially Supported Callable Deprecations" RFC that she has proposed. The RSS feed for this podcast is https://derickrethans.nl/feed-phpinternalsnews.xml, you can download this episode's MP3 file, and it's available on Spotify and iTunes. There is a dedicated website: https://phpinternals.news Transcript Derick Rethans 0:14 Hi, I'm Derick. Welcome to PHP internals news, the podcast dedicated to explaining the latest developments in the PHP language. This is episode 101. Today I'm talking with Juliette Reinders Folmer, about the Expand Deprecation Notice Scope for Partially supported Callables RFC that she's proposing. That's quite a mouthful. I think you should shorten the title. Juliette, would you please introduce yourself? Juliette Reinders Folmer 0:37 You're starting with the hardest questions, because introducing myself is something I never know how to do. So let's just say I'm a PHP developer and I work in open source, nearly all the time. Derick Rethans 0:50 Mostly related to WordPress as far as I understand? Juliette Reinders Folmer 0:52 Nope, mostly related to actually CLI tools. Things like PHP Unit polyfills. Things like PHP Code Sniffer, PHP parallel Lint. I spend the majority of my time on CLI tools, and only a small portion of my time consulting on the things for WordPress, like keeping that cross version compatible. Derick Rethans 1:12 All right, very well. I actually did not know that. So I learned something new already. Juliette Reinders Folmer 1:16 Yeah, but it's nice. You give me the chance now to correct that image. Because I notice a lot of people see me in within the PHP world as the voice of WordPress and vice versa, by the way in WordPress world to see me as far as PHP. And in reality, I do completely different things. There is a perception bias there somewhere and which has slipped in. Derick Rethans 1:38 It's good to clear that up then. Juliette Reinders Folmer 1:39 Yeah, thank you. Derick Rethans 1:40 Let's have a chat about the RFC itself then. What is the problem that is RFC is trying to solve? Juliette Reinders Folmer 1:46 There was an RFC or 8.2 which has already been approved in October, which deprecates partially supported callables. Now for those people listening who do not know enough about that RFC, partially supported callables are callables which you can call via a function like call_user_func that which you can't assign to variable and then call as a variable. Sometimes you can call them just by using the syntax which you used for defining the callable, so not as variable but as the actual literal. Derick Rethans 2:20 And as an example here, that is, for example, static colon colon function name, for example. Juliette Reinders Folmer 2:26 Absolutely, yeah. Derick Rethans 2:27 Which you can use with call_user_func by having two array elements. You can call it with literal syntax, but you can't assign it to a variable and then call it. Do I get that, right? Juliette Reinders Folmer 2:36 Absolutely. That's it. There's eight of those. And basically, the original RFC from Nikita proposed to deprecate support for them in 8.2, add deprecation notices and remove support for them altogether in PHP nine. And the original RFC explicitly excluded two particular things from those deprecation notices. That's the callable type and using the syntaxes in combination with the is_callable function, where you're checking if the syntax is callable. The argument used in the original RFC was to keep those side effect free. The problem with this is that with the callable type, this means you go from absolutely no notice or nothing, to a fatal error in PHP 9. Everything works, and you're not getting any notification. But in PHP 9, its fatal error at the moment that callable is being passed to a function. Derick Rethans 3:31 This is the callable type in function declarations. Juliette Reinders Folmer 3:33 Yeah, absolutely. And with is_callable, I discovered a pattern in my wanderings across the world where people use the syntax in is_callable, but then use it in a literal call. So not using call_user_func, not using a variable to call it, but it's callable static double colon method name, and then called static double colon method name as literal. And that pattern basically, for valid calls would mean that that function would no longer be called in PHP 9 without any notification whatsoever. Derick Rethans 4:13 So it's a silent change that you can't detect at all. Juliette Reinders Folmer 4:17 Yeah, which to me sounded dangerous. I started asking some questions about that. But six weeks ago, the conclusion was, well, maybe this should be changed. But as this was explicit in the original RFC, we can't just change it. We need to have a new RFC to basically amend the original RFC and remove the exception for these two situations and allow them to throw deprecation notices. Derick Rethans 4:44 What are you proposing to change with this RFC than? Juliette Reinders Folmer 4:47 What this RFC is proposing is simply to remove the exception that the callable type and is_callable are not throwing a deprecation notice. This RFC is proposing that they should throw a deprecation notice, so that more of these type situations can be discovered in time for PHP 9 to prevent users getting fatal errors. Derick Rethans 5:08 Now, of course, we have no idea when PHP nine is actually showing up, but I don't think it will be this year. Well, I know it won't be this year, and it certainly won't be be next year neither, I think. Juliette Reinders Folmer 5:17 That's all the same. I mean, it makes there'll be two, three years ahead, but it doesn't really make sense to have the main deprecation in 8.2 and then have the additional deprecation in 8.4 or something. Derick Rethans 5:29 Absolutely. Juliette Reinders Folmer 5:30 It's a lot more logical to have it all in in the same version. Because it's all related. It's basically the same thing without the exception for callable type. And is_callable. Derick Rethans 5:42 Although there is no current application, would this be able to be found if you had like a comprehensive test suite? Juliette Reinders Folmer 5:48 Yes and no. Yes, you can find this with a test suite. But one, you're presuming that there are tests. Two, that the tests covered the effected code with enough path coverage. Three, imagine a test you've written yourself at some point in the past where which affected callables, you might have, you know, a data provider where you say: Okay, valid callable function, which you've mocked or, you know, closure, which you've put in and second, this function does not exist. Okay, so now you're testing this function, which at some point in its logic has a callable, and expects that type to receive that type. But are you actually testing with the specific deprecated partially supported callables? Even if you have a test, and the test covers the affected code, if you do not test with one of these eight syntaxes, which has been deprecated, you still cannot detect it. And then, four, you still need to make sure that the tests are routinely run, and in open source, that's generally not a problem. Most open source projects, use GitHub actions by now to run the tests automatically on every pull request, etc. But, have the tests been turned on to actually run against PHP 8.2. Are the tests run against pull requests? I mean, there are still plenty of projects, which don't do that kind of thing. Yes, you can detect it with a good test suite. But there's a lot of caveats when you will not detect it. And more importantly, you will not be able to detect it until PHP 9. Derick Rethans 7:23 Yes, when your code and stops behaving as you were expecting it to be. Juliette Reinders Folmer 7:28 Yeah, because in 8.2, you're gonna get deprecation notices for everything else, but these two situations. But not in 8.2, not in 8.3, not in 8.4, and then whatever eights we're gonna get until nine, you will not be able to detect without deprecation notices, until PHP 9 actually removes support for these partials deprecated callables. Yes, but no. Derick Rethans 7:53 We already touched a little bit on how you found out for the need for this RFC or for changing behaviours. But as people have stated in the past, adding deprecation notices is a BC break. That's a subject that w

  4. 03/24/2022

    PHP Internals News: Episode 100: Sealed Classes

    PHP Internals News: Episode 100: Sealed Classes Thursday, March 24th 2022, 09:04 GMT London, UK In this episode of "PHP Internals News" I talk with Saif Eddin Gmati (Website, Twitter, GitHub) about the "Sealed Classes" RFC that he has proposed. The RSS feed for this podcast is https://derickrethans.nl/feed-phpinternalsnews.xml, you can download this episode's MP3 file, and it's available on Spotify and iTunes. There is a dedicated website: https://phpinternals.news Transcript Derick Rethans 0:14 Hi, I'm Derick. Welcome to PHP internals news, the podcast dedicated to explaining the latest developments in the PHP language. This is episode 100. Today I'm talking with Saif Eddin Gmati about the sealed classes RFC that they're proposing. Saif, would you please introduce yourself? Saif Eddin Gmati 0:31 Hello, my name is Saif Eddin Gmati. I work as a Senior programmer at Les-Tilleuls.coop. I'm an open source enthusiast and contributor. Derick Rethans 0:39 Let's dive straight into this RFC. What is the problem that you're trying to solve with it? Saif Eddin Gmati 0:43 Sealed classes just like enums and tagged unions allow developers to define their data models in a way where invalid state becomes less likely. It also eliminates the need to handle unknown subtypes for a specific model, as using sealed classes to define models gives us an idea on what child types would be available at run time. Sealing also provides us with a way for restricting inheritance or the use of a specific trait. For example, if we look at logger trait from the PSR log package that could be sealed to logger interface. This way, we ensure that every use of this trait is coming from a logger not from any other class. Derick Rethans 1:24 I'm just reading through this RFC tomorrow, again, and something I didn't pick up on reading to it last time. It states that PHP already has sort of two sealed classes. Unknown Speaker 1:35 Yes, the throwable class in PHP can only be implemented by extending either error or exception. The same applies for DateTime interface, which can only be implemented by extending DateTime class or DateTime Immutable class. Derick Rethans 1:52 Because PHP itself doesn't allow you to implement either throwable or DateTimeInterface. I haven't quite realized that that these are also sealed classes really. What is sort of the motivation behind wanting to introduce sealed classes? Unknown Speaker 2:06 The main motivation for this feature comes from Hack the programming language. Hack contains a lot of interesting type concepts that I think personally, PHP could benefit from and sealed classes is one of those concepts. Derick Rethans 2:18 What kind of syntax are you proposing? Saif Eddin Gmati 2:21 The syntax I'm proposing actually there is three syntax options for the RFC currently, but the main syntax is inspired by both Hack and Java. It's more similar to the syntax used in Java as Hack uses attributes. Personally, I have been I guess, using attributes from the start as I personally see sealing and finalizing similar as both effects how inheritance work for a specific class. Having sealed implemented as an attribute while final uses a keyword brings more inconsistency into the language which is why I have decided not to include attributes as a syntax option. Derick Rethans 2:56 In my opinion, attributes shouldn't be used for any kind of syntax things. What they should be used for is attaching information to already existing things. And by using attributes again, to extend syntax, you sort of putting this syntax parsing in two different places , right? You're putting it both in the syntax as well as in attributes. I asked what the syntax is, but I don't think he actually mentioned what the syntax is. Saif Eddin Gmati 3:20 The syntax the main set next proposed for the RFC is using sealed and permit as keywords we first have the sealed modifier which is added in front of the class similar to how final or abstract modifiers are used. We also have the permit clause which is basically a list allows you to name a specific classes that are able to inherit from this specific type. Derick Rethans 3:43 So when you say type here, is that just interfaces and classes or something else as well? Saif Eddin Gmati 3:48 It's classes interfaces and traits. Traits are allowed to add sealing but they are not allowed to permit. Okay for example, an interface is not allowed to permit a trait because a trait cannot implement an interface Derick Rethans 4:03 In the language itself, when does this get enforced? Saif Eddin Gmati 4:06 This inheritance restriction gets enforced when loading a class. So let's say we are loading Class A currently if this class extends B, we check if B is sealed. And if it is we check if B allows A to extend it. But when loading a specific sealed class, nothing gets actually checked. We just take the permit clause classes and store them and move on. Derick Rethans 4:32 It only gets checks if you're trying to implement an interface. Saif Eddin Gmati 4:36 This gets enforced when trying to implement an interface, extend that class, or use it trait. Derick Rethans 4:41 Okay. What are general use cases for this feature? Saif Eddin Gmati 4:45 General use cases for a feature are for example, implementing programming concepts such as Option which is a type that can only have two subtypes. One is Some, other is None. Another concept is the Result where only two subtypes are possible, either success or failure. Another use case is to restrict inheritance. As I mentioned before, for example, logger trait from the PSR log package is a trait that implements some of the method methods in logger interface, and expects whoever is using that trait to implement the rest. However, there is no restriction by the language regarding this, we can seal this trait to a logger interface ensuring that only loggers are allowed use this trait. Derick Rethans 5:34 When you say that Option has like the value Some or None, just sound like an enum to me. How should I think differently about enums and sealed classes here? Saif Eddin Gmati 5:43 Enums cannot hold a dynamic value. You can have a value but you cannot have a dynamic value, however, tagged unions will allow you to implement option the same way. Tagged unions are that useful only for this specific case, there is some other cases such as the one I mentioned for traits that cannot actually be implemented using the tagged unions. There is also the I don't know how to say this. Let's say we have a type A that sealed and permitting only B and C. And this case A on itself, as long as it's not an abstract class, is by itself a type. Can be used as a normal class, you can create an instance and use it normally. However with tagged unions, the option itself would not be a type, you either have some or none. That's the main difference between tagged unions until classes Derick Rethans 6:37 A tagged union PHP doesn't have them. So how does a tagged union relate to enums? Saif Eddin Gmati 6:43 With tagged unions as the, there is an RFC that's still in draft, I suppose that uses actually it is built on top of enums that that's why. Derick Rethans 6:55 I reckon once that gets closer to completion, I'll end up talking to the author of that RFC. So something I'm wondering, can a sealed type permit only one other type? Or does it have to be more than one? Saif Eddin Gmati 7:10 No, it can permit only one type. Let's say we have class A that only permits B. However, another thing is class B does not actually have to extend A, like if A is permitting B, B does not actually have to implement A. It's still useful because another class called C can extend B and implement A, so an instance of A B can still exists. Derick Rethans 7:36 I'm not quite sure whether I understood that. If you have an interface that says A permits B, then B is not required to implement A, mostly because the moment you loads class B, you don't even know it exists, right? Because it doesn't refer to it. Saif Eddin Gmati 7:54 Yes. Derick Rethans 7:55 It's just going to break anything? Saif Eddin Gmati 7:57 Hopefully not. The only break would be in the new reserved keywords which are sealed and permits. So those cannot be used as identifiers any more, but depending on the syntax choice, if for example, the second syntax choice wins which that would only take the permits keyword. If the third syntax choice is chosen then no new reserved keywords will be introduced so there will be no breaks. Deri

  5. 03/10/2022

    PHP Internals News: Episode 99: Allow Null and False as Standalone Types

    PHP Internals News: Episode 99: Allow Null and False as Standalone Types Thursday, March 10th 2022, 09:04 GMT London, UK In this episode of "PHP Internals News" I talk with George Peter Banyard (Website, Twitter, GitHub, GitLab) about the "Allow Null and False as Standalone Types" RFC that he has proposed. The RSS feed for this podcast is https://derickrethans.nl/feed-phpinternalsnews.xml, you can download this episode's MP3 file, and it's available on Spotify and iTunes. There is a dedicated website: https://phpinternals.news Transcript Derick Rethans 0:15 Hi, I'm Derick. Welcome to PHP internals news, a podcast dedicated to explain the latest developments in the PHP language. This is episode 99. Today I'm talking with George Peter Banyard, about the Allow null and false at standalone types RFC that he's proposing. Hello, George Peter, would you please introduce yourself? George Peter Banyard 0:36 Hello, my name is George Peter Banyard. I work on the PHP language, and I'm an Imperial student in maths in my free time. Derick Rethans 0:44 Are you're trying to say you're a student in your free time or contribute to PHP in your free time? George Peter Banyard 0:49 I feel like at this time, it's like, both are true at the same time. Derick Rethans 0:53 Let's hop into this RFC. It is titled allow null and false as standalone types. What is the problem that it is trying to solve? George Peter Banyard 1:02 This is the second iteration of this RFC. So the first one was to just allow null initially, and null is the unit type In type theory parlance of PHP, ie the type which only has one value. So null is a type and a value. And the main issue is that when for leads more with like inhabitants, and like the Liskov substitution principle. If you have like a method, like the parent method, which can be told like either null or an object, and your implementation in a child class always returns null, for various reasons, maybe because it doesn't support this feature, or whatever is out, or... If your child method only returns null, currently, you can't document, that you can't type this properly, you can document it in a doc comment or something like that. But due to how PHP type handling works, you need to specify at least like another type with null in the union. Basically resort to always saying like mimicking the parent signature, when you could be more specific. This was the main use case I initially went into. Derick Rethans 2:08 If I understand correctly, you can't just have an inherited method that has hinted as to just return null? George Peter Banyard 2:14 Exactly. If you always return null, maybe because you always work or something like that, then you must still declare the return type as like null or exception, which is not a concrete because you say what, like why never fail. And like static analysers, if they can figure it out that you're using a child class, they can't maybe like do some assumptions or work further down that like what you're doing is redundant or things like that. So that's one of the main reasons I initially went with it. And I didn't add false initially, because it was like, well, false, it's not really a type properly. It's, it's what's called a value type. False is one value from the Boolean type. And I was like, Well, okay, we're just going to limit it to like, being the type theory purist, limited to proper types, where null is a proper type, although it's a bit sometimes misunderstood, I feel in the PHP community at large. And then people were like, well, if we add null, then by the only type-ish thing, which you can use in a type declaration, or whatever, which can't be used in a return type on its own, is false. And it's just weird. So why not add it in full. So that was the second thing as to why I added it. Some of PHP internal's functions being terribly designed because they were designed back in the early noughties, return null on success and false on failure, which you can't probably type at the moment. Currently, we need to type them as like Boolean or null, but true can never be returned in this case. And there are some other some other people have reached out to me it's like, well, yeah, but I always return false in this case. Or I also return always true in this case, although true, we have this weird asymmetry that we have false as a value type and not true. Derick Rethans 3:49 What was the reason for having false but not true? George Peter Banyard 3:53 When the union type RFC got discussed and passed for PHP 8.0, false was added, because a lot of traditional behaviour of PHP internal functions, was to return false on failure, instead of the technically more correct thing would be to return null. Because loads of functions return a false on failure, and saying that like in returns, these types, or a Boolean would be basically lying because you could never have true, false was included in it. With the restrictions that you can only use false as the complement with other types. So you need to do for example, array, or false, you couldn't just use false. Derick Rethans 4:37 Would it also mean that you can define a return type of a method that inherited a method that returns a bool, as false? George Peter Banyard 4:48 Yes, that would be now possible with the amended proposal. Yeah, which goes back to this weird a symmetry, we're probably. Adding true to make a complete would be a future RFC to do. Derick Rethans 5:00 Now, we've talked about return types. But I guess the reverse applies to arguments? George Peter Banyard 5:06 Arguments and property types also would, would be allowed to, like declare themselves as like null or false. The usefulness here is way more limited. Because if you declare an argument to be of type null, then basically you can only ever pass a null to it. And then therefore, the type doesn't do anything. Derick Rethans 5:26 But in an inherited method, you could then widen it. George Peter Banyard 5:31 Yes, exactly. You could always say: Well, this argument exists, it's always null. If you extend like your class or message, then you can add other types. But in theory, you can already do that by adding like an argument at the end of the message, because that's LSP compliant. The case for, and properties of those, because they are typing, they're in like their beads. Kind of debatable why you would do that. But it's just that like, well, if you accept types at one point, just restricting them like somewhere else gets very weird. At this point is more like look at the human review, or like use static analysis for the analyser to tell you like this argument is redundant and just remove it or this property doesn't make any sense. Because if it can only ever be null, why does it even exist in the first place? Derick Rethans 6:13 Right now, you can already use false in union types, but why not with null or false? George Peter Banyard 6:19 That goes back to the when a union type RFC got introduced. Null got added as a keyword. Before you could only use the question mark, before a type to make the type nullable. If you have a more complex union type, to not use the question mark in front of it. Therefore, the null keyword got added as a proper type. And because the logic was, Well, you shouldn't ever be able to return just null. Because then that function is kind of equivalent to void. Because of that, it was said that like, Well, okay, null and false basically have like kind of the same status is that like, if you just want to use null on its own, you're doing something kind of weird. And if you're returning more than false, like that signature is very strange. I think when that was discussed, nobody knew initially that an actual PHP function within one of the extensions, like in core had such a weird signature. Which mainly, we just started discovering that after this got, like accepted and we could like actually start properly typing the internal functions, and then you discover these weird edge cases where sounds like, that's a bit strange, can't properly document it. We just need to make like a note on the PHP documentation side. And like the type signature kind of lies to you. PHP's type hierarchy is a bit strange, void kind of lives on its own. So if the function is marked as void, it must always like any child inheritance, or whatever needs to be void. And when you type return in the function body, you need to always use return with like a semicolon afterwards, you can't even return null. Although, under the hood, PHP will always return a value when you call a function, even if the function is void, which will be null. Derick Rethans 7:58 The RFC also talks about question mark null, what is that supposed to be? Is that null or null? George Peter Banyard 8:03 PHP has this limited type redundancy checks at compile time. It will basically check if you're duplicating types. So if you write for example, int or int, even if it's capitalized differently, PHP was like, okay, just

  6. 03/03/2022

    PHP Internals News: Episode 98: Deprecating utf8_encode and utf8_decode

    PHP Internals News: Episode 98: Deprecating utf8_encode and utf8_decode Thursday, March 3rd 2022, 09:02 GMT London, UK In this episode of "PHP Internals News" I chat with Rowan Tommins (GitHub, Website, Twitter) about the "Deprecate and Remove utf8_encode and utf8_decode" RFC. The RSS feed for this podcast is https://derickrethans.nl/feed-phpinternalsnews.xml, you can download this episode's MP3 file, and it's available on Spotify and iTunes. There is a dedicated website: https://phpinternals.news Transcript Derick Rethans 0:14 Hi, I'm Derick. Welcome to PHP Internals News, a podcast dedicated to explaining the latest developments in the PHP language. This is episode 98. Today I'm talking with Rowan Tommins about the "Deprecate and remove UTF8_encode and UTF8_decode" RFC that he's proposing. Hi, Rowan, would you please introduce yourself? Rowan Tommins 0:38 Hi, I'm Rowan Tommins. I'm a PHP software architect by day and try and contribute back to the community and have been hanging around in the internals mailing list for about 10 years and contributed to make the language better, where I can. Derick Rethans 0:57 Excellent. Yeah, that's how I started out as well, many, many more years before that, to be honest. This RFC, what problem is this trying to solve? Rowan Tommins 1:08 PHP has these two functions, utf8_encode and utf8_decode, which, in themselves, they're not broken. They do what they are designed to do. But they are very frequently misunderstood. Mostly because of their name. And because Character Encodings in general, are not very well understood. People use them wrong, and end up getting in all sorts of pickles that are worse than if the functions weren't there in first place. Derick Rethans 1:37 What are you proposing with the RFC then? Rowan Tommins 1:39 Fundamentally, I'm proposing to remove the functions. As of PHP 8.2, there will be a deprecation notice whenever you use them, and then in 9.0, they would be gone forever, and you wouldn't be able to use them by mistake, because they just wouldn't be there. Derick Rethans 1:56 I reckon there's going to be a way to actually do what people originally intended to do with it at some point, right? Rowan Tommins 2:02 So yeah, there are alternatives to these functions, which are much clearer in what you're doing, and much more flexible in what you can do with them so that they cover the cases that these functions sound like they're going to do, but don't actually do when you understand what they're really doing. Derick Rethans 2:20 I think we'll get back to that a little bit later on. You're wanting to deprecate these functions. But what do these functions actually do? Rowan Tommins 2:27 What they actually do is convert between a character encoding called Latin-1, ISO 8859-1, and UTF-8. So utf8_encode converts from Latin-1 into UTF-8, utf8_decode does the opposite. And that's all they do. Their names make it sound like they're some kind of fix all the UTF 8 things in my text. But they are actually just these one very specific conversion, which is occasionally useful, but not clear from their names. Derick Rethans 3:01 It's certainly how I have seen it used in the past, where people just throw everything and the kitchen sink at it, and expecting it to be valid UTF 8, and then at the end, decode. I mean, the decoding was not even part much of this, right? It's just throw everything at it, and then magically it will all be UTF 8. But I reckon that's not really quite the case. When and how does that go wrong? Rowan Tommins 3:26 So what actually ends up happening is, because text doesn't know what encoding it's in. Something that people misunderstand about character encoding is they think it's like, the text is a certain colour, and the computer knows what colour it is. And if you tell the computer to make it a different colour, then it will work. But it's not like that. In the computer, there's just the sequence of binary. And the encoding is how to read that binary as text. And if you tell the computer to read it as Latin 1, it will read it as Latin 1. If you take to convert from Latin 1 to UTF 8, it will assume the input is Latin 1, it will convert to UTF 8 on that basis. If your text actually wasn't Latin 1 in the first place, you're just going to end up with garbage. And some of the worst cases of that is when you already have UTF 8, and then you run utf8_encode on it, because the language doesn't know that you've already got UTF 8, so it tries to read its Latin 1, write it out ass UTF 8 and you get this weird Mojibake. I don't know pronouncing that right. Derick Rethans 4:27 I think it's pronounced Mojibake. Rowan Tommins 4:30 Mojibake. Derick Rethans 4:31 It's a Japanese term, because clearly these things, these issues happened with Japanese text quite a lot because they have a lot more different and difficult characters and encodings as well. With which things often go wrong though? Rowan Tommins 4:44 Using an unco on text that's already UTF 8 is obviously a big one. Usually obvious, but occasionally people just getting a muddle with that. The other thing that often happens is confusing with similar encoding. Latin 1 is often mistaken for a different coding windows 1252. To the extent that web pages labelled as Latin 1, web browsers will assume that they're actually in Windows 1252. These PHP functions don't make that assumption. If your text is actually in Windows 1252, and it's been mislabelled Latin 1, you might still think you're doing the right thing. So I've got Latin 1 text, but you haven't. And then the characters that are different, are going to get mangled again. And there's a few other related encodings that often look the same. There are a few other encodings that look the same at a glance that again, will go wrong on any character that's different between the different encodings. Derick Rethans 5:43 How could a function tell which encoding a certain text was in? Rowan Tommins 5:49 It's tricky. There are libraries out there that try to do it. Some encodings that are sequences of bits that aren't a valid character. So if any of those appear, it's definitely not in that encoding. Unfortunately, a lot of encodings, every pattern of bits has a meaning. It's just not necessarily mean. So you can't look at the string and just tell at a glance. The only way I've seen that does it effectively, is trying to guess based on what language text it might be in. If your text suddenly has a load of symbols in the middle of sentences, you're probably using the wrong encoding. If it's suddenly got a load of capital letters, in the middle of words, you're probably using the wrong encoding. So you can make guesses like that, that ultimately, there are only ever guesses. Derick Rethans 6:38 It's only always going to be a guess, right? You can't really tell for certain what it it is, which I've seen people assume that she can just tell. We have concluded that utf8_encode and decode don't actually do what they say they don't magically encode everything to UTF 8. What if things go wrong? How are errors handled? Rowan Tommins 6:58 If you're converting from Latin 1 into UTF 8, there Latin 1 covers all 256 possible eight bit binary strings. Those will correspond directly to a single mapping in Unicode and therefore in UTF 8. So there are no errors as such, when that happens, but it might not be what you want. One of the most notable ones that's different between these encodings is Latin 1 was standardized in 1985, the Euro didn't exist, then. The euro symbol doesn't have an encoding in Latin 1. If you've got a euro sign, you haven't got Latin 1 text, but you might think you've got Latin 1 text, and it will just encode it to what to a control character, which is where the windows 1252 code page puts the euro symbol, it replaces some control characters in Latin 1. One of the reasons why these character encodings are so easily confused is they've all nicely built to being compatible on top of each other. Latin 1 is deliberately an extension of ASCII. Windows 1252 is deliberately an extension of Latin 1, replacing some control characters. UTF 8 is also based on Latin 1, the first section of Unicode is actually the Latin 1, characters UTF 8 will encode and slightly differently so that it can carry on above 256. So in that direction, you can't actually get an error, you could just get a string, that doesn't make sense. Going back the other way. Unicode has, I think, potentially 11 million or something, and actually, at least a million assigned code points. Latin 1 only has 256. So you can't map all those back. And this function, the utf8_decode just replaces any that it can't match with the question mark. Similarly, if the input string isn't valid UTF 8. Again, if you've just misunderstood what strings doing and you haven't actually got a UTF 8 string in the first place, any sequence that doesn't look like valid UTF 8, again, just gets replaced with a question mark. Completely silently

  7. 01/27/2022

    PHP Internals News: Episode 97: Redacting Parameters

    PHP Internals News: Episode 97: Redacting Parameters Thursday, January 27th 2022, 09:09 GMT London, UK In this episode of "PHP Internals News" I chat with Tim Düsterhus (GitHub) about the "Redacting Parameters in Back Traces" RFC. The RSS feed for this podcast is https://derickrethans.nl/feed-phpinternalsnews.xml, you can download this episode's MP3 file, and it's available on Spotify and iTunes. There is a dedicated website: https://phpinternals.news Transcript Derick Rethans 0:00 Before we start with this episode, I want to apologize for the bad audio quality. Instead of using my nice mic I managed to use to one built into my computer. I hope you'll still enjoy the episode. Derick Rethans 0:30 Hi, I'm Derick. Welcome to PHP internals news, a podcast dedicated to explaining the latest developments in the PHP language. This is episode 97. Today I'm talking with Tim Düsterhus about Redacting Parameters in Backtraces RFC that he's proposing. Tim, would you please introduce yourself? Tim Düsterhus 0:50 Hi, Derick, thank you for inviting me. I am Tim Düsterhus, and I'm a developer at WoltLab. We are building a web application suite for you to build online communities. Derick Rethans 0:59 Thanks for coming on this morning. What is the problem that you're trying to solve with this RFC? Tim Düsterhus 1:05 If everything is going well, we don't need this RFC. But errors can and will happen and our application might encounter some exceptional situation, maybe some request to an external service fails. And so the application throws an error, this exception will bubble up a stack trace and either be caught, or go into a global exception handler. And then basically, in both cases, the exception will be logged into the error log. If it can be handled, we want to make the admin side aware of the issues so they can maybe fix their networking. If it is unable to be handled because of a programming error, we need to log it as well to fix the bug. In our case, we have the exception in the error log. And what happens next? In our case, we have many, many lay person administrators that run a community for their hobby, they're not really programmers with no technical expertise. And we also have a strong customers help customers environment. What do those customers do? They grab their error log and post it within our forums in public. Now in our forum, we have the error log with the full stack trace, including all sensitive values, maybe user passwords, if the Authentication Service failed, or something else, that should not really happen. In our case, it's lay person administrators. But I'm also seeing that experienced developers can make this mistake. I am triaging issues with an open source software written in C. And I've sometimes seeing system administrators posting their full core dump, including their TLS certificates there, and they don't really realize what they have just done. That's really an issue that affects laypersons, and professional administrators the same. In our case, our application attempts to strip those sensitive information from this backtrace. We have a custom exception handler that scans the full stack face, tries to match up class names and method names e.g. the PDO constructor to scrub the database password. And now recently, we have extended this stripping to also strip anything from parameters that are called password, secret, or something like that. That mostly works well. But in any case, this exception handler will miss sensitive information because it needs to basically guess what parameters are sensitive values and which don't. And also our exception handler grew very complex because to match up those parameters, it needs to use reflection. And any failures within the exception handler cannot really be recovered from, if the exception handler fails, you're out of luck. Derick Rethans 3:51 Quite a few things to think of to make sure that you're not sharing any secrets. And I certainly have seen almost doing this myself. We now know what the problem is. How is this RFC proposing to fix this? Tim Düsterhus 4:03 Primarily, we want to propose a standardized way for applications or libraries to indicate which parameters hold sensitive values. Our custom exception handler uses reflection as we said before, and it only matches up the parameter's names, but we also have this attribute I am proposing, SensitiveParameter within our application itself. Any parameter names that are not definitely sensitive can be attributed with this attribute. But this only works within our software, but not with any third party libraries we are using, e.g. for encryption or whatever there is. Primarily we want to propose a standardized way an attribute that is in PHP core, anyone can use that and everyone knows what this attribute means. Secondarily, the RFC is proposing a default implementation to keep the exception handler simple. As I said before, we are using reflection. This is very complex, it does not work with the require_once or include_once family, because that are not functions. We need to handle this case to not try to attempt to reflect on those non functions when redacting any parameters. This is complex. And we want to simplify that. Derick Rethans 5:20 From what I understand this is then a way to make sure that there's a standardized method for marking arguments as being sensitive. And because this is that now standardized, only one solution to the problem has to be found right? Tim Düsterhus 5:34 Basically, not every library is using their own attributes, possibly, or we can match parameter names that are not like password, secret, but it can be documented: hey, if you are using sensitive parameters, you should put this attribute and then those exception handlers will be aware that this attribute is sensitive and can strip it, or in case of the RFC PHP itself, will already strip those parameters from the stack trace. Derick Rethans 6:04 You're suggesting that PHP standard way of showing stack traces also takes care of the sensitive parameter here? Tim Düsterhus 6:11 Yes, exactly. Derick Rethans 6:13 Which internal PHP functions are likely to get this attribute? Tim Düsterhus 6:16 Basically anything with a parameter called password or secret, as I said before, examples include PDO's constructor, the database password will be in there and possibly also the user name or host name, which might be considered sensitive. But the password is the most important thing I have on my list. ldap_bind, which possibly includes user passwords; the password_hash function; possibly various OpenSSL functions. One will need to look and this list can be extended in the future as well, if someone realizes we missed anything. Derick Rethans 6:55 Now, I know sometimes that there's a problem where an application connects to the wrong server with PDO. And as you say, the host name was also in this PDO constructor, would it not then make debugging that specific case harder because the hostname would also be redacted from the stack traces? Tim Düsterhus 7:14 The attribute I am proposing as the parameter attribute, each parameter can be sensitive or non sensitive. We would need to decide whether we consider the hostname sensitive or not. It usually is not. So I would not put the attribute on the host name, or on the DSN string in the first parameter. The password definitely is sensitive. And the username possibly is a grey area. By default, I probably would not put the attribute there. But this is something that needs to be discussed in the greater community possibly. Derick Rethans 7:47 I saw in the RFC that when you request a stack trace in PHP with get back trace or whatever the name of this function is, is that the sensitive parameters are being replaced by an object of the class SensitiveParameter. Why did you pick that instead of just a string, saying something like "redacted". Tim Düsterhus 8:06 We cannot force users to put the attribute only on parameters that take strings. If we use a redacted string we might violate the type hint. If a function takes some key pair class, or an option of a key pair class, this usually is a sensitive attribute, we cannot simply put a string there. We can but then we would violate the typing. And as we violate the typing in at least some of the cases, we can also violate it in all of the cases and then make it very clear that this parameter was redacted and not a real value that just looks like a string "redacted". Exception handlers would be able to use an instanceof SensitiveParameter check to possibly make it more user friendly when they render the stack trace. When you using an GUI to handle your exceptions as such a Sentry can show some placeholder instead of pretending it's a real string in there. Derick Rethans 8:07 And of course, the string "redacted" can already exist as an argument value yet anyway, right? Tim Düsterhus 9:12 Yeah. Derick Rethans 9:13 Where would attribute be checked?

  8. 12/16/2021

    PHP Internals News: Episode 96: User Defined Operator Overloads

    PHP Internals News: Episode 96: User Defined Operator Overloads Thursday, December 16th 2021, 09:24 GMT London, UK In this episode of "PHP Internals News" I chat with Jordan LeDoux (GitHub) about the "User Defined Operator Overloads" RFC. The RSS feed for this podcast is https://derickrethans.nl/feed-phpinternalsnews.xml, you can download this episode's MP3 file, and it's available on Spotify and iTunes. There is a dedicated website: https://phpinternals.news Transcript Derick Rethans 0:14 Hi, I'm Derick. Welcome to PHP internals news, a podcast dedicated to explaining the latest developments in the PHP language. This is episode 96. Today I'm talking with Jordan, about a user defined operator overloads RFC that he's proposing. Jordan, would you please introduce yourself? Jordan LeDoux 0:33 My name is Jordan LeDoux. I've been working in PHP for quite a while now. This is the second time I have ventured to propose an RFC. Derick Rethans 0:44 What was the first one? Jordan LeDoux 0:45 The first one was the "never for parameter types", which was much more exploratory. And we talked about it a little bit. And it generated a lot of good discussion that contributed to kind of the idea formation, which was what I hope to get out of it. Derick Rethans 1:01 Okay, but that didn't end up making it into a PHP release. As far as I understand, right? Jordan LeDoux 1:07 No, I withdrew it actually, it was clear that the better way to approach the problem it was trying to solve was with a much more comprehensive solution. That particular solution was something that only required a seven line change to the engine. So I wanted to see if it was something people were okay with, or thought was a decent idea for that particular problem, much more comprehensive, like template classes, or something like that is probably the better route to go. Derick Rethans 1:35 Well, I think the RFC that we're talking about today, is going to require quite a bit more than seven lines of code? Jordan LeDoux 1:41 Quite a bit more. Yeah. Derick Rethans 1:42 So what is this RFC that we're talking about today? Jordan LeDoux 1:45 Well, user defined operator overloads is a way for PHP developers to define the ways in which objects interact with specific operators. So for instance, the plus operator, the plus sign. It's a way for those objects to kind of define their own logic as far as how that's handled, which right now, as of PHP 8.0, those were all switched to type errors. So it's not possible currently to write any code that doesn't result in a fatal error, where objects are used with operators. Derick Rethans 2:25 Usually, I ask about every RFC, what problem are you trying to solve this? So what problem are you trying to solve this RFC? Jordan LeDoux 2:31 The biggest problem that this solves is that objects contain, so objects in most programs represent a value or multiple values that have a program context. That's the most powerful thing about objects is they're contextual, and they understand the state, they understand what state the object is in, and sometimes even what state the whole program is in. And that's necessary for a lot of things. Like for instance, if you're tracking a distance, you know, you might measure that meters, and that would have a number you might have 30 meters of distance, but it also has a unit of meters. You could just represent that as an int. And then the program just knows internally, hey this is always in meters. But if you need to convert that to a different unit, then that becomes: Okay, well, now I need a special case some things, or I need a function just for converting, and I need to remember which unit my number is in. In a lot of cases, you handle that with objects because objects understand state, and they understand state transitions, which is what a lot of methods are about; transitioning the state of the object from one state to another. Operators are also about state transitions. And they're about very specific kinds of state transitions. It's natural in a lot of ways to think that you, you should be able to define how those two things interact. But currently, it's just not possible within PHP. Derick Rethans 4:00 Well, does them this magic operator overloading? Jordan LeDoux 4:04 It allows PHP developers to define an implementation logic, which is much like you define a function body that describes how does this object interact with this operator. That's essentially it. There's a lot of other details as to how it does that and what are the restrictions, but that's really the core of the idea. Derick Rethans 4:26 And in what kind of situations would you use that? Jordan LeDoux 4:28 A lot of them are situations where you're doing very complicated mathematics, or scientific computing or machine learning or things of that nature, where you are going to routinely encounter numbers that have state to them or that have multiple dimensions to them. So for instance, vector mathematics is one where the way that vectors interact with a lot of the operators that we're familiar with, like the multiplication sign is very different than how the number five interacts with the multiplication sign. Complex numbers is another one, you know, to multiply two complex numbers together, you have to treat it like a polynomial where you're multiplying it with the FOIL method: first, outside, inside, last. You know, there's a lot of those sorts of circumstances. But it also could potentially be very useful for some things that are not really mathematical but more quality of life for PHP developers. For instance, scalar objects is something that a lot of developers in PHP have, you know, wanted for a while. It's a thing that's a little more difficult to pin down, how exactly would you go about doing this within the engine, and it's a thing that the engine would kind of have to be very opinionated about by its nature. PHP developers can't provide their own scalar objects. And the main reason for this is that scalars interact with operators and objects can't. So simply allowing PHP developers to define a way for objects to interact with operators would allow user land to develop their own scalar object replacements. It wouldn't make every scalar that object; scalar objects within the engine still has, it's a separate feature. And it's still a thing that would be desirable, probably to a lot of people. But it gets quite a bit of the way there. Derick Rethans 6:20 It is always interesting that people come up with the example of complex numbers, because I'm not sure how useful that is in a PHP user land context. And then beyond the scalars, I then sometimes struggle to see where this could be used. With the only exception is probably doing calculations with money related issues. The moment you bring up operator overloading, you'll also get people to say that this is going to get abused. Examples of that, in my opinion at least, is where in C++ you have like the operator to put things into the stream and stuff like that. What answer would you have to kind of comments? Jordan LeDoux 6:58 Abuse of operator overloads to do things that can create unmaintainable code, because that's really the concern for developers is, does a language feature promote code that's difficult to maintain, that's difficult to understand, that's difficult to follow, and develop, and you know, work with. The RFC, the way that I've gone about this implementation, has had that in mind, because I also have experienced that. This is not a thing where I coming down from the academic high tower with, you know, whatever my my concept of this is, and no, no real world experience with these things. I share a lot of those concerns. Actually, I think this is a very useful feature that has a lot of applications I've encountered. I have had to work with matrix maths, I have had to work with complex numbers, I've had to work with arbitrary precision numbers, and all of those situations would have been served so much better by having operator overloads. I was fighting with the language the entire time, I was trying to do those. But I understand you know, in a lot of web applications, those are not common problems to encounter. My experience of that isn't typical. The thing about the way that it's done is it tries to head off a lot of the ways that it could be misused. An example of that is that the RFC requires typing of the parameters. You can't define an operator method and leave the types blank. If you do, then you get a fatal error during compile. It tells you you must explicitly define a type. And the reason for this is that blank types are assumed to be mixed. So it's the same as putting mixed for the type within the engine. And a mixed type says I can take anything, it doesn't matter what you give me, I can take anything. But that simply isn't true for operators. It's never true. Because even if you think hey, I can accept floats, ints, I can accept any objects, I can figure something out with them. You know, even if you think that's true, what happens when somebody passes you a

Ratings & Reviews

4.9
out of 5
7 Ratings

About

This is the 'PHP Internals News' podcast, where we discuss the latest PHP news, implementations, and issues with PHP internals developers and other guests.

To listen to explicit episodes, sign in.

Stay up to date with this show

Sign in or sign up to follow shows, save episodes, and get the latest updates.

Select a country or region

Africa, Middle East, and India

Asia Pacific

Europe

Latin America and the Caribbean

The United States and Canada