From fulop@inf.u-szeged.hu Tue Apr 12 18:05:03 2005 Date: Tue, 12 Apr 2005 16:53:51 +0200 From: Fülöp Zoltán To: scamel@elte.hu, gsd@elte.hu Cc: Fülöp Zoltán Subject: Family polymorphisms ... Tisztelt Szerzok! Csatolva kuldom a biralatot a Family polymorphisms in Java c. cikkukrol, amelyet az Actca Cyberneticahoz nyujtottak be kozlesre. Sajnos a cikket a jelenlegi formajaban nem tudjuk kozlesre elfogadni. Kerem, hogy sziveskedjenek a cikket a biralatnak megfeleloen atdolgozni es elkuldeni aprilis 28-ig. Amennyiben a biralo elegedett lesz a javitassal, a javitott valtozatot el tudjuk fogadni kozlesre. Udvozlettel, Fulop Zoltan -- ++++++++++++++++++++++++++++++++++++++++++ Prof. Dr. Zoltán Fülöp University of Szeged Department of Foundations of Computer Science H-6701 Szeged POB 652 tel. +36-62-544302 fax. +36-62-546397 fulop@inf.u-szeged.hu http://www.inf.u-szeged.hu/~fulop [ Part 2: "Attached Text" ] --------------------------------- Paper review - Acta Cybernetica --------------------------------- Title: Family Polymorphism in Java Authors: I. Zolyomi and Z. Porkolab Result: Acceptance only recommended after a non-trivial revision Summary: This paper describes an approach to family polymorphism which is based on generics in Java 1.5. It uses a "marker" type argument to produce multiple type-distinct versions of each class in a family of classes, thus providing a separate class family for each marker type. Overall evaluation: The approach in this paper does not provide true family polymorphism; in fact it is a much less powerful variant of an approach which has been known at least since 1998, which could be described as 'family monomorphism'. Hence, these older results should be studied and considered, and the contribution of this paper should be reworked such that it is in fact new and useful. Maybe the approach using marker type arguments is a nice shortcut which may be used in some cases where the 1998 approach is not needed. Comments: The paper is well-structured and easy to read, but it contains a number of elements which are not treated with sufficient clarity, or which reveal genuine misunderstandings. Because of this, the paper should be revised before acceptance. In the following the main problems are described: Figure 1 shows two small inheritance hierarchies, and from the context it would be reasonable for the reader to assume that they are intended to illustrate family polymorphism at work. However, they actually illustrate the typical main-stream 'solution' to a problem which could be addressed using family polymorphism (or even family monomorphism, which will be described below). As it stands, it is a simply a standard covariance problem, in that subclasses are related to subclasses (like Horse/Grass), but the type system insists that subclasses are related to general ones (like Horse/Food), and this calls for dynamic checks as described. In order to actually consider family polymorphism in this context it would make sense to describe the 'family polymorphic' version of this design, where the covariance problem disappears. This would be something like the following: class Eating { virtual class Animal { abstract void eat(Food); } virtual class Food { ... } } class HerbivoreEating extends Eating { furtherbound class Animal { void eat(Food) {...} } furtherbound class Food { ... } } class CarnivoreEating extends Eating { furtherbound class Animal { void eat(Food) {...} } furtherbound class Food { void eat(Food) {...} } } The family contains two classes, Animal and Food. Derived families will inevitably contain classes with the same names, because derived families are created by putting additional constraints on the classes in the family ("here, this class [in the example: Animal] must also be a subclass of blahblah [in the example: some Horse-related stuff]"). If we want to use different names for the members of a derived family we can introduce aliases: class HerbivoreEating extends Eating { furtherbound class Animal { void eat(Food) {...} } furtherbound class Food { ... } alias Horse = Animal; alias Grass = Food; } The point in family polymorphism is that it is now safe to use the classes of a family together, no matter what family it is: void feed(final Eating E, E.Animal a, E.Food f) { a.eat(f); // safe, and statically checked } We could then call this method as follows: final HerbivoreEating HE = new HerbivoreEating(); feed(HE, new HE.Horse(), new HE.Grass()); or using polymorphism already before the call: final Eating E = new HerbivoreEating(); feed(E, new HE.Animal(), new HE.Food()); We can't use the derived-family-specific names like 'Horse' here, because it might as well have been a CarnivoreEating family or some other family derived from Eating. Of course, we could introduce 'import' like statements to avoid mentioning the family everywhere: /* This is a Java+Fam.Pol source code file */ import java.lang.util.*; // ordinary imports here as needed family Eating E = new HerbivoreEating(); import E.*; // the rest of the file can now use the classes Animal and Food, // which will refer to E.Animal and E.Food. class Foo { ... void bar(Animal a, Food f) { a.eat(f); } ... } ... /* end of file */ The binding of E to the new instance of HerbivoreEating might take place in another file, in some configuration or deployment setting, or whatever. The point is that a family object could be treated like a package (with the additional ability to hold state). So, the family polymorphism approach gives rise to a quite different inheritance hierarchy (for which there is no standard UML notation, of course): ------------------------ | Eating | | -------- ------ | | | Animal | | Food | | | | is | | is | | | | ...... | | .... | | | -------- ------ | ------------------------ ^ | ---------------------------------- | | ------------------------ ------------------------ | HerbivoreEating | | CarnivoreEating | | -------- ------ | | -------- ------ | | | Animal | | Food | | | | Animal | | Food | | | | is | | is | | | | is | | is | | | | also | | also | | | | also | | also | | | | ...... | | .... | | | | ...... | | .... | | | -------- ------ | | -------- ------ | ------------------------ ------------------------ If you only present the two separate hierarchies of Fig.1 to the reader, (s)he will have no chance to know how this would actually be handled using family polymorphism. Typo: On page 4, in Lion.eat, the if-condition is upside down: You should throw the exception when 'f instanceof Meat' is false, not true. Type: On page 5, method Animal.eat, there is no parameter name, but an ellipsis indicates that there is a method body. Please add an argument name. Thinko: On page 5, you state "Trust in parameter, do not check" in the method Lion.eat. Unless you describe in what way you want to modify the language, the reader must assume that you intend this to be Java code, and the Java compiler would not let you trust any such thing without checking. LaTeX problem: On page 7, several occurrences of type applications (the name of a parameterized class + some type arguments) contain upside-down exclamation/question marks, probably because LaTeX requires using a source code font (\texttt{..}) or something like that in order to show '<' and '>' as themselves. In Sect.4 you describe an approach based on marker classes. It is true that you can use marker classes as type arguments in order to create any number of distinct "copies" of the classes, and they will be able to refer to each other, such that Animal can only be used together with Food, etc. However, this is a very incomplete solution to the problem of expressing and managing class families. The first problem is that it does not help you express the types of other members of the family in a way which can be adjusted when you create derived families. Assume (not unreasonably) that you want different interfaces to the kind of Food which is Grass and the kind which is Meat. Just assume that we want a method 'isGreen' on Grass and 'isFresh' on Meat. In this case a Horse would probably want to choose grass which is green and the Lion would prefer meat which is fresh, and this means that the derived Animal classes would need to know about the extended interfaces of the derived Food classes, as in // in Carnivore.Animal void eat(Food f) { if (f.isFresh()) ... } // in Herbivore.Animal void eat(Food f) { if (f.isGreen()) ... } Your marker classes would not help you here. In fact, there is a well-known solution to this problem, see for instance @InProceedings{BruceOderskyWadler98, author = "Kim B. Bruce and Martin Odersky and Philip Wadler", title = "A Statically Safe Alternative to Virtual Types", volume = "1445", series = "Lecture Notes in Computer Science", pages = "523--549", booktitle = "ecoop98", address = "Brussels, Belgium", publisher = "Springer-Verlag", year = "1998", month = jul, } (note that those people stopped claiming that virtual types are unsafe several years ago). The solution is to use type arguments in a different manner, not just as markers but actually to describe the members of the family, in this case Animal and Food: class Animal, F extends Food> { void eat(F f) { ... } } class Food, F extends Food> { ... } The type variables A and F stand for Animal and Food, respectively, and the type arguments must be repeated when using Animal and Food because those classes are parameterized in the first place. So the whole things gets rather verbose, but it works. This makes it possible to _update_ the types of the class family members in derived families: class Horse, F extends Grass> extends Animal { ... } class Grass, F extends Grass> extends Food { ... } class Lion, F extends Meat> extends Animal { ... } class Meat, F extends Meat> extends Food { ... } Now we must instantiate these families in order to use them, and this is done by taking a fixed point (which motivates the 'F' in the name): class HorseF extends Horse {} class GrassF extends Grass {} class LionF extends Lion {} class MeatF extends Meat {} With this setup it is possible for each of the class family members to use the extended interfaces of the other members, e.g., because the 'eat' method in Horse knows that the argument of type F is a subclass of Grass, i.e., it will have 'isGreen()' and whatever else we know about grass. This is a natural and useful extension of your approach using marker type arguments, but it is also well-known since 1998, so you should certainly know about it, and you ought to create something which is a little bit better than that, rather than a much less capable version of it - which is what you have presented in the current version of the paper. However, even with these improvements it is not family polymorphism, it is family _monomorphism_ because there is no subtype relation between anything associated with the general family Animal/Food and the derived families Horse/Grass and Lion/Meat. So you could not create any object or any other entity which is capable of playing the role as a 'package', which is what you can do with true family polymorphism. Still, family monomorphism is definitely also useful, because it allows you to avoid duplicating the shared code in the general family, and you can use parametric polymorphism (so-called polymorphic methods in Java 1.5) in order to write one piece of code that operates on both families: , F extends Food> void feed(A a, F f) { a.eat(f); } We could then call this poly-method as follows: feed(new HorseF(), new GrassF()); or feed(new LionF(), new MeatF()); Type inference on these two invocations would select A=HorseF and F=GrassF, respectively A=LionF and F=MeatF, and these choices are acceptable because they satisfy the constraints given on the type argument declarations of feed: the arguments must have types AA and FF such that it is statically known that the constraints AA extends Animal FF extends Food are both satisfied. The only missing piece compared to family polymorphism is now that there is no way prove that these constraints are satisfied except for knowing it up front (because AA and FF are type arguments which are declared to satisfy these constraints or something that implies them), or by using compile time constant types such as HorseF and GrassF where the constraints can be verified directly by checking the declarations. In summary, this means that family monomorphism is capable of handling a family _only_ when that family is known exactly at compile-time or it is known exactly somewhere and then passed through a number of type arguments which preserve the required constraints, which means that the exact family is known somewhere else in the program. Finally, there is also a smaller notational problem in this paper: The frequent occurrence of source code constructs with an ampersand, as in 'eat(Food&)', seems to indicate that the authors are thinking in terms of C++ source code, which would not be appropriate given that the text around these pieces of source code refer to Java in so many ways (and Java is even in the title). So - please - use correct Java syntax for the example source code or indicate it very clearly when the source code is not intended to be Java or even Java-like.