Make super() work in a class definition

Foreword

Thanks to all those who gave me valuable feedbacks to my other proposal of Make name lookups follow MRO within a class definition, I am now convinced that in trying to bring the mental model of a subclass consistent across attribute lookups of a class object and name lookups within a class body, the proposed approach of redefining a class scope would indeed create too much potential for compatibility issues given the maturity of Python.

Special thanks to @Gouvernathor, who suggested the idea of making super() work in a class definition. I had considered the possibility before making that proposal, since it was also suggested in the popular StackOverflow question, but decided at the time that making a bare name lookup follow MRO would be the cleaner approach.

Now onto the actual proposal.

Motivation

The current usage of super() requires a class object, either passed in as the first argument or obtained through __class__ in the current namespace. This prevents the usage of super() during the execution of a class body, where the class object is not yet created.

So a subclass designed as an extension to a base class can have methods that call super() to gain easy access to attributes of the base classes, while within the class definition itself one needs to explicitly reference the name of the base class in order to access its attributes:

class Base:
    __slots__ = 'foo',

    def __init__(self, foo):
        self.foo = foo

class Child(Base):
    __slots__ = Base.__slots__ + ('bar',)

    def __init__(self, foo, bar):
        super().__init__(foo)
        self.bar = bar

This requires a breaking of the DRY rule, hindering any change in class name, to quote the Rationale of PEP-3135:

The goal of this proposal is to generalize the usage of super() so it can intuitively work in a class definition like it does in a method:

class Base:
    __slots__ = 'foo',

    def __init__(self, foo):
        self.foo = foo

class Child(Base):
    __slots__ = super().__slots__ + ('bar',)

    def __init__(self, foo, bar):
        super().__init__(foo)
        self.bar = bar

Research

Despite the current super() requiring a class object to work, we can see from the implementation of its attribute getter descriptor _super_lookup_descr that all it really wants from a given class object is just its MRO:

And the MRO of a class is currently obtained by calling the mro method of the class, defined in its metaclass:

Similarly, the actual implemetation of the mro method, mro_implementation_unlocked, needs only the bases from a given class object:

The Proposal

From the research above, it is now apparent we can bypass super()'s requirement of a class object with the following implementation.

First, let the class builder make the calculated MRO available as __mro__ to the namespace in which the class body executes. The MRO is calculated based solely on the given bases, without a class object. More on that in the second point that follows.

Note that the actual implementation should insert __mro__ only if the parser finds any call to super() in the class body, just like how the parser only inserts __class__ to the namespace of a method if it finds the use of super() in the method:

import builtins
from types import resolve_bases, _calculate_meta

def build_class(func, name, *bases, metaclass=NewType, **kwargs):
    metaclass = _calculate_meta(metaclass, bases)
    namespace = metaclass.__prepare__(name, bases, **kwargs)
    resolved_bases = resolve_bases(bases)
    namespace['__mro__'] = metaclass.mro(resolved_bases)
    exec(func.__code__, globals(), namespace)
    if resolved_bases != bases:
        namespace['__orig_bases__'] = bases
    del namespace['__mro__']
    return metaclass(name, resolved_bases, namespace)

builtins.__build_class__ = build_class

Secondly, make the type.mro method test if the sole argument it gets is a class, in which case it will do what it currently does by getting the bases from class; otherwise the first argument is treated as bases for the rest of MRO calculations.

Since there is no separately exposed API from type.mro with just the MRO calculation logics, the following Python equivalent implementation for illustration purposes creates an intermediate class from the given base classes and calls its mro method in order to emulate calling type.mro with a sequence of bases. The actual implementation should be to make mro_implementation_unlocked take a PyObject *cls_or_bases instead, and treat cls_or_bases as a class and call lookup_tp_bases(type) only if it passes a type check, or else assign it directly to bases:

class NewType(type):
    def mro(cls_or_bases):
        if isinstance(cls_or_bases, type):
            return super().mro()
        return __class__('_TempClass', cls_or_bases, {}).mro()

Finally, make the no-argument form of super() test if there is __mro__ in the namespace after failing to find __class__ in the namespace. If there is, meaning that it’s called from a class body, use __mro__ as MRO instead of calculating it from a class object.

Since the current implementation of super.__init__ looks up locals directly from the current frame and therefore can’t be called from an overriding method of a subclass, the following Python implementation for emulation purposes skips the conditional statement that checks for the presence of __class__, and includes only the fallback logics to handle __mro__ for a class defintion, by again creating an intermediate class so to call the current super().__init__ with a class object. The actual implementation should be a conditional statement within the _super_lookup_descr function to use __mro__ instead of lookup_tp_mro(su_obj_type):

import sys

class class_super(super):
    def __init__(self):
        cls = type('_TempClass', tuple(sys._getframe(1).f_locals['__mro__']), {})
        super().__init__(cls, cls)

so that:

class Base:
    __slots__ = 'foo',

    def __init__(self, foo):
        self.foo = foo

class Child(Base):
    __slots__ = class_super().__slots__ + ('bar',)

    def __init__(self, foo, bar):
        super().__init__(foo)
        self.bar = bar

print(Child.__slots__) # outputs ('foo', 'bar')
child = Child(1, 2)
print(child.foo) # outputs 1
print(child.bar) # outputs 2

Demo: mlEiPT - Online Python3 Interpreter & Debugging Tool - Ideone.com

Backward Compatibility

Maintainers of metaclasses with a custom mro method defined should be advised to make mro check the type of the first argument, and use it directly as bases for MRO calculations if it isn’t a class, if their metaclass is to support this feature.

Performance Impact

The existing code base is minimally affected performance-wise in the additional check for the presence of super() in class defintions that the parser will be performing, and in the additional type check of the first argument to type.mro.

2 Likes

I’ll quote my reply to your previous topic for why I think this is unnecessary, unhelpful and potentially misleading and harmful:

9 Likes

I think David is pretty much spot on, only thing I have to add is regarding this part of the motivation:

I know that it was quoted as the “DRY rule” from that other PEP but DRY is not a rule, it’s a principle and should not be applied to simply remove all repetition IMO. Correct me if I’m wrong, but the only part of your example that has anything close to “violating DRY” is this part:

Where I presume you’re talking about the repeteated __slots__.

To me that is a misunderstanding of DRY, which is more about not needlessly repeating larger parts of code and instead abstract them, and to only do that when it makes sense! This is just normal code IMO and DRY is not applicable.

2 Likes

I think the DRY rationale (which was the initial rationale for the no-params super call) makes sense. You shouldn’t have to repeat what superclass you want to take the member/method of, unless you want a specific one and intentionally skip over the MRO.

No, the DRY violation is the fact that in current python this use of super() is not possible, and you have to write attr = SuperClass.attr + ("foo",) (and reminder to @blhsing that the __slots__ example is wrong, you shouldn’t repeat the same slot between a superclass and a subclass). The violation comes from repeating SuperClass, not attr, after already writing SuperClass in the class signature.


I have issues about the proposed implementation since it does a single-use subclass for each use of the new super(), and for other reasons too ; but I realize it’s just a prototype and not written in stone.

Other than that, I think having the inherited attribute and/or method dynamically resolved during class-body execution is prone to problems, and that the enum.auto- like behavior of having a placeholder object be resolved at subclass instanciation (= after the class block is finished executing) is better.

1 Like

That is also not a violation IMO.

In that case I don’t see how this is a violation due to Sup being repeated :

class Sub(Sup):
    def meth(self):
        rv = Sup.meth(self)
        rv.more()
        return rv

Or this due to Sub being repeated :

class Sub(Sup):
    def meth(self):
        rv = super(Sub, self).meth()
        rv.more()
        return rv

Yet according to you, this is not :

class Sub(Sup):
    att = Sup.att + "more"

First of all, I don’t like the DRY principle at all because it stops people from actually thinking about what abstractions to use and why they are useful. In the top code, you specify that you want to use a method from one of possibly many superclasses. This is probably what people should use most of the time in most methods.

I the second code block you just want to use a method from any superclass in the mro. This is useful for __init__ and some other special methods but a lot of places where super() is used should probably be replaced by calling a specific superclass’s method.

So I disagree that any of your examples have anything to do with DRY and instead they are examples of thoughtful abstractions that have their own specific uses.

9 Likes

__slots__ is already inherited and cumulative. The following works fine:

class Child(Base):
    __slots__ = 'bar',
    def __init__(self, foo, bar):
        super().__init__(foo)
        self.bar = bar

or more DRYly,

class Base:
    __slots__ = 'foo',
   def __init__(self, *, foo, **kwargs):
       super().__init__(**kwargs)
       self.foo = foo


 class Child(Base):
    __slots__ = 'bar',
    def __init__(self, *, bar, **kwargs):
        super().__init__(**kwargs)
        self.bar = bar

c = Child(foo=1, bar=2)
1 Like

The DRY issue is completely orthogonal to my objection and was part of my “at best” scenario. The point of super() is not to avoid repetition[1].

I’ll quote the Python documentation on super():

There are two typical use cases for super. In a class hierarchy with single inheritance, super can be used to refer to parent classes without naming them explicitly, thus making the code more maintainable. This use closely parallels the use of super in other programming languages.

The second use case is to support cooperative multiple inheritance in a dynamic execution environment. This use case is unique to Python and is not found in statically compiled languages or languages that only support single inheritance. This makes it possible to implement “diamond diagrams” where multiple base classes implement the same method. Good design dictates that such implementations have the same calling signature in every case (because the order of calls is determined at runtime, because that order adapts to changes in the class hierarchy, and because that order can include sibling classes that are unknown prior to runtime).

While you can accomplish the first use-case, the second use-case is what’s relevant here. You can’t accomplish it with what’s being proposed here and that’s what makes enabling super() without arguments inside a class body so dangerous, because people will almost certainly misunderstand what super enables in a class body vs. inside a method.


  1. you can argue that the point of being able to omit its parameters is to avoid repetition ↩︎

5 Likes

Yes, the DRY point is exactly about being able to omit super’s parameters. That’s why my second example, using super-with-parameters, still violates DRY, and why the only DRY-compliant solution to my first and second examples is to use super() without parameters.
But I was more answering to Jacob than to you, I think I misused the Reply button. Your point was that the proposed feature isn’t useful and has risks, my only point with respect to that was that DRY compliance does make the feature useful.


And now I’m answering to Jacob.

No, if you take into account the “second use case” of what David quoted above, singling out a single ancestor and bypassing the MRO is a very specific thing, to do only if you know what you’re doing (typically to skip one particular ancestor). This typically leads to behaviors where the method of the second parent, in multiple-inheritance, is not called at all. It is very bad as a common practice, quite the opposite of being “what people should use most of the time in most methods”.


And now back to David.

I don’t see why the second use-case would be blocked by what’s proposed here. It would possibly (probably) not work with Ben’s proposed implementation, but it could work very well if you do it my enum.auto-like way, where all the super and inheritance behavior are allowed to take place.

EDIT : actually I think I do. And you do have a point, I’ll have to think about it .

Yeah I think I overstated how often you should do it, but I find that often when you call super() you tend to have a specific superclass in mind and it that case it makes sense to just call the method that one. Calling with super() can lead to the wrong superclass being called if the mro is changed later (later here being either at . Of course, that might cause the other issues that have been brought up. One should think of which approach to use and you’re probably right that super() should be the default.

I’d go as far as saying that even if you have a superclass in mind when writing the method (and unless it’s a “protected”-like private method that subclassers shouldn’t be concerned about), you still should use super() because otherwise you prevent the full inheritance-OOP process from taking place and allowing subclasses to wedge-in other ancestors between the class you’re writing and the superclass you’re targeting. That wedged-in ancestor will not be a “wrong superclass”, because that’s not a bug, it’s a feature of object-oriented programming allowing multiple inheritance.

But we’re drifting off-topic, as that only applies to methods.

A method in a subclass can, thanks to super() calls, implement a mixture of various superclasses’ implementations of that method. That mixture even changes for a given class depending on whether the method is called on an instance of the class itself, or of a subclass.
But given that the super() resolution here only happens at one particular time, which is at class instanciation at the latest, and it does not change at subclassing time. The best one subclass may do is to pick one superclass and take its value. And maybe do things to that value, such as my + "more", but that’s not part of what we wanted to add to class-block super() ; the point is, a class-body super() cannot make a mixture of several ancestors’ attribute values, and that sets it apart from a crucial feature of super() as it currently exists.

TL;DR : what David said.

As a result, I agree that extending super() to do this is a bad idea.
It may be a good idea, in order for people not to use the att = Sup.att + ... syntax and in order not to support what I described as a bad practice above, to allow something like att = ances().att + ..., but the proposed ances (which is a bad name for a builtin ofc) should not be named super.

1 Like

To give a simple, albeit rather pathological, case, you can manipulate a class’ __bases__ at runtime:

>>> class A: pass
...
>>> class B: pass
...
>>> class C(B, A): pass
...
>>> C.__bases__
(<class '__main__.B'>, <class '__main__.A'>)
>>> C.__bases__ = tuple(reversed(C.__bases__))
>>> C.__bases__
(<class '__main__.A'>, <class '__main__.B'>)
>>> C.__mro__
(<class '__main__.C'>, <class '__main__.A'>, <class '__main__.B'>, <class 'object'>)

Given that super() is “supposed” to follow the MRO, this would give at best unexpected results. You could argue that it’s “obvious” that a calculation done at class definition time must use the bases as defined at the time the statement is evaluated, but that’s a pretty subtle distinction that’s likely to be missed by users who just think of super() as “magic”, and assume that they can use it in a class definition just like they do in a method.

The long and short of this is that it’s not even remotely clear what it would even mean for super() to “work” in a class definition, except on a superficial level. And that superficial understanding breaks down once you try to use any of Python’s more advanced dynamic features.

9 Likes

Agreed. The second use case is not applicable at the time of a class definition, and to make it truly magical as implied by what super() is known for such that when a class attribute dependent on super() is accessed later its value can be dynamically evaluated, it would require too much trickery with descriptor, wrapper and introspection and with still too many possible gotchas to make the gain worthwhile.

To satisfy the first use case for future reference though, I’m posting my revised prototype with intermediate classes removed (with an MRO implementation borrowed from this SO answer), and the wording super renamed as parent to avoid sounding magical, in case someone wants to play with the idea in the future:

import sys
import builtins
from types import resolve_bases, _calculate_meta

class parent:
    def __getattribute__(self, name, _not_found=object()):
        for base in sys._getframe(1).f_locals['__mro__']:
            if (obj := getattr(base, name, _not_found)) is not _not_found:
                if hasattr(obj, '__get__'):
                    return obj.__get__(base)
                return obj
        raise AttributeError

class NewType(type):
    def mro(cls_or_bases):
        mro = []
        bases = cls_or_bases
        if isinstance(cls_or_bases, type):
            bases = bases.__bases__
            mro.append(cls_or_bases)
        mro += NewType._mro_merge([base.__mro__ for base in bases])
        return mro

    @staticmethod
    def _mro_merge(mros):
        if not any(mros):
            return []
        for candidate, *_ in mros:
            if all(candidate not in tail for _, *tail in mros):
                return [candidate] + NewType._mro_merge([
                    tail if head is candidate else [head, *tail]
                    for head, *tail in mros
                ])
        raise TypeError('Cannot create a consistent method resolution order '
            '(MRO) for bases')

def build_class(func, name, *bases, metaclass=NewType, **kwargs):
    metaclass = _calculate_meta(metaclass, bases)
    namespace = metaclass.__prepare__(name, bases, **kwargs)
    resolved_bases = resolve_bases(bases)
    namespace['__mro__'] = metaclass.mro(resolved_bases)
    exec(func.__code__, globals(), namespace)
    if resolved_bases is not bases:
        namespace['__orig_bases__'] = bases
    del namespace['__mro__']
    return metaclass(name, resolved_bases, namespace, **kwargs)

builtins.__build_class__ = build_class

that’s a pretty subtle distinction that’s likely to be missed by users who just think of super() as “magic”, and assume that they can use it in a class definition just like they do in a method

I think that this argument in particular is not very convincing. Users who are aware of the possibility of modifying the class bases/mro after it’s creation and understand Python on a level sufficient to use such meta-programming tricks correctly are unlikely to assume that super() in class bodies uses any “magic” that would allow it to re-calculate the resulting value after an mro change.

In fact, I think that this “static” behaviour of class-level super() is consistent with how some other parts of the language work (decorators, default arguments, metaclasses). Most language features in Python behave “eagerly”, executing when code flow reaches the appropriate place and then “statically” applying the required changes to the object.

And the exceptions only prove the rule. The defining feature of descriptors is that they break this rule and they (almost?) always involve a function or lambda that makes it pretty clear that some kind of “action at a distance” is intended. I think that all other counter examples are also obviously special (like doing stuff in threads or with async or in __del__).

In the vast majority of cases when a line of Python code is executed, the effects are immediate and have a well defined, non-lazy result, so I don’t think that the proposed static behaviour of class-level super() is all that unintuitive.

I’ve noticed a lot of recent ideas about drastically changing the semantics of classes recently that all seem to stem from people trying to make inheritance do things it has no reason to need to do, and to significantly increase the complexity of the language to avoid knowing how things interact. Inheritance should be cooperative, you should know that the type you are deriving from is compatible. Functions exist, not everything needs to be a method of a class. said functions can accept multiple objects to compose behavior from each of them or case/match and do type dispatch.

12 Likes

Perhaps a more convincing case for the impossibility of this: The MRO is not only dependent on the class itself, but on its descendants:

class A: pass
class B(A): pass
class C(A): pass
class D(B, C): pass

What should super() in the class definition of B refer to? When you instantiate a B object directly, super() will give you A. When you instantiate a D object, then C comes after B in its MRO.

3 Likes

Very well said.

Honestly, I scarcely use inheritance at all in my projects nowadays (the largest I think is on the order of 5k lines), let alone any complex multiple inheritance tricks. Composition and duck-typing are just such powerful tools that I end up not seeing a need for anything else. Even with composition it’s rare that I actually feel any compulsion to write a bunch of delegation boilerplate. Code like that smells to me the underlying problem should really be solved by a better creation pattern (e.g. a subtype really just represents a subset of possible instances of the base, so they should just be created by a factory such as a classmethod), or a more flexible behavioural pattern (in particular, the Strategy pattern, which is trivial in Python because functions are first-class).

5 Likes