Skip to main content

Luca Chiodini

What the global statement really means in Python

Teaser

Consider this Python program:

foo = 42

def f():
    global foo
    print(foo)  # prints 42
    
f()

What does global mean? What does it do? Maybe you could be under the impression that the global statement (i.e., the line of code that starts with the keyword global) is necessary to make the variable foo “visible” inside the function f and therefore for this program to work properly. If that is what you think or you are not sure whether this claim is correct or not, then I encourage you to continue reading to learn some of Python’s Execution Model.

Preamble

The goal of this explanation is to make you understand when and why the global statement is needed. To accomplish this, we will introduce two key concepts: code blocks and bindings. Before diving in, take a moment to read the rest of this section.

Prerequisites and limitations

We refer to the latest Python version at the time of this writing (Python 3.10). This explanation tries to minimize as much as possible the number of Python’s language features used. It assumes that you are familiar with the following constructs:

  • Function definition and function call
  • Numbers (and the == operator to compare them)
  • Lists (of numbers)
  • Assignment, if and for statements
  • The built-in print function

Everything else is out of scope, and particularly the following features are explicitly excluded:

  • Definition of classes
  • Import from modules
  • with and try statements

Code blocks

A Python program is made of code blocks. In the subset of Python we are considering, only two pieces of a source code constitute a block: the body of a function and a module. For the purpose of this explanation, you only need to be aware that each Python source file contains a top-level module that includes all its source code.

Here is the content of a hypothetical file containing a Python program. Try to answer yourself: how many blocks are there?

def f():
    def g():
        print(42)

The correct answer is three: one block for the top-level module, one for the body of the function f (which just consists of the definition of a function g) and one for the body of the function g.

Let’s play with a different example: a file contains this source code. How many blocks are there?

def printAllTen():
    for val in [10, 20, 30]:
        if val == 10:
            print(val)

printAllTen()

If you answered two, give yourself a pat on the back. Surprisingly, the body of the for statement does not constitute a block (remember: only function bodies and modules). For the same reason, the body of the if statement also does not constitute another block. Note that this is different from what happens in many other languages, such as C/C++ and Java, where blocks are created with curly braces { ... } (also used in loops and if statements). This is not the case in Python, and more importantly, indentation does not determine blocks.

Bindings

Scope

For the limited purposes of this explanation, the scope of a name is the region of code in which that name is visible. Observe that in a program, the same name can refer to more than one entity. For example, a program can contain two different variables both named x. In this case, each variable has its own scope.

In Python, a scope is made of blocks. The scope of a name includes the block where the name is introduced and any blocks contained within that block that do not introduce a different binding for that name.

When a name is used in a block, it is resolved (to figure out its value) using the nearest enclosing scope.

Let’s re-examine the program presented in the teaser but without any global statement:

foo = 42

def f():
    print(foo)  # prints 42
    
f()

The name foo is used inside the block determined by the body of the function f. Since this block does not introduce foo, it is resolved using the nearest enclosing scope rule. We “climb up” one level and reach the top-level block, in which foo is found. It gets successfully evaluated to 42, which is the value that will be printed. Look mum, no global!

To understand then why global exists, continue reading.

What binding means

Consider this fragment of code (valid both in C/C++ and Java):

int foo;
// ...
foo = 42;

In the first line, we are declaring a variable of type int named foo. Later on, we can use an assignment to set the value of foo to 42. You might be wondering why we are spending time to walk through such a simple program: the highlight, here, is that you are able to first declare a variable with a name and only later assign a value to that variable.

Python does not support declarations. This means that we have to immediately associate a value to a name whenever we introduce a name. This operation is called binding: to associate a value to a name. Binding operations introduce names.

In Python, the example presented above would simply be:

foo = 42

This assignment statement binds the value on its right to the name on its left in the only block present, the top-level module.

Not only the assignment binds

The assignment statement is not the only construct that binds. Among others (the full list is described in the language specification), the following constructs also bind names:

  • a function definition binds the name of the function to the function;
  • when a function is called, the parameter names are bound to the arguments used to call the function;
  • a for loop binds the name in its header to the values provided by the iterator.

Re-binding

Consider this Python program:

foo = 42
# ...
foo = 10
print(foo)  # prints 10

In the previous section, we stated that an assignment binds a value to a name. What happens when there are two assignments to the same name in the same block? A re-binding operation occurs: the name on the left side of the assignment, previously already introduced, is re-bound to a new (potentially different) value. In our example, foo is initially associated with the value 42; later on, a new assignment causes re-binding and foo is then associated with the value 10.

In general, a binding operation causes re-binding of a name when that name was already introduced in the same block.

Things are no different if we move away from the boring, top-level block. In this program:

def f():
    foo = 42
    # ...
    foo = 10
    print(foo)  # prints 10

f()

foo is bound and re-bound inside the block constituted by the body of the function f. Even after calling f from the top-level block, when we try to access foo in that block we get back an error, as expected.

def f():
    foo = 42
    # ...
    foo = 10
    print(foo)  # prints 10

f()
print(foo)  # NameError: name 'foo' is not defined

Binding the same name in two blocks

We are approaching the core of this explanation. Carefully consider this Python program and try to convince yourself, using the knowledge acquired so far, that indeed the call to print inside the function f will print 10 and the one in the last line of the program will print 42.

foo = 42

def f():
    foo = 10
    print(foo)  # prints 10

f()
print(foo)  # prints 42

Here is the explanation. The assignment in the first line is contained in the top-level block: when executed, it binds the value 42 to the name foo. The assignment inside the function f is instead contained in the block determined by the body of the function f. Therefore, when executed, it binds the value 10 to the name foo in that block, regardless of the presence or not of foo in some enclosing block. No re-binding occurs in this scenario because the two binding operations happen in different blocks.

A more elaborate example:

def f():
    foo = 42
    def g():
        def h():
            foo = 10
            def i():
                print(foo)  # prints 10
            i()
        h()
        print(foo)  # prints 42
    g()

f()

Here a variable named foo is introduced in the body of h and its scope includes the bodies of the functions h and i (two blocks). Another variable named foo is introduced in the body of the function f and its scope includes the bodies of the functions f and g but not the bodies of h and i. To find out what gets printed at a given point, we need to look at the nearest enclosing scope.

Why global

If you read until here, you might be wondering why one would ever need a global statement. Take a look again at the Python program we analyzed a little earlier:

foo = 42

def f():
    foo = 10
    print(foo)  # prints 10

f()
print(foo)  # prints 42

It introduces a different foo inside f. How would you, instead, assign a different value to the foo variable whose name is bound to 42 and that was introduced in the top-level block?

There is no way to achieve that behavior with the features we have covered so far! To effectively change the value of foo introduced by the binding operation at the top-level, we need to perform an assignment. But, since Python lacks declarations, assigning some value to foo inside the body of the function f will introduce a “new” foo bound to the new value in that block.

This explains why we need the global statement, in which the keyword global is followed by a comma-separated list of names (identifiers). The global statement behaves like a binding operation: the subsequent binding operations of those names in the same block will cause re-binding of the names bound at the top-level. This means that a subsequent assignment would effectively change the value of the global variable.

Moreover, in the enclosed blocks, the uses of those names refer the names bound at the top-level. In fact, this behavior is analogous to the one of a binding operation: it does not extend to enclosed blocks that themselves introduce those names.

Let’s add just the global statement as the first line inside the body of the function f.

foo = 42

def f():
    global foo
    foo = 10
    print(foo)  # prints 10

f()
print(foo)  # prints 10

With this addition, the subsequent assignment that involves the name foo inside the body of the function f will cause a re-binding of the name foo that was previously bound to 42 in the top-level block. From that point onwards, the name foo will be associated with the value 10.

The next statement inside the function is going to look up the value associated with the name foo. The call to print is therefore going to print 10. After returning from f, we continue executing the next statements that belong to the top-level block. At this point, foo is clearly bound to 10, which will be printed.

We have finally solved the mystery, hurray!

There is more

The rest of this explanation is not needed to understand the example presented in the teaser, but you might still find it very helpful to improve your understanding of Python’s Execution Model.

Where a name can be used without errors

In Python, a name can be bound at any place in a block, and its scope extends to the entire block, even before the place where the binding occurs. However, that does not mean that at runtime we can freely access the name everywhere in the block without errors!

We already saw an instance of NameError in an earlier example. When a name cannot be found and we are inside a function, we get an UnboundLocalError, a subclass of NameError. Take a look at this example:

def f():
    print(foo)  # UnboundLocalError: local variable 'foo' referenced before assignment
    foo = 10

f()

By scanning the entire source code for name binding operations, we can statically determine the set of names that are in scope in any given block. Python statically knows that the name foo is in scope for the block determined by the body of the function f, given that it is the target of an assignment statement, which is a binding operation. The problem is that at runtime, when the first line of function f is executed, the name of the local variable foo has not yet been bound to a value; hence the error.

Here is a tricky example for you to analyze: what will happen when we execute this program, and why?

count = 0

def inc():
    count = count + 1
    print(count)
    
inc()

You probably tried to run this example and got an UnboundLocalError. Why does that happen? Shouldn’t there be “two count variables”, one global and one local to the function inc? The reality is more subtle, but you are already well equipped to answer these questions.

We said that the source code is scanned to find operations that bind names. Inside the function inc we have one such operation, the assignment statement, which has as a target the identifier count. There is no global statement, which means that the assignment is going to bind a “new” count that will be in scope for the current block, regardless of what might be already bound in enclosing blocks.

This local variable count is in scope for the whole block, but only gets bound to a value when the assignment statement is executed. To accomplish that, at runtime, Python has firstly to evaluate the right-hand side to figure out which value needs to be associated with count.

The right-hand side contains a use of the name count, which needs to be resolved according to the usual rules. Since there is a local count in scope for the current block, the name will refer to that binding and not to the one at the top-level. But the name has not yet been bound at the moment of the execution in which we are evaluating the right-hand side, and therefore an UnboundLocalError is raised.

nonlocal statement

We have seen that Python provides a global statement to allow, in enclosed blocks, re-binding names previously bound at the top-level (i.e., the global module). What if we want to re-bind a name that was bound in a block that is not the local one, but also not the global one? The nonlocal statement comes to the rescue with exactly this semantics.

Here is an example in which the nonlocal statement causes the assignment inside the function g to re-bind the name foo introduced in the nearest enclosing block (excluding the global one), which in this case is the block determined by the body of the function f. There, foo was bound to 42 and is now going to be re-bound to 10.

def f():
    foo = 42
    def g():
        nonlocal foo
        foo = 10
    g()
    print(foo)  # prints 10
    
f()

Conclusion

Wow, this was a blast! We have covered a significant part of Python’s Execution Model, and with this knowledge you should now be able to properly reason on the need of global (and more!).

This explanation is grounded in the official language specification, which covers in-depth some more aspects not explained here (e.g., free variables and closures).

If you find issues or have ideas on how to improve this explanation, feel free to drop me a line!

Authors

I am Luca Chiodini. This work would have not been possible without Igor Moreno Santos: the good bits of this explanation are because of him, and any remaining error is my fault. We both are PhD students working under the excellent supervision of prof. Matthias Hauswirth at the LuCE Research Lab.