|
|
Subscribe / Log in / New account

A tiny Python called Snek

Please consider subscribing to LWN

Subscriptions are the lifeblood of LWN.net. If you appreciate this content and would like to see more of it, your subscription will help to ensure that LWN continues to thrive. Please visit this page to join up and keep LWN on the net.

By Jake Edge
January 22, 2020
LCA

Keith Packard is no stranger to the linux.conf.au stage; he has spoken on a wide variety of topics since he started going to the conference in 2004 (which was held in Adelaide, where organizers apparently had a lot of ice cream for attendees). One of his talks at this year's conference was on an education-focused project that he has been working on for around a year: a version of Python called "Snek" targeting embedded processors. He gave a look at some of the history of his work with 10-12 year-old students that led to the development of Snek as well as some plans for the language—and hardware to run it on—moving forward.

Teaching

He has been involved with teaching programming by way of building robots out of Lego and attaching computers to them. After that, they "drive them around the classroom". It has been "fantastic fun"; he has been working with an "amazing educator" who recently retired after 30 years of teaching science and technology. His children were taught by her, so he felt he had "an obligation to help out".

[Keith Packard]

The students have had three or four years of Lego-based instruction by the time they reach the class he assists, so they are "really good at building Lego". In the class, they teach three separate languages over the length of the class: Logo, ROBOLAB for Lego RCX devices, and, until recently, C++ for Arduino. Teaching ten-year-olds C++ was not an enjoyable experience, he said.

When he joined the program, he was given a real, live Apple II machine to use for teaching the students. It had a controller box that could be used to make "static robots"; they couldn't move around, since they were connected to the controller, but they could have moving parts. His slides [PDF] show a picture of the Apple II and a small pen plotter made out of Lego. Sadly, the Apple II died under his watch, though it had lasted for almost 40 years; "I probably plugged the card in wrong, turned it on, and smoke was emitted".

Logo was used for programming the static robots. It is a transliteration of the Lisp programming language. It seemed a bit odd to be teaching a language from the 1950s to children, "but it is kind of cool". He showed a short program that would talk to the robot in Logo and blink a light once per second; it was easily readable, with simple syntax and easy-to-use primitives to interface with the device.

Arduino

He had a lot of experience teaching Arduino programming to high school and college students, so he decided to give that a try after the Apple II died. Arduino devices are quite inexpensive; they are something that can be taken out of a Lego robotics project and moved into other kinds of projects. A number of the students developed much more sophisticated robots using the Arduino after they completed the class, which they showed at science fairs.

But, since the Arduino programming environment uses C++, students would need to be taught that language. C++ has a complicated syntax for students of that age, so lots of the instruction involved dealing with curly braces and semicolons. The performance of the resulting programs was amazing, he said. One of the students built a "linebot", which is a robot that follows a line of darker tape on a floor or table. That student modified the program when they were in 11th grade using their calculus knowledge to switch to a proportional-integral-derivative (PID) algorithm. That was possible because of the performance of C++ on the Arduino; the same is not true of any of the other languages being used as they simply are not fast enough to run something like PID.

He showed the Arduino code needed to blink a light, which was far more complex than the Logo example. It required two functions; "I have no idea why". The amount of typing needed to enter that code is substantial; it would take the students around 30 minutes just to type it in, which was not a profitable use of their time, or his. Beyond that, there is no C++ interactive mode. The experimentation cycle requires compile and download steps each time the program is changed. Code was reused and modified over the duration of the class, which helped reduce the typing burden; once that problem was identified, the teaching was changed so that students only needed to type in a full program once.

It was "really cool" that they could build standalone robots using a free-software development environment, he said. It was also nice that they were teaching the students skills that they could take beyond the Lego environment and use in other areas. Some of the skills the students learned were specific to the Lego environment, however, so Packard and the other instructors figured out ways to help the students understand which skills were more widely applicable versus those that would only apply in Lego world.

He showed a video of a "fireworks" display that the students programmed; it can be seen in the YouTube video of the talk, just after the 12-minute mark. It was built with a single Arduino driving a bunch of LED controllers to display fireworks of various sorts and in different colors. He wired up the pulse-width modulation (PWM) controller boards for the display; the 10th and 11th grade students then took around three weeks to get it all working. It was clearly something that was too complex for the younger students, which made the instructors realize that advanced Arduino programming was not something they could teach at that level.

Make it easier

He taught C++ for about five years before deciding that he wanted to rethink things. He had used BASIC as a child and wanted an environment that was similar to that: one where you could simply type at a command prompt and see the results right away. But BASIC is essentially a dead language and he wanted something that the students could use elsewhere after the class was done. He came to the conclusion that he wanted "something a lot like Python".

So he came up with a Python subset called Snek. It is Python compatible; his test suite runs on both Snek and Python, which must produce identical answers. Ideally he would have liked a screen and keyboard directly attached to the robot, but that turns out to be painful to do, so he settled for a laptop with an integrated development environment (IDE) that had a simple editor and an interaction window. There are two IDEs available for Snek. The first is snekde that he wrote in Python; it is meant to look the same as the Logo environment the students were used to. After he got that working, he came across the Mu IDE, which looked much like what he had written "but was competently done", he said with a grin.

The requirements for working with the robot and Snek are pretty minimal: a USB connection from the host to the robot running Snek and a Python program running on the host. Windows 10 has finally joined Linux and macOS in not requiring drivers to talk to USB serial devices, so it is all pretty straightforward. It is easier to install than the Arduino environment because there is no need for a compiler.

There are a few reasons he wanted to use USB to connect to the robots. For one thing, the USB connection will also charge the battery of the device; it was pretty rare to need to plug in just for a recharge as the batteries tend to stay topped up as a matter of course while working with the board. Another option would be Bluetooth, but that does not charge batteries and it turns out that teachers have to spend an inordinate amount of time ensuring that the pairing of devices and tablets is done properly for a roomful of students. "Wires kinda solve that problem in a really straightforward fashion."

He showed the dozen-line Snek program for a simple line-following robot (as well as a video of it in action). The program "looks pretty Pythonic" and is easy to understand, he said. He can teach a program like that in 40 minutes or so, which means that students can get something going rather quickly.

There are other options for implementations of Python that run on small devices: MicroPython and its descendant, CircuitPython. (LWN covered a PyCon keynote on CircuitPython in 2019 and looked at MicroPython in 2015.) Those two are "fantastic", he said, but they are much larger than Snek. CircuitPython (256KB) is roughly four times the size of Snek (64KB) for the same board, he said.

He showed a CircuitPython version of the code to turn an LED on and off based on a switch, which was nearly as complex as an Arduino version, though it is written in a language with much simpler syntax. There are various imports required along with configuration of the switch and LED, but the six-line Snek version dispenses with all of that. CircuitPython is as capable as the full Arduino, but that also means he has to teach a bunch of preliminaries before the students can actually make the robot do anything. Snek internally sets up and configures the device so that students can do things with it right away, including directly from the interaction window.

Snek implementation

There are directly interpreted languages, such as BASIC and Lisp, where the source code is executable. At the other end of the spectrum are the fully compiled languages like C++, Rust, Fortran, and so on. In the middle are the bytecode languages, such as Perl, Ruby, JavaScript, Python, and, now, Snek. Those languages are compiled to bytecode, then the bytecode is run on a virtual machine. We often think of those languages as being interpreted, "which is semi-true", but there is a bytecode compiler as part of the language implementation, he said.

So a language like Python needs both a compiler and an interpreter to run the bytecode on the Python virtual machine; that meant it would be more complicated to implement Python than it would be for a language like BASIC. There are four separate components that went into Snek: the compiler, virtual machine, hardware support for a particular target, and memory manager. For BASIC, the memory-management was simple, but Python requires garbage collection. All of that meant that a Python implementation was going to be larger than the BASIC implementations from his youth.

In order to keep it small, though, he needed to pare down the kinds of objects that Snek would support. Instead of having an integer type, which is actually an unnatural thing to students of that age since they have already learned about fractions and decimal numbers, Snek only has 32-bit floating point numbers. Tuples, lists, and dictionaries are supported as are strings so that messages can be emitted. Compiled functions as objects are an integral part of Python so those are part of Snek as well.

He wanted to have 32-bit floats, but he also wanted to represent all objects in 32 bits. He remembered a trick that uses the fact that IEEE 32-bit floats have 16 million "not-a-number" (NaN) values. For Snek, everything that is not a NaN is a 32-bit float, but NaNs actually encode a type tag and pointer in the unused 23-bit fraction portion of the "float", which was a "cute hack".

He started writing the Snek compiler using Bison to generate the parser, but the tables it created "were ginormous". Instead of writing yet another recursive descent parser, which are "really tedious to fix", he remembered a table-driven parser generator for LL grammars that he had written in Lisp 40 years ago. He took those ideas and turned them into the Lola parser generator, which generates "intensely compacted parser tables"; the goal is to sacrifice some parsing performance for much smaller tables.

He also tried Flex for creating the lexer, but that had similar size problems so he wrote a lexer by hand. Like the earliest C compilers, Snek's compiler emits bytecodes directly from the parser, which makes it easy to see how language constructs are turned into bytecode in the parser code. The entire compiler, including lexer, parser, and code generator, is 1500 lines of code. The Lola-generated parser was around 5KB of code and data, while the Bison-generated one was over 10KB

The Snek virtual machine is a stack machine with an explicit accumulator register; he has done a number of virtual-machine implementations along the way and has settled on this style since it typically reduces the number of pushes and pops required. It has 61 opcodes, 39 of which are expression operators (e.g. +, =). It is a non-recursive implementation in order to keep the C stack within known bounds.

The memory manager was "a fun piece". He knew that he could not have two heaps, which is often used for garbage collection by copying the active objects to a new heap and then switching to it; "that's not going to work when I have 2KB of RAM". So he did a mark-and-sweep garbage collector that also compacts the objects. It is done incrementally so that there are no long pauses for garbage collection. And, once again, he needed to avoid recursion.

"Python has some tricky bits." Lexical white space is difficult to parse but "it is absolutely fantastic for teaching". He learned some interesting things about Python's function parameters, he said, including the differences between positional and named parameters along with how those interact with required and optional parameters. Python shares constants used within the same compilation block, but he found that to be too difficult for Snek, so it is not implemented.

Snekboard

He closed the talk with a mention of the Snekboard crowdfunding campaign that he is running at Crowd Supply. He did not want to design hardware for the project, but he needed a board that was battery-powered and could drive Lego motors, which require 9V. So he designed the Snekboard to satisfy those requirements; it will run Snek or CircuitPython and will hopefully be available in the next few months.

In the opening, Packard said that he has returned to linux.conf.au many times since Adelaide, hoping to repeat the ice-cream experience there, but had so far been disappointed. The conference organizers may have been listening as there were two ice-cream events in the two days following Packard's talk. That seems like it might help entice him back for more LCAs down the road.

[I would like to thank LWN's travel sponsor, the Linux Foundation, for travel assistance to Gold Coast for linux.conf.au.]

Index entries for this article
Conferencelinux.conf.au/2020
PythonEmbedded
PythonImplementations


(Log in to post comments)

A tiny Python called Snek

Posted Jan 22, 2020 21:49 UTC (Wed) by vadim (subscriber, #35271) [Link]

Finding out why Arduino uses two functions (setup() and loop()) is easy, just look in the code. You'll find this in main:
    setup();

    for (;;) {
        loop();
        if (serialEventRun) serialEventRun();
    }

So, I take it that this situation has 4 reasons for existing:

  1. It saves the need to know how to make a loop. This is useful because near any Arduino program will need to loop, so that's one thing less to figure out before getting something that works.
  2. It avoids the rather obtuse syntaxes of "for(;;)" or "while(1)", which a new student won't yet understand, and would need additional explanation.
  3. It avoids the need for the user to call serialEventRun(), which is even more obscure to newbies, not to speak of the 'if' bit.
  4. It allows for more processing behind the scenes to be inserted if needed, without requiring existing code to be modified.

So I would say on the balance it's a good thing. It allows getting down to business as quickly as possible, and does its best to avoid the student having to write arcane incantations that they don't understand the purpose of, and which they could screw up.

A tiny Python called Snek

Posted Jan 22, 2020 22:08 UTC (Wed) by Cyberax (✭ supporter ✭, #52523) [Link]

It also saves a level of indentation, making code a bit more readable.

A tiny Python called Snek

Posted Jan 22, 2020 22:59 UTC (Wed) by keithp (subscriber, #5140) [Link]

Thanks for the explanation; I was unaware of the call to serialEventRun. Useful bits!

From a teaching perspective, I did have to explain the overall structure (setup called once, loop called again and again), but your comment about avoiding the need to teach for(;;) or while(1) resonates with my experience. It's faster to explain in words what the system does than get students to type the loop syntax in by hand.

A tiny Python called Snek

Posted Jan 23, 2020 11:20 UTC (Thu) by ceplm (subscriber, #41334) [Link]

Hasn’t he just reinvented Lua? I mean for the small embedded hardware, that should be it, shoudln't it?

A tiny Python called Snek

Posted Jan 23, 2020 12:10 UTC (Thu) by dgm (subscriber, #49227) [Link]

I'm not sure if Lua (or eLua) could be made to run on the target platform without reimplementing the interpreter.

A tiny Python called Snek

Posted Jan 31, 2020 20:35 UTC (Fri) by simcop2387 (subscriber, #101710) [Link]

It's not generally possible with the default ATMEGA328 processors on arduinos. Porting the code and everything generally works fine and is easy to do but the real problem is the lack of memory on them, only 2KB of RAM. That causes a number of problems for LUA to do anything non-trivial due to the way the VM and everything works. That said I use a cortex m3 + lua to run my christmas lights. You don't need much more ram for it to suddenly be useful, about 4K seems to be the spot in my experience, and the 32kb on the arm processor I use is obviously morebetter.

A tiny Python called Snek

Posted Jan 23, 2020 20:53 UTC (Thu) by keithp (subscriber, #5140) [Link]

The main goal of the Snek project was to build a small, easy to teach language that would be useful later on. To me, that meant doing something based on Python.

My daughter took an introductory programming class as part of her geology degree. The class used Python, but most of the time was spent learning things also in Snek. Most of the examples and classwork would run (unchanged) using Snek instead of Python.

A tiny Lua for Arduino-scale machines could be a fun project though. Our robotics program teaches students three languages (Snek, Logo and ROBOLAB) and lets them choose which language to use for further work. When the OS 9 Macs needed to run Logo are no longer available, we'll need to find something new...

A tiny Python called Snek

Posted Jan 23, 2020 21:13 UTC (Thu) by ceplm (subscriber, #41334) [Link]

> When the OS 9 Macs needed to run Logo are no longer available, we'll need to find something new...

What’s wrong with UCB Logo? https://en.wikipedia.org/wiki/Logo_(programming_language)

A tiny Python called Snek

Posted Jan 24, 2020 6:06 UTC (Fri) by keithp (subscriber, #5140) [Link]

There's lots of potentially interesting languages to teach new programmers, but our program focuses on building robots (with Lego components!), which means having hardware to connect the computer with motors and sensors. Finding hardware that can do that turns out to be non-trivial.

When the old Macintosh machines stop working, the interface hardware will no longer have anything to connect to. Those use a proprietary serial protocol hooked up to the original Macintosh serial ports. Maybe those interfaces could be connected to a modern machine?

For Snek, I had hoped to use existing Arduino-compatible hardware, but couldn't find any integrated boards capable of driving 9V motors and servos. So I ended up building my own, which has resulted in a Crowd Supply campaign... Somehow this feels a lot like yak shaving.

A tiny Python called Snek

Posted Jan 30, 2020 11:30 UTC (Thu) by Wol (subscriber, #4433) [Link]

A tiny Python called Snek

Posted Jan 23, 2020 22:58 UTC (Thu) by SEJeff (guest, #51588) [Link]

I find it interesting that this is supposed to be smaller than micropython and works on the esp32, but not the smaller esp8266, which micropython works on.

A tiny Python called Snek

Posted Jan 24, 2020 5:56 UTC (Fri) by keithp (subscriber, #5140) [Link]

Yeah, the ESP32 isn't exactly tiny. But, it's something I had lying around, so naturally I ported Snek to it. Somehow, I tend to collect tiny microcontroller boards. They're like coat hangers around here.

MicroPython uses a couple hundred kB of ROM; which is amazingly small considering how much Python that includes. Snek is a much less capable language, but it can squeeze down to about 32kB of ROM if you leave out the math functions.

Snek currently has ports for:

  • Adafruit Crickit FeatherWing (Atmel SAMD21)
  • Arduino Duemilanove (Atmel ATmega 328P)
  • ESP32
  • Adafruit Feather M0 Express (Atmel SAMD32)
  • SiFive HiFive1 Revb (RISC-V)
  • Adafruit ItsyBitsy (Atmel ATmega 32u4)
  • Arduino Mega (Atmel ATmega 2560)
  • Adafruit Metro M0 (Atmel SAMD21)
  • Arduino Nano 33 IoT (Atmel SAMD21)
  • Adafruit Circuit Playground Express (Atmel SAMD21)
  • POSIX (for testing and debugging)
  • QEMU for ARM (Cortex M3)
  • QEMU for RISC-V (rv32imac)
  • Arduino µduino from Crowd Supply (Atmel ATmega 32u4)

These range from 32kB of ROM and 2kB of RAM all the way up to several hundred kB of ROM and a bunch of RAM.

A tiny Python called Snek

Posted Jan 24, 2020 14:38 UTC (Fri) by SEJeff (guest, #51588) [Link]

If a little hardware fairy were to gift you an adafruit huzzah esp8266, would you consider porting to it?

A tiny Python called Snek

Posted Jan 24, 2020 18:36 UTC (Fri) by keithp (subscriber, #5140) [Link]

Sure! I've already used the esp8266 toolchain packaged in Debian as I tested the Picolibc port done by Jonathan McDowell.

A tiny Python called Snek

Posted Feb 5, 2020 9:23 UTC (Wed) by zoobab (guest, #9945) [Link]

Any plan for STM32 boards? especially the popular and cheap STM32F103? They are ARM Cortex M3, so it should be close to the other ARM boards already supported...

Lost his magic Smoke...

Posted Jan 24, 2020 0:58 UTC (Fri) by rahvin (guest, #16953) [Link]

Even with the wonders of modern tech and the durability of old tech, Magic Smoke strikes again! I've heard those old Apple II's have a pretty limited amount of magic smoke.

A tiny Python called Snek

Posted Jan 30, 2020 22:34 UTC (Thu) by ejr (subscriber, #51652) [Link]

If you use signaling NaNs for other types, you can run through a sequence of arithmetic operations and check if they were actually numbers after the fact. Old technique.


Copyright © 2020, Eklektix, Inc.
This article may be redistributed under the terms of the Creative Commons CC BY-SA 4.0 license
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds