Who Begat Python? Knowing your Interpreter
Divya Goswami (~divya32) |
Python as our favorite language and the reason of this great conference has been winning hearts as one creates and builds applications in all domains. Be it ML or Data Science or Cybersecurity. Although not many will know ( Or actually not WANT to know ) that Python is implemented in C. The original python written by RV Guido, which includes OOP concepts and integrating various libraries and every high level language implementation. All these are written and compiled in C code. The python binary that resides in /usr/bin/ can thus be put into a debugger and run to travel into various regions, including where objects(lists, dicts, tuples) to where import libraries reside (os, sys, etc). The language known to run code ONLY while getting interpreted is actually getting compiled and run on what's called python VM.
My talk will unfurl the key directory in the cpython source tree, reveal
__pycache__ mystery, show you what bytecodes are and how they get executed on a stack based virtual machine. Python gives us ample materials to trace function calls and show us how
which is the bytecode of the above function.
Basic Talk flow:
The talk will walk you through:
- Cloning and compiling python code from source and run it
- Exploring Grammar, AST, Objects, Include, Lib directories of Python source tree
- Conversion of hello.py file to it's .pyc bytecode
- Various steps traversed during the bytecode tranformation
- Bytecode interpretation, i.e. running the final compiled version of code
- Introduction to the Python VM. What are stack based VM
- How does the bytecode get interpreted (specifically the ceval.c file)
Fun and trivias
Also, I have prepared some queer challenges in between my talks to keep everyone on toes! Do checkout my gist link below for a complete detail of the talk. I have divided it into several sections.
Talk Outline (Breakdown of 30mins):
Who Begat Python
history - 5 mins
Why did I choose this topic?
Intro to python, through the creator's eye - 1 min
Refer to the book - 1 minIntro to Python for the newbies - Whetting your Appetite is the perfect way to get introduced to the powers of Python.
Meme break - 1st trivia question
Python Source Code - 5mins
Cloning the repo and run python binary. - 30sec
Explain directories - 4 mins
Example hello.py file
Conversion using Example - 10mins
- Read source code (convert hello.py to the interpreter level source code. - 5mins
Generate AST and parsing
Produce bytecode - 5mins
Conclude using instaviz - last minutes
Trivia second question
Run the bytecode - 10mins
Visualizing the Python VM - 5 mins
Running sample on vm and show stack - 5mins
Solution to first question
Solution for second question as homework
- Acquaintance with C code style
- Know that Python is a programming language.
- Difference between compiled language and interpreted language
I am an independent security researcher with a DevOps background. I love debugging and disassembling code rather than writing code. I'm an Opensource contributor, previously into OWASP, currently working with Open Mainframe Project, a collaboration between The Linux Foundation and IBM Z. I blog occasionally and keep a keen interest in system architecture. Still an undergrad student and wish to rule over Python (long way to go yet). I have previously given talks on Open Source tools used in data collection using OSINT techniques and Content Security Policy bypass using Polyglot XSS attacks.