1

Functional Programming in Python



Pycon India 2010
September 25-26, 2010

Draft
Dhananjay Nene
Blog         Twitter

Out of scope

  • Debate between OO, imperative and functional programming styles
  • Debate between pure FP vs. Mixed paradigm programming
  • Debate between idiomatic Python constructs visavis those encouraged by functional programming
  • The actual process of deciding whether to apply functional programming

Why functional programming ?

  • Functions as building blocks. Allows construction of complex logic using composition and higher order functions
  • No side effects and immutability leads to reduced likelihood of errors and easier ability to unit test individual functions
  • Superior ability to deal with concurrency and multi=threading

Python <==> Functional Programming

  • Python is not a ground up functional programming language
  • Python treats functions as first class citizens. Thats the basic construct to enable functional programming
  • Python also supports a lot of other useful constructs for functional programming. However it is deficient in some.
  • It is possible to write the same logic imperatively and using functional programming.

Python constructs for Functional Programming

  • Functions
  • Function references and invocation
  • Function composition (eg. decorators)
  • Sequences, slices
  • Iterators, generators
  • List comprehnsions
  • lambda
  • map(), reduce() and filter()
  • functools and itertools modules

Python Constructs : Functions

# Double the value
def double(x) :
return x * 2

# Use a function reference
twice = double
assert twice(4) == 8

# ** apply_twice is a higher order function **
def apply_twice(function, argument) :
return function(function(argument))
assert apply_twice(double,4) == 16

def convert_to_string(function) :
def inner(*args,**kwargs) :
return str(function(*args,**kwargs))
return inner

@convert_to_string
def triple(x) :
return 3 * x;

Python Constructs : Partial Functions

from functools import partial

def power(x,y) :
return x ** y

square = partial(power, y = 2)
cube = partial(power, y = 3)

assert (square(5), cube(7)) == (25, 343)

Python Constructs : Generators

# Generator
def fibonacci() :
prev, current = 0,1
while True :
yield current
prev, current = current, prev + current

# Generator instantiation
gen = fibonacci()

# Generator usage
for i in range(10) :
print gen.next()

Python Constructs : Iterators

# Iterator to iterate through field names of an object
class AttrIterator(object) :
def __init__(self,obj) :
self.attrs = list(obj.__dict__.keys())
self.counter = 0
def __iter__(self) : return self
def next(self) :
if self.counter >= len(self.attrs) : raise StopIteration
attr = self.attrs[self.counter]
self.counter += 1
return attr

# Class
class Something(object) :
def __init__(self,a,b,c) :
self.a, self.b, self.c = a, b, c
def __iter__(self) : return AttrIterator(self)

# Iterator usage
for attr in Something(1,2,3) :
print attr

Python Constructs : List Comprehensions

one_to_ten = tuple(range(1,11))
assert one_to_ten == (1, 2, 3, 4, 5, 6, 7, 8, 9, 10)

# Simple list comprehension
squares_of_odds = tuple(x * x for x in one_to_ten if x % 2 != 0)
assert squares_of_odds == (1, 9, 25, 49, 81)

# Lambda
double = lambda x : 2 * x
assert double(5) == 10

# filter()
odd_only = filter(lambda x : x % 2 != 0, one_to_ten)
assert odd_only == (1, 3, 5, 7, 9)

# map()
doubled_list = map(double,one_to_ten)
assert doubled_list == [2, 4, 6, 8, 10, 12, 14, 16, 18, 20]

# reduce()
sum = reduce(lambda acc, x : acc + x, one_to_ten, 0)
assert sum == 55

Python Constructs : itertools

import itertools as it
one_to_ten = (1, 2, 3, 4, 5, 6, 7, 8, 9, 10)

assert tuple(it.islice(one_to_ten,5)) == (1, 2, 3, 4, 5)

assert tuple(it.islice(it.count(10),5)) == (10,11,12,13,14)

assert tuple(it.islice(it.repeat(5),3)) == (5,5,5)

assert tuple(it.izip((1,2,3),('a','b','c'))) ==
((1,'a'),(2,'b'),(3,'c'))

assert tuple(it.chain((1,2,3),(4,5))) == (1,2,3,4,5)

assert tuple(it.combinations((1,2,3),2)) ==
((1, 2), (1, 3), (2, 3))

assert tuple(it.islice(it.cycle((1,2,3)),5)) == (1,2,3,1,2)

Immutable Objects

  • Immutable object is an object whose state cannot be modified after it is created
  • When an object is known to be immutable its reference can always be used safely. (Context: Concurrency)
  • Python has no notion of immutable objects. You can
    • Avoid all immutable operations on python structures
    • Hand craft your own structures or use libraries such as pysistence
  • If you do multiprocessing, but no multithreading, immutability still useful but less so.

Sample immutable object

class Immutable(object):
def __setattr__(self, *args):
raise TypeError("can't modify immutable instance")
__delattr__ = __setattr__
def __init__(self, **kwargs):
self.__dict__.update(kwargs)
def __str__(self):
return "Immutable %s %s" % (self.__class__.__name__,
self.__dict__)
__repr__ = __str__

class Person(Immutable):
def __init__(self,name, pincode):
super(Person,self).__init__(
**{'name':name, 'pincode' : pincode})

me = Person("Dhananjay", 411001)
# Careful. The following will still work
me.__dict__['name'] = 'Nene'
# The following raises an exception
me.name = "Foobar"

Concurrency and Parallel Processing

  • We cannot but at this stage refer to the GIL
  • For computationally (non IO) intensive tasks, GIL will constrain ability to leverage multiple cores.
  • For large grained computation blocks, you can use multiprocessing instead.
  • The multiprocessing module (PEP-371) eases that pain.

Sample Message Passing Code (1 of 4)

import multiprocessing
import os
import time
from collections import defaultdict
import re

class Actor(multiprocessing.Process):
def __init__(self,function,*args,**kwargs):
multiprocessing.Process.__init__(self,*args,**kwargs)
self.queue = multiprocessing.Manager().Queue()
self.function = function
self.start()
def run(self):
while True :
val = self.queue.get()
if val[0] is None :
break
self.function(*val)

Sample Message Passing Code (2 of 4)

def map_word(wdlist,word):
# Locate the dictionary & increment the wordcount
worddict = wdlist[hash(word) % mapper.reducer_count]
worddict[word] = worddict.get(word,0) + 1
return wdlist

def mapper(filename):
with(open(filename,"r")) as file :
worddicts = reduce(
# using mapword function
map_word,
# being called for each word
(word.lower()
for word in re.findall('\w+', file.read())),
# starting point being a list of empty dicts
list({} for _ in range(mapper.reducer_count)))

# put each dictionary in the appropriate queue to reduce
map(lambda i : mapper.reducer_queues[i].put(
("put", worddicts[i],filename)),
range(mapper.reducer_count))

Sample Message Passing Code (3 of 4)

def initmapper(*args, **kwargs):
setattr(mapper,'reducer_queues',args[0])
setattr(mapper,'reducer_count',len(args[0]))

def reducer(*message):
if message[0] == "put" :
for word, count in message[1].items() :
reducer.worddict[word][message[2]] += count
elif message[0] == "init" :
setattr(reducer,'worddict',defaultdict(
lambda : defaultdict(lambda : 0)))
elif message[0] == "show" :
for word, filedict in reducer.worddict.items() :
for filename,count in filedict.items() :
print word, filename, count

Sample Message Passing Code (4 of 4)

if __name__ == "__main__" :
reducers = list(Actor(reducer) for i in range(5))
reducer_queues = list(reducer.queue for reducer in reducers)
for reducer_queue in reducer_queues :
reducer_queue.put(("init",))
pool = multiprocessing.Pool(
processes=4,initializer=initmapper,
initargs=(reducer_queues,))
for filename in tuple(os.path.join(dirpath,name)
for dirpath, dirname, filename in
os.walk("../data/20_newsgroups")
for name in filename) :
pool.apply_async(mapper, (filename,))
pool.close()
pool.join()

for reducer_queue in reducer_queues :
reducer_queue.put(("show",))
reducer_queue.put((None,))
for reducer in reducers : reducer.join()

Conway's Game of Life

  • A board consisting of number of rows and columns
  • Each cell in the board is either alive or dead
  • With each iteration, the number of live neighbours of each cell are counted
  • Depending on the current status of the cell, and the number of live neighbours, the next status of the cell is determined.

Conway's Game of Life: OO + imperative way

class Cell(object):
def __init__(self,row,col,state):
self.row = row
self.col = col
self.neighbours = {}
self.state = state
self.marked = False
def set_neighbour(self,c):
neighbour = self.neighbours.get((c.row,c.col),None)
if not neighbour :
self.neighbours[(cell.row,cell.col)] = cell
def live_neighbours(self):
live = 0
for cell in self.neighbours.values() :
if cell.state :
live += 1
return live

def cycle(self):
live = self.live_neighbours()
if self.state :
if live < 2 : self.marked = False
elif live > 3 : self.marked = False
else : self.marked = True
elif live == 3 :
self.marked = True
else :
self.marked = False

def switch(self):
self.state = self.marked

class Board(object):
def __init__(self,rows,cols,live):
self.cells = []
self.rows = rows
self.cols = cols
for row_nbr in range(rows) :
row = []
self.cells.append(row)
for col_nbr in range(cols) :
if (row_nbr,col_nbr) in live :
row.append(Cell(row_nbr,col_nbr,True))
else :
row.append(Cell(row_nbr,col_nbr,False))
for row_nbr in range(rows) :
for col_nbr in range(cols) :
for row_offset in (-1,0,1) :
for col_offset in (-1,0,1) :
self.set_neighbours(
row_nbr,col_nbr,
row_offset,col_offset)
def set_neighbours(self,row_nbr,col_nbr,
row_offset,col_offset):
n_row = row_nbr + row_offset
n_col = col_nbr + col_offset
if 0 <= n_row < rows :
if 0 <= n_col < cols :
if (n_row != row_nbr) or \
(n_col != col_nbr) :
self.cells[row_nbr][col_nbr]\
.set_neighbour(
self.cells[n_row][n_col])

def show(self):
buffer = []
for row in range(self.rows) :
row_buffer = []
for col in range(self.cols) :
if self.cells[row][col].state == True :
row_buffer.append('*')
else :
row_buffer.append('_')
buffer.append("".join(row_buffer))
return "\n".join(buffer)

def cycle(self):
for row in range(self.rows) :
for col in range(self.cols) :
self.cells[row][col].cycle()
for row in range(self.rows) :
for col in range(self.cols) :
self.cells[row][col].switch()

if __name__ == "__main__" :
# print board.show()
start = time.time()
board = Board(5,5,((2,1),(2,2),(2,3)))
for i in range(100000) :
board.cycle()
stop = time.time()
print "Time:", stop - start

Game of Life: Functional way

def generate_board(n_row, n_col, live):
return tuple(tuple(
((row,col) in live, neighbours(row,col,n_row,n_col))
for col in range(n_col)) for row in range(n_row) )

def neighbours(row, col, n_row,n_col):
return tuple((r,c)
for r in range(max(0,row-1),min(n_row-1,row+1)+1)
for c in range(max(0,col-1),min(n_col-1,col+1)+1)
if (r != row) or (c != col))

def next_state(row, board):
for state,nbrs in row :
live = sum(1 for r, c in nbrs if board[r][c][0])
yield state,2<=live<= 3 if state else live == 3,nbrs

def cycle(board):
return tuple(tuple(
(new,nns) for _, new, nns in row)
for row in tuple(tuple(next_state(row,board))
for row in board))

def render(board):
return "\n".join(("".join('*' if col else '_'
for col,_ in row)) for row in board)

import time
start = time.time()
board = generate_board(5,5,((2,1),(2,2),(2,3)))
for i in range(100000) :
board = cycle(board)
stop = time.time()
print "Time:", stop - start

Characteristics of Functional Code

  • Code looks very different, often non idiomatic. Readibility may need to be reevaluated
  • Brain needs to be retaught.
  • Many small classes become tuples or dicts
  • Many methods will become functions
  • Larger function blocks likely to be broken down into many smaller functions
  • Most functions are likely to have no side-effects
    • Lesser likelihood of defects being introduced
    • Much easier to test