# Functions and Libraries

## Introduction

• A language should not include everything anyone could ever want
• Instead, it should allow developers to express every abstraction they want [Steele 1999]
• Define functions to create higher-level operations
• Group them in libraries to keep them manageable

## You Can Skip This Lecture If...

• You know how to define a function in Python
• You know what default parameter values are
• You know what `import` does
• You are familiar with the `sys`, `math`, and `os` libraries

## Defining Functions

• Define a new function using `def`
• Parameter names follow in parentheses
• ```def double(x):
return x * 2

print double(5)
print double(['basalt', 'granite'])
```
```10
['basalt', 'granite', 'basalt', 'granite']
```
• Cannot declare types for parameters

## Returning Values

• Finish the function at any time using `return`
```def sign(x):
if x < 0:
return -1
if x == 0:
return 0
return 1
```
• Functions with `return` statements scattered through them are hard to understand
• Have to read the function line by line to figure out what it might do
• In general:
• Use early returns at the start of the function to handle special cases
• And then one return at the end to handle the general case

## Everything Returns Something

• Functions without explicit `return` statements return `None`
• And `return` on its own is the same as `return None`
• ```def hello():
print 'HELLO'

def world():
print 'WORLD'
return

print hello()
print world()
```
```HELLO
None
WORLD
None
```
• The more consistent functions are about the types of the things they return, the better
• If a function can return `None`, an integer, or a list, the caller will have to write an `if` statement

## Scope

• Python manages variables using a call stack
• ```# Global variable.
rock_type = 'unknown'

# Function that creates local variable.
def classify(rock_name):
if rock_name in ['basalt', 'granite']:
rock_type = 'igneous'
elif rock_name in ['sandstone', 'shale']:
rock_type = 'sedimentary'
else:
rock_type = 'metamorphic'
print 'in function, rock_type is', rock_type

# Call the function to prove that it uses its local 'x'.
print "before function, rock_type is", rock_type
classify('sandstone')
print "after function, rock_type is", rock_type
```
```before function, rock_type is unknown
in function, rock_type is sedimentary
after function, rock_type is unknown
```
• Figure 9.1: Call Stack

• When a function is called, Python creates a new stack frame
• A table of name/value pairs
• Parameters are just local variables that are automatically initialized
• When a variable is referenced, Python looks for it in:
• The top stack frame, then
• The global variables

## Parameter Passing Rules

• Python copies variables' values when passing them to functions
• But remember: variables hold references to lists
• So the parameters are aliases
• Not an issue for strings, numbers, and Booleans, since they are immutable
• ```def add_salt(first, second):
first += "salt"
second += ["salt"]

str = "rock"
seq = ["gneiss", "shale"]

print "before"
print "str is:", str
print "seq is:", seq

print "after"
print "str is:", str
print "seq is:", seq

```
```before
str is: rock
seq is: ['gneiss', 'shale']
after
str is: rock
seq is: ['gneiss', 'shale', 'salt']
```

Figure 9.2: Parameter Passing

## Making Copies

• To pass a copy of a list into a function, slice it
• `values[:]` is the same as `values[0:len(values)]`
• …which is a slice of `values` that includes the entire list…
• …and slicing creates a new list
• ```def add_salt(first, second):
first += "salt"
second += ["salt"]

str = "rock"
seq = ["gneiss", "shale"]

print "before"
print "str is:", str
print "seq is:", seq

print "after"
print "str is:", str
print "seq is:", seq

```
```before
str is: rock
seq is: ['gneiss', 'shale']
after
str is: rock
seq is: ['gneiss', 'shale']
```

Figure 9.3: Passing Slices

## Default Parameter Values

• You can specify default values for parameters when defining a function
• Just “assign” some value to the parameter in the definition
• ```def total(values, start=0, end=None):

# If no values given, total is zero.
if not values:
return 0

# If no end specified, use the entire sequence.
if end is None:
end = len(values)

# Calculate.
result = 0
for i in range(start, end):
result += values[i]
return result```
• The parameters actually passed when the function is called are matched up left to right
• ```numbers = [10, 20, 30]
print "total(numbers, 0, 3):", total(numbers, 0, 3)
print "total(numbers, 2):", total(numbers, 2)
print "total(numbers):", total(numbers)```
```numbers being added: [10, 20, 30]
total(numbers, 0, 3): 60
total(numbers, 2): 30
total(numbers): 60
```
• All parameters with defaults must come after all parameters without them
• Otherwise, matching values to parameters would be ambiguous

## Functions Are Objects

• A function is just another object
• Happens to be an object you can call, just as strings and lists happens to be objects you can index
• `def` is just a shorthand for “create a function, and assign it to a variable”
• ```def circumference(r):
return 2 * 3.14159 * r

circ = circumference

print 'circumference(1.0):', circumference(1.0)
print 'circ(2.0):', circ(2.0)
```
```circumference(1.0): 6.28318
circ(2.0): 12.56636
```
• Figure 9.4: Functions As Objects

• This means you can:
• Redefine functions (just as you can reassign values to variables)
• Create aliases for functions
• Pass functions as parameters
• Store functions in lists

## Function Object Examples

• Example: apply a function to each value in a list
• ```def apply_to_list(function, values):
result = []
for v in values:
temp = function(v)
result.append(temp)
return result

```[0.62831800000000004, 6.2831799999999998, 62.831800000000001]
```
• Example: apply several functions to a single value
• ```def area(r):
return 3.14159 * r * r

def color(r):
return "unknown"

def apply_each(functions, value):
result = []
for f in functions:
temp = f(value)
result.append(temp)
return result

functions = [circumference, area, color]
print apply_each(functions, 1.0)```
```[6.2831799999999998, 3.1415899999999999, 'unknown']
```

## Function Attributes

• Every function has an attribute called `__name__`
• The name it was originally defined with
• Handy when debugging
• ```def sedimentary(rock_name):
return rock_name in ['sandstone', 'shale']

sed = sedimentary

print 'original name:', sedimentary.__name__
print 'name of alias:', sed.__name__
```
```original name: sedimentary
name of alias: sedimentary
```
• Python uses double underscores to mark reserved names

## Creating Modules

• Every Python file is automatically also a module (or library)
• Refer to its contents as `geology.thing`
• Just like the methods and attributes of an object
• Put this in `geology.py`
```def rock_type(rock_name):
if rock_name in ['basalt', 'granite']:
return 'igneous'
elif rock_name in ['sandstone', 'shale']:
return 'sedimentary'
else:
return 'metamorphic'
```
• And this in `analysis.py`
```import geology

for r in ['granite', 'gneiss']:
print r, 'is', geology.rock_type(r)
```
• When `analysis.py` runs, it prints this
```granite is igneous
gneiss is metamorphic
```

## Module Scope

• Each module is its own scope
• Functions search their module after looking at the call stack, but before searching the globals
• Put this in `outer.py`
```manager = "Albus Dumbledore"
import inner
print "outer:", manager
print "inner:", inner.get_manager()
```
• And this in `inner.py`
```manager = "Lucius Malfoy"
def get_manager():
return manager
```
• Running `outer.py` produces this:
```outer: Albus Dumbledore
inner: Lucius Malfoy
```

## Other Ways to Import

• `import geology as g`, then call `g.print_version()`
• `from geology import print_version`, then call `print_version()`
• `from geology import *` imports everything from `geology`
• Almost always a bad idea
• The next version of the module might add a function with the same name as something you're importing from elsewhere

## Import Executes Statements

• `import` is a statement
• Executed when Python encounters it, just like any other statement
• The statements in a module are executed as it is loaded
• Assignment and `def` are statements
• You can use conditionals, loops, and anything else, too
• Put this in `geology.py`
```print 'loading geology module'

def rock_type(rock_name):
if rock_name in ['basalt', 'granite']:
return 'igneous'
elif rock_name in ['sandstone', 'shale']:
return 'sedimentary'
else:
return 'metamorphic'

```
• Then run `analysis.py`:
```loading geology module
granite is igneous
gneiss is metamorphic
```

## Knowing Who You Are

• Inside a module, `__name__` is set to:
• The module's name, if it is being imported
• Or the string `"__main__"`, if it is the main program
• Often used to include self-tests in the module
• When the module is run from the command line, the self-tests are executed
• When it's loaded by other code, the tests are skipped
• ```def is_rock(name):
return name in ['basalt', 'granite', 'sandstone', 'shale']

if __name__ == '__main__':
tests = [['basalt', True],  ['gingerale', False],
[12345678, False], ['sandstone', True]]
for (value, expected) in tests:
actual = is_rock(value)
if actual == expected:
print 'pass'
else:
print 'fail'
```
``````\$ python self_test.py
``````pass
``````pass
``````pass
``````pass
``````\$ python
``````>>> import self_test
``````>>> self_test.is_rock('sugar')
``````False
``````

## The System Library

• Most commonly used library in Python is the system library `sys`
• Information about the Python interpreter (e.g., version number and copyright notice)
• Information about the environment (e.g., what operating system the program is running on)
• Advanced features that mere mortals should never meddle with
• TypeNamePurposeExampleResult
Data`argv`The program's command line arguments`sys.argv[0]``"myscript.py"` (or whatever your program is called)
`maxint`Largest positive value that can be represented by Python's basic integer type`sys.maxint``2147483647`
`path`List of directories that Python searches when importing modules`sys.path``['/home/greg/pylib', '/Python24/lib', '/Python24/lib/site-packages']`
`platform`What type of operating system Python is running on`sys.platform``"win32"`
`stdin`Standard input`sys.stdin.readline()`(Typically) the next line of input from the keyboard
`stdout`Standard output`sys.stdout.write('****')`(Typically) print four stars to the screen
`stderr`Standard error`sys.stderr.write('Program crashing!\n')`Print an error message to the screen
`version`What version of Python this is`sys.version``"2.4 (#60, Feb 9 2005, 19:03:27) [MSC v.1310 32 bit (Intel)]"`
Function`exit`Exit from Python, returning a status code to the operating system`sys.exit(0)`Terminates program with status 0
Table 9.1: The Python Runtime System Library

## Command-Line Arguments

• `sys.argv` contains the program's command-line arguments
• Program's name is always `sys.argv[0]`
• ```import sys

for i in range(len(sys.argv)):
print i, sys.argv[i]
```
``````\$ python command_line.py
``````0 command_line.py
``````\$ python command_line.py first second
``````0 command_line.py
``````1 first
``````2 second
``````

## Standard I/O

• `sys.stdin` and `sys.stdout` are standard input and output
• Normally connected to the keyboard and screen
• If you redirect, or use a pipe, the operating system connects them to files or other programs
• `sys.stderr` is connected to standard error
• ```import sys

count = 0
count += 1

sys.stdout.write('read ' + str(count) + ' lines')
```
``````\$ python standard_io.py < standard_io.py
``````\$ read 7 lines
``````

## The Python Search Path

• `sys.path` is the list of places Python is allowed to look to find modules for import
• Initialized from the `PYTHONPATH` environment variable
• Directory containing the program being run is automatically put at the start of this list
• If `sys.path` is `['/home/swc/lib', '/Python24/lib']`, then `import geology` will try:
• `./geology.py`
• `/home/swc/lib/geology.py`
• `/Python24/lib/geology.py`
• Then fail

## Exiting

• `sys.exit` terminates the program
• Returns an integer status code to the operating system
• 0 indicates successful execution (“zero errors”)
• Non-zero is an error code
• Yes, it's the opposite of what you'd expect…
• If you don't exit explicitly, Python returns 0
• So please use `sys.exit(1)` or something similar so that the operating system knows something's gone wrong

## The Math Library

• Much of Python's standard library is just wrappers around standard C libraries
• See later how to wrap libraries yourself
• Example: the `math` library
• TypeNamePurposeExampleResult
Constant`e`Constant`e``2.71828…`
`pi`Constant`pi``3.14159…`
Function`ceil`Ceiling`ceil(2.5)``3.0`
`floor`Floor`floor(-2.5)``-3.0`
`exp`Exponential`exp(1.0)``2.71828…`
`log`Logarithm`log(4.0)``1.38629…`
`log(4.0, 2.0)``2.0`
`log10`Base-10 logarithm`log10(4.0)``0.60205…`
`pow`Power`pow(2.5, 2.0)``6.25`
`sqrt`Square root`sqrt(9.0)``3.0`
`cos`Cosine`cos(pi)``-1.0`
`asin`Arc sine`asin(-1.0)``-1.5707…`
`hypot`Euclidean norm x2 + y2`hypot(2, 3)``3.60555…`
`degrees`Convert from radians to degrees`degrees(pi)``180`
`radians`Convert from degrees to radians`radians(45)``0.78539…`
Table 9.2: The Python Math Library

## Working with the File System

• The `os` module is an interface between Python and the operating system
• Tries to hide the differences between different operating systems
• But there's only so much it can do
• TypeNamePurposeExampleResult
Constant`curdir`The symbolic name for the current directory.`os.curdir``.` on Linux or Windows.
`pardir`The symbolic name for the parent directory.`os.pardir``..` on Linux or Windows.
`sep`The separator character used in paths.`os.sep``/` on Linux, `\` on Windows.
`linesep`The end-of-line marker used in text files.`os.linesep``\n` on Linux, `\r\n` on Windows.
Function`listdir`List the contents of a directory.`os.listdir('/tmp')`The names of all the files and directories in `/tmp` (except `.` and `..`).
`mkdir`Create a new directory.`os.mkdir('/tmp/scratch')`Make the directory `/tmp/scratch`. Use `os.makedirs` to make several directories at once.
`remove`Delete a file.`os.remove('/tmp/workingfile.txt')`Delete the file `/tmp/workingfile.txt`.
`rename`Rename (or move) a file or directory.`os.rename('/tmp/scratch.txt', '/home/swc/data/important.txt')`Move the file `/tmp/scratch.txt` to `/home/swc/data/important.txt`.
`rmdir`Remove a directory.`os.rmdir('/home/swc')`Probably not something you want to do… Use `os.removedirs` to remove several directories at once.
`stat`Get information about a file or directory.`os.stat('/home/swc/data/important.txt')`Find out when `important.txt` was created, how large it is, etc.
Table 9.3: The Python Operating System Library
```import sys, os

print 'initial working directory:', os.getcwd()
os.chdir(sys.argv[1])
print 'moved to:', os.getcwd()
print 'contents:', os.listdir(os.curdir)
```
``````\$ python os_example.py ~/swc
``````initial working directory: /home/dmalfoy/swc/lec/inc/py03
``````moved to: /home/dmalfoy/swc
``````contents: ['.svn', 'conf', 'config.mk', 'data', 'depend.mk', 'thesis']
``````

## File and Directory Status

• `os.stat` returns an object whose members have information about a file or directory, including:
• `st_size`: size in bytes
• `st_atime`: time of most recent access
• `st_mtime`: time of most recent modification
• ```import sys
import os

for filename in sys.argv[1:]:
status = os.stat(filename)
print filename, status.st_size, status.st_atime
```
``````\$ python stat_file.py . stat_file.py
``````. 0 1137971715
``````stat_file.py 141 1137971715
``````

## Manipulating Pathnames

• `os` has a submodule called `os.path`
• Manipulate pathnames correctly and efficiently
• Do not write your own functions for this—the rules are trickier than you think
• TypeNamePurposeExampleResult
Function`abspath`Create normalized absolute pathnames.`os.path.abspath('../jeevan/bin/script.py')``/home/jeevan/bin/script.py` (if executed in `/home/gvwilson`)
`basename`Return the last portion of a path (i.e., the filename, or the last directory name).`os.path.basename('/tmp/scratch/junk.data')``junk.data`
`dirname`Return all but the last portion of a path.`os.path.dirname('/tmp/scratch/junk.data')``/tmp/scratch`
`exists`Return `True` if a pathname refers to an existing file or directory.`os.path.exists('./scribble.txt')``True` if there is a file called `scribble.txt` in the current working directory, `False` otherwise.
`getatime`Get the last access time of a file or directory (like `os.stat`).`os.path.getatime('.')``1112109573` (which means that the current directory was last read or written at 10:19:33 EST on March 29, 2005).
`getmtime`Get the last modification time of a file or directory (like `os.stat`).`os.path.getmtime('.')``1112109502` (which means that the current directory was last modified 71 seconds before the time shown above).
`getsize`Get the size of something in bytes (like `os.stat`).`os.path.getsize('py03.swc')``29662`.
`isabs``True` if its argument is an absolute pathname.`os.path.isabs('tmp/data.txt')``False`
`isfile``True` if its argument identifies an existing file.`os.path.isfile('tmp/data.txt')``True` if a file called `./tmp/data.txt` exists, and `False` otherwise.
`isdir``True` if its argument identifies an existing directory..`os.path.isdir('tmp')``True` if the current directory has a subdirectory called `tmp`.
`join`Join pathname fragments to create a full pathname.`os.path.join('/tmp', 'scratch', 'data.txt')``"/tmp/scratch/data.txt"`
`normpath`Normalize a pathname (i.e., remove redundant slashes, uses of `.` and `..`, etc.).`os.path.normpath('tmp/scratch/../other/file.txt')``"tmp/other/file.txt"`
`split`Return both of the values returned by `os.path.dirname` and `os.path.basename`.`os.path.split('/tmp/scratch.dat')``('/tmp', 'scratch.dat')`
`splitext`Split a path into two pieces `root` and `ext`, such that `ext` is the last piece beginning with a `"."`.`os.path.splitext('/tmp/scratch.dat')``('/tmp/scratch', '.dat')`
Table 9.4: The Python Pathname Library
```import os

print 'does /home/swc exist?', os.path.exists('/home/swc')
print 'is it a directory?', os.path.isdir('/home/swc')
print 'what is its configuration directory?', os.path.join('/home/swc', 'conf')
print 'where is the configuration file?', os.path.split('/home/swc/conf/current.conf')
```
```does /home/swc exist? True
is it a directory? True
what is its configuration directory? /home/swc\conf
where is the configuration file? ('/home/swc/conf', 'current.conf')
```

## Summary

• The real measure of a programming language is how well it supports modularization
• Use functions, libraries, and other techniques to keep programs comprehensible
• Remember, you're really writing them for other human beings

## Exercises

Exercise 9.1:

Write a function that takes two strings called `text` and `fragment` as arguments, and returns the number of times `fragment` appears in the second half of `text`. Your function must not create a copy of the second half of `text`. (Hint: read the documentation for `string.count`.)

Exercise 9.2:

What does the Python keyword `global` do? What are some reasons not to write code that uses it?

Exercise 9.3:

Python allows you to import all the functions and variables in a module at once, making them local name. For example, if the module is called `values`, and contains a variable called `Threshold` and a function called `limit`, then after the statement `from values import *`, you can then refer directly to `Threshold` and `limit`, rather than having to use `values.Threshold` or `values.limit`. Explain why this is generally considered a bad thing to do, even though it reduces the amount programmers have to type.

Exercise 9.4:

`sys.stdin`, `sys.stdout`, and `sys.stderr` are variables, which means that you can assign to them. For example, if you want to change where `print` sends its output, you can do this:

```import sys

print 'this goes to stdout'
temp = sys.stdout
sys.stdout = open('temporary.txt', 'w')
print 'this goes to temporary.txt'
sys.stdout = temp
```

Do you think this is a good programming practice? When and why do you think its use might be justified?

Exercise 9.5:

`os.stat(path)` returns an object whose members describe various properties of the file or directory identified by `path`. Using this, write a function that will determine whether or not a file is more than one year old.

Exercise 9.6:

Write a Python program that takes as its arguments two years (such as 1997 and 2007), prints out the number of days between the 15th of each month from January of the first year until December of the last year.

Exercise 9.7:

Write a simple version of `which` in Python. Your program should check each directory on the caller's path (in order) to find an executable program that has the name given to it on the command line.

Exercise 9.8:

In the default parameter value example, why does `total` use a default value of `None` for `end`, rather than an integer such as 0 or -1?

Exercise 9.9:

What does the `*` in front of the parameter `extras` mean in the following code example?

```def total(*extras):
result = 0
for e in extras:
result += e
return result```

Hint: look at the following three examples:

```print total()
print total(19)
print total(2, 3, 5)```

Exercise 9.10:

Use the `os.path`, `stat`, and `time` modules to write a program that finds all files in a directory whose names end with a specific suffix, and which are more than a certain number of days old. For example, if your program is run as `oldfiles /tmp .backup 10`, it will print a list of all files in the `/tmp` directory whose names end in `.backup` that are more than 10 days old.

Exercise 9.11:

The previous lecture ended by showing several different ways to copy files using Python. Read the documentation for the `shutil` module, and see if there's a simpler way.

Exercise 9.12:

Consider the short program shown below:

```def add_and_max(new_value, collection=[]):
collection.append(new_value)
return max(collection)