Added sections 2-4
This commit is contained in:
11
Notes/03_Program_organization/00_Overview.md
Normal file
11
Notes/03_Program_organization/00_Overview.md
Normal file
@@ -0,0 +1,11 @@
|
||||
# Overview
|
||||
|
||||
In this section you will learn:
|
||||
|
||||
* How to organize larger programs.
|
||||
* Defining and working with functions.
|
||||
* Exceptions and Error handling.
|
||||
* Basic module management.
|
||||
* Script writing.
|
||||
|
||||
Python is great for short scripts, one-off problems, prototyping, testing, etc.
|
||||
275
Notes/03_Program_organization/01_Script.md
Normal file
275
Notes/03_Program_organization/01_Script.md
Normal file
@@ -0,0 +1,275 @@
|
||||
# 3.1 Python Scripting
|
||||
|
||||
In this part we look more closely at the practice of writing Python
|
||||
scripts.
|
||||
|
||||
### What is a Script?
|
||||
|
||||
A *script* is a program that runs a series of statements and stops.
|
||||
|
||||
```python
|
||||
# program.py
|
||||
|
||||
statement1
|
||||
statement2
|
||||
statement3
|
||||
...
|
||||
```
|
||||
|
||||
We have been writing scripts to this point.
|
||||
|
||||
### A Problem
|
||||
|
||||
If you write a useful script, it will grow in features and
|
||||
functionality. You may want to apply it to other related problems.
|
||||
Over time, it might become a critical application. And if you don't
|
||||
take care, it might turn into a huge tangled mess. So, let's get
|
||||
organized.
|
||||
|
||||
### Defining Things
|
||||
|
||||
You must always define things before they get used later on in a program.
|
||||
|
||||
```python
|
||||
def square(x):
|
||||
return x*x
|
||||
|
||||
a = 42
|
||||
b = a + 2 # Requires that `a` is defined
|
||||
|
||||
z = square(b) # Requires `square` and `b` to be defined
|
||||
```
|
||||
|
||||
**The order is important.**
|
||||
You almost always put the definitions of variables an functions near the beginning.
|
||||
|
||||
### Defining Functions
|
||||
|
||||
It is a good idea to put all of the code related to a single *task* all in one place.
|
||||
|
||||
```python
|
||||
def read_prices(filename):
|
||||
prices = {}
|
||||
with open(filename) as f:
|
||||
f_csv = csv.reader(f)
|
||||
for row in f_csv:
|
||||
prices[row[0]] = float(row[1])
|
||||
return prices
|
||||
```
|
||||
|
||||
A function also simplifies repeated operations.
|
||||
|
||||
```python
|
||||
oldprices = read_prices('oldprices.csv')
|
||||
newprices = read_prices('newprices.csv')
|
||||
```
|
||||
|
||||
### What is a Function?
|
||||
|
||||
A function is a named sequence of statements.
|
||||
|
||||
```python
|
||||
def funcname(args):
|
||||
statement
|
||||
statement
|
||||
...
|
||||
return result
|
||||
```
|
||||
|
||||
*Any* Python statement can be used inside.
|
||||
|
||||
```python
|
||||
def foo():
|
||||
import math
|
||||
print(math.sqrt(2))
|
||||
help(math)
|
||||
```
|
||||
|
||||
There are no *special* statements in Python.
|
||||
|
||||
### Function Definition
|
||||
|
||||
Functions can be *defined* in any order.
|
||||
|
||||
```python
|
||||
def foo(x):
|
||||
bar(x)
|
||||
|
||||
def bar(x):
|
||||
statements
|
||||
|
||||
# OR
|
||||
def bar(x)
|
||||
statements
|
||||
|
||||
def foo(x):
|
||||
bar(x)
|
||||
```
|
||||
|
||||
Functions must only be defined before they are actually *used* (or called) during program execution.
|
||||
|
||||
```python
|
||||
foo(3) # foo must be defined already
|
||||
```
|
||||
|
||||
Stylistically, it is probably more common to see functions defined in a *bottom-up* fashion.
|
||||
|
||||
### Bottom-up Style
|
||||
|
||||
Functions are treated as building blocks.
|
||||
The smaller/simpler blocks go first.
|
||||
|
||||
```python
|
||||
# myprogram.py
|
||||
def foo(x):
|
||||
...
|
||||
|
||||
def bar(x):
|
||||
...
|
||||
foo(x) # Defined above
|
||||
...
|
||||
|
||||
def spam(x):
|
||||
...
|
||||
bar(x) # Defined above
|
||||
...
|
||||
|
||||
spam(42) # Code that uses the functions appears at the end
|
||||
```
|
||||
|
||||
Later functions build upon earlier functions.
|
||||
|
||||
### Function Design
|
||||
|
||||
Ideally, functions should be a *black box*.
|
||||
They should only operate on passed inputs and avoid global variables
|
||||
and mysterious side-effects. Main goals: *Modularity* and *Predictability*.
|
||||
|
||||
### Doc Strings
|
||||
|
||||
A good practice is to include documentations in the form of
|
||||
doc-strings. Doc-strings are strings written immediately after the
|
||||
name of the function. They feed `help()`, IDEs and other tools.
|
||||
|
||||
```python
|
||||
def read_prices(filename):
|
||||
'''
|
||||
Read prices from a CSV file of name,price
|
||||
'''
|
||||
prices = {}
|
||||
with open(filename) as f:
|
||||
f_csv = csv.reader(f)
|
||||
for row in f_csv:
|
||||
prices[row[0]] = float(row[1])
|
||||
return prices
|
||||
```
|
||||
|
||||
### Type Annotations
|
||||
|
||||
You can also add some optional type annotations to your function definitions.
|
||||
|
||||
```python
|
||||
def read_prices(filename: str) -> dict:
|
||||
'''
|
||||
Read prices from a CSV file of name,price
|
||||
'''
|
||||
prices = {}
|
||||
with open(filename) as f:
|
||||
f_csv = csv.reader(f)
|
||||
for row in f_csv:
|
||||
prices[row[0]] = float(row[1])
|
||||
return prices
|
||||
```
|
||||
|
||||
These do nothing. It is purely informational.
|
||||
They may be used by IDEs, code checkers, etc.
|
||||
|
||||
## Exercises
|
||||
|
||||
In section 2, you wrote a program called `report.py` that printed out a report showing the performance of a stock portfolio.
|
||||
This program consisted of some functions. For example:
|
||||
|
||||
```python
|
||||
# report.py
|
||||
import csv
|
||||
|
||||
def read_portfolio(filename):
|
||||
'''
|
||||
Read a stock portfolio file into a list of dictionaries with keys
|
||||
name, shares, and price.
|
||||
'''
|
||||
portfolio = []
|
||||
with open(filename) as f:
|
||||
rows = csv.reader(f)
|
||||
headers = next(rows)
|
||||
|
||||
for row in rows:
|
||||
record = dict(zip(headers, row))
|
||||
stock = {
|
||||
'name' : record['name'],
|
||||
'shares' : int(record['shares']),
|
||||
'price' : float(record['price'])
|
||||
}
|
||||
portfolio.append(stock)
|
||||
return portfolio
|
||||
...
|
||||
```
|
||||
|
||||
However, there were also portions of the program that just performed a series of scripted calculations.
|
||||
This code appeared near the end of the program. For example:
|
||||
|
||||
```python
|
||||
...
|
||||
|
||||
# Output the report
|
||||
|
||||
headers = ('Name', 'Shares', 'Price', 'Change')
|
||||
print('%10s %10s %10s %10s' % headers)
|
||||
print(('-' * 10 + ' ') * len(headers))
|
||||
for row in report:
|
||||
print('%10s %10d %10.2f %10.2f' % row)
|
||||
...
|
||||
```
|
||||
|
||||
In this exercise, we’re going take this program and organize it a little more strongly around the use of functions.
|
||||
|
||||
### (a) Structuring a program as a collection of functions
|
||||
|
||||
Modify your `report.py` program so that all major operations,
|
||||
including calculations and output, are carried out by a collection of
|
||||
functions. Specifically:
|
||||
|
||||
* Create a function `print_report(report)` that prints out the report.
|
||||
* Change the last part of the program so that it is nothing more than a series of function calls and no other computation.
|
||||
|
||||
### (b) Creating a function for program execution
|
||||
|
||||
Take the last part of your program and package it into a single function `portfolio_report(portfolio_filename, prices_filename)`.
|
||||
Have the function work so that the following function call creates the report as before:
|
||||
|
||||
```python
|
||||
portfolio_report('Data/portfolio.csv', 'Data/prices.csv')
|
||||
```
|
||||
|
||||
In this final version, your program will be nothing more than a series
|
||||
of function definitions followed by a single function call to
|
||||
`portfolio_report()` at the very end (which executes all of the steps
|
||||
involved in the program).
|
||||
|
||||
By turning your program into a single function, it becomes easy to run it on different inputs.
|
||||
For example, try these statements interactively after running your program:
|
||||
|
||||
```python
|
||||
>>> portfolio_report('Data/portfolio2.csv', 'Data/prices.csv')
|
||||
... look at the output ...
|
||||
>>> files = ['Data/portfolio.csv', 'Data/portfolio2.csv']
|
||||
>>> for name in files:
|
||||
print(f'{name:-^43s}')
|
||||
portfolio_report(name, 'prices.csv')
|
||||
print()
|
||||
|
||||
... look at the output ...
|
||||
>>>
|
||||
```
|
||||
|
||||
[Next](02_More_functions)
|
||||
491
Notes/03_Program_organization/02_More_functions.md
Normal file
491
Notes/03_Program_organization/02_More_functions.md
Normal file
@@ -0,0 +1,491 @@
|
||||
# 3.2 More on Functions
|
||||
|
||||
This section fills in a few more details about how functions work and are defined.
|
||||
|
||||
### Calling a Function
|
||||
|
||||
Consider this function:
|
||||
|
||||
```python
|
||||
def read_prices(filename, debug):
|
||||
...
|
||||
```
|
||||
|
||||
You can call the function with positional arguments:
|
||||
|
||||
```
|
||||
prices = read_prices('prices.csv', True)
|
||||
```
|
||||
|
||||
Or you can call the function with keyword arguments:
|
||||
|
||||
```python
|
||||
prices = read_prices(filename='prices.csv', debug=True)
|
||||
```
|
||||
|
||||
### Default Arguments
|
||||
|
||||
Sometimes you want an optional argument.
|
||||
|
||||
```python
|
||||
def read_prices(filename, debug=False):
|
||||
...
|
||||
```
|
||||
|
||||
If a default value is assigned, the argument is optional in function calls.
|
||||
|
||||
```python
|
||||
d = read_prices('prices.csv')
|
||||
e = read_prices('prices.dat', True)
|
||||
```
|
||||
|
||||
*Note: Arguments with defaults must appear at the end of the arguments list (all non-optional arguments go first).*
|
||||
|
||||
### Prefer keyword arguments for optional arguments
|
||||
|
||||
Compare and contrast these two different calling styles:
|
||||
|
||||
```python
|
||||
parse_data(data, False, True) # ?????
|
||||
|
||||
parse_data(data, ignore_errors=True)
|
||||
parse_data(data, debug=True)
|
||||
parse_data(data, debug=True, ignore_errors=True)
|
||||
```
|
||||
|
||||
Keyword arguments improve code clarity.
|
||||
|
||||
### Design Best Practices
|
||||
|
||||
Always give short, but meaningful names to functions arguments.
|
||||
|
||||
Someone using a function may want to use the keyword calling style.
|
||||
|
||||
```python
|
||||
d = read_prices('prices.csv', debug=True)
|
||||
```
|
||||
|
||||
Python development tools will show the names in help features and documentation.
|
||||
|
||||
### Return Values
|
||||
|
||||
The `return` statement returns a value
|
||||
|
||||
```python
|
||||
def square(x):
|
||||
return x * x
|
||||
```
|
||||
|
||||
If no return value or `return` not specified, `None` is returned.
|
||||
|
||||
```python
|
||||
def bar(x):
|
||||
statements
|
||||
return
|
||||
|
||||
a = bar(4) # a = None
|
||||
|
||||
# OR
|
||||
def foo(x):
|
||||
statements # No `return`
|
||||
|
||||
b = foo(4) # b = None
|
||||
```
|
||||
|
||||
### Multiple Return Values
|
||||
|
||||
Functions can only return one value.
|
||||
However, a function may return multiple values by returning a tuple.
|
||||
|
||||
```python
|
||||
def divide(a,b):
|
||||
q = a // b # Quotient
|
||||
r = a % b # Remainder
|
||||
return q, r # Return a tuple
|
||||
```
|
||||
|
||||
Usage example:
|
||||
|
||||
```python
|
||||
x, y = divide(37,5) # x = 7, y = 2
|
||||
|
||||
x = divide(37, 5) # x = (7, 2)
|
||||
```
|
||||
|
||||
### Variable Scope
|
||||
|
||||
Programs assign values to variables.
|
||||
|
||||
```python
|
||||
x = value # Global variable
|
||||
|
||||
def foo():
|
||||
y = value # Local variable
|
||||
```
|
||||
|
||||
Variables assignments occur outside and inside function definitions.
|
||||
Variables defined outside are "global". Variables inside a function are "local".
|
||||
|
||||
### Local Variables
|
||||
|
||||
Variables inside functions are private.
|
||||
|
||||
```python
|
||||
def read_portfolio(filename):
|
||||
portfolio = []
|
||||
for line in open(filename):
|
||||
fields = line.split()
|
||||
s = (fields[0],int(fields[1]),float(fields[2]))
|
||||
portfolio.append(s)
|
||||
return portfolio
|
||||
```
|
||||
|
||||
In this example, `filename`, `portfolio`, `line`, `fields` and `s` are local variables.
|
||||
Those variables are not retained or accessible after the function call.
|
||||
|
||||
```pycon
|
||||
>>> stocks = read_portfolio('stocks.dat')
|
||||
>>> fields
|
||||
Traceback (most recent call last):
|
||||
File "<stdin>", line 1, in ?
|
||||
NameError: name 'fields' is not defined
|
||||
>>>
|
||||
```
|
||||
|
||||
They also can't conflict with variables found elsewhere.
|
||||
|
||||
### Global Variables
|
||||
|
||||
Functions can freely access the values of globals.
|
||||
|
||||
```python
|
||||
name = 'Dave'
|
||||
|
||||
def greeting():
|
||||
print('Hello', name) # Using `name` global variable
|
||||
```
|
||||
|
||||
However, functions can't modify globals:
|
||||
|
||||
```python
|
||||
name = 'Dave'
|
||||
|
||||
def spam():
|
||||
name = 'Guido'
|
||||
|
||||
spam()
|
||||
print(name) # prints 'Dave'
|
||||
```
|
||||
|
||||
**Remember: All assignments in functions are local.**
|
||||
|
||||
### Modifying Globals
|
||||
|
||||
If you must modify a global variable you must declare it as such.
|
||||
|
||||
```python
|
||||
name = 'Dave'
|
||||
def spam():
|
||||
global name
|
||||
name = 'Guido' # Changes the global name above
|
||||
```
|
||||
|
||||
The global declaration must appear before its use. Having seen this,
|
||||
know that it is considered poor form. In fact, try to avoid entirely
|
||||
if you can. If you need a function to modify some kind of state outside
|
||||
of the function, it's better to use a class instead (more on this later).
|
||||
|
||||
### Argument Passing
|
||||
|
||||
When you call a function, the argument variables are names for passed values.
|
||||
If mutable data types are passed (e.g. lists, dicts), they can be modified *in-place*.
|
||||
|
||||
```python
|
||||
def foo(items):
|
||||
items.append(42) # Modifies the input object
|
||||
|
||||
a = [1, 2, 3]
|
||||
foo(a)
|
||||
print(a) # [1, 2, 3, 42]
|
||||
```
|
||||
|
||||
**Key point: Functions don't receive a copy of the input arguments.**
|
||||
|
||||
### Reassignment vs Modifying
|
||||
|
||||
Make sure you understand the subtle difference between modifying a value and reassigning a variable name.
|
||||
|
||||
```python
|
||||
def foo(items):
|
||||
items.append(42) # Modifies the input object
|
||||
|
||||
a = [1, 2, 3]
|
||||
foo(a)
|
||||
print(a) # [1, 2, 3, 42]
|
||||
|
||||
# VS
|
||||
def bar(items):
|
||||
items = [4,5,6] # Reassigns `items` variable
|
||||
|
||||
b = [1, 2, 3]
|
||||
bar(b)
|
||||
print(b) # [1, 2, 3]
|
||||
```
|
||||
|
||||
*Reminder: Variable assignment never overwrites memory. The name is simply bound to a new value.*
|
||||
|
||||
## Exercises
|
||||
|
||||
This exercise involves a lot of steps and putting concepts together from past exercises.
|
||||
The final solution is only about 25 lines of code, but take your time and make sure you understand each part.
|
||||
|
||||
A central part of your `report.py` program focuses on the reading of
|
||||
CSV files. For example, the function `read_portfolio()` reads a file
|
||||
containing rows of portfolio data and the function `read_prices()`
|
||||
reads a file containing rows of price data. In both of those
|
||||
functions, there are a lot of low-level "fiddly" bits and similar
|
||||
features. For example, they both open a file and wrap it with the
|
||||
`csv` module and they both convert various fields into new types.
|
||||
|
||||
If you were doing a lot of file parsing for real, you’d probably want
|
||||
to clean some of this up and make it more general purpose. That's
|
||||
our goal.
|
||||
|
||||
Start this exercise by creating a new file called `fileparse.py`. This is where we will be doing our work.
|
||||
|
||||
### (a) Reading CSV Files
|
||||
|
||||
To start, let’s just focus on the problem of reading a CSV file into a
|
||||
list of dictionaries. In the file `fileparse.py`, define a simple
|
||||
function that looks like this:
|
||||
|
||||
```python
|
||||
# fileparse.py
|
||||
import csv
|
||||
|
||||
def parse_csv(filename):
|
||||
'''
|
||||
Parse a CSV file into a list of records
|
||||
'''
|
||||
with open(filename) as f:
|
||||
rows = csv.reader(f)
|
||||
|
||||
# Read the file headers
|
||||
headers = next(rows)
|
||||
records = []
|
||||
for row in rows:
|
||||
if not row: # Skip rows with no data
|
||||
continue
|
||||
record = dict(zip(headers, row))
|
||||
records.append(record)
|
||||
|
||||
return records
|
||||
```
|
||||
|
||||
This function reads a CSV file into a list of dictionaries while
|
||||
hiding the details of opening the file, wrapping it with the `csv`
|
||||
module, ignoring blank lines, and so forth.
|
||||
|
||||
Try it out:
|
||||
|
||||
Hint: `python3 -i fileparse.py`.
|
||||
|
||||
```pycon
|
||||
>>> portfolio = parse_csv('Data/portfolio.csv')
|
||||
>>> portfolio
|
||||
[{'price': '32.20', 'name': 'AA', 'shares': '100'}, {'price': '91.10', 'name': 'IBM', 'shares': '50'}, {'price': '83.44', 'name': 'CAT', 'shares': '150'}, {'price': '51.23', 'name': 'MSFT', 'shares': '200'}, {'price': '40.37', 'name': 'GE', 'shares': '95'}, {'price': '65.10', 'name': 'MSFT', 'shares': '50'}, {'price': '70.44', 'name': 'IBM', 'shares': '100'}]
|
||||
>>>
|
||||
```
|
||||
|
||||
This is great except that you can’t do any kind of useful calculation with the data because everything is represented as a string.
|
||||
We’ll fix this shortly, but let’s keep building on it.
|
||||
|
||||
### (b) Building a Column Selector
|
||||
|
||||
In many cases, you’re only interested in selected columns from a CSV file, not all of the data.
|
||||
Modify the `parse_csv()` function so that it optionally allows user-specified columns to be picked out as follows:
|
||||
|
||||
```python
|
||||
>>> # Read all of the data
|
||||
>>> portfolio = parse_csv('Data/portfolio.csv')
|
||||
>>> portfolio
|
||||
[{'price': '32.20', 'name': 'AA', 'shares': '100'}, {'price': '91.10', 'name': 'IBM', 'shares': '50'}, {'price': '83.44', 'name': 'CAT', 'shares': '150'}, {'price': '51.23', 'name': 'MSFT', 'shares': '200'}, {'price': '40.37', 'name': 'GE', 'shares': '95'}, {'price': '65.10', 'name': 'MSFT', 'shares': '50'}, {'price': '70.44', 'name': 'IBM', 'shares': '100'}]
|
||||
|
||||
>>> # Read some of the data
|
||||
>>> shares_held = parse_csv('portfolio.csv', select=['name','shares'])
|
||||
>>> shares_held
|
||||
[{'name': 'AA', 'shares': '100'}, {'name': 'IBM', 'shares': '50'}, {'name': 'CAT', 'shares': '150'}, {'name': 'MSFT', 'shares': '200'}, {'name': 'GE', 'shares': '95'}, {'name': 'MSFT', 'shares': '50'}, {'name': 'IBM', 'shares': '100'}]
|
||||
>>>
|
||||
```
|
||||
|
||||
An example of a column selector was given in Section 2.5.
|
||||
However, here’s one way to do it:
|
||||
|
||||
```python
|
||||
# fileparse.py
|
||||
import csv
|
||||
|
||||
def parse_csv(filename, select=None):
|
||||
'''
|
||||
Parse a CSV file into a list of records
|
||||
'''
|
||||
with open(filename) as f:
|
||||
rows = csv.reader(f)
|
||||
|
||||
# Read the file headers
|
||||
headers = next(rows)
|
||||
|
||||
# If a column selector was given, find indices of the specified columns.
|
||||
# Also narrow the set of headers used for resulting dictionaries
|
||||
if select:
|
||||
indices = [headers.index(colname) for colname in select]
|
||||
headers = select
|
||||
else:
|
||||
indices = []
|
||||
|
||||
records = []
|
||||
for row in rows:
|
||||
if not row: # Skip rows with no data
|
||||
continue
|
||||
# Filter the row if specific columns were selected
|
||||
if indices:
|
||||
row = [ row[index] for index in indices ]
|
||||
|
||||
# Make a dictionary
|
||||
record = dict(zip(headers, row))
|
||||
records.append(record)
|
||||
|
||||
return records
|
||||
```
|
||||
|
||||
There are a number of tricky bits to this part. Probably the most important one is the mapping of the column selections to row indices.
|
||||
For example, suppose the input file had the following headers:
|
||||
|
||||
```pycon
|
||||
>>> headers = ['name', 'date', 'time', 'shares', 'price']
|
||||
>>>
|
||||
```
|
||||
|
||||
Now, suppose the selected columns were as follows:
|
||||
|
||||
```pycon
|
||||
>>> select = ['name', 'shares']
|
||||
>>>
|
||||
```
|
||||
|
||||
To perform the proper selection, you have to map the selected column names to column indices in the file.
|
||||
That’s what this step is doing:
|
||||
|
||||
```pycon
|
||||
>>> indices = [headers.index(colname) for colname in select ]
|
||||
>>> indices
|
||||
[0, 3]
|
||||
>>>
|
||||
```
|
||||
|
||||
In other words, "name" is column 0 and "shares" is column 3.
|
||||
When you read a row of data from the file, the indices are used to filter it:
|
||||
|
||||
```pycon
|
||||
>>> row = ['AA', '6/11/2007', '9:50am', '100', '32.20' ]
|
||||
>>> row = [ row[index] for index in indices ]
|
||||
>>> row
|
||||
['AA', '100']
|
||||
>>>
|
||||
```
|
||||
|
||||
### (c) Performing Type Conversion
|
||||
|
||||
Modify the `parse_csv()` function so that it optionally allows type-conversions to be applied to the returned data.
|
||||
For example:
|
||||
|
||||
```pycon
|
||||
>>> portfolio = parse_csv('Data/portfolio.csv', types=[str, int, float])
|
||||
>>> portfolio
|
||||
[{'price': 32.2, 'name': 'AA', 'shares': 100}, {'price': 91.1, 'name': 'IBM', 'shares': 50}, {'price': 83.44, 'name': 'CAT', 'shares': 150}, {'price': 51.23, 'name': 'MSFT', 'shares': 200}, {'price': 40.37, 'name': 'GE', 'shares': 95}, {'price': 65.1, 'name': 'MSFT', 'shares': 50}, {'price': 70.44, 'name': 'IBM', 'shares': 100}]
|
||||
|
||||
>>> shares_held = parse_csv('Data/portfolio.csv', select=['name', 'shares'], types=[str, int])
|
||||
>>> shares_held
|
||||
[{'name': 'AA', 'shares': 100}, {'name': 'IBM', 'shares': 50}, {'name': 'CAT', 'shares': 150}, {'name': 'MSFT', 'shares': 200}, {'name': 'GE', 'shares': 95}, {'name': 'MSFT', 'shares': 50}, {'name': 'IBM', 'shares': 100}]
|
||||
>>>
|
||||
```
|
||||
|
||||
You already explored this in Exercise 2.7. You'll need to insert the
|
||||
following fragment of code into your solution:
|
||||
|
||||
```python
|
||||
...
|
||||
if types:
|
||||
row = [func(val) for func, val in zip(types, row) ]
|
||||
...
|
||||
```
|
||||
|
||||
### (d) Working with Headers
|
||||
|
||||
Some CSV files don’t include any header information.
|
||||
For example, the file `prices.csv` looks like this:
|
||||
|
||||
```csv
|
||||
"AA",9.22
|
||||
"AXP",24.85
|
||||
"BA",44.85
|
||||
"BAC",11.27
|
||||
...
|
||||
```
|
||||
|
||||
Modify the `parse_csv()` function so that it can work with such files by creating a list of tuples instead.
|
||||
For example:
|
||||
|
||||
```python
|
||||
>>> prices = parse_csv('Data/prices.csv', types=[str,float], has_headers=False)
|
||||
>>> prices
|
||||
[('AA', 9.22), ('AXP', 24.85), ('BA', 44.85), ('BAC', 11.27), ('C', 3.72), ('CAT', 35.46), ('CVX', 66.67), ('DD', 28.47), ('DIS', 24.22), ('GE', 13.48), ('GM', 0.75), ('HD', 23.16), ('HPQ', 34.35), ('IBM', 106.28), ('INTC', 15.72), ('JNJ', 55.16), ('JPM', 36.9), ('KFT', 26.11), ('KO', 49.16), ('MCD', 58.99), ('MMM', 57.1), ('MRK', 27.58), ('MSFT', 20.89), ('PFE', 15.19), ('PG', 51.94), ('T', 24.79), ('UTX', 52.61), ('VZ', 29.26), ('WMT', 49.74), ('XOM', 69.35)]
|
||||
>>>
|
||||
```
|
||||
|
||||
To make this change, you’ll need to modify the code so that the first
|
||||
line of data isn’t interpreted as a header line. Also, you’ll need to
|
||||
make sure you don’t create dictionaries as there are no longer any
|
||||
column names to use for keys.
|
||||
|
||||
### (e) Picking a different column delimitier
|
||||
|
||||
Although CSV files are pretty common, it’s also possible that you could encounter a file that uses a different column separator such as a tab or space.
|
||||
For example, the file `Data/portfolio.dat` looks like this:
|
||||
|
||||
```csv
|
||||
name shares price
|
||||
"AA" 100 32.20
|
||||
"IBM" 50 91.10
|
||||
"CAT" 150 83.44
|
||||
"MSFT" 200 51.23
|
||||
"GE" 95 40.37
|
||||
"MSFT" 50 65.10
|
||||
"IBM" 100 70.44
|
||||
```
|
||||
|
||||
The `csv.reader()` function allows a different delimiter to be given as follows:
|
||||
|
||||
```python
|
||||
rows = csv.reader(f, delimiter=' ')
|
||||
```
|
||||
|
||||
Modify your `parse_csv()` function so that it also allows the delimiter to be changed.
|
||||
|
||||
For example:
|
||||
|
||||
```pycon
|
||||
>>> portfolio = parse_csv('Data/portfolio.dat', types=[str, int, float], delimiter=' ')
|
||||
>>> portfolio
|
||||
[{'price': '32.20', 'name': 'AA', 'shares': '100'}, {'price': '91.10', 'name': 'IBM', 'shares': '50'}, {'price': '83.44', 'name': 'CAT', 'shares': '150'}, {'price': '51.23', 'name': 'MSFT', 'shares': '200'}, {'price': '40.37', 'name': 'GE', 'shares': '95'}, {'price': '65.10', 'name': 'MSFT', 'shares': '50'}, {'price': '70.44', 'name': 'IBM', 'shares': '100'}]
|
||||
>>>
|
||||
```
|
||||
|
||||
If you’ve made it this far, you’ve created a nice library function that’s genuinely useful.
|
||||
You can use it to parse arbitrary CSV files, select out columns of
|
||||
interest, perform type conversions, without having to worry too much
|
||||
about the inner workings of files or the `csv` module.
|
||||
|
||||
Nice!
|
||||
|
||||
[Next](03_Error_checking)
|
||||
393
Notes/03_Program_organization/03_Error_checking.md
Normal file
393
Notes/03_Program_organization/03_Error_checking.md
Normal file
@@ -0,0 +1,393 @@
|
||||
# 3.3 Error Checking
|
||||
|
||||
This section discusses some aspects of error checking and exception handling.
|
||||
|
||||
### How programs fail
|
||||
|
||||
Python performs no checking or validation of function argument types or values.
|
||||
A function will work on any data that is compatible with the statements in the function.
|
||||
|
||||
```python
|
||||
def add(x, y):
|
||||
return x + y
|
||||
|
||||
add(3, 4) # 7
|
||||
add('Hello', 'World') # 'HelloWorld'
|
||||
add('3', '4') # '34'
|
||||
```
|
||||
|
||||
If there are errors in a function, they will show up at run time (as an exception).
|
||||
|
||||
```python
|
||||
def add(x, y):
|
||||
return x + y
|
||||
|
||||
>>> add(3, '4')
|
||||
Traceback (most recent call last):
|
||||
...
|
||||
TypeError: unsupported operand type(s) for +:
|
||||
'int' and 'str'
|
||||
>>>
|
||||
```
|
||||
|
||||
To verify code, there is a strong emphasis on testing (covered later).
|
||||
|
||||
### Exceptions
|
||||
|
||||
Exceptions are used to signal errors.
|
||||
To raise an exception yourself, use `raise` statement.
|
||||
|
||||
```python
|
||||
if name not in names:
|
||||
raise RuntimeError('Name not found')
|
||||
```
|
||||
|
||||
To catch an exception use `try-except`.
|
||||
|
||||
```python
|
||||
try:
|
||||
authenticate(username)
|
||||
except RuntimeError as e:
|
||||
print(e)
|
||||
```
|
||||
|
||||
### Exception Handling
|
||||
|
||||
Exceptions propagate to the first matching `except`.
|
||||
|
||||
```python
|
||||
def grok():
|
||||
...
|
||||
raise RuntimeError('Whoa!') # Exception raised here
|
||||
|
||||
def spam():
|
||||
grok() # Call that will raise exception
|
||||
|
||||
def bar():
|
||||
try:
|
||||
spam()
|
||||
except RuntimeError as e: # Exception caught here
|
||||
...
|
||||
|
||||
def foo():
|
||||
try:
|
||||
bar()
|
||||
except RuntimeError as e: # Exception does NOT arrive here
|
||||
...
|
||||
|
||||
foo()
|
||||
```
|
||||
|
||||
To handle the exception, use the `except` block. You can add any statements you want to handle the error.
|
||||
|
||||
```python
|
||||
def grok(): ...
|
||||
raise RuntimeError('Whoa!')
|
||||
|
||||
def bar():
|
||||
try:
|
||||
grok()
|
||||
except RuntimeError as e: # Exception caught here
|
||||
statements # Use this statements
|
||||
statements
|
||||
...
|
||||
|
||||
bar()
|
||||
```
|
||||
|
||||
After handling, execution resumes with the first statement after the `try-except`.
|
||||
|
||||
```python
|
||||
def grok(): ...
|
||||
raise RuntimeError('Whoa!')
|
||||
|
||||
def bar():
|
||||
try:
|
||||
grok()
|
||||
except RuntimeError as e: # Exception caught here
|
||||
statements
|
||||
statements
|
||||
...
|
||||
statements # Resumes execution here
|
||||
statements # And continues here
|
||||
...
|
||||
|
||||
bar()
|
||||
```
|
||||
|
||||
### Built-in Exceptions
|
||||
|
||||
There are about two-dozen built-in exceptions.
|
||||
This is not an exhaustive list. Check the documentation for more.
|
||||
|
||||
```python
|
||||
ArithmeticError
|
||||
AssertionError
|
||||
EnvironmentError
|
||||
EOFError
|
||||
ImportError
|
||||
IndexError
|
||||
KeyboardInterrupt
|
||||
KeyError
|
||||
MemoryError
|
||||
NameError
|
||||
ReferenceError
|
||||
RuntimeError
|
||||
SyntaxError
|
||||
SystemError
|
||||
TypeError
|
||||
ValueError
|
||||
```
|
||||
|
||||
### Exception Values
|
||||
|
||||
Most exceptions have an associated value. It contains more information about what's wrong.
|
||||
|
||||
```python
|
||||
raise RuntimeError('Invalid user name')
|
||||
```
|
||||
|
||||
This value is passed to the variable supplied in `except`.
|
||||
|
||||
```python
|
||||
try:
|
||||
...
|
||||
except RuntimeError as e: # `e` holds the value raised
|
||||
...
|
||||
```
|
||||
|
||||
The value is an instance of the exception type. However, it often looks like a string when
|
||||
printed.
|
||||
|
||||
```python
|
||||
except RuntimeError as e:
|
||||
print('Failed : Reason', e)
|
||||
```
|
||||
|
||||
### Catching Multiple Errors
|
||||
|
||||
You can catch different kinds of exceptions with multiple `except` blocks.
|
||||
|
||||
```python
|
||||
try:
|
||||
...
|
||||
except LookupError as e:
|
||||
...
|
||||
except RuntimeError as e:
|
||||
...
|
||||
except IOError as e:
|
||||
...
|
||||
except KeyboardInterrupt as e:
|
||||
...
|
||||
```
|
||||
|
||||
Alternatively, if the block to handle them is the same, you can group them:
|
||||
|
||||
```python
|
||||
try:
|
||||
...
|
||||
except (IOError,LookupError,RuntimeError) as e:
|
||||
...
|
||||
```
|
||||
|
||||
### Catching All Errors
|
||||
|
||||
To catch any exception, use `Exception` like this:
|
||||
|
||||
```python
|
||||
try:
|
||||
...
|
||||
except Exception:
|
||||
print('An error occurred')
|
||||
```
|
||||
|
||||
In general, writing code like that is a bad idea because you'll have no idea
|
||||
why it failed.
|
||||
|
||||
### Wrong Way to Catch Errors
|
||||
|
||||
Here is the wrong way to use exceptions.
|
||||
|
||||
```python
|
||||
try:
|
||||
go_do_something()
|
||||
except Exception:
|
||||
print('Computer says no')
|
||||
```
|
||||
|
||||
This swallows all possible errors. It may make it impossible to debug
|
||||
when the code is failing for some reason you didn't expect at all
|
||||
(e.g. uninstalled Python module, etc.).
|
||||
|
||||
### Somewhat Better Approach
|
||||
|
||||
This is a more sane approach.
|
||||
|
||||
```python
|
||||
try:
|
||||
go_do_something()
|
||||
except Exception as e:
|
||||
print('Computer says no. Reason :', e)
|
||||
```
|
||||
|
||||
It reports a specific reason for failure. It is almost always a good
|
||||
idea to have some mechanism for viewing/reporting errors when you
|
||||
write code that catches all possible exceptions.
|
||||
|
||||
In general though, it's better to catch the error more narrowly. Only
|
||||
catch the errors you can actually deal with. Let other errors pass to
|
||||
other code.
|
||||
|
||||
### Reraising an Exception
|
||||
|
||||
Use `raise` to propagate a caught error.
|
||||
|
||||
```python
|
||||
try:
|
||||
go_do_something()
|
||||
except Exception as e:
|
||||
print('Computer says no. Reason :', e)
|
||||
raise
|
||||
```
|
||||
|
||||
It allows you to take action (e.g. logging) and pass the error on to the caller.
|
||||
|
||||
### Exception Best Practices
|
||||
|
||||
Don't catch exceptions. Fail fast and loud. If it's important, someone
|
||||
else will take care of the problem. Only catch an exception if you
|
||||
are *that* someone. That is, only catch errors where you can recover
|
||||
and sanely keep going.
|
||||
|
||||
### `finally` statement
|
||||
|
||||
It specifies code that must fun regardless of whether or not an exception occurs.
|
||||
|
||||
```python
|
||||
lock = Lock()
|
||||
...
|
||||
lock.acquire()
|
||||
try:
|
||||
...
|
||||
finally:
|
||||
lock.release() # this will ALWAYS be executed. With and without exception.
|
||||
```
|
||||
|
||||
Comonly used to properly manage resources (especially locks, files, etc.).
|
||||
|
||||
### `with` statement
|
||||
|
||||
In modern code, `try-finally` often replaced with the `with` statement.
|
||||
|
||||
```python
|
||||
lock = Lock()
|
||||
with lock:
|
||||
# lock acquired
|
||||
...
|
||||
# lock released
|
||||
```
|
||||
|
||||
A more familiar example:
|
||||
|
||||
```python
|
||||
with open(filename) as f:
|
||||
# Use the file
|
||||
...
|
||||
# File closed
|
||||
```
|
||||
|
||||
It defines a usage *context* for a resource. When execution leaves that context,
|
||||
resources are released. `with` only works with certain objects.
|
||||
|
||||
## Exercises
|
||||
|
||||
### (a) Raising exceptions
|
||||
|
||||
The `parse_csv()` function you wrote in the last section allows
|
||||
user-specified columns to be selected, but that only works if the
|
||||
input data file has column headers.
|
||||
|
||||
Modify the code so that an exception gets raised if both the `select`
|
||||
and `has_headers=False` arguments are passed.
|
||||
For example:
|
||||
|
||||
```python
|
||||
>>> parse_csv('Data/prices.csv', select=['name','price'], has_headers=False)
|
||||
Traceback (most recent call last):
|
||||
File "<stdin>", line 1, in <module>
|
||||
File "fileparse.py", line 9, in parse_csv
|
||||
raise RuntimeError("select argument requires column headers")
|
||||
RuntimeError: select argument requires column headers
|
||||
>>>
|
||||
```
|
||||
|
||||
Having added this one check, you might ask if you should be performing
|
||||
other kinds of sanity checks in the function. For example, should you
|
||||
check that the filename is a string, that types is a list, or anything
|
||||
of that nature?
|
||||
|
||||
As a general rule, it’s usually best to skip such tests and to just
|
||||
let the program fail on bad inputs. The traceback message will point
|
||||
at the source of the problem and can assist in debugging.
|
||||
|
||||
The main reason for adding the above check to avoid running the code
|
||||
in a non-sensical mode (e.g., using a feature that requires column
|
||||
headers, but simultaneously specifying that there are no headers).
|
||||
|
||||
This indicates a programming error on the part of the calling code.
|
||||
|
||||
### (b) Catching exceptions
|
||||
|
||||
The `parse_csv()` function you wrote is used to process the entire
|
||||
contents of a file. However, in the real-world, it’s possible that
|
||||
input files might have corrupted, missing, or dirty data. Try this
|
||||
experiment:
|
||||
|
||||
```python
|
||||
>>> portfolio = parse_csv('Data/missing.csv', types=[str, int, float])
|
||||
Traceback (most recent call last):
|
||||
File "<stdin>", line 1, in <module>
|
||||
File "fileparse.py", line 36, in parse_csv
|
||||
row = [func(val) for func, val in zip(types, row)]
|
||||
ValueError: invalid literal for int() with base 10: ''
|
||||
>>>
|
||||
```
|
||||
|
||||
Modify the `parse_csv()` function to catch all `ValueError` exceptions
|
||||
generated during record creation and print a warning message for rows
|
||||
that can’t be converted.
|
||||
|
||||
The message should include the row number and information about the reason why it failed.
|
||||
To test your function, try reading the file `Data/missing.csv` above.
|
||||
For example:
|
||||
|
||||
```python
|
||||
>>> portfolio = parse_csv('Data/missing.csv', types=[str, int, float])
|
||||
Row 4: Couldn't convert ['MSFT', '', '51.23']
|
||||
Row 4: Reason invalid literal for int() with base 10: ''
|
||||
Row 7: Couldn't convert ['IBM', '', '70.44']
|
||||
Row 7: Reason invalid literal for int() with base 10: ''
|
||||
>>>
|
||||
>>> portfolio
|
||||
[{'price': 32.2, 'name': 'AA', 'shares': 100}, {'price': 91.1, 'name': 'IBM', 'shares': 50}, {'price': 83.44, 'name': 'CAT', 'shares': 150}, {'price': 40.37, 'name': 'GE', 'shares': 95}, {'price': 65.1, 'name': 'MSFT', 'shares': 50}]
|
||||
>>>
|
||||
```
|
||||
|
||||
### (c) Silencing Errors
|
||||
|
||||
Modify the `parse_csv()` function so that parsing error messages can be silenced if explicitly desired by the user.
|
||||
For example:
|
||||
|
||||
```python
|
||||
>>> portfolio = parse_csv('Data/missing.csv', types=[str,int,float], silence_errors=True)
|
||||
>>> portfolio
|
||||
[{'price': 32.2, 'name': 'AA', 'shares': 100}, {'price': 91.1, 'name': 'IBM', 'shares': 50}, {'price': 83.44, 'name': 'CAT', 'shares': 150}, {'price': 40.37, 'name': 'GE', 'shares': 95}, {'price': 65.1, 'name': 'MSFT', 'shares': 50}]
|
||||
>>>
|
||||
```
|
||||
|
||||
Error handling is one of the most difficult things to get right in
|
||||
most programs. As a general rule, you shouldn’t silently ignore
|
||||
errors. Instead, it’s better to report problems and to give the user
|
||||
an option to the silence the error message if they choose to do so.
|
||||
|
||||
[Next](04_Modules)
|
||||
317
Notes/03_Program_organization/04_Modules.md
Normal file
317
Notes/03_Program_organization/04_Modules.md
Normal file
@@ -0,0 +1,317 @@
|
||||
# 3.4 Modules
|
||||
|
||||
This section introduces the concept of modules.
|
||||
|
||||
### Modules and import
|
||||
|
||||
Any Python source file is a module.
|
||||
|
||||
```python
|
||||
# foo.py
|
||||
def grok(a):
|
||||
...
|
||||
def spam(b):
|
||||
...
|
||||
```
|
||||
|
||||
The `import` statement loads and *executes* a module.
|
||||
|
||||
```python
|
||||
# program.py
|
||||
import foo
|
||||
|
||||
a = foo.grok(2)
|
||||
b = foo.spam('Hello')
|
||||
...
|
||||
```
|
||||
|
||||
### Namespaces
|
||||
|
||||
A module is a collection of named values and is sometimes said to be a *namespace*.
|
||||
The names are all of the global variables and functions defined in the source file.
|
||||
After importing, the module name is used as a prefix. Hence the *namespace*.
|
||||
|
||||
```python
|
||||
import foo
|
||||
|
||||
a = foo.grok(2)
|
||||
b = foo.spam('Hello')
|
||||
...
|
||||
```
|
||||
|
||||
The module name is tied to the file name (foo -> foo.py).
|
||||
|
||||
### Global Definitions
|
||||
|
||||
Everything defined in the *global* scope is what populates the module
|
||||
namespace. `foo` in our previous example. Consider two modules
|
||||
that define the same variable `x`.
|
||||
|
||||
```python
|
||||
# foo.py
|
||||
x = 42
|
||||
def grok(a):
|
||||
...
|
||||
```
|
||||
|
||||
```python
|
||||
# bar.py
|
||||
x = 37
|
||||
def spam(a):
|
||||
...
|
||||
```
|
||||
|
||||
In this case, the `x` definitions refer to different variables. One
|
||||
is `foo.x` and the other is `bar.x`. Different modules can use the
|
||||
same names and those names won't conflict with each other.
|
||||
|
||||
**Modules are isolated.**
|
||||
|
||||
### Modules as Environments
|
||||
|
||||
Modules form an enclosing environment for all of the code defined inside.
|
||||
|
||||
```python
|
||||
# foo.py
|
||||
x = 42
|
||||
|
||||
def grok(a):
|
||||
print(x)
|
||||
```
|
||||
|
||||
*Global* variables are always bound to the enclosing module (same file).
|
||||
Each source file is its own little universe.
|
||||
|
||||
### Module Execution
|
||||
|
||||
When a module is imported, *all of the statements in the module
|
||||
execute* one after another until the end of the file is reached. The
|
||||
contents of the module namespace are all of the *global* names that
|
||||
are still defined at the end of the execution process. If there are
|
||||
scripting statements that carry out tasks in the global scope
|
||||
(printing, creating files, etc.) you will see them run on import.
|
||||
|
||||
### `import as` statement
|
||||
|
||||
You can change the name of a module as you import it:
|
||||
|
||||
```python
|
||||
import math as m
|
||||
def rectangular(r, theta):
|
||||
x = r * m.cos(theta)
|
||||
y = r * m.sin(theta)
|
||||
return x, y
|
||||
```
|
||||
|
||||
It works the same as a normal import. It just renames the module in that one file.
|
||||
|
||||
### `from` module import
|
||||
|
||||
This picks selected symbols out of a module and makes them available locally.
|
||||
|
||||
```python
|
||||
from math import sin, cos
|
||||
|
||||
def rectangular(r, theta):
|
||||
x = r * cos(theta)
|
||||
y = r * sin(theta)
|
||||
return x, y
|
||||
```
|
||||
|
||||
It allows parts of a module to be used without having to type the module prefix.
|
||||
Useful for frequently used names.
|
||||
|
||||
### Comments on importing
|
||||
|
||||
Variations on import do *not* change the way that modules work.
|
||||
|
||||
```python
|
||||
import math as m
|
||||
# vs
|
||||
from math import cos, sin
|
||||
...
|
||||
```
|
||||
|
||||
Specifically, `import` always executes the *entire* file and modules
|
||||
are still isolated environments.
|
||||
|
||||
The `import module as` statement is only manipulating the names.
|
||||
|
||||
### Module Loading
|
||||
|
||||
Each module loads and executes only *once*.
|
||||
*Note: Repeated imports just return a reference to the previously loaded module.*
|
||||
|
||||
`sys.modules` is a dict of all loaded modules.
|
||||
|
||||
```python
|
||||
>>> import sys
|
||||
>>> sys.modules.keys()
|
||||
['copy_reg', '__main__', 'site', '__builtin__', 'encodings', 'encodings.encodings', 'posixpath', ...]
|
||||
>>>
|
||||
```
|
||||
|
||||
### Locating Modules
|
||||
|
||||
Python consults a path list (sys.path) when looking for modules.
|
||||
|
||||
```python
|
||||
>>> import sys
|
||||
>>> sys.path
|
||||
[
|
||||
'',
|
||||
'/usr/local/lib/python36/python36.zip',
|
||||
'/usr/local/lib/python36',
|
||||
...
|
||||
]
|
||||
```
|
||||
|
||||
Current working directory is usually first.
|
||||
|
||||
### Module Search Path
|
||||
|
||||
`sys.path` contains the search paths.
|
||||
|
||||
You can manually adjust if you need to.
|
||||
|
||||
```python
|
||||
import sys
|
||||
sys.path.append('/project/foo/pyfiles')
|
||||
```
|
||||
|
||||
Paths are also added via environment variables.
|
||||
|
||||
```python
|
||||
% env PYTHONPATH=/project/foo/pyfiles python3
|
||||
Python 3.6.0 (default, Feb 3 2017, 05:53:21)
|
||||
[GCC 4.2.1 Compatible Apple LLVM 8.0.0 (clang-800.0.38)]
|
||||
>>> import sys
|
||||
>>> sys.path
|
||||
['','/project/foo/pyfiles', ...]
|
||||
```
|
||||
|
||||
## Exercises
|
||||
|
||||
For this exercise involving modules, it is critically important to
|
||||
make sure you are running Python in a proper environment. Modules
|
||||
are usually when programmers encounter problems with the current working
|
||||
directory or with Python's path settings.
|
||||
|
||||
### (a) Module imports
|
||||
|
||||
In section 3, we created a general purpose function `parse_csv()` for parsing the contents of CSV datafiles.
|
||||
|
||||
Now, we’re going to see how to use that function in other programs.
|
||||
First, start in a new shell window. Navigate to the folder where you
|
||||
have all your files. We are going to import them.
|
||||
|
||||
Start Python interactive mode.
|
||||
|
||||
```shell
|
||||
bash % python3
|
||||
Python 3.6.1 (v3.6.1:69c0db5050, Mar 21 2017, 01:21:04)
|
||||
[GCC 4.2.1 (Apple Inc. build 5666) (dot 3)] on darwin
|
||||
Type "help", "copyright", "credits" or "license" for more information.
|
||||
>>>
|
||||
```
|
||||
|
||||
Once you’ve done that, try importing some of the programs you
|
||||
previously wrote. You should see their output exactly as before.
|
||||
Just emphasize, importing a module runs its code.
|
||||
|
||||
```python
|
||||
>>> import bounce
|
||||
... watch output ...
|
||||
>>> import mortgage
|
||||
... watch output ...
|
||||
>>> import report
|
||||
... watch output ...
|
||||
>>>
|
||||
```
|
||||
|
||||
If none of this works, you’re probably running Python in the wrong directory.
|
||||
Now, try importing your `fileparse` module and getting some help on it.
|
||||
|
||||
```python
|
||||
>>> import fileparse
|
||||
>>> help(fileparse)
|
||||
... look at the output ...
|
||||
>>> dir(fileparse)
|
||||
... look at the output ...
|
||||
>>>
|
||||
```
|
||||
|
||||
Try using the module to read some data:
|
||||
|
||||
```python
|
||||
>>> portfolio = fileparse.parse_csv('Data/portfolio.csv',select=['name','shares','price'], types=[str,int,float])
|
||||
>>> portfolio
|
||||
... look at the output ...
|
||||
>>> pricelist = fileparse.parse_csv('Data/prices.csv',types=[str,float], has_headers=False)
|
||||
>>> pricelist
|
||||
... look at the output ...
|
||||
>>> prices = dict(pricelist)
|
||||
>>> prices
|
||||
... look at the output ...
|
||||
>>> prices['IBM']
|
||||
106.11
|
||||
>>>
|
||||
```
|
||||
|
||||
Try importing a function so that you don’t need to include the module name:
|
||||
|
||||
```python
|
||||
>>> from fileparse import parse_csv
|
||||
>>> portfolio = parse_csv('Data/portfolio.csv', select=['name','shares','price'], types=[str,int,float])
|
||||
>>> portfolio
|
||||
... look at the output ...
|
||||
>>>
|
||||
```
|
||||
|
||||
### (b) Using your library module
|
||||
|
||||
In section 2, you wrote a program `report.py` that produced a stock report like this:
|
||||
|
||||
```shell
|
||||
Name Shares Price Change
|
||||
---------- ---------- ---------- ----------
|
||||
AA 100 39.91 7.71
|
||||
IBM 50 106.11 15.01
|
||||
CAT 150 78.58 -4.86
|
||||
MSFT 200 30.47 -20.76
|
||||
GE 95 37.38 -2.99
|
||||
MSFT 50 30.47 -34.63
|
||||
IBM 100 106.11 35.67
|
||||
```
|
||||
|
||||
Take that program and modify it so that all of the input file
|
||||
processing is done using functions in your `fileparse` module. To do
|
||||
that, import `fileparse` as a module and change the `read_portfolio()`
|
||||
and `read_prices()` functions to use the `parse_csv()` function.
|
||||
|
||||
Use the interactive example at the start of this exercise as a guide.
|
||||
Afterwards, you should get exactly the same output as before.
|
||||
|
||||
### (c) Using more library imports
|
||||
|
||||
In section 1, you wrote a program `pcost.py` that read a portfolio and computed its cost.
|
||||
|
||||
```python
|
||||
>>> import pcost
|
||||
>>> pcost.portfolio_cost('Data/portfolio.csv')
|
||||
44671.15
|
||||
>>>
|
||||
```
|
||||
|
||||
Modify the `pcost.py` file so that it uses the `report.read_portfolio()` function.
|
||||
|
||||
### Commentary
|
||||
|
||||
When you are done with this exercise, you should have three
|
||||
programs. `fileparse.py` which contains a general purpose
|
||||
`parse_csv()` function. `report.py` which produces a nice report, but
|
||||
also contains `read_portfolio()` and `read_prices()` functions. And
|
||||
finally, `pcost.py` which computes the portfolio cost, but makes use
|
||||
of the code written for the `report.py` program.
|
||||
|
||||
[Next](05_Main_module)
|
||||
299
Notes/03_Program_organization/05_Main_module.md
Normal file
299
Notes/03_Program_organization/05_Main_module.md
Normal file
@@ -0,0 +1,299 @@
|
||||
# 3.5 Main Module
|
||||
|
||||
This section introduces the concept of a main program or main module.
|
||||
|
||||
### Main Functions
|
||||
|
||||
In many programming languages, there is a concept of a *main* function or method.
|
||||
|
||||
```c
|
||||
// c / c++
|
||||
int main(int argc, char *argv[]) {
|
||||
...
|
||||
}
|
||||
```
|
||||
|
||||
```java
|
||||
// java
|
||||
class myprog {
|
||||
public static void main(String args[]) {
|
||||
...
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
This is the first function that is being executing when an application is launched.
|
||||
|
||||
### Python Main Module
|
||||
|
||||
Python has no *main* function or method. Instead, there is a *main*
|
||||
module. The *main module* is the source file that runs first.
|
||||
|
||||
```bash
|
||||
bash % python3 prog.py
|
||||
...
|
||||
```
|
||||
|
||||
Whatever module you give to the interpreter at startup becomes *main*. It doesn't matter the name.
|
||||
|
||||
### `__main__` check
|
||||
|
||||
It is standard practice for modules that can run as a main script to use this convention:
|
||||
|
||||
```python
|
||||
# prog.py
|
||||
...
|
||||
if __name__ == '__main__':
|
||||
# Running as the main program ...
|
||||
statements
|
||||
...
|
||||
```
|
||||
|
||||
Statements inclosed inside the `if` statement become the *main* program.
|
||||
|
||||
### Main programs vs. library imports
|
||||
|
||||
Any file can either run as main or as a library import:
|
||||
|
||||
```bash
|
||||
bash % python3 prog.py # Running as main
|
||||
```
|
||||
|
||||
```python
|
||||
import prog
|
||||
```
|
||||
|
||||
In both cases, `__name__` is the name of the module. However, it will only be set to `__main__` if
|
||||
running as main.
|
||||
|
||||
As a general rule, you don't want statements that are part of the main
|
||||
program to execute on a library import. So, it's common to have an `if-`check in code
|
||||
that might be used either way.
|
||||
|
||||
```python
|
||||
if __name__ == '__main__':
|
||||
# Does not execute if loaded with import ...
|
||||
```
|
||||
|
||||
### Program Template
|
||||
|
||||
Here is a common program template for writing a Python program:
|
||||
|
||||
```python
|
||||
# prog.py
|
||||
# Import statements (libraries)
|
||||
import modules
|
||||
|
||||
# Functions
|
||||
def spam():
|
||||
...
|
||||
|
||||
def blah():
|
||||
...
|
||||
|
||||
# Main function
|
||||
def main():
|
||||
...
|
||||
|
||||
if __name__ == '__main__':
|
||||
main()
|
||||
```
|
||||
|
||||
### Command Line Tools
|
||||
|
||||
Python is often used for command-line tools
|
||||
|
||||
```bash
|
||||
bash % python3 report.py portfolio.csv prices.csv
|
||||
```
|
||||
|
||||
It means that the scripts are executed from the shell /
|
||||
terminal. Common use cases are for automation, background tasks, etc.
|
||||
|
||||
### Command Line Args
|
||||
|
||||
The command line is a list of text strings.
|
||||
|
||||
```bash
|
||||
bash % python3 report.py portfolio.csv prices.csv
|
||||
```
|
||||
|
||||
This list of text strings is found in `sys.argv`.
|
||||
|
||||
```python
|
||||
# In the previous bash command
|
||||
sys.argv # ['report.py, 'portfolio.csv', 'prices.csv']
|
||||
```
|
||||
|
||||
Here is a simple example of processing the arguments:
|
||||
|
||||
```python
|
||||
import sys
|
||||
|
||||
if len(sys.argv) != 3:
|
||||
raise SystemExit(f'Usage: {sys.argv[0]} ' 'portfile pricefile')
|
||||
portfile = sys.argv[1]
|
||||
pricefile = sys.argv[2]
|
||||
...
|
||||
```
|
||||
|
||||
### Standard I/O
|
||||
|
||||
Standard Input / Output (or stdio) are files that work the same as normal files.
|
||||
|
||||
```python
|
||||
sys.stdout
|
||||
sys.stderr
|
||||
sys.stdin
|
||||
```
|
||||
|
||||
By default, print is directed to `sys.stdout`. Input is read from
|
||||
`sys.stdin`. Tracebacks and errors are directed to `sys.stderr`.
|
||||
|
||||
Be aware that *stdio* could be connected to terminals, files, pipes, etc.
|
||||
|
||||
```bash
|
||||
bash % python3 prog.py > results.txt
|
||||
# or
|
||||
bash % cmd1 | python3 prog.py | cmd2
|
||||
```
|
||||
|
||||
### Environment Variables
|
||||
|
||||
Environment variables are set in the shell.
|
||||
|
||||
```bash
|
||||
bash % setenv NAME dave
|
||||
bash % setenv RSH ssh
|
||||
bash % python3 prog.py
|
||||
```
|
||||
|
||||
`os.environ` is a dictionary that contains these values.
|
||||
|
||||
```python
|
||||
import os
|
||||
|
||||
name = os.environ['NAME'] # 'dave'
|
||||
```
|
||||
|
||||
Changes are reflected in any subprocesses later launched by the program.
|
||||
|
||||
### Program Exit
|
||||
|
||||
Program exit is handled through exceptions.
|
||||
|
||||
```python
|
||||
raise SystemExit
|
||||
raise SystemExit(exitcode)
|
||||
raise SystemExit('Informative message')
|
||||
```
|
||||
|
||||
An alternative.
|
||||
|
||||
```python
|
||||
import sys
|
||||
sys.exit(exitcode)
|
||||
```
|
||||
|
||||
A non-zero exit code indicates an error.
|
||||
|
||||
### The `#!` line
|
||||
|
||||
On Unix, the `#!` line can launch a script as Python.
|
||||
Add the following to the first line of your script file.
|
||||
|
||||
```python
|
||||
#!/usr/bin/env python3
|
||||
# prog.py
|
||||
...
|
||||
```
|
||||
|
||||
It requires the executable permission.
|
||||
|
||||
```bash
|
||||
bash % chmod +x prog.py
|
||||
# Then you can execute
|
||||
bash % prog.py
|
||||
... output ...
|
||||
```
|
||||
|
||||
*Note: The Python Launcher on Windows also looks for the `#!` line to indicate language version.*
|
||||
|
||||
### Script Template
|
||||
|
||||
Here is a common code template for Python programs that run as command-line scripts:
|
||||
|
||||
```python
|
||||
#!/usr/bin/env python3
|
||||
# prog.py
|
||||
|
||||
# Import statements (libraries)
|
||||
import modules
|
||||
|
||||
# Functions
|
||||
def spam():
|
||||
...
|
||||
|
||||
def blah():
|
||||
...
|
||||
|
||||
# Main function
|
||||
def main(argv):
|
||||
# Parse command line args, environment, etc.
|
||||
...
|
||||
|
||||
if __name__ == '__main__':
|
||||
import sys
|
||||
main(sys.argv)
|
||||
```
|
||||
|
||||
## Exercises
|
||||
|
||||
### (a) `main()` functions
|
||||
|
||||
In the file `report.py` add a `main()` function that accepts a list of command line options and produces the same output as before.
|
||||
You should be able to run it interatively like this:
|
||||
|
||||
```python
|
||||
>>> import report
|
||||
>>> report.main(['report.py', 'Data/portfolio.csv', 'Data/prices.csv'])
|
||||
Name Shares Price Change
|
||||
---------- ---------- ---------- ----------
|
||||
AA 100 39.91 7.71
|
||||
IBM 50 106.11 15.01
|
||||
CAT 150 78.58 -4.86
|
||||
MSFT 200 30.47 -20.76
|
||||
GE 95 37.38 -2.99
|
||||
MSFT 50 30.47 -34.63
|
||||
IBM 100 106.11 35.67
|
||||
>>>
|
||||
```
|
||||
|
||||
Modify the `pcost.py` file so that it has a similar `main()` function:
|
||||
|
||||
```python
|
||||
>>> import pcost
|
||||
>>> pcost.main(['pcost.py', 'Data/portfolio.csv'])
|
||||
Total cost: 44671.15
|
||||
>>>
|
||||
```
|
||||
|
||||
### (b) Making Scripts
|
||||
|
||||
Modify the `report.py` and `pcost.py` programs so that they can execute as a script on the command line:
|
||||
|
||||
```bash
|
||||
bash $ python3 report.py Data/portfolio.csv Data/prices.csv
|
||||
Name Shares Price Change
|
||||
---------- ---------- ---------- ----------
|
||||
AA 100 39.91 7.71
|
||||
IBM 50 106.11 15.01
|
||||
CAT 150 78.58 -4.86
|
||||
MSFT 200 30.47 -20.76
|
||||
GE 95 37.38 -2.99
|
||||
MSFT 50 30.47 -34.63
|
||||
IBM 100 106.11 35.67
|
||||
|
||||
bash $ python3 pcost.py Data/portfolio.csv
|
||||
Total cost: 44671.15
|
||||
```
|
||||
132
Notes/03_Program_organization/06_Design_discussion.md
Normal file
132
Notes/03_Program_organization/06_Design_discussion.md
Normal file
@@ -0,0 +1,132 @@
|
||||
# 3.6 Design Discussion
|
||||
|
||||
In this section we consider some design decisions made in code so far.
|
||||
|
||||
### Filenames versus Iterables
|
||||
|
||||
Compare these two programs that return the same output.
|
||||
|
||||
```python
|
||||
# Provide a filename
|
||||
def read_data(filename):
|
||||
records = []
|
||||
with open(filename) as f:
|
||||
for line in f:
|
||||
...
|
||||
records.append(r)
|
||||
return records
|
||||
|
||||
d = read_data('file.csv')
|
||||
```
|
||||
|
||||
```python
|
||||
# Provide lines
|
||||
def read_data(lines):
|
||||
records = []
|
||||
for line in lines:
|
||||
...
|
||||
records.append(r)
|
||||
return records
|
||||
|
||||
with open('file.csv') as f:
|
||||
d = read_data(f)
|
||||
```
|
||||
|
||||
* Which of these functions do you prefer? Why?
|
||||
* Which of these functions is more flexible?
|
||||
|
||||
|
||||
### Deep Idea: "Duck Typing"
|
||||
|
||||
[Duck Typing](https://en.wikipedia.org/wiki/Duck_typing) is a computer programming concept to determine whether an object can be used for a particular purpose. It is an application of the [duck test](https://en.wikipedia.org/wiki/Duck_test).
|
||||
|
||||
> If it looks like a duck, swims like a duck, and quacks like a duck, then it probably is a duck.
|
||||
|
||||
In our previous example that reads the lines, our `read_data` expects
|
||||
any iterable object. Not just the lines of a file.
|
||||
|
||||
```python
|
||||
def read_data(lines):
|
||||
records = []
|
||||
for line in lines:
|
||||
...
|
||||
records.append(r)
|
||||
return records
|
||||
```
|
||||
|
||||
This means that we can use it with other *lines*.
|
||||
|
||||
```python
|
||||
# A CSV file
|
||||
lines = open('data.csv')
|
||||
data = read_data(lines)
|
||||
|
||||
# A zipped file
|
||||
lines = gzip.open('data.csv.gz','rt')
|
||||
data = read_data(lines)
|
||||
|
||||
# The Standard Input
|
||||
lines = sys.stdin
|
||||
data = read_data(lines)
|
||||
|
||||
# A list of strings
|
||||
lines = ['ACME,50,91.1','IBM,75,123.45', ... ]
|
||||
data = read_data(lines)
|
||||
```
|
||||
|
||||
There is considerable flexibility with this design.
|
||||
|
||||
*Question: Shall we embrace or fight this flexibility?*
|
||||
|
||||
### Library Design Best Practices
|
||||
|
||||
Code libraries are often better served by embracing flexibility.
|
||||
Don't restrict your options. With great flexibility comes great power.
|
||||
|
||||
## Exercise
|
||||
|
||||
### (a)From filenames to file-like objects
|
||||
|
||||
In this section, you worked on a file `fileparse.py` that contained a
|
||||
function `parse_csv()`. The function worked like this:
|
||||
|
||||
```pycon
|
||||
>>> import fileparse
|
||||
>>> portfolio = fileparse.parse_csv('Data/portfolio.csv', types=[str,int,float])
|
||||
>>>
|
||||
```
|
||||
|
||||
Right now, the function expects to be passed a filename. However, you
|
||||
can make the code more flexible. Modify the function so that it works
|
||||
with any file-like/iterable object. For example:
|
||||
|
||||
```
|
||||
>>> import fileparse
|
||||
>>> import gzip
|
||||
>>> with gzip.open('Data/portfolio.csv.gz', 'rt') as f:
|
||||
... port = fileparse.parse_csv(f, types=[str,int,float])
|
||||
...
|
||||
>>> lines = ['name,shares,price', 'AA,34.23,100', 'IBM,50,91.1', 'HPE,75,45.1']
|
||||
>>> port = fileparse.parse_csv(lines, types=[str,int,float])
|
||||
>>>
|
||||
```
|
||||
|
||||
In this new code, what happens if you pass a filename as before?
|
||||
|
||||
```
|
||||
>>> port = fileparse.parse_csv('Data/portfolio.csv', types=[str,int,float])
|
||||
>>> port
|
||||
... look at output (it should be crazy) ...
|
||||
>>>
|
||||
```
|
||||
|
||||
With flexibility comes power and with power comes responsibility. Sometimes you'll
|
||||
need to be careful.
|
||||
|
||||
### (b) Fixing existing functions
|
||||
|
||||
Fix the `read_portfolio()` and `read_prices()` functions in the
|
||||
`report.py` file so that they work with the modified version of
|
||||
`parse_csv()`. This should only involve a minor modification.
|
||||
Afterwards, your `report.py` and `pcost.py` programs should work
|
||||
the same way they always did.
|
||||
Reference in New Issue
Block a user