link experiment

This commit is contained in:
David Beazley
2020-05-26 09:21:19 -05:00
parent 9aec32e1a9
commit b5244b0e61
24 changed files with 4265 additions and 2 deletions

View File

@@ -181,4 +181,4 @@ exercise work. For example:
>>>
```
[Next](02_Hello_world)
[Previous:Table of Contents](../Contents) [Next:1.2 Hello World](02_Hello_world)

View File

@@ -475,4 +475,4 @@ an identifying filename and line number.
* Fix the error
* Run the program successfully
[Next](03_Numbers)
[Previous:1.1 Introducing Python](01_Python) [Next:1.3 Numbers](03_Numbers)

View File

@@ -0,0 +1,11 @@
# Overview
The inner workings of Python Objects.
In this section we will cover:
* A few more details about how objects work.
* How objects are represented.
* Details of attribute access.
* Data encapsulation techniques.

View File

@@ -0,0 +1,620 @@
# 5.1 Dictionaries Revisited
The Python object system is largely based on an implementation based on dictionaries. This
section discusses that.
### Dictionaries, Revisited
Remember that a dictionary is a collection of names values.
```python
stock = {
'name' : 'GOOG',
'shares' : 100,
'price' : 490.1
}
```
Dictionaries are commonly used for simple data structures.
However, they are used for critical parts of the interpreter and may be the *most important type of data in Python*.
### Dicts and Modules
In a module, a dictionary holds all of the global variables and functions.
```python
# foo.py
x = 42
def bar():
...
def spam():
...
```
If we inspect `foo.__dict__` or `globals()`, you'll see the dictionary.
```python
{
'x' : 42,
'bar' : <function bar>,
'spam' : <function spam>
}
```
### Dicts and Objects
User defined objects also use dictionaries for both instance data and classes.
In fact, the entire object system is mostly an extra layer that's put on top of dictionaries.
A dictionary holds the instance data, `__dict__`.
```python
>>> s = Stock('GOOG', 100, 490.1)
>>> s.__dict__
{'name' : 'GOOG','shares' : 100, 'price': 490.1 }
```
You populate this dict (and instance) when assigning to `self`.
```python
class Stock(object):
def __init__(self,name,shares,price):
self.name = name
self.shares = shares
self.price = price
```
The instance data, `self.__dict__`, looks like this:
```python
{
'name': 'GOOG',
'shares': 100,
'price': 490.1
}
```
**Each instance gets its own private dictionary.**
```python
s = Stock('GOOG',100,490.1) # {'name' : 'GOOG','shares' : 100, 'price': 490.1 }
t = Stock('AAPL',50,123.45) # {'name' : 'AAPL','shares' : 50, 'price': 123.45 }
```
If you created 200 instances of some class, there are 100 dictionaries sitting around holding data.
### Class Members
A separate dictionary also holds the methods.
```python
class Stock(object):
def __init__(self,name,shares,price):
self.name = name
self.shares = shares
self.price = price
def cost(self):
return self.shares * self.price
def sell(self,nshares):
self.shares -= nshares
```
The dictionary is in `Stock.__dict__`.
```python
{
'cost': <function>,
'sell': <function>,
'__init__': <function>
}
```
### Instances and Classes
Instances and classes are linked together.
The `__class__` attribute refers back to the class.
```python
>>> s = Stock('GOOG', 100, 490.1)
>>> s.__dict__
{ 'name': 'GOOG', 'shares': 100, 'price': 490.1 }
>>> s.__class__
<class '__main__.Stock'>
>>>
```
The instance dictionary holds data unique to each instance, whereas the class dictionary holds data collectively shared by *all* instances.
### Attribute Access
When you work with objects, you access data and methods using the `.` operator.
```python
x = obj.name # Getting
obj.name = value # Setting
del obj.name # Deleting
```
These operations are directly tied to the dictionaries sitting underneath the covers.
### Modifying Instances
Operations that modify an object update the underlying dictionary.
```python
>>> s = Stock('GOOG', 100, 490.1)
>>> s.__dict__
{ 'name':'GOOG', 'shares': 100, 'price': 490.1 }
>>> s.shares = 50 # Setting
>>> s.date = '6/7/2007' # Setting
>>> s.__dict__
{ 'name': 'GOOG', 'shares': 50, 'price': 490.1, 'date': '6/7/2007' }
>>> del s.shares # Deleting
>>> s.__dict__
{ 'name': 'GOOG', 'price': 490.1, 'date': '6/7/2007' }
>>>
```
### Reading Attributes
Suppose you read an attribute on an instance.
```python
x = obj.name
```
The attribute may exist in two places:
* Local instance dictionary.
* Class dictionary.
Both dictionaries must be checked. First, check in local `__dict__`.
If not found, look in `__dict__` of class through `__class__`.
```python
>>> s = Stock(...)
>>> s.name
'GOOG'
>>> s.cost()
49010.0
>>>
```
This lookup scheme is how the members of a *class* get shared by all instances.
### How inheritance works
Classes may inherit from other classes.
```python
class A(B, C):
...
```
The base classes are stored in a tuple in each class.
```python
>>> A.__bases__
(<class '__main__.B'>, <class '__main__.C'>)
>>>
```
This provides a link to parent classes.
### Reading Attributes with Inheritance
First, check in local `__dict__`. If not found, look in `__dict__` of class through `__class__`.
If not found in class, look in base classes through `__bases__`.
### Reading Attributes with Single Inheritance
In inheritance hierarchies, attributes are found by walking up the inheritance tree.
```python
class A(object): pass
class B(A): pass
class C(A): pass
class D(B): pass
class E(D): pass
```
With Single Inheritance, there ia single path to the top.
You stop with the first match.
### Method Resolution Order or MRO
Python precomputes an inheritance chain and stores it in the *MRO* attribute on the class.
```python
>>> E.__mro__
(<class '__main__.E'>, <class '__main__.D'>,
<class '__main__.B'>, <class '__main__.A'>,
<type 'object'>)
>>>
```
This chain is called the **Method Resolutin Order**.
The find the attributes, Python walks the MRO. First match, wins.
### MRO in Multiple Inheritance
There is no single path to the top with multiple inheritance.
Let's take a look at an example.
```python
class A(object): pass
class B(object): pass
class C(A, B): pass
class D(B): pass
class E(C, D): pass
```
What happens when we do?
```python
e = E()
e.attr
```
A similar search process is carried out, but what is the order? That's a problem.
Python uses *cooperative multiple inheritance*.
These are some rules about class ordering:
* Children before parents
* Parents go in order
The MRO is computed using those rules.
```python
>>> E.__mro__
(
<class 'E'>,
<class 'C'>,
<class 'A'>,
<class 'D'>,
<class 'B'>,
<class 'object'>)
>>>
```
### An Odd Code Reuse
Consider two completely unrelated objects:
```python
class Dog(object):
def noise(self):
return 'Bark'
def chase(self):
return 'Chasing!'
class LoudDog(Dog):
def noise(self):
# Code commonality with LoudBike
return super().noise().upper()
```
And
```python
class Bike(object):
def noise(self):
return 'On Your Left'
def pedal(self):
return 'Pedaling!'
class LoudBike(Bike):
def noise(self):
# Code commonality with LoudDog
return super().noise().upper()
```
There is a code commonality in the implementation of `LoudDog.noise()` and
`LoudBike.noise()`. In fact, the code is exactly the same.
### The "Mixin" Pattern
The *Mixin* pattern is a class with a fragment of code.
```python
class Loud(object):
def noise(self):
return super().noise().upper()
```
This class is not usable in isolation.
It mixes with other classes via inheritance.
```python
class LoudDog(Loud, Dog):
pass
class LoudBike(Loud, Bike):
pass
```
This is one of the primary uses of multiple inheritance in Python.
### Why `super()`
Always use `super()` when overriding methods.
```python
class Loud(object):
def noise(self):
return super().noise().upper()
```
`super()` delegates to the *next class* on the MRO.
The tricky bit is that you don't know what it is when you create the Mixin.
### Some Cautions
Multiple inheritance is a powerful tool. Remember that with power comes responsibility.
Frameworks / libraries sometimes use it for advanced features involving composition of components.
## Exercises
In Exercise 4.1, you defined a class `Stock` that represented a holding of stock.
In this exercise, we will use that class.
### (a) Representation of Instances
At the interactive shell, inspect the underlying dictionaries of the two instances you created:
```python
>>> from stock import Stock
>>> goog = Stock('GOOG',100,490.10)
>>> ibm = Stock('IBM',50, 91.23)
>>> goog.__dict__
... look at the output ...
>>> ibm.__dict__
... look at the output ...
>>>
```
### (b) Modification of Instance Data
Try setting a new attribute on one of the above instances:
```python
>>> goog.date = '6/11/2007'
>>> goog.__dict__
... look at output ...
>>> ibm.__dict__
... look at output ...
>>>
```
In the above output, youll notice that the `goog` instance has a attribute `date` whereas the `ibm` instance does not.
It is important to note that Python really doesnt place any restrictions on attributes.
The attributes of an instance are not limited to those set up in the `__init__()` method.
Instead of setting an attribute, try placing a new value directly into the `__dict__` object:
```python
>>> goog.__dict__['time'] = '9:45am'
>>> goog.time
'9:45am'
>>>
```
Here, you really notice the fact that an instance is just a layer on
top of a dictionary. *Note: it should be emphasized that direct
manipulation of the dictionary is uncommon—you should always write
your code to use the (.) syntax.*
### (c) The role of classes
The definitions that make up a class definition are shared by all instances of that class.
Notice, that all instances have a link back to their associated class:
```python
>>> goog.__class__
... look at output ...
>>> ibm.__class__
... look at output ...
>>>
```
Try calling a method on the instances:
```python
>>> goog.cost()
49010.0
>>> ibm.cost()
4561.5
>>>
```
Notice that the name *cost* is not defined in either `goog.__dict__`
or `ibm.__dict__`. Instead, it is being supplied by the class
dictionary. Try this:
```python
>>> Stock.__dict__['cost']
... look at output ...
>>>
```
Try calling the `cost()` method directly through the dictionary:
```python
>>> Stock.__dict__['cost'](goog)
49010.0
>>> Stock.__dict__['cost'](ibm)
4561.5
>>>
```
Notice how you are calling the function defined in the class definition and how the `self` argument gets the instance.
Try adding a new attribute to the `Stock` class:
```python
>>> Stock.foo = 42
>>>
```
Notice how this new attribute now shows up on all of the instances:
```python
>>> goog.foo
42
>>> ibm.foo
42
>>>
```
However, notice that it is not part of the instance dictionary:
```python
>>> goog.__dict__
... look at output and notice there is no 'foo' attribute ...
>>>
```
The reason you can access the `foo` attribute on instances is that
Python always checks the class dictionary if it cant find something
on the instance itself.
This part of the exercise illustrates something known as a class
variable. Suppose, for instance, you have a class like this:
```python
class Foo(object):
a = 13 # Class variable
def __init__(self,b):
self.b = b # Instance variable
```
In this class, the variable `a`, assigned in the body of the class itself, is a *class variable*.
It is shared by all of the instances that get created.
```python
>>> f = Foo(10)
>>> g = Foo(20)
>>> f.a # Inspect the class variable (same for both instances)
13
>>> g.a
13
>>> f.b # Inspect the instance variable (differs)
10
>>> g.b
20
>>> Foo.a = 42 # Change the value of the class variable
>>> f.a
42
>>> g.a
42
>>>
```
### (d) Bound Methods
A subtle feature of Python is that invoking a method actually involves
two steps and something known as a bound method.
```python
>>> s = goog.sell
>>> s
<bound method Stock.sell of Stock('GOOG',100,490.1)>
>>> s(25)
>>> goog.shares
75
>>>
```
Bound methods actually contain all of the pieces needed to call a method.
For instance, they keep a record of the function implementing the method:
```python
>>> s.__func__
<function sell at 0x10049af50>
>>>
```
This is the same value as found in the `Stock` dictionary.
```python
>>> Stock.__dict__['sell']
<function sell at 0x10049af50>
>>>
```
Take a close look at both references do `0x10049af50`. They are both the same in `s` and `Stock.__dict__['sell']`.
Bound methods also record the instance, which is the `self` argument.
```python
>>> s.__self__
Stock('GOOG',75,490.1)
>>>
```
When you invoke the function using `()` all of the pieces come together.
For example, calling `s(25)` actually does this:
```python
>>> s.__func__(s.__self__, 25) # Same as s(25)
>>> goog.shares
50
>>>
```
### (e) Inheritance
Make a new class that inherits from `Stock`.
```python
>>> class NewStock(Stock):
def yow(self):
print('Yow!')
>>> n = NewStock('ACME', 50, 123.45)
>>> n.cost()
6172.50
>>> n.yow()
Yow!
>>>
```
Inheritance is implemented by extending the search process for attributes.
The `__bases__` attribute has a tuple of the immediate parents:
```python
>>> NewStock.__bases__
(<class 'stock.Stock'>,)
>>>
```
The `__mro__` attribute has a tuple of all parents, in the order that
they will be searched for attributes.
```python
>>> NewStock.__mro__
(<class '__main__.NewStock'>, <class 'stock.Stock'>, <class 'object'>)
>>>
```
Heres how the `cost()` method of instance `n` above would be found:
```python
>>> for cls in n.__class__.__mro__:
if 'cost' in cls.__dict__:
break
>>> cls
<class '__main__.Stock'>
>>> cls.__dict__['cost']
<function cost at 0x101aed598>
>>>
```
[Next](02_Classes_encapsulation)

View File

@@ -0,0 +1,335 @@
# 5.2 Classes and Encapsulation
When writing classes, it is common to try and encapsulate internal details.
This section introduces a very Python programming idioms for this including
private variables and properties.
### Public vs Private.
One of the primary roles of a class is to encapsulate data an internal
implementation details of an object. However, a class also defines a
*public* interface that the outside world is supposed to use to
manipulate the object. This distinction between implementation
details and the public interface is important.
### A Problem
In Python, almost everything about classes and objects is *open*.
* You can easily inspect object internals.
* You can change things at will.
* There is no strong notion of access-control (i.e., private class members)
That is an issue when you are trying to isolate details of the *internal implementation*.
### Python Encapsulation
Python relies on programming conventions to indicate the intended use
of something. These conventions are based on naming. There is a
general attitude that it is up to the programmer to observe the rules
as opposed to having the language enforce them.
### Private Attributes
Any attribute name with leading `_` is considered to be *private*.
```python
class Person(object):
def __init__(self, name):
self._name = 0
```
As mentioned earlier, this is only a programming style. You can still access and change it.
```python
>>> p = Person('Guido')
>>> p._name
'Guido'
>>> p._name = 'Dave'
>>>
```
### Simple Attributes
Consider the following class.
```python
class Stock(object):
def __init__(self, name, shares, price):
self.name = name
self.shares = shares
self.price = price
s = Stock('GOOG', 100, 490.1)
s.shares = 50
```
Suppose later you want to add a validation.
```python
s.shares = '50' # Raise a TypeError, this is a string
```
How would you do it?
### Managed Attributes
You might introduce accessor methods.
```python
class Stock(object):
def __init__(self, name, shares, price):
self.name = name self.set_shares(shares) self.price = price
# Function that layers the "get" operation
def get_shares(self):
return self._shares
# Function that layers the "set" operation
def set_shares(self, value):
if not isinstance(value, int):
raise TypeError('Expected an int')
self._shares = value
```
Too bad that this breaks all of our existing code. `s.shares = 50` becomes `s.set_shares(50)`
### Properties
There is an alternative approach to the previous pattern.
```python
class Stock(object):
def __init__(self, name, shares, price):
self.name = name
self.shares = shares
self.price = price
@property
def shares(self):
return self._shares
@shares.setter
def shares(self, value):
if not isinstance(value, int):
raise TypeError('Expected int')
self._shares = value
```
Normal attribute access now triggers the getter and setter under `@property` and `@shares.setter`.
```python
class Stock(object):
def __init__(self, name, shares, price):
self.name = name
self.shares = shares
self.price = price
# Triggered with `s.shares`
@property
def shares(self):
return self._shares
# Triggered with `s.shares = ...`
@shares.setter
def shares(self, value):
if not isinstance(value, int):
raise TypeError('Expected int')
self._shares = value
```
With this pattern, there are *no changes* needed to the source code.
The new *setter* is also called when there is an assignment within the class.
```python
class Stock(object):
def __init__(self, name, shares, price):
...
# This assignment calls the setter below
self.shares = shares
...
...
@shares.setter
def shares(self, value):
if not isinstance(value, int):
raise TypeError('Expected int')
self._shares = value
```
There is often a confusion between a property and the use of private names.
Although a property internally uses a private name like `_shares`, the rest
of the class (not the property) can continue to use a name like `shares`.
Properties are also useful for computed data attributes.
```python
class Stock(object):
def __init__(self, name, shares, price):
self.name = name
self.shares = shares
self.price = price
@property
def cost(self):
return self.shares * self.price
...
```
This allows you to drop the extra parantheses, hiding the fact that it's actually method:
```python
>>> s = Stock('GOOG', 100, 490.1)
>>> s.shares # Instance variable
100
>>> s.cost # Computed Value
49010.0
>>>
```
### Uniform access
The last example shows how to put a more uniform interface on an object.
If you don't do this, an object might be confusing to use:
```python
>>> s = Stock('GOOG', 100, 490.1)
>>> a = s.cost() # Method
49010.0
>>> b = s.shares # Data attribute
100
>>>
```
Why is the `()` required for the cost, but not for the shares? A property
can fix this.
### Decorator Syntax
The `@` syntax is known as *decoration".
It specifies a modifier that's applied to the function definition that immediately follows.
```python
...
@property
def cost(self):
return self.shares * self.price
```
It's kind of like a macro. More details in Section 7.
### `__slots__` Attribute
You can restrict the set of attributes names.
```python
class Stock(object):
__slots__ = ('name','_shares','price')
def __init__(self, name, shares, price):
self.name = name
...
```
It will raise an error for other attributes.
```python
>>> s.price = 385.15
>>> s.prices = 410.2
Traceback (most recent call last):
File "<stdin>", line 1, in ?
AttributeError: 'Stock' object has no attribute 'prices'
```
It prevents errors and restricts usage of objects. It's actually used for performance and
makes Python use memory more efficiently.
### Final Comments on Encapsulation
Don't go overboard with private attributes, properties, slots,
etc. They serve a specific purpose and you may see them when reading
other Python code. However, they are not necessary for most
day-to-day coding.
## Exercises
### (a) Simple properties
Properties are a useful way to add "computed attributes" to an object.
In Exercise 4.1, you created an object `Stock`. Notice that on your
object there is a slight inconsistency in how different kinds of data
are extracted:
```python
>>> from stock import Stock
>>> s = Stock('GOOG', 100, 490.1)
>>> s.shares
100
>>> s.price
490.1
>>> s.cost()
49010.0
>>>
```
Specifically, notice how you have to add the extra `()` to `cost` because it is a method.
You can get rid of the extra `()` on `cost()` if you turn it into a property.
Take your `Stock` class and modify it so that the cost calculation works like this:
```python
>>> s.cost
49010.0
>>>
```
Try calling `s.cost()` as a function and observe that it doesnt work now that `cost` has been defined as a property.
```python
>>> s.cost()
... fails ...
>>>
```
### (b) Properties and Setters
Modify the `shares` attribute so that the value is stored in a private
attribute and that a pair of property functions are used to ensure
that it is always set to an integer value.
Here is an example of the expected behavior:
```python
>>> s = Stock('GOOG',100,490.10)
>>> s.shares = 50
>>> s.shares = 'a lot'
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: expected an integer
>>>
```
### (c) Adding slots
Modify the `Stock` class so that it has a `__slots__` attribute.
Then, verify that new attributes cant be added:
```python
>>> from stock import Stock
>>> s = Stock('GOOG', 100, 490.10)
>>> s.name
'GOOG'
>>> s.blah = 42
... see what happens ...
>>>
```
When you use `__slots__`, Python actually uses a more efficient internal representation of objects.
What happens if you try to inspect the underlying dictionary of `s` above?
```python
>>> s.__dict__
... see what happens ...
>>>
```
It should be noted that `__slots__` is most commonly used as an
optimization on classes that serve as data structures. Using slots
will make such programs use far-less memory and run a bit faster.

View File

@@ -0,0 +1,14 @@
# Overview
A simple definition of *Iteration*: Looping over items.
```python
a = [2,4,10,37,62]
# Iterate over a
for x in a:
...
```
This is a very common pattern. Loops, list comprehensions, etc.
Most programs do a huge amount of iteration.

View File

@@ -0,0 +1,313 @@
# 6.1 Iteration Protocol
This section looks at the process of iteration.
### Iteration Everywhere
Many different objects support iteration.
```python
a = 'hello'
for c in a: # Loop over characters in a
...
b = { 'name': 'Dave', 'password':'foo'}
for k in b: # Loop over keys in dictionary
...
c = [1,2,3,4]
for i in c: # Loop over items in a list/tuple
...
f = open('foo.txt')
for x in f: # Loop over lines in a file
...
```
### Iteration: Protocol
Let's take an inside look at the `for` statement.
```python
for x in obj:
# statements
```
What happens under the hood?
```python
_iter = obj.__iter__() # Get iterator object
while True:
try:
x = _iter.__next__() # Get next item
except StopIteration: # No more items
break
# statements ...
```
All the objects that work with the `for-loop` implement this low-level iteration protocol.
Example: Manual iteration over a list.
```python
>>> x = [1,2,3]
>>> it = x.__iter__()
>>> it
<listiterator object at 0x590b0>
>>> it.__next__()
1
>>> it.__next__()
2
>>> it.__next__()
3
>>> it.__next__()
Traceback (most recent call last):
File "<stdin>", line 1, in ? StopIteration
>>>
```
### Supporting Iteration
Knowing about iteration is useful if you want to add it to your own objects.
For example, making a custom container.
```python
class Portfolio(object):
def __init__(self):
self.holdings = []
def __iter__(self):
return self.holdings.__iter__()
...
port = Portfolio()
for s in port:
...
```
## Exercises
### (a) Iteration Illustrated
Create the following list:
```python
a = [1,9,4,25,16]
```
Manually iterate over this list. Call `__iter__()` to get an iterator and
call the `__next__()` method to obtain successive elements.
```python
>>> i = a.__iter__()
>>> i
<listiterator object at 0x64c10>
>>> i.__next__()
1
>>> i.__next__()
9
>>> i.__next__()
4
>>> i.__next__()
25
>>> i.__next__()
16
>>> i.__next__()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
StopIteration
>>>
```
The `next()` built-in function is a shortcut for calling
the `__next__()` method of an iterator. Try using it on a file:
```python
>>> f = open('Data/portfolio.csv')
>>> f.__iter__() # Note: This returns the file itself
<_io.TextIOWrapper name='Data/portfolio.csv' mode='r' encoding='UTF-8'>
>>> next(f)
'name,shares,price\n'
>>> next(f)
'"AA",100,32.20\n'
>>> next(f)
'"IBM",50,91.10\n'
>>>
```
Keep calling `next(f)` until you reach the end of the
file. Watch what happens.
### (b) Supporting Iteration
On occasion, you might want to make one of your own objects support
iteration--especially if your object wraps around an existing
list or other iterable. In a new file `portfolio.py`, define the
following class:
```python
# portfolio.py
class Portfolio(object):
def __init__(self, holdings):
self._holdings = holdings
@property
def total_cost(self):
return sum([s.cost for s in self._holdings])
def tabulate_shares(self):
from collections import Counter
total_shares = Counter()
for s in self._holdings:
total_shares[s.name] += s.shares
return total_shares
```
This class is meant to be a layer around a list, but with some
extra methods such as the `total_cost` property. Modify the `read_portfolio()`
function in `report.py` so that it creates a `Portfolio` instance like this:
```
# report.py
...
import fileparse
from stock import Stock
from portfolio import Portfolio
def read_portfolio(filename):
'''
Read a stock portfolio file into a list of dictionaries with keys
name, shares, and price.
'''
with open(filename) as file:
portdicts = fileparse.parse_csv(file,
select=['name','shares','price'],
types=[str,int,float])
portfolio = [ Stock(d['name'], d['shares'], d['price']) for d in portdicts ]
return Portfolio(portfolio)
...
```
Try running the `report.py` program. You will find that it fails spectacularly due to the fact
that `Portfolio` instances aren't iterable.
```python
>>> import report
>>> report.portfolio_report('Data/portfolio.csv', 'Data/prices.csv')
... crashes ...
```
Fix this by modifying the `Portfolio` class to support iteration:
```python
class Portfolio(object):
def __init__(self, holdings):
self._holdings = holdings
def __iter__(self):
return self._holdings.__iter__()
@property
def total_cost(self):
return sum([s.shares*s.price for s in self._holdings])
def tabulate_shares(self):
from collections import Counter
total_shares = Counter()
for s in self._holdings:
total_shares[s.name] += s.shares
return total_shares
```
After you've made this change, your `report.py` program should work again. While you're
at it, fix up your `pcost.py` program to use the new `Portfolio` object. Like this:
```python
# pcost.py
import report
def portfolio_cost(filename):
'''
Computes the total cost (shares*price) of a portfolio file
'''
portfolio = report.read_portfolio(filename)
return portfolio.total_cost
...
```
Test it to make sure it works:
```python
>>> import pcost
>>> pcost.portfolio_cost('Data/portfolio.csv')
44671.15
>>>
```
### (d) Making a more proper container
If making a container class, you often want to do more than just
iteration. Modify the `Portfolio` class so that it has some other
special methods like this:
```python
class Portfolio(object):
def __init__(self, holdings):
self._holdings = holdings
def __iter__(self):
return self._holdings.__iter__()
def __len__(self):
return len(self._holdings)
def __getitem__(self, index):
return self._holdings[index]
def __contains__(self, name):
return any([s.name == name for s in self._holdings])
@property
def total_cost(self):
return sum([s.shares*s.price for s in self._holdings])
def tabulate_shares(self):
from collections import Counter
total_shares = Counter()
for s in self._holdings:
total_shares[s.name] += s.shares
return total_shares
```
Now, try some experiments using this new class:
```
>>> import report
>>> portfolio = report.read_portfolio('Data/portfolio.csv')
>>> len(portfolio)
7
>>> portfolio[0]
Stock('AA', 100, 32.2)
>>> portfolio[1]
Stock('IBM', 50, 91.1)
>>> portfolio[0:3]
[Stock('AA', 100, 32.2), Stock('IBM', 50, 91.1), Stock('CAT', 150, 83.44)]
>>> 'IBM' in portfolio
True
>>> 'AAPL' in portfolio
False
>>>
```
One important observation about this--generally code is considered
"Pythonic" if it speaks the common vocabulary of how other parts of
Python normally work. For container objects, supporting iteration,
indexing, containment, and other kinds of operators is an important
part of this.
[Next](02_Customizing_iteration)

View File

@@ -0,0 +1,265 @@
# 6.2 Customizing Iteration
This section looks at how you can customize iteration using a generator.
### A problem
Suppose you wanted to create your own custom iteration pattern.
For example, a countdown.
```python
>>> for x in countdown(10):
... print(x, end=' ')
...
10 9 8 7 6 5 4 3 2 1
>>>
```
There is an easy way to do this.
### Generators
A generator is a function that defines iteration.
```python
def countdown(n):
while n > 0:
yield n
n -= 1
```
For example:
```python
>>> for x in countdown(10):
... print(x, end=' ')
...
10 9 8 7 6 5 4 3 2 1
>>>
```
A generator is any function that uses the `yield` statement.
The behavior of generators is different than a normal function.
Calling a generator function creates a generator object. It does not execute the function.
```python
def countdown(n):
# Added a print statement
print('Counting down from', n)
while n > 0:
yield n
n -= 1
```
```python
>>> x = countdown(10)
# There is NO PRINT STATEMENT
>>> x
# x is a generator object
<generator object at 0x58490>
>>>
```
The function only executes on `__next__()` call.
```python
>>> x = countdown(10)
>>> x
<generator object at 0x58490>
>>> x.__next__()
Counting down from 10
10
>>>
```
`yield` produces a value, but suspends the function execution.
The function resumes on next call to `__next__()`.
```python
>>> x.__next__()
9
>>> x.__next__()
8
```
When the generator returns, the iteration raises an error.
```python
>>> x.__next__()
1
>>> x.__next__()
Traceback (most recent call last):
File "<stdin>", line 1, in ? StopIteration
>>>
```
*Observation: A generator function implements the same low-level protocol that the for statements uses on lists, tuples, dicts, files, etc.*
## Exercises
### (a) A Simple Generator
If you ever find yourself wanting to customize iteration, you should
always think generator functions. They're easy to write---make
a function that carries out the desired iteration logic and use `yield`
to emit values.
For example, try this generator that searches a file for lines containing
a matching substring:
```python
>>> def filematch(filename, substr):
with open(filename, 'r') as f:
for line in f:
if substr in line:
yield line
>>> for line in open('Data/portfolio.csv'):
print(line, end='')
name,shares,price
"AA",100,32.20
"IBM",50,91.10
"CAT",150,83.44
"MSFT",200,51.23
"GE",95,40.37
"MSFT",50,65.10
"IBM",100,70.44
>>> for line in filematch('Data/portfolio.csv', 'IBM'):
print(line, end='')
"IBM",50,91.10
"IBM",100,70.44
>>>
```
This is kind of interesting--the idea that you can hide a bunch of
custom processing in a function and use it to feed a for-loop.
The next example looks at a more unusual case.
### (b) Monitoring a streaming data source
Generators can be an interesting way to monitor real-time data sources
such as log files or stock market feeds. In this part, we'll
explore this idea. To start, follow the next instructions carefully.
The program `Data/stocksim.py` is a program that
simulates stock market data. As output, the program constantly writes
real-time data to a file `stocklog.csv`. In a
separate command window go into the `Data/` directory and run this program:
```bash
bash % python3 stocksim.py
```
If you are on Windows, just locate the `stocksim.py` program and
double-click on it to run it. Now, forget about this program (just
let it run). Using another window, look at the file
`Data/stocklog.csv` being written by the simulator. You should see
new lines of text being added to the file every few seconds. Again,
just let this program run in the background---it will run for several
hours (you shouldn't need to worry about it).
Once the above program is running, let's write a little program to
open the file, seek to the end, and watch for new output. Create a
file `follow.py` and put this code in it:
```python
# follow.py
import os
import time
f = open('Data/stocklog.csv')
f.seek(0, os.SEEK_END) # Move file pointer 0 bytes from end of file
while True:
line = f.readline()
if line == '':
time.sleep(0.1) # Sleep briefly and retry
continue
fields = line.split(',')
name = fields[0].strip('"')
price = float(fields[1])
change = float(fields[4])
if change < 0:
print(f'{name:>10s} {price:>10.2f} {change:>10.2f}')
```
If you run the program, you'll see a real-time stock ticker. Under the hood,
this code is kind of like the Unix `tail -f` command that's used to watch a log file.
Note: The use of the `readline()` method in this example is
somewhat unusual in that it is not the usual way of reading lines from
a file (normally you would just use a `for`-loop). However, in
this case, we are using it to repeatedly probe the end of the file to
see if more data has been added (`readline()` will either
return new data or an empty string).
### (c) Using a generator to produce data
If you look at the code in part (b), the first part of the code is producing
lines of data whereas the statements at the end of the `while` loop are consuming
the data. A major feature of generator functions is that you can move all
of the data production code into a reusable function.
Modify the code in part (b) so that the file-reading is performed by
a generator function `follow(filename)`. Make it so the following code
works:
```python
>>> for line in follow('Data/stocklog.csv'):
print(line, end='')
... Should see lines of output produced here ...
```
Modify the stock ticker code so that it looks like this:
```python
if __name__ == '__main__':
for line in follow('Data/stocklog.csv'):
fields = line.split(',')
name = fields[0].strip('"')
price = float(fields[1])
change = float(fields[4])
if change < 0:
print(f'{name:>10s} {price:>10.2f} {change:>10.2f}')
```
### (d) Watching your portfolio
Modify the `follow.py` program so that it watches the stream of stock
data and prints a ticker showing information for only those stocks
in a portfolio. For example:
```python
if __name__ == '__main__':
import report
portfolio = report.read_portfolio('Data/portfolio.csv')
for line in follow('Data/stocklog.csv'):
fields = line.split(',')
name = fields[0].strip('"')
price = float(fields[1])
change = float(fields[4])
if name in portfolio:
print(f'{name:>10s} {price:>10.2f} {change:>10.2f}')
----
Note: For this to work, your `Portfolio` class must support the
`in` operator. See the last exercise and make sure you implement the
`__contains__()` operator.
### Discussion
Something very powerful just happened here. You moved an interesting iteration pattern
(reading lines at the end of a file) into its own little function. The `follow()` function
is now this completely general purpose utility that you can use in any program. For
example, you could use it to watch server logs, debugging logs, and other similar data sources.
That's kind of cool.
[Next](03_Producers_consumers)

View File

@@ -0,0 +1,301 @@
# 6.3 Producers, Consumers and Pipelines
Generators are a useful tool for setting various kinds of producer/consumer
problems and dataflow pipelines. This section discusses that.
### Producer-Consumer Problems
Generators are closely related to various forms of *producer-consumer*.
```python
# Producer
def follow(f):
...
while True:
...
yield line # Produces value in `line` below
...
# Consumer
for line in follow(f): # Consumes vale from `yield` above
...
```
`yield` produces values that `for` consumes.
### Generator Pipelines
You can use this aspect of generators to set up processing pipelines (like Unix pipes).
*producer* &rarr; *processing* &rarr; *processing* &rarr; *consumer*
Processing pipes have an initial data producer, some set of intermediate processing stages and a final consumer.
**producer** &rarr; *processing* &rarr; *processing* &rarr; *consumer*
```python
def producer():
...
yield item
...
```
The producer is typically a generator. Although it could also be a list of some other sequence.
`yield` feeds data into the pipeline.
*producer* &rarr; *processing* &rarr; *processing* &rarr; **consumer**
```python
def consumer(s):
for item in s:
...
```
Consumer is a for-loop. It gets items and does something with them.
*producer* &rarr; **processing** &rarr; **processing** &rarr; *consumer*
```python
def processing(s:
for item in s:
...
yield newitem
...
```
Intermediate processing stages simultaneously consume and produce items.
They might modify the data stream.
They can also filter (discarding items).
*producer* &rarr; *processing* &rarr; *processing* &rarr; *consumer*
```python
def producer():
...
yield item # yields the item that is received by the `processing`
...
def processing(s:
for item in s: # Comes from the `producer`
...
yield newitem # yields a new item
...
def consumer(s):
for item in s: # Comes from the `processing`
...
```
Code to setup the pipeline
```python
a = producer()
b = processing(a)
c = consumer(b)
```
You will notice that data incrementally flows through the different functions.
## Exercises
For this exercise the `stocksim.py` program should still be running in the background.
Youre going to use the `follow()` function you wrote in the previous exercise.
### (a) Setting up a simple pipeline
Let's see the pipelining idea in action. Write the following
function:
```python
>>> def filematch(lines, substr):
for line in lines:
if substr in line:
yield line
>>>
```
This function is almost exactly the same as the first generator
example in the previous exercise except that it's no longer
opening a file--it merely operates on a sequence of lines given
to it as an argument. Now, try this:
```
>>> lines = follow('Data/stocklog.csv')
>>> ibm = filematch(lines, 'IBM')
>>> for line in ibm:
print(line)
... wait for output ...
```
It might take awhile for output to appear, but eventually you
should see some lines containing data for IBM.
### (b) Setting up a more complex pipeline
Take the pipelining idea a few steps further by performing
more actions.
```
>>> from follow import follow
>>> import csv
>>> lines = follow('Data/stocklog.csv')
>>> rows = csv.reader(lines)
>>> for row in rows:
print(row)
['BA', '98.35', '6/11/2007', '09:41.07', '0.16', '98.25', '98.35', '98.31', '158148']
['AA', '39.63', '6/11/2007', '09:41.07', '-0.03', '39.67', '39.63', '39.31', '270224']
['XOM', '82.45', '6/11/2007', '09:41.07', '-0.23', '82.68', '82.64', '82.41', '748062']
['PG', '62.95', '6/11/2007', '09:41.08', '-0.12', '62.80', '62.97', '62.61', '454327']
...
```
Well, that's interesting. What you're seeing here is that the output of the
`follow()` function has been piped into the `csv.reader()` function and we're
now getting a sequence of split rows.
### (c) Making more pipeline components
Let's extend the whole idea into a larger pipeline. In a separate file `ticker.py`,
start by creating a function that reads a CSV file as you did above:
```python
# ticker.py
from follow import follow
import csv
def parse_stock_data(lines):
rows = csv.reader(lines)
return rows
if __name__ == '__main__':
lines = follow('Data/stocklog.csv')
rows = parse_stock_data(lines)
for row in rows:
print(row)
```
Write a new function that selects specific columns:
```
# ticker.py
...
def select_columns(rows, indices):
for row in rows:
yield [row[index] for index in indices]
...
def parse_stock_data(lines):
rows = csv.reader(lines)
rows = select_columns(rows, [0, 1, 4])
return rows
```
Run your program again. You should see output narrowed down like this:
```
['BA', '98.35', '0.16']
['AA', '39.63', '-0.03']
['XOM', '82.45','-0.23']
['PG', '62.95', '-0.12']
...
```
Write generator functions that convert data types and build dictionaries.
For example:
```python
# ticker.py
...
def convert_types(rows, types):
for row in rows:
yield [func(val) for func, val in zip(types, row)]
def make_dicts(rows, headers):
for row in rows:
yield dict(zip(headers, row))
...
def parse_stock_data(lines):
rows = csv.reader(lines)
rows = select_columns(rows, [0, 1, 4])
rows = convert_types(rows, [str, float, float])
rows = make_dicts(rows, ['name', 'price', 'change'])
return rows
...
```
Run your program again. You should now a stream of dictionaries like this:
```
{ 'name':'BA', 'price':98.35, 'change':0.16 }
{ 'name':'AA', 'price':39.63, 'change':-0.03 }
{ 'name':'XOM', 'price':82.45, 'change': -0.23 }
{ 'name':'PG', 'price':62.95, 'change':-0.12 }
...
```
### (d) Filtering data
Write a function that filters data. For example:
```python
# ticker.py
...
def filter_symbols(rows, names):
for row in rows:
if row['name'] in names:
yield row
```
Use this to filter stocks to just those in your portfolio:
```python
import report
portfolio = report.read_portfolio('Data/portfolio.csv')
rows = parse_stock_data(follow('Data/stocklog.csv'))
rows = filter_symbols(rows, portfolio)
for row in rows:
print(row)
```
### (e) Putting it all together
In the `ticker.py` program, write a function `ticker(portfile, logfile, fmt)`
that creates a real-time stock ticker from a given portfolio, logfile,
and table format. For example::
```python
>>> from ticker import ticker
>>> ticker('Data/portfolio.csv', 'Data/stocklog.csv', 'txt')
Name Price Change
---------- ---------- ----------
GE 37.14 -0.18
MSFT 29.96 -0.09
CAT 78.03 -0.49
AA 39.34 -0.32
...
>>> ticker('Data/portfolio.csv', 'Data/stocklog.csv', 'csv')
Name,Price,Change
IBM,102.79,-0.28
CAT,78.04,-0.48
AA,39.35,-0.31
CAT,78.05,-0.47
...
```
### Discussion
Some lessons learned: You can create various generator functions and
chain them together to perform processing involving data-flow
pipelines. In addition, you can create functions that package a
series of pipeline stages into a single function call (for example,
the `parse_stock_data()` function).
[Next](04_More_generators)

View File

@@ -0,0 +1,179 @@
# 6.4 More Generators
This section introduces a few additional generator related topics including
generator expressions and the itertools module.
### Generator Expressions
A generator version of a list comprehension.
```python
>>> a = [1,2,3,4]
>>> b = (2*x for x in a)
>>> b
<generator object at 0x58760>
>>> for i in b:
... print(i, end=' ')
...
2 4 6 8
>>>
```
Differences with List Comprehensions.
* Does not construct a list.
* Only useful purpose is iteration.
* Once consumed, can't be reused.
General syntax.
```python
(<expression> for i in s if <conditional>)
```
It can also serve as a function argument.
```python
sum(x*x for x in a)
```
It can be applied to any iterable.
```python
>>> a = [1,2,3,4]
>>> b = (x*x for x in a)
>>> c = (-x for x in b)
>>> for i in c:
... print(i, end=' ')
...
-1 -4 -9 -16
>>>
```
The main use of generator expressions is in code that performs some
calculation on a sequence, but only uses the result once. For
example, strip all comments from a file.
```python
f = open('somefile.txt')
lines = (line for line in f if not line.startswith('#'))
for line in lines:
...
f.close()
```
With generators, the code runs faster and uses little memory. It's like a filter applied to a stream.
### Why Generators
* Many problems are much more clearly expressed in terms of iteration.
* Looping over a collection of items and performing some kind of operation (searching, replacing, modifying, etc.).
* Processing pipelines can be applied to a wide range of data processing problems.
* Better memory efficiency.
* Only produce values when needed.
* Contrast to constructing giant lists.
* Can operate on streaming data
* Generators encourage code reuse
* Separates the *iteration* from code that uses the iteration
* You can build a toolbox of interesting iteration functions and *mix-n-match*.
### `itertools` module
The `itertools` is a library module with various functions designed to help with iterators/generators.
```python
itertools.chain(s1,s2)
itertools.count(n)
itertools.cycle(s)
itertools.dropwhile(predicate, s)
itertools.groupby(s)
itertools.ifilter(predicate, s)
itertools.imap(function, s1, ... sN)
itertools.repeat(s, n)
itertools.tee(s, ncopies)
itertools.izip(s1, ... , sN)
```
All functions process data iteratively.
They implement various kinds of iteration patterns.
More information at [Generator Tricks for Systems Programmers](http://www.dabeaz.com/generators/) tutorial from PyCon '08.
## Exercises
In the previous exercises, you wrote some code that followed lines being written to a log file and parsed them into a sequence of rows.
This exercise continues to build upon that. Make sure the `Data/stocksim.py` is still running.
### (a) Generator Expressions
Generator expressions are a generator version of a list comprehension.
For example:
```python
>>> nums = [1, 2, 3, 4, 5]
>>> squares = (x*x for x in nums)
>>> squares
<generator object <genexpr> at 0x109207e60>
>>> for n in squares:
... print(n)
...
1
4
9
16
25
```
Unlike a list a comprehension, a generator expression can only be used once.
Thus, if you try another for-loop, you get nothing:
```python
>>> for n in squares:
... print(n)
...
>>>
```
### (b) Generator Expressions in Function Arguments
Generator expressions are sometimes placed into function arguments.
It looks a little weird at first, but try this experiment:
```python
>>> nums = [1,2,3,4,5]
>>> sum([x*x for x in nums]) # A list comprehension
55
>>> sum(x*x for x in nums) # A generator expression
55
>>>
```
In the above example, the second version using generators would
use significantly less memory if a large list was being manipulated.
In your `portfolio.py` file, you performed a few calculations
involving list comprehensions. Try replacing these with
generator expressions.
### (c) Code simplification
Generators expressions are often a useful replacement for
small generator functions. For example, instead of writing a
function like this:
```python
def filter_symbols(rows, names):
for row in rows:
if row['name'] in names:
yield row
```
You could write something like this:
```python
rows = (row for row in rows if row['name'] in names)
```
Modify the `ticker.py` program to use generator expressions
as appropriate.

View File

@@ -0,0 +1,9 @@
# Overview
In this section we will take a look at more Python features you may encounter.
* Variable argument functions
* Anonymous functions and lambda
* Returning function and closures
* Function decorators
* Static and class methods

View File

@@ -0,0 +1,214 @@
# 7.1 Variable Arguments
### Positional variable arguments (*args)
A function that accepts *any number* of arguments is said to use variable arguments.
For example:
```python
def foo(x, *args):
...
```
Function call.
```python
foo(1,2,3,4,5)
```
The arguments get passed as a tuple.
```python
def foo(x, *args):
# x -> 1
# args -> (2,3,4,5)
```
### Keyword variable arguments (**kwargs)
A function can also accept any number of keyword arguments.
For example:
```python
def foo(x, y, **kwargs):
...
```
Function call.
```python
foo(2,3,flag=True,mode='fast',header='debug')
```
The extra keywords are passed in a dictionary.
```python
def foo(x, y, **kwargs):
# x -> 2
# y -> 3
# kwargs -> { 'flat': True, 'mode': 'fast', 'header': 'debug' }
```
### Combining both
A function can also combine any number of variable keyword and non-keyword arguments.
Function definition.
```python
def foo(*args, **kwargs):
...
```
This function takes any combination of positional or keyword arguments.
It is sometimes used when writing wrappers or when you want to pass arguments through to another function.
### Passing Tuples and Dicts
Tuples can be expanded into variable arguments.
```python
numbers = (2,3,4)
foo(1, *numbers) # Same as f(1,2,3,4)
```
Dictionaries can also be expaded into keyword arguments.
```python
options = {
'color' : 'red',
'delimiter' : ',',
'width' : 400
}
foo(data, **options)
# Same as foo(data, color='red', delimiter=',', width=400)
```
These are not commonly used except when writing library functions.
## Exercises
### (a) A simple example of variable arguments
Try defining the following function:
```python
>>> def avg(x,*more):
return float(x+sum(more))/(1+len(more))
>>> avg(10,11)
10.5
>>> avg(3,4,5)
4.0
>>> avg(1,2,3,4,5,6)
3.5
>>>
```
Notice how the parameter `*more` collects all of the extra arguments.
### (b) Passing tuple and dicts as arguments
Suppose you read some data from a file and obtained a tuple such as
this:
```
>>> data = ('GOOG', 100, 490.1)
>>>
```
Now, suppose you wanted to create a `Stock` object from this
data. If you try to pass `data` directly, it doesn't work:
```
>>> from stock import Stock
>>> s = Stock(data)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: __init__() takes exactly 4 arguments (2 given)
>>>
```
This is easily fixed using `*data` instead. Try this:
``python
>>> s = Stock(*data)
>>> s
Stock('GOOG', 100, 490.1)
>>>
```
If you have a dictionary, you can use `**` instead. For example:
```python
>>> data = { 'name': 'GOOG', 'shares': 100, 'price': 490.1 }
>>> s = Stock(**data)
Stock('GOOG', 100, 490.1)
>>>
```
### (c) Creating a list of instances
In your `report.py` program, you created a list of instances
using code like this:
```python
def read_portfolio(filename):
'''
Read a stock portfolio file into a list of dictionaries with keys
name, shares, and price.
'''
with open(filename) as lines:
portdicts = fileparse.parse_csv(lines,
select=['name','shares','price'],
types=[str,int,float])
portfolio = [ Stock(d['name'], d['shares'], d['price'])
for d in portdicts ]
return Portfolio(portfolio)
```
You can simplify that code using `Stock(**d)` instead. Make that change.
### (d) Argument pass-through
The `fileparse.parse_csv()` function has some options for changing the
file delimiter and for error reporting. Maybe you'd like to expose those
options to the `read_portfolio()` function above. Make this change:
```
def read_portfolio(filename, **opts):
'''
Read a stock portfolio file into a list of dictionaries with keys
name, shares, and price.
'''
with open(filename) as lines:
portdicts = fileparse.parse_csv(lines,
select=['name','shares','price'],
types=[str,int,float],
**opts)
portfolio = [ Stock(**d) for d in portdicts ]
return Portfolio(portfolio)
```
Once you've made the change, trying reading a file with some errors:
```python
>>> import report
>>> port = report.read_portfolio('Data/missing.csv')
Row 4: Couldn't convert ['MSFT', '', '51.23']
Row 4: Reason invalid literal for int() with base 10: ''
Row 7: Couldn't convert ['IBM', '', '70.44']
Row 7: Reason invalid literal for int() with base 10: ''
>>>
```
Now, try silencing the errors:
```python
>>> import report
>>> port = report.read_portfolio('Data/missing.csv', silence_errors=True)
>>>
```
[Next](02_Anonymous_function)

View File

@@ -0,0 +1,161 @@
# 7.2 Anonymous Functions and Lambda
### List Sorting Revisited
Lists can be sorted *in-place*. Using the `sort` method.
```python
s = [10,1,7,3]
s.sort() # s = [1,3,7,10]
```
You can sort in reverse order.
```python
s = [10,1,7,3]
s.sort(reverse=True) # s = [10,7,3,1]
```
It seems simple enough. However, how do we sort a list of dicts?
```python
[{'name': 'AA', 'price': 32.2, 'shares': 100},
{'name': 'IBM', 'price': 91.1, 'shares': 50},
{'name': 'CAT', 'price': 83.44, 'shares': 150},
{'name': 'MSFT', 'price': 51.23, 'shares': 200},
{'name': 'GE', 'price': 40.37, 'shares': 95},
{'name': 'MSFT', 'price': 65.1, 'shares': 50},
{'name': 'IBM', 'price': 70.44, 'shares': 100}]
```
By what criteria?
You can guide the sorting by using a *key function*. The *key function* is a function that receives the dictionary and returns the value in a specific key.
```python
def stock_name(s):
return s['name']
portfolio.sort(key=stock_name)
```
The value returned by the *key function* determines the sorting.
```python
# Check how the dictionaries are sorted by the `name` key
[
{'name': 'AA', 'price': 32.2, 'shares': 100},
{'name': 'CAT', 'price': 83.44, 'shares': 150},
{'name': 'GE', 'price': 40.37, 'shares': 95},
{'name': 'IBM', 'price': 91.1, 'shares': 50},
{'name': 'IBM', 'price': 70.44, 'shares': 100},
{'name': 'MSFT', 'price': 51.23, 'shares': 200},
{'name': 'MSFT', 'price': 65.1, 'shares': 50}
]
```
### Callback Functions
Callback functions are often short one-line functions that are only used for that one operation. For example of previous sorting example.
Programmers often ask for a short-cut, so is there a shorter way to specify custom processing for `sort()`?
### Lambda: Anonymous Functions
Use a lambda instead of creating the function.
In our previous sorting example.
```python
portfolio.sort(key=lambda s: s['name'])
```
This creates an *unnamed* function that evaluates a *single* expression.
The above code is much shorter than the initial code.
```python
def stock_name(s):
return s['name']
portfolio.sort(key=stock_name)
# vs lambda
portfolio.sort(key=lambda s: s['name'])
```
### Using lambda
* lambda is highly restricted.
* Only a single expression is allowed.
* No statements like `if`, `while`, etc.
* Most common use is with functions like `sort()`.
## Exercises
Read some stock portfolio data and convert it into a list:
```python
>>> import report
>>> portfolio = list(report.read_portfolio('Data/portfolio.csv'))
>>> for s in portfolio:
print(s)
Stock('AA', 100, 32.2)
Stock('IBM', 50, 91.1)
Stock('CAT', 150, 83.44)
Stock('MSFT', 200, 51.23)
Stock('GE', 95, 40.37)
Stock('MSFT', 50, 65.1)
Stock('IBM', 100, 70.44)
>>>
```
### (a) Sorting on a field
Try the following statements which sort the portfolio data
alphabetically by stock name.
```python
>>> def stock_name(s):
return s.name
>>> portfolio.sort(key=stock_name)
>>> for s in portfolio:
print(s)
... inspect the result ...
>>>
```
In this part, the `stock_name()` function extracts the name of a stock from
a single entry in the `portfolio` list. `sort()` uses the result of
this function to do the comparison.
### (b) Sorting on a field with lambda
Try sorting the portfolio according the number of shares using a
`lambda` expression:
```python
>>> portfolio.sort(key=lambda s: s.shares)
>>> for s in portfolio:
print(s)
... inspect the result ...
>>>
```
Try sorting the portfolio according to the price of each stock
```python
>>> portfolio.sort(key=lambda s: s.price)
>>> for s in portfolio:
print(s)
... inspect the result ...
>>>
```
Note: `lambda` is a useful shortcut because it allows you to
define a special processing function directly in the call to `sort()` as
opposed to having to define a separate function first (as in part a).
[Next](03_Returning_functions)

View File

@@ -0,0 +1,234 @@
# 7.3 Returning Functions
This section introduces the idea of closures.
### Introduction
Consider the following function.
```python
def add(x, y):
def do_add():
print('Adding', x, y)
return x + y
return do_add
```
This is a function that returns another function.
```python
>>> a = add(3,4)
>>> a
<function do_add at 0x6a670>
>>> a()
Adding 3 4
7
```
### Local Variables
Observe how to inner function refers to variables defined by the outer function.
```python
def add(x, y):
def do_add():
# `x` and `y` are defined above `add(x, y)`
print('Adding', x, y)
return x + y
return do_add
```
Further observe that those variables are somehow kept alive after `add()` has finished.
```python
>>> a = add(3,4)
>>> a
<function do_add at 0x6a670>
>>> a()
Adding 3 4 # Where are these values coming from?
7
```
### Closures
When an inner function is returned as a result, the inner function is known as a *closure*.
```python
def add(x, y):
# `do_add` is a closure
def do_add():
print('Adding', x, y)
return x + y
return do_add
```
*Essential feature: A closure retains the values of all variables needed for the function to run properly later on.*
### Using Closures
Closure are an essential feature of Python. However, their use if often subtle.
Common applications:
* Use in callback functions.
* Delayed evaluation.
* Decorator functions (later).
### Delayed Evaluation
Consider a function like this:
```python
def after(seconds, func):
time.sleep(seconds)
func()
```
Usage example:
```python
def greeting():
print('Hello Guido')
after(30, greeting)
```
`after` executes the supplied function... later.
Closures carry extra information around.
```python
def add(x, y):
def do_add():
print('Adding %s + %s -> %s' % (x, y, x + y))
return do_add
def after(seconds, func):
time.sleep(seconds)
func()
after(30, add(2, 3))
# `do_add` has the references x -> 2 and y -> 3
```
A function can have its own little environment.
### Code Repetition
Closures can also be used as technique for avoiding excessive code repetition.
You can write functions that make code.
## Exercises
### (a) Using Closures to Avoid Repetition
One of the more powerful features of closures is their use in
generating repetitive code. If you refer back to exercise 5.2
recall the code for defining a property with type checking.
```python
class Stock(object):
def __init__(self, name, shares, price):
self.name = name
self.shares = shares
self.price = price
...
@property
def shares(self):
return self._shares
@shares.setter
def shares(self, value):
if not isinstance(value, int):
raise TypeError('Expected int')
self._shares = value
...
```
Instead of repeatedly typing that code over and over again, you can
automatically create it using a closure.
Make a file `typedproperty.py` and put the following code in
it:
```python
# typedproperty.py
def typedproperty(name, expected_type):
private_name = '_' + name
@property
def prop(self):
return getattr(self, private_name)
@prop.setter
def prop(self, value):
if not isinstance(value, expected_type):
raise TypeError(f'Expected {expected_type}')
setattr(self, private_name, value)
return prop
```
Now, try it out by defining a class like this:
```python
from typedproperty import typedproperty
class Stock(object):
name = typedproperty('name', str)
shares = typedproperty('shares', int)
price = typedproperty('price', float)
def __init__(self, name, shares, price):
self.name = name
self.shares = shares
self.price = price
```
Try creating an instance and verifying that type-checking works.
```python
>>> s = Stock('IBM', 50, 91.1)
>>> s.name
'IBM'
>>> s.shares = '100'
... should get a TypeError ...
>>>
```
### (b) Simplifying Function Calls
In the above example, users might find calls such as
`typedproperty('shares', int)` a bit verbose to type--especially if
they're repeated a lot. Add the following definitions to the
`typedproperty.py` file:
```python
String = lambda name: typedproperty(name, str)
Integer = lambda name: typedproperty(name, int)
Float = lambda name: typedproperty(name, float)
```
Now, rewrite the `Stock` class to use these functions instead:
```python
class Stock(object):
name = String('name')
shares = Integer('shares')
price = Float('price')
def __init__(self, name, shares, price):
self.name = name
self.shares = shares
self.price = price
```
Ah, that's a bit better. The main takeaway here is that closures and `lambda`
can often be used to simplify code and eliminate annoying repetition. This
is often good.
### (c) Putting it into practice
Rewrite the `Stock` class in the file `stock.py` so that it uses typed properties
as shown.
[Next](04_Function_decorators)

View File

@@ -0,0 +1,152 @@
# 7.4 Function Decorators
This section introduces the concept of a decorator. This is an advanced
topic for which we only scratch the surface.
### Logging Example
Consider a function.
```python
def add(x, y):
return x + y
```
Now, consider the function with some logging.
```python
def add(x, y):
print('Calling add')
return x + y
```
Now a second function also with some logging.
```python
def sub(x, y):
print('Calling sub')
return x - y
```
### Observation
*Observation: It's kind of repetitive.*
Writing programs where there is a lot of code replication is often really annoying.
They are tedious to write and hard to maintain.
Especially if you decide that you want to change how it works (i.e., a different kind of logging perhaps).
### Example continuation
Perhaps you can make *logging wrappers*.
```python
def logged(func):
def wrapper(*args, **kwargs):
print('Calling', func.__name__)
return func(*args, **kwargs)
return wrapper
```
Now use it.
```python
def add(x, y):
return x + y
logged_add = logged(add)
```
What happens when you call the function returned by `logged`?
```python
logged_add(3, 4) # You see the logging message appear
```
This example illustrates the process of creating a so-called *wrapper function*.
**A wrapper is a function that wraps another function with some extra bits of processing.**
```python
>>> logged_add(3, 4)
Calling add # Extra output. Added by the wrapper
7
>>>
```
*Note: The `logged()` function creates the wrapper and returns it as a result.*
## Decorators
Putting wrappers around functions is extremely common in Python.
So common, there is a special syntax for it.
```python
def add(x, y):
return x + y
add = logged(add)
# Special syntax
@logged
def add(x, y):
return x + y
```
The special syntax performs the same exact steps as shown above. A decorator is just new syntax.
It is said to *decorate* the function.
### Commentary
There are many more subtle details to decorators than what has been presented here.
For example, using them in classes. Or using multiple decorators with a function.
However, the previous example is a good illustration of how their use tends to arise.
## Exercises
### (a) A decorator for timing
If you define a function, its name and module are stored in the
`__name__` and `__module__` attributes. For example:
```python
>>> def add(x,y):
return x+y
>>> add.__name__
'add'
>>> add.__module__
'__main__'
>>>
```
In a file `timethis.py`, write a decorator function `timethis(func)`
that wraps a function with an extra layer of logic that prints out how
long it takes for a function to execute. To do this, you'll surround
the function with timing calls like this:
```python
start = time.time()
r = func(*args,**kwargs)
end = time.time()
print('%s.%s: %f' % (func.__module__, func.__name__, end-start))
```
Here is an example of how your decorator should work:
```python
>>> from timethis import timethis
>>> @timethis
def countdown(n):
while n > 0:
n -= 1
>>> countdown(10000000)
__main__.countdown : 0.076562
>>>
```
Discussion: This `@timethis` decorator can be placed in front of any
function definition. Thus, you might use it as a diagnostic tool for
performance tuning.
[Next](05_Decorated_methods)

View File

@@ -0,0 +1,260 @@
# 7.5 Decorated Methods
This section discusses a few common decorators that are used in
combination with method definitions.
### Predefined Decorators
There are predefined decorators used to specify special kinds of methods in class definitions.
```python
class Foo(object):
def bar(self,a):
...
@staticmethod
def spam(a):
...
@classmethod
def grok(cls,a):
...
@property
def name(self):
...
```
Let's go one by one.
### Static Methods
`@staticmethod` is used to define a so-called *static* class methods (from C++/Java).
A static method is a function that is part of the class, but which does *not* operate on instances.
```python
class Foo(object):
@staticmethod
def bar(x):
print('x =', x)
>>> Foo.bar(2) x=2
>>>
```
Static methods are sometimes used to implement internal supporting code for a class.
For example, code to help manage created instances (memory management, system resources, persistence, locking, etc).
They're also used by certain design patterns (not discussed here).
### Class Methods
`@classmethod` is used to define class methods.
A class method is a method that receives the *class* object as the first parameter instead of the instance.
```python
class Foo(object):
def bar(self):
print(self)
@classmethod
def spam(cls):
print(cls)
>>> f = Foo()
>>> f.bar()
<__main__.Foo object at 0x971690> # The instance `f`
>>> Foo.spam()
<class '__main__.Foo'> # The class `Foo`
>>>
```
Class methods are most often used as a tool for defining alternate constructors.
```python
class Date(object):
def __init__(self,year,month,day):
self.year = year
self.month = month
self.day = day
@classmethod
def today(cls):
# Notice how the class is passed as an argument
tm = time.localtime()
# And used to create a new instance
return cls(tm.tm_year, tm.tm_mon, tm.tm_mday)
d = Date.today()
```
Class methods solve some tricky problems with features like inheritance.
```python
class Date(object):
...
@classmethod
def today(cls):
# Gets the correct class (e.g. `NewDate`)
tm = time.localtime()
return cls(tm.tm_year, tm.tm_mon, tm.tm_mday)
class NewDate(Date):
...
d = NewDate.today()
```
## Exercises
Start this exercise by defining a `Date` class. For example:
```
>>> class Date(object):
def __init__(self,year,month,day):
self.year = year
self.month = month
self.day = day
>>> d = Date(2010, 4, 13)
>>> d.year, d.month, d.day
(2010, 4, 13)
>>>
```
### (a) Class Methods
A common use of class methods is to provide alternate constructors
(epecially since Python doesn't support overloaded methods). Modify
the `Date` class to have a class method `today()` that creates a date
from today's date.
```python
>>> import time
>>> class Date(object):
def __init__(self,year,month,day):
self.year = year
self.month = month
self.day = day
@classmethod
def today(cls):
t = time.localtime()
return cls(t.tm_year, t.tm_mon, t.tm_mday)
>>> d = Date.today()
>>> d.year, d.month, d.day
... output varies. Should be today ...
>>>
```
One reason you should use class methods for this is that they work
with inheritance. For example, try this:
```python
>>> class CustomDate(Date):
def yow(self):
print('Yow!')
>>> d = CustomDate.today()
<__main__.CustomDate object at 0x10923d400>
>>> d.yow()
Yow!
>>>
```
### (b) Class Methods in Practice
In your `report.py` and `portfolio.py` files, the creation of a `Portfolio`
object is a bit muddled. For example, the `report.py` program has code like this:
```python
def read_portfolio(filename, **opts):
'''
Read a stock portfolio file into a list of dictionaries with keys
name, shares, and price.
'''
with open(filename) as lines:
portdicts = fileparse.parse_csv(lines,
select=['name','shares','price'],
types=[str,int,float],
**opts)
portfolio = [ Stock(**d) for d in portdicts ]
return Portfolio(portfolio)
```
and the `portfolio.py` file defines `Portfolio()` with an odd initializer
like this:
```python
class Portfolio(object):
def __init__(self, holdings):
self.holdings = holdings
...
```
Frankly, the chain of responsibility is all a bit confusing because the
code is scattered. If a `Portfolio` class is supposed to contain
a list of `Stock` instances, maybe you should change the class to be a bit more clear.
Like this:
```python
# portfolio.py
import stock
class Portfolio(object):
def __init__(self):
self.holdings = []
def append(self, holding):
if not isinstance(holding, stock.Stock):
raise TypeError('Expected a Stock instance')
self.holdings.append(holding)
...
```
If you want to read a portfolio from a CSV file, maybe you should make a
class method for it:
```python
# portfolio.py
import fileparse
import stock
class Portfolio(object):
def __init__(self):
self.holdings = []
def append(self, holding):
if not isinstance(holding, stock.Stock):
raise TypeError('Expected a Stock instance')
self.holdings.append(holding)
@classmethod
def from_csv(cls, lines, **opts):
self = cls()
portdicts = fileparse.parse_csv(lines,
select=['name','shares','price'],
types=[str,int,float],
**opts)
for d in portdicts:
self.append(stock.Stock(**d))
return self
```
To use this new Portfolio class, you can now write code like this:
```
>>> from portfolio import Portfolio
>>> with open('Data/portfolio.csv') as lines:
... port = Portfolio.from_csv(lines)
...
>>>
```
Make these changes to the `Portfolio` class and modify the `report.py`
code to use the class method.

View File

@@ -0,0 +1,9 @@
# Overview
In this section we will cover the basics of:
* Testing
* Logging, error handling and diagnostics
* Debugging
Using Python.

View File

@@ -0,0 +1,264 @@
# 8.1 Testing
## Testing Rocks, Debugging Sucks
The dynamic nature of Python makes testing critically important to most applications.
There is no compiler to find your bugs. The only way to find bugs is to run the code and make sure you try out all of its features.
## Assertions
The assertion statement is an internal check for the program.
If an expression is not true, it raises a `AssertionError` exception.
`assert` statement syntax.
```python
assert <expression> [, 'Diagnostic message']
```
For example.
```python
assert isinstance(10, int), 'Expected int'
```
It shouldn't be used to check the user-input.
### Contract Programming
Also known as Design By Contract, liberal use of assertions is an approach for designing
software. It prescribes that software designers should define precise
interface specifications for the components of the software.
For example, you might put assertions on all inputs and outputs.
```python
def add(x, y):
assert isinstance(x, int), 'Expected int'
assert isinstance(y, int), 'Expected int'
return x + y
```
Checking inputs will immediately catch callers who aren't using appropriate arguments.
```python
>>> add(2, 3)
5
>>> add('2', '3')
Traceback (most recent call last):
...
AssertionError: Expected int
>>>
```
### Inline Tests
Assertions can also be used for simple tests.
```python
def add(x, y):
return x + y
assert add(2,2) == 4
```
This way you are including the test in the same module as your code.
*Benefit: If the code is obviously broken, attempts to import the module will crash.*
This is not recommended for exhaustive testing.
### `unittest` Module
Suppose you have some code.
```python
# simple.py
def add(x, y):
return x + y
```
You can create a separate testing file. For example:
```python
# testsimple.py
import simple
import unittest
```
Then define a testing class.
```python
# testsimple.py
import simple
import unittest
# Notice that it inherits from unittest.TestCase
class TestAdd(unittest.TestCase):
...
```
The testing class must inherit from `unittest.TestCase`.
In the testing class, you define the testing methods.
```python
# testsimple.py
import simple
import unittest
# Notice that it inherits from unittest.TestCase
class TestAdd(unittest.TestCase):
def test_simple(self):
# Test with simple integer arguments
r = simple.add(2, 2)
self.assertEqual(r, 5)
def test_str(self):
# Test with strings
r = simple.add('hello', 'world')
self.assertEqual(r, 'helloworld')
```
*Important: Each method must start with `test`.
### Using `unittest`
There are several built in assertions that come with `unittest`. Each of them asserts a different thing.
```python
# Assert that expr is True
self.assertTrue(expr)
# Assert that x == y
self.assertEqual(x,y)
# Assert that x != y
self.assertNotEqual(x,y)
# Assert that x is near y
self.assertAlmostEqual(x,y,places)
# Assert that callable(arg1,arg2,...) raises exc
self.assertRaises(exc, callable, arg1, arg2, ...)
```
This is not an exhaustive list. There are other assertions in the module.
### Running `unittest`
To run the tests, turn the code into a script.
```python
# testsimple.py
...
if __name__ == '__main__':
unittest.main()
```
Then run Python on the test file.
```bash
bash % python3 testsimple.py
F.
========================================================
FAIL: test_simple (__main__.TestAdd)
--------------------------------------------------------
Traceback (most recent call last):
File "testsimple.py", line 8, in test_simple
self.assertEqual(r, 5)
AssertionError: 4 != 5
--------------------------------------------------------
Ran 2 tests in 0.000s
FAILED (failures=1)
```
### Commentary
Effective unit testing is an art and it can grow to be quite complicated for large applications.
The `unittest` module has a huge number of options related to test
runners, collection of results and other aspects of testing. Consult
the documentation for details.
### Third Party Test Tools
We won't cover any third party test tools in this course.
However, there are a few popular alternatives and complements to
`unittest`.
* [pytest](https://pytest.org) - A popular alternative.
* [coverage](http://coverage.readthedocs.io) - Code coverage.
## Exercises
In this exercise, you will explore the basic mechanics of using
Python's `unittest` module.
In earlier exercises, you wrote a file `stock.py` that contained a `Stock`
class. For this exercise, it assumed that you're using the code written
for Exercise 7.3. If, for some reason, that's not working,
you might want to copy the solution from `Solutions/7_3` to your working
directory.
### (a) Writing Unit Tests
In a separate file `test_stock.py`, write a set a unit tests
for the `Stock` class. To get you started, here is a small
fragment of code that tests instance creation:
```python
# test_stock.py
import unittest
import stock
class TestStock(unittest.TestCase):
def test_create(self):
s = stock.Stock('GOOG', 100, 490.1)
self.assertEqual(s.name, 'GOOG')
self.assertEqual(s.shares, 100)
self.assertEqual(s.price, 490.1)
if __name__ == '__main__':
unittest.main()
```
Run your unit tests. You should get some output that looks like this:
```
.
----------------------------------------------------------------------
Ran 1 tests in 0.000s
OK
```
Once you're satisifed that it works, write additional unit tests that
check for the following:
- Make sure the `s.cost` property returns the correct value (49010.0)
- Make sure the `s.sell()` method works correctly. It should
decrement the value of `s.shares` accordingly.
- Make sure that the `s.shares` attribute can't be set to a non-integer value.
For the last part, you're going to need to check that an exception is raised.
An easy way to do that is with code like this:
```python
class TestStock(unittest.TestCase):
...
def test_bad_shares(self):
s = stock.Stock('GOOG', 100, 490.1)
with self.assertRaises(TypeError):
s.shares = '100'
```
[Next](02_Logging)

View File

@@ -0,0 +1,305 @@
# 8.2 Logging
This section briefly introduces the logging module.
### `logging` Module
The `logging` module is a standard library module for recording diagnostic information.
It's also a very large module with a lot of sophisticated functionality.
We will show a simple example to illustrate its usefulness.
### Exceptions Revisited
In the exercises, we wrote a function `parse()` that looked something like this:
```python
# fileparse.py
def parse(f, types=None, names=None, delimiter=None):
records = []
for line in f:
line = line.strip()
if not line: continue
try:
records.append(split(line,types,names,delimiter))
except ValueError as e:
print("Couldn't parse :", line)
print("Reason :", e)
return records
```
Focus on the `try-except` statement. What should you do in the `except` block?
Should you print a warning message?
```python
try:
records.append(split(line,types,names,delimiter))
except ValueError as e:
print("Couldn't parse :", line)
print("Reason :", e)
```
Or do you silently ignore it?
```python
try:
records.append(split(line,types,names,delimiter))
except ValueError as e:
pass
```
Neither solution is satisfactory because you often want *both* behaviors (user selectable).
### Using `logging`
The `logging` module can address this.
```python
# fileparse.py
import logging
log = logging.getLogger(__name__)
def parse(f,types=None,names=None,delimiter=None):
...
try:
records.append(split(line,types,names,delimiter))
except ValueError as e:
log.warning("Couldn't parse : %s", line)
log.debug("Reason : %s", e)
```
The code is modified to issue warning messages or a special `Logger`
object. The one created with `logging.getLogger(__name__)`.
### Logging Basics
Create a logger object.
```python
log = logging.getLogger(name) # name is a string
```
Issuing log messages.
```python
log.critical(message [, args])
log.error(message [, args])
log.warning(message [, args])
log.info(message [, args])
log.debug(message [, args])
```
*Each method represents a different level of severity.*
All of them create a formatted log message. `args` is used for the `%` operator.
```python
logmsg = message % args # Written to the log
```
### Logging Configuration
The logging behavior is configured separately.
```python
# main.py
...
if __name__ == '__main__':
import logging
logging.basicConfig(
filename = 'app.log', # Log output file
level = logging.INFO, # Output level
)
```
Typically, this is a one-time configuration at program startup.
The configuration is separate from the code that makes the logging calls.
### Comments
Logging is highly configurable.
You can adjust every aspect of it: output files, levels, message formats, etc.
However, the code that uses logging doesn't have to worry about that.
## Exercises
### (a) Adding logging to a module
In Exercise 3.3, you added some error handling to the
`fileparse.parse_csv()` function. It looked like this:
```python
# fileparse.py
import csv
def parse_csv(lines, select=None, types=None, has_headers=True, delimiter=',', silence_errors=False):
'''
Parse a CSV file into a list of records with type conversion.
'''
if select and not has_headers:
raise RuntimeError('select requires column headers')
rows = csv.reader(lines, delimiter=delimiter)
# Read the file headers (if any)
headers = next(rows) if has_headers else []
# If specific columns have been selected, make indices for filtering and set output columns
if select:
indices = [ headers.index(colname) for colname in select ]
headers = select
records = []
for rowno, row in enumerate(rows, 1):
if not row: # Skip rows with no data
continue
# If specific column indices are selected, pick them out
if select:
row = [ row[index] for index in indices]
# Apply type conversion to the row
if types:
try:
row = [func(val) for func, val in zip(types, row)]
except ValueError as e:
if not silence_errors:
print(f"Row {rowno}: Couldn't convert {row}")
print(f"Row {rowno}: Reason {e}")
continue
# Make a dictionary or a tuple
if headers:
record = dict(zip(headers, row))
else:
record = tuple(row)
records.append(record)
return records
```
Notice the print statements that issue diagnostic messages. Replacing those
prints with logging operations is relatively simple. Change the code like this:
```python
# fileparse.py
import csv
import logging
log = logging.getLogger(__name__)
def parse_csv(lines, select=None, types=None, has_headers=True, delimiter=',', silence_errors=False):
'''
Parse a CSV file into a list of records with type conversion.
'''
if select and not has_headers:
raise RuntimeError('select requires column headers')
rows = csv.reader(lines, delimiter=delimiter)
# Read the file headers (if any)
headers = next(rows) if has_headers else []
# If specific columns have been selected, make indices for filtering and set output columns
if select:
indices = [ headers.index(colname) for colname in select ]
headers = select
records = []
for rowno, row in enumerate(rows, 1):
if not row: # Skip rows with no data
continue
# If specific column indices are selected, pick them out
if select:
row = [ row[index] for index in indices]
# Apply type conversion to the row
if types:
try:
row = [func(val) for func, val in zip(types, row)]
except ValueError as e:
if not silence_errors:
log.warning("Row %d: Couldn't convert %s", rowno, row)
log.debug("Row %d: Reason %s", rowno, e)
continue
# Make a dictionary or a tuple
if headers:
record = dict(zip(headers, row))
else:
record = tuple(row)
records.append(record)
return records
```
Now that you've made these changes, try using some of your code on
bad data.
```python
>>> import report
>>> a = report.read_portfolio('Data/missing.csv')
Row 4: Bad row: ['MSFT', '', '51.23']
Row 7: Bad row: ['IBM', '', '70.44']
>>>
```
If you do nothing, you'll only get logging messages for the `WARNING`
level and above. The output will look like simple print statements.
However, if you configure the logging module, you'll get additional
information about the logging levels, module, and more. Type these
steps to see that:
```python
>>> import logging
>>> logging.basicConfig()
>>> a = report.read_portfolio('Data/missing.csv')
WARNING:fileparse:Row 4: Bad row: ['MSFT', '', '51.23']
WARNING:fileparse:Row 7: Bad row: ['IBM', '', '70.44']
>>>
```
You will notice that you don't see the output from the `log.debug()`
operation. Type this to change the level.
```
>>> logging.getLogger('fileparse').level = logging.DEBUG
>>> a = report.read_portfolio('Data/missing.csv')
WARNING:fileparse:Row 4: Bad row: ['MSFT', '', '51.23']
DEBUG:fileparse:Row 4: Reason: invalid literal for int() with base 10: ''
WARNING:fileparse:Row 7: Bad row: ['IBM', '', '70.44']
DEBUG:fileparse:Row 7: Reason: invalid literal for int() with base 10: ''
>>>
```
Turn off all, but the most critical logging messages:
```
>>> logging.getLogger('fileparse').level=logging.CRITICAL
>>> a = report.read_portfolio('Data/missing.csv')
>>>
```
### (b) Adding Logging to a Program
To add logging to an application, you need to have some mechanism to
initialize the logging module in the main module. One way to
do this is to include some setup code that looks like this:
```
# This file sets up basic configuration of the logging module.
# Change settings here to adjust logging output as needed.
import logging
logging.basicConfig(
filename = 'app.log', # Name of the log file (omit to use stderr)
filemode = 'w', # File mode (use 'a' to append)
level = logging.WARNING, # Logging level (DEBUG, INFO, WARNING, ERROR, or CRITICAL)
)
```
Again, you'd need to put this someplace in the startup steps of your
program.
[Next](03_Debugging)

View File

@@ -0,0 +1,147 @@
# 8.3 Debugging
### Debugging Tips
So, you're program has crashed...
```bash
bash % python3 blah.py
Traceback (most recent call last):
File "blah.py", line 13, in ?
foo()
File "blah.py", line 10, in foo
bar()
File "blah.py", line 7, in bar
spam()
File "blah.py", 4, in spam
line x.append(3)
AttributeError: 'int' object has no attribute 'append'
```
Now what?!
### Reading Tracebacks
The last line is the specific cause of the crash.
```bash
bash % python3 blah.py
Traceback (most recent call last):
File "blah.py", line 13, in ?
foo()
File "blah.py", line 10, in foo
bar()
File "blah.py", line 7, in bar
spam()
File "blah.py", 4, in spam
line x.append(3)
# Cause of the crash
AttributeError: 'int' object has no attribute 'append'
```
However, it's not always easy to read or understand.
*PRO TIP: Paste the whole traceback into Google.*
### Using the REPL
Use the option `-i` to keep Python alive when executing a script.
```bash
bash % python3 -i blah.py
Traceback (most recent call last):
File "blah.py", line 13, in ?
foo()
File "blah.py", line 10, in foo
bar()
File "blah.py", line 7, in bar
spam()
File "blah.py", 4, in spam
line x.append(3)
AttributeError: 'int' object has no attribute 'append'
>>>
```
It preserves the interpreter state. That means that you can go poking around after the crash. Checking variable values and other state.
### Debugging with Print
`print()` debugging is quite common.
*Tip: Make sure you use `repr()`*
```python
def spam(x):
print('DEBUG:', repr(x))
...
```
`repr()` shows you an accurate representation of a value. Not the *nice* printing output.
```python
>>> from decimal import Decimal
>>> x = Decimal('3.4')
# NO `repr`
>>> print(x)
3.4
# WITH `repr`
>>> print(repr(x))
Decimal('3.4')
>>>
```
### The Python Debugger
You can manually launch the debugger inside a program.
```python
def some_function():
...
breakpoint() # Enter the debugger (Python 3.7+)
...
```
This starts the debugger at the `breakpoint()` call.
For earlier Python versions:
```python
import pdb
...
pdb.set_trace() # Instead of `breakpoint()`
...
```
### Run under debugger
You can also run an entire program under debugger.
```bash
bash % python3 -m pdb someprogram.py
```
It will automatically enter the debugger before the first statement. Allowing you to set breakpoints and change the configuration.
Common debugger commands:
```code
(Pdb) help # Get help
(Pdb) w(here) # Print stack trace
(Pdb) d(own) # Move down one stack level
(Pdb) u(p) # Move up one stack level
(Pdb) b(reak) loc # Set a breakpoint
(Pdb) s(tep) # Execute one instruction
(Pdb) c(ontinue) # Continue execution
(Pdb) l(ist) # List source code
(Pdb) a(rgs) # Print args of current function
(Pdb) !statement # Execute statement
```
For breakpoints location is one of the following.
```code
(Pdb) b 45 # Line 45 in current file
(Pdb) b file.py:45 # Line 34 in file.py
(Pdb) b foo # Function foo() in current file
(Pdb) b module.foo # Function foo() in a module
```

View File

@@ -0,0 +1,7 @@
# Overview
In this section we will cover more details on:
* Packages.
* Third Party Modules.
* How to structure an application.

View File

@@ -0,0 +1,415 @@
# 9.1 Packages
This section introduces the concept of a package.
### Modules
Any Python source file is a module.
```python
# foo.py
def grok(a):
...
def spam(b):
...
```
An `import` statement loads and *executes* a module.
```python
# program.py
import foo
a = foo.grok(2)
b = foo.spam('Hello')
...
```
### Packages vs Modules
For larger collections of code, it is common to organize modules into a package.
```code
# From this
pcost.py
report.py
fileparse.py
# To this
porty/
__init__.py
pcost.py
report.py
fileparse.py
```
You pick a name and make a top-level directory. `porty` in the example above.
Add an `__init__.py` file. It may be empty.
Put your source files into it.
### Using a Package
A package serves as a namespace for imports.
This means that there are multilevel imports.
```python
import porty.report
port = porty.report.read_portfolio('port.csv')
```
There are other variations of import statements.
```python
from porty import report
port = report.read_portfolio('port.csv')
from porty.report import read_portfolio
port = read_portfolio('port.csv')
```
### Two problems
There are two main problems with this approach.
* imports between files in the same package.
* Main scripts placed inside the package.
Both break.
### Problem: Imports
Imports between files in the same package *must include the package name in the import*.
Remember the structure.
```code
porty/
__init__.py
pcost.py
report.py
fileparse.py
```
Import example.
```python
# report.py
from porty import fileparse
def read_portfolio(filename):
return fileparse.parse_csv(...)
```
All imports are *absolute*, not relative.
```python
# report.py
import fileparse # BREAKS. fileparse not found
...
```
### Relative Imports
However, you can use `.` to refer to the current package. Instead of the package name.
```python
# report.py
from . import fileparse
def read_portfolio(filename):
return fileparse.parse_csv(...)
```
Syntax:
```python
from . import modname
```
This makes it easy to rename the package.
### Problem: Main Scripts
Running a submodule as a main script breaks.
```bash
bash $ python porty/pcost.py # BREAKS
...
```
*Reason: You are running Python on a single file and Python doesn't see the rest of the package structure correctly (`sys.path` is wrong).*
All imports break.
### `__init__.py` files
The primary purpose of these files is to stitch modules together.
Example: consolidating functions
```python
# porty/__init__.py
from .pcost import portfolio_cost
from .report import portfolio_report
```
Makes names appear at the *top-level* when importing.
```python
from porty import portfolio_cost
portfolio_cost('portfolio.csv')
```
Instead of using the multilevel imports.
```python
from porty import pcost
pcost.portfolio_cost('portfolio.csv')
```
### Solution for scripts
Use `-m package.module` option.
```bash
bash % python3 -m porty.pcost portfolio.csv
```
It will run the code in a proper package environment.
There is another alternative: Write a new top-level script.
```python
#!/usr/bin/env python3
# pcost.py
import porty.pcost
import sys
porty.pcost.main(sys.argv)
```
This script lives *outside* the package.
### Application Structure
Code organization and file structure is key to the maintainability of an application.
One recommended structure is the following.
```code
porty-app/
README.txt
script.py # SCRIPT
porty/
# LIBRARY CODE
__init__.py
pcost.py
report.py
fileparse.py
```
Top-level scripts need to exist outside the code package. One level up.
```python
#!/usr/bin/env python3
# script.py
import sys
import porty
porty.report.main(sys.argv)
```
## Exercises
At this point, you have a directory with several programs:
```
pcost.py # computes portfolio cost
report.py # Makes a report
ticker.py # Produce a real-time stock ticker
```
There are a variety of supporting modules with other functionality:
```
stock.py # Stock class
portfolio.py # Portfolio class
fileparse.py # CSV parsing
tableformat.py # Formatted tables
follow.py # Follow a log file
typedproperty.py # Typed class properties
```
In this exercise, we're going to clean up the code and put it into
a common package.
### (a) Making a simple package
Make a directory called `porty/` and put all of the above Python
files into it. Additionally create an empty `__init__.py` file and
put it in the directory. You should have a directory of files
like this:
```
porty/
__init__.py
fileparse.py
follow.py
pcost.py
portfolio.py
report.py
stock.py
tableformat.py
ticker.py
typedproperty.py
```
Remove the file `__pycache__` that's sitting in your directory. This
contains pre-compiled Python modules from before. We want to start
fresh.
Try importing some of package modules:
```python
>>> import porty.report
>>> import porty.pcost
>>> import porty.ticker
```
If these imports fail, go into the appropriate file and fix the
module imports to include a package-relative import. For example,
a statement such as `import fileparse` might change to the
following:
```
# report.py
from . import fileparse
...
```
If you have a statement such as `from fileparse import parse_csv`, change
the code to the following:
```
# report.py
from .fileparse import parse_csv
...
```
### (b) Making an application directory
Putting all of your code into a "package" isn't often enough for an
application. Sometimes there are supporting files, documentation,
scripts, and other things. These files need to exist OUTSIDE of the
`porty/` directory you made above.
Create a new directory called `porty-app`. Move the `porty` directory
you created in part (a) into that directory. Copy the
`Data/portfolio.csv` and `Data/prices.csv` test files into this
directory. Additionally create a `README.txt` file with some
information about yourself. Your code should now be organized as
follows:
```
porty-app/
portfolio.csv
prices.csv
README.txt
porty/
__init__.py
fileparse.py
follow.py
pcost.py
portfolio.py
report.py
stock.py
tableformat.py
ticker.py
typedproperty.py
```
To run your code, you need to make sure you are working in the top-level `porty-app/`
directory. For example, from the terminal:
```python
shell % cd porty-app
shell % python3
>>> import porty.report
>>>
```
Try running some of your prior scripts as a main program:
```python
shell % cd porty-app
shell % python3 -m porty.report portfolio.csv prices.csv txt
Name Shares Price Change
---------- ---------- ---------- ----------
AA 100 9.22 -22.98
IBM 50 106.28 15.18
CAT 150 35.46 -47.98
MSFT 200 20.89 -30.34
GE 95 13.48 -26.89
MSFT 50 20.89 -44.21
IBM 100 106.28 35.84
shell %
```
### (c) Top-level Scripts
Using the `python -m` command is often a bit weird. You may want to
write a top level script that simply deals with the oddities of packages.
Create a script `print-report.py` that produces the above report:
```python
#!/usr/bin/env python3
# print-report.py
import sys
from porty.report import main
main(sys.argv)
```
Put this script in the top-level `porty-app/` directory. Make sure you
can run it in that location:
```
shell % cd porty-app
shell % python3 print-report.py portfolio.csv prices.csv txt
Name Shares Price Change
---------- ---------- ---------- ----------
AA 100 9.22 -22.98
IBM 50 106.28 15.18
CAT 150 35.46 -47.98
MSFT 200 20.89 -30.34
GE 95 13.48 -26.89
MSFT 50 20.89 -44.21
IBM 100 106.28 35.84
shell %
```
Your final code should now be structured something like this:
```
porty-app/
portfolio.csv
prices.csv
print-report.py
README.txt
porty/
__init__.py
fileparse.py
follow.py
pcost.py
portfolio.py
report.py
stock.py
tableformat.py
ticker.py
typedproperty.py
```
[Next](02_Third_party)

View File

@@ -0,0 +1,45 @@
# 9.2 Third Party Modules
### Introduction
Python has a large library of built-in modules (*batteries included*).
There are even more third party modules. Check them in the [Python Package Index](https://pypi.org/) or PyPi. Or just do a Google search for a topic.
### Some Notable Modules
* `requests`: Accessing web services.
* `numpy`, `scipy`: Arrays and vector mathematics.
* `pandas`: Stats and data analysis.
* `django`, `flask`: Web programming.
* `sqlalchemy`: Databases and ORM.
* `ipython`: Alternative interactive shell.
### Installing Modules
Most common classic technique: `pip`.
```bash
bash % python3 -m pip install packagename
```
This command will download the package and install it globally in your Python folder. Somewhere like:
```code
/usr/local/lib/python3.6/site-packages
```
### Problems
* You may be using an installation of Python that you don't directly control.
* A corporate approved installation
* The Python version that comes with the OS.
* You might not have permission to install global packages in the computer.
* Your program might have unusual dependencies.
### Talk about environments...
## Exercises
(rewrite)

3
Notes/Contents.md Normal file
View File

@@ -0,0 +1,3 @@
# Table of Contents
This is contents