3.6 Design Discussion

In this section we consider some design decisions made in code so far.

Filenames versus Iterables

Compare these two programs that return the same output.

# Provide a filename
def read_data(filename):
    records = []
    with open(filename) as f:
        for line in f:
            ...
            records.append(r)
    return records

d = read_data('file.csv')

# Provide lines
def read_data(lines):
    records = []
    for line in lines:
        ...
        records.append(r)
    return records

with open('file.csv') as f:
    d = read_data(f)

Which of these functions do you prefer? Why?
Which of these functions is more flexible?

Deep Idea: "Duck Typing"

Duck Typing is a computer programming concept to determine whether an object can be used for a particular purpose. It is an application of the duck test.

If it looks like a duck, swims like a duck, and quacks like a duck, then it probably is a duck.

In our previous example that reads the lines, our read_data expects any iterable object. Not just the lines of a file.

def read_data(lines):
    records = []
    for line in lines:
        ...
        records.append(r)
    return records

This means that we can use it with other lines.

# A CSV file
lines = open('data.csv')
data = read_data(lines)

# A zipped file
lines = gzip.open('data.csv.gz','rt')
data = read_data(lines)

# The Standard Input
lines = sys.stdin
data = read_data(lines)

# A list of strings
lines = ['ACME,50,91.1','IBM,75,123.45', ... ]
data = read_data(lines)

There is considerable flexibility with this design.

Question: Shall we embrace or fight this flexibility?

Library Design Best Practices

Code libraries are often better served by embracing flexibility. Don't restrict your options. With great flexibility comes great power.

Exercise

(a)From filenames to file-like objects

In this section, you worked on a file fileparse.py that contained a function parse_csv(). The function worked like this:

>>> import fileparse
>>> portfolio = fileparse.parse_csv('Data/portfolio.csv', types=[str,int,float])
>>>

Right now, the function expects to be passed a filename. However, you can make the code more flexible. Modify the function so that it works with any file-like/iterable object. For example:

>>> import fileparse
>>> import gzip
>>> with gzip.open('Data/portfolio.csv.gz', 'rt') as f:
...      port = fileparse.parse_csv(f, types=[str,int,float])
...
>>> lines = ['name,shares,price', 'AA,34.23,100', 'IBM,50,91.1', 'HPE,75,45.1']
>>> port = fileparse.parse_csv(lines, types=[str,int,float])
>>>

In this new code, what happens if you pass a filename as before?

>>> port = fileparse.parse_csv('Data/portfolio.csv', types=[str,int,float])
>>> port
... look at output (it should be crazy) ...
>>>

With flexibility comes power and with power comes responsibility. Sometimes you'll need to be careful.

(b) Fixing existing functions

Fix the read_portfolio() and read_prices() functions in the report.py file so that they work with the modified version of parse_csv(). This should only involve a minor modification. Afterwards, your report.py and pcost.py programs should work the same way they always did.

3.3 KiB Raw Blame History