diff --git a/Notes/02_Working_with_data/01_Datatypes.md b/Notes/02_Working_with_data/01_Datatypes.md index 93f5fca..c07bfd3 100644 --- a/Notes/02_Working_with_data/01_Datatypes.md +++ b/Notes/02_Working_with_data/01_Datatypes.md @@ -1,5 +1,9 @@ +[Contents](../Contents) \| [Previous (1.6 Files)](../01_Introduction/06_Files) \| [Next (2.2 Containers)](02_Containers) + # 2.1 Datatypes and Data structures +This section introduces data structures in the form of tuples and dictionaries. + ### Primitive Datatypes Python has a few primitive types of data: @@ -16,7 +20,8 @@ We learned about these in the introduction. email_address = None ``` -This type is often used as a placeholder for optional or missing value. +`None` is often used as a placeholder for optional or missing value. It +evaluates as `False` in conditionals. ```python if email_address: @@ -25,8 +30,7 @@ if email_address: ### Data Structures -Real programs have more complex data than the ones that can be easily represented by the datatypes learned so far. -For example information about a stock: +Real programs have more complex data. For example information about a stock holding: ```code 100 shares of GOOG at $490.10 @@ -61,7 +65,7 @@ t = () # An empty tuple w = ('GOOG', ) # A 1-item tuple ``` -Tuples are usually used to represent *simple* records or structures. +Tuples are often used to represent *simple* records or structures. Typically, it is a single *object* of multiple parts. A good analogy: *A tuple is like a single row in a database table.* Tuple contents are ordered (like an array). @@ -73,9 +77,9 @@ shares = s[1] # 100 price = s[2] # 490.1 ``` -However, th contents can't be modified. +However, the contents can't be modified. -```pycon +```python >>> s[1] = 75 TypeError: object does not support item assignment ``` @@ -88,7 +92,7 @@ s = (s[0], 75, s[2]) ### Tuple Packing -Tuples are focused more on packing related items together into a single *entity*. +Tuples are more about packing related items together into a single *entity*. ```python s = ('GOOG', 100, 490.1) @@ -105,7 +109,7 @@ name, shares, price = s print('Cost', shares * price) ``` -The number of variables must match the tuple structure. +The number of variables on the left must match the tuple structure. ```python name, shares = s # ERROR @@ -116,19 +120,20 @@ ValueError: too many values to unpack ### Tuples vs. Lists -Tuples are NOT just read-only lists. Tuples are most ofter used for a *single item* consisting of multiple parts. -Lists are usually a collection of distinct items, usually all of the same type. +Tuples look like read-only lists. However, tuples are most often used +for a *single item* consisting of multiple parts. Lists are usually a +collection of distinct items, usually all of the same type. ```python -record = ('GOOG', 100, 490.1) # A tuple representing a stock in a portfolio +record = ('GOOG', 100, 490.1) # A tuple representing a record in a portfolio symbols = [ 'GOOG', 'AAPL', 'IBM' ] # A List representing three stock symbols ``` ### Dictionaries -A dictionary is a hash table or associative array. -It is a collection of values indexed by *keys*. These keys serve as field names. +A dictionary is mapping of keys to values. It's also sometimes called a hash table or +associative array. The keys serve as indices for accessing values. ```python s = { @@ -140,9 +145,9 @@ s = { ### Common operations -To read values from a dictionary use the key names. +To get values from a dictionary use the key names. -```pycon +```python >>> print(s['name'], s['shares']) GOOG 100 >>> s['price'] @@ -152,7 +157,7 @@ GOOG 100 To add or modify values assign using the key names. -```pycon +```python >>> s['shares'] = 75 >>> s['date'] = '6/6/2007' >>> @@ -160,7 +165,7 @@ To add or modify values assign using the key names. To delete a value use the `del` statement. -```pycon +```python >>> del s['date'] >>> ``` @@ -178,11 +183,11 @@ s[2] ## Exercises -### Note +In the last few exercises, you wrote a program that read a datafile +`Data/portfolio.csv`. Using the `csv` module, it is easy to read the +file row-by-row. -In the last few exercises, you wrote a program that read a datafile `Data/portfolio.csv`. Using the `csv` module, it is easy to read the file row-by-row. - -```pycon +```python >>> import csv >>> f = open('Data/portfolio.csv') >>> rows = csv.reader(f) @@ -194,11 +199,13 @@ In the last few exercises, you wrote a program that read a datafile `Data/portfo >>> ``` -Although reading the file is easy, you often want to do more with the data than read it. -For instance, perhaps you want to store it and start performing some calculations on it. -Unfortunately, a raw "row" of data doesn’t give you enough to work with. For example, even a simple math calculation doesn’t work: +Although reading the file is easy, you often want to do more with the +data than read it. For instance, perhaps you want to store it and +start performing some calculations on it. Unfortunately, a raw "row" +of data doesn’t give you enough to work with. For example, even a +simple math calculation doesn’t work: -```pycon +```python >>> row = ['AA', '100', '32.20'] >>> cost = row[1] * row[2] Traceback (most recent call last): @@ -207,8 +214,9 @@ TypeError: can't multiply sequence by non-int of type 'str' >>> ``` -To do more, you typically want to interpret the raw data in some way and turn it into a more useful kind of object so that you can work with it later. -Two simple options are tuples or dictionaries. +To do more, you typically want to interpret the raw data in some way +and turn it into a more useful kind of object so that you can work +with it later. Two simple options are tuples or dictionaries. ### Exercise 2.1: Tuples @@ -216,16 +224,17 @@ At the interactive prompt, create the following tuple that represents the above row, but with the numeric columns converted to proper numbers: -```pycon +```python >>> t = (row[0], int(row[1]), float(row[2])) >>> t ('AA', 100, 32.2) >>> ``` -Using this, you can now calculate the total cost by multiplying the shares and the price: +Using this, you can now calculate the total cost by multiplying the +shares and the price: -```pycon +```python >>> cost = t[1] * t[2] >>> cost 3220.0000000000005 @@ -244,15 +253,16 @@ surprising if you haven’t seen it before. This happens in all programming languages that use floating point decimals, but it often gets hidden when printing. For example: -```pycon +```python >>> print(f'{cost:0.2f}') 3220.00 >>> ``` -Tuples are read-only. Verify this by trying to change the number of shares to 75. +Tuples are read-only. Verify this by trying to change the number of +shares to 75. -```pycon +```python >>> t[1] = 75 Traceback (most recent call last): File "", line 1, in @@ -260,9 +270,10 @@ TypeError: 'tuple' object does not support item assignment >>> ``` -Although you can’t change tuple contents, you can always create a completely new tuple that replaces the old one. +Although you can’t change tuple contents, you can always create a +completely new tuple that replaces the old one. -```pycon +```python >>> t = (t[0], 75, t[2]) >>> t ('AA', 75, 32.2) @@ -274,9 +285,10 @@ value is discarded. Although the above assignment might look like you are modifying the tuple, you are actually creating a new tuple and throwing the old one away. -Tuples are often used to pack and unpack values into variables. Try the following: +Tuples are often used to pack and unpack values into variables. Try +the following: -```pycon +```python >>> name, shares, price = t >>> name 'AA' @@ -289,7 +301,7 @@ Tuples are often used to pack and unpack values into variables. Try the followin Take the above variables and pack them back into a tuple -```pycon +```python >>> t = (name, 2*shares, price) >>> t ('AA', 150, 32.2) @@ -300,7 +312,7 @@ Take the above variables and pack them back into a tuple An alternative to a tuple is to create a dictionary instead. -```pycon +```python >>> d = { 'name' : row[0], 'shares' : int(row[1]), @@ -313,25 +325,27 @@ An alternative to a tuple is to create a dictionary instead. Calculate the total cost of this holding: -```pycon +```python >>> cost = d['shares'] * d['price'] >>> cost 3220.0000000000005 >>> ``` -Compare this example with the same calculation involving tuples above. Change the number of shares to 75. +Compare this example with the same calculation involving tuples +above. Change the number of shares to 75. -```pycon +```python >>> d['shares'] = 75 >>> d {'name': 'AA', 'shares': 75, 'price': 75} >>> ``` -Unlike tuples, dictionaries can be freely modified. Add some attributes: +Unlike tuples, dictionaries can be freely modified. Add some +attributes: -```pycon +```python >>> d['date'] = (6, 11, 2007) >>> d['account'] = 12345 >>> d @@ -343,15 +357,16 @@ Unlike tuples, dictionaries can be freely modified. Add some attributes: If you turn a dictionary into a list, you’ll get all of its keys: -```pycon +```python >>> list(d) ['name', 'shares', 'price', 'date', 'account'] >>> ``` -Similarly, if you use the `for` statement to iterate on a dictionary, you will get the keys: +Similarly, if you use the `for` statement to iterate on a dictionary, +you will get the keys: -```pycon +```python >>> for k in d: print('k =', k) @@ -365,7 +380,7 @@ k = account Try this variant that performs a lookup at the same time: -```pycon +```python >>> for k in d: print(k, '=', d[k]) @@ -379,7 +394,7 @@ account = 12345 You can also obtain all of the keys using the `keys()` method: -```pycon +```python >>> keys = d.keys() >>> keys dict_keys(['name', 'shares', 'price', 'date', 'account']) @@ -388,20 +403,24 @@ dict_keys(['name', 'shares', 'price', 'date', 'account']) `keys()` is a bit unusual in that it returns a special `dict_keys` object. -This is an overlay on the original dictionary that always gives you the current keys—even if the dictionary changes. For example, try this: +This is an overlay on the original dictionary that always gives you +the current keys—even if the dictionary changes. For example, try +this: -```pycon +```python >>> del d['account'] >>> keys dict_keys(['name', 'shares', 'price', 'date']) >>> ``` -Carefully notice that the `'account'` disappeared from `keys` even though you didn’t call `d.keys()` again. +Carefully notice that the `'account'` disappeared from `keys` even +though you didn’t call `d.keys()` again. -A more elegant way to work with keys and values together is to use the `items()` method. This gives you `(key, value)` tuples: +A more elegant way to work with keys and values together is to use the +`items()` method. This gives you `(key, value)` tuples: -```pycon +```python >>> items = d.items() >>> items dict_items([('name', 'AA'), ('shares', 75), ('price', 32.2), ('date', (6, 11, 2007))]) @@ -415,9 +434,10 @@ date = (6, 11, 2007) >>> ``` -If you have tuples such as `items`, you can create a dictionary using the `dict()` function. Try it: +If you have tuples such as `items`, you can create a dictionary using +the `dict()` function. Try it: -```pycon +```python >>> items dict_items([('name', 'AA'), ('shares', 75), ('price', 32.2), ('date', (6, 11, 2007))]) >>> d = dict(items) diff --git a/Notes/02_Working_with_data/02_Containers.md b/Notes/02_Working_with_data/02_Containers.md index 5d04d36..99d38d6 100644 --- a/Notes/02_Working_with_data/02_Containers.md +++ b/Notes/02_Working_with_data/02_Containers.md @@ -1,5 +1,9 @@ +[Contents](../Contents) \| [Previous (2.1 Datatypes)](01_Datatypes) \| [Next (2.3 Formatting)](03_Formatting) + # 2.2 Containers +This section discusses lists, dictionaries, and sets. + ### Overview Programs often have to work with many objects. @@ -11,11 +15,11 @@ There are three main choices to use. * Lists. Ordered data. * Dictionaries. Unordered data. -* Sets. Unordered collection +* Sets. Unordered collection of unique items. ### Lists as a Container -Use a list when the order of the data matters. Remember that lists can hold any kind of objects. +Use a list when the order of the data matters. Remember that lists can hold any kind of object. For example, a list of tuples. ```python @@ -47,7 +51,7 @@ An example when reading records from a file. ```python records = [] # Initial empty list -with open('portfolio.csv', 'rt') as f: +with open('Data/portfolio.csv', 'rt') as f: for line in f: row = line.split(',') records.append((row[0], int(row[1])), float(row[2])) @@ -69,7 +73,7 @@ prices = { Here are some simple lookups: -```pycon +```python >>> prices['IBM'] 93.37 >>> prices['GOOG'] @@ -95,7 +99,7 @@ An example populating the dict from the contents of a file. ```python prices = {} # Initial empty dict -with open('prices.csv', 'rt') as f: +with open('Data/prices.csv', 'rt') as f: for line in f: row = line.split(',') prices[row[0]] = float(row[1]) @@ -143,12 +147,13 @@ holidays = { Then to access: -```pycon ->>> holidays[3, 14] 'Pi day' +```python +>>> holidays[3, 14] +'Pi day' >>> ``` -*Neither a list nor another dictionary can serve as a dictionary key, because lists and dictionaries are mutable.* +*Neither a list, a set, nor another dictionary can serve as a dictionary key, because lists and dictionaries are mutable.* ### Sets @@ -162,7 +167,7 @@ tech_stocks = set(['IBM', 'AAPL', 'MSFT']) Sets are useful for membership tests. -```pycon +```python >>> tech_stocks set(['AAPL', 'IBM', 'MSFT']) >>> 'IBM' in tech_stocks @@ -194,6 +199,9 @@ s1 - s2 # Set difference ## Exercises +In these exercises, you start building one of the major programs used +for the rest of this course. Do your work in the file `Work/report.py`. + ### Exercise 2.4: A list of tuples The file `Data/portfolio.csv` contains a list of stocks in a @@ -227,7 +235,8 @@ that file, define a function `read_portfolio(filename)` that opens a given portfolio file and reads it into a list of tuples. To do this, you’re going to make a few minor modifications to the above code. -First, instead of defining `total_cost = 0`, you’ll make a variable that’s initially set to an empty list. For example: +First, instead of defining `total_cost = 0`, you’ll make a variable +that’s initially set to an empty list. For example: ```python portfolio = [] @@ -251,7 +260,7 @@ interpreter): *Hint: Use `-i` when executing the file in the terminal* -```pycon +```python >>> portfolio = read_portfolio('Data/portfolio.csv') >>> portfolio [('AA', 100, 32.2), ('IBM', 50, 91.1), ('CAT', 150, 83.44), ('MSFT', 200, 51.23), @@ -291,14 +300,15 @@ That said, you can also rewrite the last for-loop using a statement like this: ### Exercise 2.5: List of Dictionaries -Take the function you wrote in part (a) and modify to represent each +Take the function you wrote in Exercise 2.4 and modify to represent each stock in the portfolio with a dictionary instead of a tuple. In this dictionary use the field names of "name", "shares", and "price" to represent the different columns in the input file. -Experiment with this new function in the same manner as you did in Exercise 2.4. +Experiment with this new function in the same manner as you did in +Exercise 2.4. -```pycon +```python >>> portfolio = read_portfolio('portfolio.csv') >>> portfolio [{'name': 'AA', 'shares': 100, 'price': 32.2}, {'name': 'IBM', 'shares': 50, 'price': 91.1}, @@ -327,7 +337,7 @@ often preferred because the resulting code is easier to read later. Viewing large dictionaries and lists can be messy. To clean up the output for debugging, considering using the `pprint` function. -```pycon +```python >>> from pprint import pprint >>> pprint(portfolio) [{'name': 'AA', 'price': 32.2, 'shares': 100}, @@ -346,7 +356,7 @@ A dictionary is a useful way to keep track of items where you want to look up items using an index other than an integer. In the Python shell, try playing with a dictionary: -```pycon +```python >>> prices = { } >>> prices['IBM'] = 92.45 >>> prices['MSFT'] = 45.12 @@ -388,7 +398,7 @@ A few little tips that you’ll need for this part. First, make sure you use the `csv` module just as you did before—there’s no need to reinvent the wheel here. -```pycon +```python >>> import csv >>> f = open('Data/prices.csv', 'r') >>> rows = csv.reader(f) @@ -409,7 +419,8 @@ an empty list—meaning no data was present on that line. There’s a possibility that this could cause your program to die with an exception. Use the `try` and `except` statements to catch this as -appropriate. +appropriate. Thought: would it be better to guard against bad data with +an `if`-statement instead? Once you have written your `read_prices()` function, test it interactively to make sure it works: @@ -425,10 +436,11 @@ interactively to make sure it works: ### Exercise 2.7: Finding out if you can retire -Tie all of this work together by adding the statements to your -`report.py` program. It takes the list of stocks in Exercise 2.5 and -the dictionary of prices in Exercise 2.6 and computes the current -value of the portfolio along with the gain/loss. +Tie all of this work together by adding a few additional statements to +your `report.py` program that compute gain/loss. These statements +should take the list of stocks in Exercise 2.5 and the dictionary of +prices in Exercise 2.6 and computes the current value of the portfolio +along with the gain/loss. [Contents](../Contents) \| [Previous (2.1 Datatypes)](01_Datatypes) \| [Next (2.3 Formatting)](03_Formatting) diff --git a/Notes/02_Working_with_data/03_Formatting.md b/Notes/02_Working_with_data/03_Formatting.md index 97ab2cd..a468ac1 100644 --- a/Notes/02_Working_with_data/03_Formatting.md +++ b/Notes/02_Working_with_data/03_Formatting.md @@ -1,7 +1,9 @@ +[Contents](../Contents) \| [Previous (2.2 Containers)](02_Containers) \| [Next (2.4 Sequences)](04_Sequences) + # 2.3 Formatting -This is a slight digression, but when you work with data, you often want to -produce structured output (tables, etc.). For example: +This section is a slight digression, but when you work with data, you +often want to produce structured output (tables, etc.). For example: ```code Name Shares Price @@ -61,7 +63,7 @@ Common modifiers adjust the field width and decimal precision. This is a partia ### Dictionary Formatting -You can use the `format_map()` method on strings. +You can use the `format_map()` method to apply string formatting to a dictionary of values: ```python >>> s = { @@ -74,7 +76,23 @@ You can use the `format_map()` method on strings. >>> ``` -It uses the same `f-strings` but takes the values from the supplied dictionary. +It uses the same codes as `f-strings` but takes the values from the +supplied dictionary. + +### format() method + +There is a method `format()` that can apply formatting to arguments or +keyword arguments. + +```python +>>> '{name:>10s} {shares:10d} {price:10.2f}'.format(name='IBM', shares=100, price=91.1) +' IBM 100 91.10' +>>> '{:10s} {:10d} {:10.2f}'.format('IBM', 100, 91.1) +' IBM 100 91.10' +>>> +``` + +Frankly, `format()` is a bit verbose. I prefer f-strings. ### C-Style Formatting @@ -89,7 +107,8 @@ You can also use the formatting operator `%`. '3.14' ``` -This requires a single item or a tuple on the right. Format codes are modeled after the C `printf()` as well. +This requires a single item or a tuple on the right. Format codes are +modeled after the C `printf()` as well. *Note: This is the only formatting available on byte strings.* @@ -101,25 +120,6 @@ b'Dave has 37 messages' ## Exercises -In Exercise 2.7, you wrote a program called `report.py` that computed the gain/loss of a -stock portfolio. In this exercise, you're going to modify it to produce a table like this: - -```code - Name Shares Price Change - ---------- ---------- ---------- ---------- - AA 100 9.22 -22.98 - IBM 50 106.28 15.18 - CAT 150 35.46 -47.98 - MSFT 200 20.89 -30.34 - GE 95 13.48 -26.89 - MSFT 50 20.89 -44.21 - IBM 100 106.28 35.84 -``` - -In this report, "Price" is the current share price of the stock and -"Change" is the change in the share price from the initial purchase -price. - ### Exercise 2.8: How to format numbers A common problem with printing numbers is specifying the number of @@ -145,7 +145,7 @@ Full documentation on the formatting codes used f-strings can be found [here](https://docs.python.org/3/library/string.html#format-specification-mini-language). Formatting is also sometimes performed using the `%` operator of strings. -```pycon +```python >>> print('%0.4f' % value) 42863.1000 >>> print('%16.2f' % value) @@ -159,7 +159,7 @@ Documentation on various codes used with `%` can be found Although it’s commonly used with `print`, string formatting is not tied to printing. If you want to save a formatted string. Just assign it to a variable. -```pycon +```python >>> f = '%0.4f' % value >>> f '42863.1000' @@ -168,6 +168,26 @@ If you want to save a formatted string. Just assign it to a variable. ### Exercise 2.9: Collecting Data +In Exercise 2.7, you wrote a program called `report.py` that computed the gain/loss of a +stock portfolio. In this exercise, you're going to start modifying it to produce a table like this: + +``` + Name Shares Price Change +---------- ---------- ---------- ---------- + AA 100 9.22 -22.98 + IBM 50 106.28 15.18 + CAT 150 35.46 -47.98 + MSFT 200 20.89 -30.34 + GE 95 13.48 -26.89 + MSFT 50 20.89 -44.21 + IBM 100 106.28 35.84 +``` + +In this report, "Price" is the current share price of the stock and +"Change" is the change in the share price from the initial purchase +price. + + In order to generate the above report, you’ll first want to collect all of the data shown in the table. Write a function `make_report()` that takes a list of stocks and dictionary of prices as input and @@ -176,7 +196,7 @@ returns a list of tuples containing the rows of the above table. Add this function to your `report.py` file. Here’s how it should work if you try it interactively: -```pycon +```python >>> portfolio = read_portfolio('Data/portfolio.csv') >>> prices = read_prices('Data/prices.csv') >>> report = make_report(portfolio, prices) @@ -197,7 +217,7 @@ if you try it interactively: Redo the for-loop in Exercise 2.9, but change the print statement to format the tuples. -```pycon +```python >>> for r in report: print('%10s %10d %10.2f %10.2f' % r) @@ -211,7 +231,7 @@ format the tuples. You can also expand the values and use f-strings. For example: -```pycon +```python >>> for name, shares, price, change in report: print(f'{name:>10s} {shares:>10d} {price:>10.2f} {change:>10.2f}') @@ -251,32 +271,32 @@ This string is just a bunch of "-" characters under each field name. For example When you’re done, your program should produce the table shown at the top of this exercise. -```code - Name Shares Price Change - ---------- ---------- ---------- ---------- - AA 100 9.22 -22.98 - IBM 50 106.28 15.18 - CAT 150 35.46 -47.98 - MSFT 200 20.89 -30.34 - GE 95 13.48 -26.89 - MSFT 50 20.89 -44.21 - IBM 100 106.28 35.84 +``` + Name Shares Price Change +---------- ---------- ---------- ---------- + AA 100 9.22 -22.98 + IBM 50 106.28 15.18 + CAT 150 35.46 -47.98 + MSFT 200 20.89 -30.34 + GE 95 13.48 -26.89 + MSFT 50 20.89 -44.21 + IBM 100 106.28 35.84 ``` ### Exercise 2.12: Formatting Challenge How would you modify your code so that the price includes the currency symbol ($) and the output looks like this: -```code - Name Shares Price Change - ---------- ---------- ---------- ---------- - AA 100 $9.22 -22.98 - IBM 50 $106.28 15.18 - CAT 150 $35.46 -47.98 - MSFT 200 $20.89 -30.34 - GE 95 $13.48 -26.89 - MSFT 50 $20.89 -44.21 - IBM 100 $106.28 35.84 +``` + Name Shares Price Change +---------- ---------- ---------- ---------- + AA 100 $9.22 -22.98 + IBM 50 $106.28 15.18 + CAT 150 $35.46 -47.98 + MSFT 200 $20.89 -30.34 + GE 95 $13.48 -26.89 + MSFT 50 $20.89 -44.21 + IBM 100 $106.28 35.84 ``` [Contents](../Contents) \| [Previous (2.2 Containers)](02_Containers) \| [Next (2.4 Sequences)](04_Sequences) diff --git a/Notes/02_Working_with_data/04_Sequences.md b/Notes/02_Working_with_data/04_Sequences.md index cb53e8b..df2dd3c 100644 --- a/Notes/02_Working_with_data/04_Sequences.md +++ b/Notes/02_Working_with_data/04_Sequences.md @@ -1,14 +1,16 @@ +[Contents](../Contents) \| [Previous (2.3 Formatting)](03_Formatting) \| [Next (2.5 Collections)](05_Collections) + # 2.4 Sequences ### Sequence Datatypes Python has three *sequence* datatypes. -* String: `'Hello'`. A string is considered a sequence of characters. +* String: `'Hello'`. A string is a sequence of characters. * List: `[1, 4, 5]`. * Tuple: `('GOOG', 100, 490.1)`. -All sequences are ordered and have length. +All sequences are ordered, indexed by integers, and have a length. ```python a = 'Hello' # String @@ -28,7 +30,7 @@ len(c) # 3 Sequences can be replicated: `s * n`. -```pycon +```python >>> a = 'Hello' >>> a * 3 'HelloHelloHello' @@ -40,7 +42,7 @@ Sequences can be replicated: `s * n`. Sequences of the same type can be concatenated: `s + t`. -```pycon +```python >>> a = (1, 2, 3) >>> b = (4, 5) >>> a + b @@ -56,7 +58,7 @@ TypeError: can only concatenate tuple (not "list") to tuple ### Slicing Slicing means to take a subsequence from a sequence. -The syntax used is `s[start:end]`. Where `start` and `end` are the indexes of the subsequence you want. +The syntax is `s[start:end]`. Where `start` and `end` are the indexes of the subsequence you want. ```python a = [0,1,2,3,4,5,6,7,8] @@ -67,12 +69,12 @@ a[:3] # [0,1,2] ``` * Indices `start` and `end` must be integers. -* Slices do *not* include the end value. +* Slices do *not* include the end value. It is like a half-open interval from math. * If indices are omitted, they default to the beginning or end of the list. ### Slice re-assignment -Slices can also be reassigned and deleted. +On lists, slices can be reassigned and deleted. ```python # Reassignment @@ -90,9 +92,9 @@ del a[2:4] # [0,1,4,5,6,7,8] ### Sequence Reductions -There are some functions to reduce a sequence to a single value. +There are some common functions to reduce a sequence to a single value. -```pycon +```python >>> s = [1, 2, 3, 4] >>> sum(s) 10 @@ -106,9 +108,9 @@ There are some functions to reduce a sequence to a single value. ### Iteration over a sequence -The for-loop iterates over the elements in the sequence. +The for-loop iterates over the elements in a sequence. -```pycon +```python >>> s = [1, 4, 9, 16] >>> for i in s: ... print(i) @@ -121,7 +123,7 @@ The for-loop iterates over the elements in the sequence. ``` On each iteration of the loop, you get a new item to work with. -This new value is placed into an iteration variable. In this example, the +This new value is placed into the iteration variable. In this example, the iteration variable is `x`: ```python @@ -129,12 +131,12 @@ for x in s: # `x` is an iteration variable ...statements ``` -In each iteration, it overwrites the previous value (if any). +On each iteration, the previous value of the iteration variable is overwritten (if any). After the loop finishes, the variable retains the last value. -### `break` statement +### break statement -You can use the `break` statement to break out of a loop before it finishes iterating all of the elements. +You can use the `break` statement to break out of a loop early. ```python for name in namelist: @@ -145,14 +147,14 @@ for name in namelist: statements ``` -When the `break` statement is executed, it will exit the loop and move +When the `break` statement executes, it exits the loop and moves on the next `statements`. The `break` statement only applies to the inner-most loop. If this loop is within another loop, it will not break the outer loop. -### `continue` statement +### continue statement -To skip one element and move to the next one you use the `continue` statement. +To skip one element and move to the next one, use the `continue` statement. ```python for line in lines: @@ -188,10 +190,11 @@ for k in range(10,50,2): * The ending value is never included. It mirrors the behavior of slices. * `start` is optional. Default `0`. * `step` is optional. Default `1`. +* `range()` computes values as needed. It does not actually store a large range of numbers. -### `enumerate()` function +### enumerate() function -The `enumerate` function provides a loop with an extra counter value. +The `enumerate` function adds an extra counter value to iteration. ```python names = ['Elwood', 'Jake', 'Curtis'] @@ -201,7 +204,7 @@ for i, name in enumerate(names): # i = 2, name = 'Curtis' ``` -How to use enumerate: `enumerate(sequence [, start = 0])`. `start` is optional. +The general form is `enumerate(sequence [, start = 0])`. `start` is optional. A good example of using `enumerate()` is tracking line numbers while reading a file: ```python @@ -223,7 +226,7 @@ Using `enumerate` is less typing and runs slightly faster. ### For and tuples -You can loop with multiple iteration variables. +You can iterate with multiple iteration variables. ```python points = [ @@ -236,11 +239,12 @@ for x, y in points: # ... ``` -When using multiple variables, each tuple will be *unpacked* into a set of iteration variables. +When using multiple variables, each tuple is *unpacked* into a set of iteration variables. +The number of variables must match the of items in each tuple. -### `zip()` function +### zip() function -The `zip` function takes sequences and makes an iterator that combines them. +The `zip` function takes multiple sequences and makes an iterator that combines them. ```python columns = ['name', 'shares', 'price'] @@ -268,7 +272,7 @@ d = dict(zip(columns, values)) Try some basic counting examples: -```pycon +```python >>> for n in range(10): # Count 0 ... 9 print(n, end=' ') @@ -288,7 +292,7 @@ Try some basic counting examples: Interactively experiment with some of the sequence reduction operations. -```pycon +```python >>> data = [4, 9, 1, 25, 16, 100, 49] >>> min(data) 1 @@ -301,7 +305,7 @@ Interactively experiment with some of the sequence reduction operations. Try looping over the data. -```pycon +```python >>> for x in data: print(x) @@ -322,7 +326,7 @@ Sometimes the `for` statement, `len()`, and `range()` get used by novices in some kind of horrible code fragment that looks like it emerged from the depths of a rusty C program. -```pycon +```python >>> for n in range(len(data)): print(data[n]) @@ -338,10 +342,10 @@ it’s inefficient with memory and it runs a lot slower. Just use a normal `for` loop if you want to iterate over data. Use `enumerate()` if you happen to need the index for some reason. -### Exercise 2.15: A practical `enumerate()` example +### Exercise 2.15: A practical enumerate() example Recall that the file `Data/missing.csv` contains data for a stock -portfolio, but has some rows with missing data. Using `enumerate()` +portfolio, but has some rows with missing data. Using `enumerate()`, modify your `pcost.py` program so that it prints a line number with the warning message when it encounters bad input. @@ -352,7 +356,7 @@ Row 7: Couldn't convert: ['IBM', '', '70.44'] >>> ``` -To do this, you’ll need to change just a few parts of your code. +To do this, you’ll need to change a few parts of your code. ```python ... @@ -363,12 +367,12 @@ for rowno, row in enumerate(rows, start=1): print(f'Row {rowno}: Bad row: {row}') ``` -### Exercise 2.16: Using the `zip()` function +### Exercise 2.16: Using the zip() function -In the file `portfolio.csv`, the first line contains column +In the file `Data/portfolio.csv`, the first line contains column headers. In all previous code, we’ve been discarding them. -```pycon +```python >>> f = open('Data/portfolio.csv') >>> rows = csv.reader(f) >>> headers = next(rows) @@ -381,7 +385,7 @@ However, what if you could use the headers for something useful? This is where the `zip()` function enters the picture. First try this to pair the file headers with a row of data: -```pycon +```python >>> row = next(rows) >>> row ['AA', '100', '32.20'] @@ -395,10 +399,10 @@ We’ve used `list()` here to turn the result into a list so that you can see it. Normally, `zip()` creates an iterator that must be consumed by a for-loop. -This pairing is just an intermediate step to building a +This pairing is an intermediate step to building a dictionary. Now try this: -```pycon +```python >>> record = dict(zip(headers, row)) >>> record {'price': '32.20', 'name': 'AA', 'shares': '100'} @@ -462,14 +466,14 @@ out of it. As long as the file has the required columns, the code will work. Modify the `report.py` program you wrote in Section 2.3 that it uses the same technique to pick out column headers. -Try running the `report.py` program on the `Data/portfoliodate.csv` file and see that it -produces the same answer as before. +Try running the `report.py` program on the `Data/portfoliodate.csv` +file and see that it produces the same answer as before. ### Exercise 2.17: Inverting a dictionary A dictionary maps keys to values. For example, a dictionary of stock prices. -```pycon +```python >>> prices = { 'GOOG' : 490.1, 'AA' : 23.45, @@ -481,7 +485,7 @@ A dictionary maps keys to values. For example, a dictionary of stock prices. If you use the `items()` method, you can get `(key,value)` pairs: -```pycon +```python >>> prices.items() dict_items([('GOOG', 490.1), ('AA', 23.45), ('IBM', 91.1), ('MSFT', 34.23)]) >>> @@ -490,7 +494,7 @@ dict_items([('GOOG', 490.1), ('AA', 23.45), ('IBM', 91.1), ('MSFT', 34.23)]) However, what if you wanted to get a list of `(value, key)` pairs instead? *Hint: use `zip()`.* -```pycon +```python >>> pricelist = list(zip(prices.values(),prices.keys())) >>> pricelist [(490.1, 'GOOG'), (23.45, 'AA'), (91.1, 'IBM'), (34.23, 'MSFT')] @@ -500,7 +504,7 @@ However, what if you wanted to get a list of `(value, key)` pairs instead? Why would you do this? For one, it allows you to perform certain kinds of data processing on the dictionary data. -```pycon +```python >>> min(pricelist) (23.45, 'AA') >>> max(pricelist) @@ -523,7 +527,7 @@ values. Note that `zip()` is not limited to pairs. For example, you can use it with any number of input lists: -```pycon +```python >>> a = [1, 2, 3, 4] >>> b = ['w', 'x', 'y', 'z'] >>> c = [0.2, 0.4, 0.6, 0.8] @@ -534,7 +538,7 @@ with any number of input lists: Also, be aware that `zip()` stops once the shortest input sequence is exhausted. -```pycon +```python >>> a = [1, 2, 3, 4, 5, 6] >>> b = ['x', 'y', 'z'] >>> list(zip(a,b)) diff --git a/Notes/02_Working_with_data/05_Collections.md b/Notes/02_Working_with_data/05_Collections.md index c2c1a21..183ad11 100644 --- a/Notes/02_Working_with_data/05_Collections.md +++ b/Notes/02_Working_with_data/05_Collections.md @@ -1,3 +1,5 @@ +[Contents](../Contents) \| [Previous (2.4 Sequences)](04_Sequences) \| [Next (2.6 List Comprehensions)](06_List_comprehension) + # 2.5 collections module The `collections` module provides a number of useful objects for data handling. @@ -20,6 +22,8 @@ portfolio = [ There are two `IBM` entries and two `GOOG` entries in this list. The shares need to be combined together somehow. +### Counters + Solution: Use a `Counter`. ```python @@ -94,7 +98,7 @@ bash % python3 -i report.py Suppose you wanted to tabulate the total number of shares of each stock. This is easy using `Counter` objects. Try it: -```pycon +```python >>> portfolio = read_portfolio('Data/portfolio.csv') >>> from collections import Counter >>> holdings = Counter() @@ -129,7 +133,7 @@ If you want to rank the values, do this: Let’s grab another portfolio of stocks and make a new Counter: -```pycon +```python >>> portfolio2 = read_portfolio('Data/portfolio2.csv') >>> holdings2 = Counter() >>> for s in portfolio2: @@ -142,7 +146,7 @@ Counter({'HPQ': 250, 'GE': 125, 'AA': 50, 'MSFT': 25}) Finally, let’s combine all of the holdings doing one simple operation: -```pycon +```python >>> holdings Counter({'MSFT': 250, 'IBM': 150, 'CAT': 150, 'AA': 100, 'GE': 95}) >>> holdings2 @@ -157,4 +161,11 @@ This is only a small taste of what counters provide. However, if you ever find yourself needing to tabulate values, you should consider using one. +### Commentary: collections module + +The `collections` module is one of the most useful library modules +in all of Python. In fact, we could do an extended tutorial on just +that. However, doing so now would also be a distraction. For now, +put `collections` on your list of bedtime reading for later. + [Contents](../Contents) \| [Previous (2.4 Sequences)](04_Sequences) \| [Next (2.6 List Comprehensions)](06_List_comprehension) diff --git a/Notes/02_Working_with_data/06_List_comprehension.md b/Notes/02_Working_with_data/06_List_comprehension.md index e93689c..81d0578 100644 --- a/Notes/02_Working_with_data/06_List_comprehension.md +++ b/Notes/02_Working_with_data/06_List_comprehension.md @@ -1,13 +1,16 @@ +[Contents](../Contents) \| [Previous (2.5 Collections)](05_Collections) \| [Next (2.7 Object Model)](07_Objects) + # 2.6 List Comprehensions A common task is processing items in a list. This section introduces list comprehensions, -a useful tool for doing just that. +a powerful tool for doing just that. ### Creating new lists -A list comprehension creates a new list by applying an operation to each element of a sequence. +A list comprehension creates a new list by applying an operation to +each element of a sequence. -```pycon +```python >>> a = [1, 2, 3, 4, 5] >>> b = [2*x for x in a ] >>> b @@ -17,7 +20,7 @@ A list comprehension creates a new list by applying an operation to each element Another example: -```pycon +```python >>> names = ['Elwood', 'Jake'] >>> a = [name.lower() for name in names] >>> a @@ -31,7 +34,7 @@ The general syntax is: `[ for in ]`. You can also filter during the list comprehension. -```pycon +```python >>> a = [1, -5, 4, 2, -2, 10] >>> b = [2*x for x in a if x > 0 ] >>> b @@ -42,7 +45,7 @@ You can also filter during the list comprehension. ### Use cases List comprehensions are hugely useful. For example, you can collect values of a specific -record field: +dictionary fields: ```python stocknames = [s['name'] for s in stocks] @@ -91,20 +94,22 @@ it's fine to view it as a cool list shortcut. ## Exercises -Start by running your `report.py` program so that you have the portfolio of stocks loaded in the interactive mode. +Start by running your `report.py` program so that you have the +portfolio of stocks loaded in the interactive mode. ```bash bash % python3 -i report.py ``` -Now, at the Python interactive prompt, type statements to perform the operations described below. -These operations perform various kinds of data reductions, transforms, and queries on the portfolio data. +Now, at the Python interactive prompt, type statements to perform the +operations described below. These operations perform various kinds of +data reductions, transforms, and queries on the portfolio data. ### Exercise 2.19: List comprehensions Try a few simple list comprehensions just to become familiar with the syntax. -```pycon +```python >>> nums = [1,2,3,4] >>> squares = [ x * x for x in nums ] >>> squares @@ -115,31 +120,35 @@ Try a few simple list comprehensions just to become familiar with the syntax. >>> ``` -Notice how the list comprehensions are creating a new list with the data suitably transformed or filtered. +Notice how the list comprehensions are creating a new list with the +data suitably transformed or filtered. ### Exercise 2.20: Sequence Reductions Compute the total cost of the portfolio using a single Python statement. -```pycon +```python +>>> portfolio = read_portfolio('Data/portfolio.csv') >>> cost = sum([ s['shares'] * s['price'] for s in portfolio ]) >>> cost 44671.15 >>> ``` -After you have done that, show how you can compute the current value of the portfolio using a single statement. +After you have done that, show how you can compute the current value +of the portfolio using a single statement. -```pycon +```python >>> value = sum([ s['shares'] * prices[s['name']] for s in portfolio ]) >>> value 28686.1 >>> ``` -Both of the above operations are an example of a map-reduction. The list comprehension is mapping an operation across the list. +Both of the above operations are an example of a map-reduction. The +list comprehension is mapping an operation across the list. -```pycon +```python >>> [ s['shares'] * s['price'] for s in portfolio ] [3220.0000000000005, 4555.0, 12516.0, 10246.0, 3835.1499999999996, 3254.9999999999995, 7044.0] >>> @@ -161,7 +170,7 @@ Try the following examples of various data queries. First, a list of all portfolio holdings with more than 100 shares. -```pycon +```python >>> more100 = [ s for s in portfolio if s['shares'] > 100 ] >>> more100 [{'price': 83.44, 'name': 'CAT', 'shares': 150}, {'price': 51.23, 'name': 'MSFT', 'shares': 200}] @@ -170,7 +179,7 @@ First, a list of all portfolio holdings with more than 100 shares. All portfolio holdings for MSFT and IBM stocks. -```pycon +```python >>> msftibm = [ s for s in portfolio if s['name'] in {'MSFT','IBM'} ] >>> msftibm [{'price': 91.1, 'name': 'IBM', 'shares': 50}, {'price': 51.23, 'name': 'MSFT', 'shares': 200}, @@ -180,7 +189,7 @@ All portfolio holdings for MSFT and IBM stocks. A list of all portfolio holdings that cost more than $10000. -```pycon +```python >>> cost10k = [ s for s in portfolio if s['shares'] * s['price'] > 10000 ] >>> cost10k [{'price': 83.44, 'name': 'CAT', 'shares': 150}, {'price': 51.23, 'name': 'MSFT', 'shares': 200}] @@ -191,7 +200,7 @@ A list of all portfolio holdings that cost more than $10000. Show how you could build a list of tuples `(name, shares)` where `name` and `shares` are taken from `portfolio`. -```pycon +```python >>> name_shares =[ (s['name'], s['shares']) for s in portfolio ] >>> name_shares [('AA', 100), ('IBM', 50), ('CAT', 150), ('MSFT', 200), ('GE', 95), ('MSFT', 50), ('IBM', 100)] @@ -201,9 +210,9 @@ Show how you could build a list of tuples `(name, shares)` where `name` and `sha If you change the the square brackets (`[`,`]`) to curly braces (`{`, `}`), you get something known as a set comprehension. This gives you unique or distinct values. -For example, this determines the set of stock names that appear in `portfolio`: +For example, this determines the set of unique stock names that appear in `portfolio`: -```pycon +```python >>> names = { s['name'] for s in portfolio } >>> names { 'AA', 'GE', 'IBM', 'MSFT', 'CAT'] } @@ -213,7 +222,7 @@ For example, this determines the set of stock names that appear in `portfolio`: If you specify `key:value` pairs, you can build a dictionary. For example, make a dictionary that maps the name of a stock to the total number of shares held. -```pycon +```python >>> holdings = { name: 0 for name in names } >>> holdings {'AA': 0, 'GE': 0, 'IBM': 0, 'MSFT': 0, 'CAT': 0} @@ -222,7 +231,7 @@ For example, make a dictionary that maps the name of a stock to the total number This latter feature is known as a **dictionary comprehension**. Let’s tabulate: -```pycon +```python >>> for s in portfolio: holdings[s['name']] += s['shares'] @@ -231,9 +240,10 @@ This latter feature is known as a **dictionary comprehension**. Let’s tabulate >>> ``` -Try this example that filters the `prices` dictionary down to only those names that appear in the portfolio: +Try this example that filters the `prices` dictionary down to only +those names that appear in the portfolio: -```pycon +```python >>> portfolio_prices = { name: prices[name] for name in names } >>> portfolio_prices {'AA': 9.22, 'GE': 13.48, 'IBM': 106.28, 'MSFT': 20.89, 'CAT': 35.46} @@ -242,12 +252,14 @@ Try this example that filters the `prices` dictionary down to only those names t ### Exercise 2.23: Extracting Data From CSV Files -Knowing how to use various combinations of list, set, and dictionary comprehensions can be useful in various forms of data processing. -Here’s an example that shows how to extract selected columns from a CSV file. +Knowing how to use various combinations of list, set, and dictionary +comprehensions can be useful in various forms of data processing. +Here’s an example that shows how to extract selected columns from a +CSV file. First, read a row of header information from a CSV file: -```pycon +```python >>> import csv >>> f = open('Data/portfoliodate.csv') >>> rows = csv.reader(f) @@ -259,23 +271,24 @@ First, read a row of header information from a CSV file: Next, define a variable that lists the columns that you actually care about: -```pycon +```python >>> select = ['name', 'shares', 'price'] >>> ``` Now, locate the indices of the above columns in the source CSV file: -```pycon +```python >>> indices = [ headers.index(colname) for colname in select ] >>> indices [0, 3, 4] >>> ``` -Finally, read a row of data and turn it into a dictionary using a dictionary comprehension: +Finally, read a row of data and turn it into a dictionary using a +dictionary comprehension: -```pycon +```python >>> row = next(rows) >>> record = { colname: row[index] for colname, index in zip(select, indices) } # dict-comprehension >>> record @@ -286,7 +299,7 @@ Finally, read a row of data and turn it into a dictionary using a dictionary com If you’re feeling comfortable with what just happened, read the rest of the file: -```pycon +```python >>> portfolio = [ { colname: row[index] for colname, index in zip(select, indices) } for row in rows ] >>> portfolio [{'price': '91.10', 'name': 'IBM', 'shares': '50'}, {'price': '83.44', 'name': 'CAT', 'shares': '150'}, diff --git a/Notes/02_Working_with_data/07_Objects.md b/Notes/02_Working_with_data/07_Objects.md index 2818ff4..72d1cf8 100644 --- a/Notes/02_Working_with_data/07_Objects.md +++ b/Notes/02_Working_with_data/07_Objects.md @@ -1,3 +1,5 @@ +[Contents](../Contents) \| [Previous (2.6 List Comprehensions)](06_List_comprehension) \| [Next (3 Program Organization)](../03_Program_organization/00_Overview) + # 2.7 Objects This section introduces more details about Python's internal object model and @@ -31,9 +33,11 @@ A picture of the underlying memory operations. In this example, there is only one list object `[1,2,3]`, but there are four different references to it. +![References](references.png) + This means that modifying a value affects *all* references. -```pycon +```python >>> a.append(999) >>> a [1,2,3,999] @@ -44,14 +48,15 @@ This means that modifying a value affects *all* references. >>> ``` -Notice how a change in the original list shows up everywhere else (yikes!). -This is because no copies were ever made. Everything is pointing to the same thing. +Notice how a change in the original list shows up everywhere else +(yikes!). This is because no copies were ever made. Everything is +pointing to the same thing. ### Reassigning values Reassigning a value *never* overwrites the memory used by the previous value. -```pycon +```python a = [1,2,3] b = a a = [4,5,6] @@ -69,13 +74,14 @@ foot at some point. Typical scenario. You modify some data thinking that it's your own private copy and it accidentally corrupts some data in some other part of the program. -*Comment: This is one of the reasons why the primitive datatypes (int, float, string) are immutable (read-only).* +*Comment: This is one of the reasons why the primitive datatypes (int, + float, string) are immutable (read-only).* ### Identity and References -Use ths `is` operator to check if two values are exactly the same object. +Use the `is` operator to check if two values are exactly the same object. -```pycon +```python >>> a = [1,2,3] >>> b = a >>> a is b @@ -86,7 +92,7 @@ True `is` compares the object identity (an integer). The identity can be obtained using `id()`. -```pycon +```python >>> id(a) 3588944 >>> id(b) @@ -94,11 +100,27 @@ obtained using `id()`. >>> ``` +Note: It is almost always better to use `==` for checking objects. The behavior +of `is` is often unexpected: + +```python +>>> a = [1,2,3] +>>> b = a +>>> c = [1,2,3] +>>> a is b +True +>>> a is c +False +>>> a == c +True +>>> +``` + ### Shallow copies Lists and dicts have methods for copying. -```pycon +```python >>> a = [2,3,[100,101],4] >>> b = list(a) # Make a copy >>> a is b @@ -118,14 +140,16 @@ True ``` For example, the inner list `[100, 101]` is being shared. -This is knows as a shallow copy. +This is known as a shallow copy. Here is a picture. + +![Shallow copy](shallow.png) ### Deep copies Sometimes you need to make a copy of an object and all the objects contained withn it. You can use the `copy` module for this: -```pycon +```python >>> a = [2,3,[100,101],4] >>> import copy >>> b = copy.deepcopy(a) @@ -142,7 +166,7 @@ False Variable names do not have a *type*. It's only a name. However, values *do* have an underlying type. -```pycon +```python >>> a = 42 >>> b = 'Hello World' >>> type(a) @@ -151,7 +175,7 @@ However, values *do* have an underlying type. ``` -`type()` will tell you what it is. The type name is usually a function +`type()` will tell you what it is. The type name is usually used as a function that creates or converts a value to that type. ### Type Checking @@ -159,18 +183,21 @@ that creates or converts a value to that type. How to tell if an object is a specific type. ```python -if isinstance(a,list): +if isinstance(a, list): print('a is a list') ``` -Checking for one of many types. +Checking for one of many possible types. ```python if isinstance(a, (list,tuple)): print('a is a list or tuple') ``` -*Caution: Don't go overboard with type checking. It can lead to excessive complexity.* +*Caution: Don't go overboard with type checking. It can lead to +excessive code complexity. Usually you'd only do it if doing +so would prevent common mistakes made by others using your code. +* ### Everything is an object @@ -182,7 +209,7 @@ is said that all objects are "first-class". A simple example: -```pycon +```python >>> import math >>> items = [abs, math, ValueError ] >>> items @@ -201,8 +228,9 @@ Failed! >>> ``` -Here, `items` is a list containing a function, a module and an exception. -You can use the items in the list in place of the original names: +Here, `items` is a list containing a function, a module and an +exception. You can directly use the items in the list in place of the +original names: ```python items[0](-45) # abs @@ -210,6 +238,8 @@ items[1].sqrt(2) # math except items[2]: # ValueError ``` +With great power come responsibility. Just because you can do that doesn't me you should. + ## Exercises In this set of exercises, we look at some of the power that comes from first-class @@ -242,7 +272,7 @@ using some list basic operations. Make a Python list that contains the names of the conversion functions you would use to convert each column into the appropriate type: -```pycon +```python >>> types = [str, int, float] >>> ``` @@ -254,7 +284,7 @@ a value `x` into a given type (e.g., `str(x)`, `int(x)`, `float(x)`). Now, read a row of data from the above file: -```pycon +```python >>> import csv >>> f = open('Data/portfolio.csv') >>> rows = csv.reader(f) @@ -265,9 +295,10 @@ Now, read a row of data from the above file: >>> ``` -As noted, this row isn’t enough to do calculations because the types are wrong. For example: +As noted, this row isn’t enough to do calculations because the types +are wrong. For example: -```pycon +```python >>> row[1] * row[2] Traceback (most recent call last): File "", line 1, in @@ -275,9 +306,10 @@ TypeError: can't multiply sequence by non-int of type 'str' >>> ``` -However, maybe the data can be paired up with the types you specified in `types`. For example: +However, maybe the data can be paired up with the types you specified +in `types`. For example: -```pycon +```python >>> types[1] >>> row[1] @@ -287,7 +319,7 @@ However, maybe the data can be paired up with the types you specified in `types` Try converting one of the values: -```pycon +```python >>> types[1](row[1]) # Same as int(row[1]) 100 >>> @@ -295,7 +327,7 @@ Try converting one of the values: Try converting a different value: -```pycon +```python >>> types[2](row[2]) # Same as float(row[2]) 32.2 >>> @@ -303,7 +335,7 @@ Try converting a different value: Try the calculation with converted values: -```pycon +```python >>> types[1](row[1])*types[2](row[2]) 3220.0000000000005 >>> @@ -311,7 +343,7 @@ Try the calculation with converted values: Zip the column types with the fields and look at the result: -```pycon +```python >>> r = list(zip(types, row)) >>> r [(, 'AA'), (, '100'), (,'32.20')] @@ -321,10 +353,10 @@ Zip the column types with the fields and look at the result: You will notice that this has paired a type conversion with a value. For example, `int` is paired with the value `'100'`. -The zipped list is useful if you want to perform conversions on all of the values, one -after the other. Try this: +The zipped list is useful if you want to perform conversions on all of +the values, one after the other. Try this: -```pycon +```python >>> converted = [] >>> for func, val in zip(types, row): converted.append(func(val)) @@ -336,14 +368,15 @@ after the other. Try this: >>> ``` -Make sure you understand what’s happening in the above code. -In the loop, the `func` variable is one of the type conversion functions (e.g., -`str`, `int`, etc.) and the `val` variable is one of the values like -`'AA'`, `'100'`. The expression `func(val)` is converting a value (kind of like a type cast). +Make sure you understand what’s happening in the above code. In the +loop, the `func` variable is one of the type conversion functions +(e.g., `str`, `int`, etc.) and the `val` variable is one of the values +like `'AA'`, `'100'`. The expression `func(val)` is converting a +value (kind of like a type cast). The above code can be compressed into a single list comprehension. -```pycon +```python >>> converted = [func(val) for func, val in zip(types, row)] >>> converted ['AA', 100, 32.2] @@ -352,10 +385,11 @@ The above code can be compressed into a single list comprehension. ### Exercise 2.25: Making dictionaries -Remember how the `dict()` function can easily make a dictionary if you have a sequence of key names and values? -Let’s make a dictionary from the column headers: +Remember how the `dict()` function can easily make a dictionary if you +have a sequence of key names and values? Let’s make a dictionary from +the column headers: -```pycon +```python >>> headers ['name', 'shares', 'price'] >>> converted @@ -365,9 +399,10 @@ Let’s make a dictionary from the column headers: >>> ``` -Of course, if you’re up on your list-comprehension fu, you can do the whole conversion in a single shot using a dict-comprehension: +Of course, if you’re up on your list-comprehension fu, you can do the +whole conversion in a single step using a dict-comprehension: -```pycon +```python >>> { name: func(val) for name, func, val in zip(headers, types, row) } {'price': 32.2, 'name': 'AA', 'shares': 100} >>> @@ -375,11 +410,13 @@ Of course, if you’re up on your list-comprehension fu, you can do the whole co ### Exercise 2.26: The Big Picture -Using the techniques in this exercise, you could write statements that easily convert fields from just about any column-oriented datafile into a Python dictionary. +Using the techniques in this exercise, you could write statements that +easily convert fields from just about any column-oriented datafile +into a Python dictionary. Just to illustrate, suppose you read data from a different datafile like this: -```pycon +```python >>> f = open('Data/dowstocks.csv') >>> rows = csv.reader(f) >>> headers = next(rows) @@ -393,7 +430,7 @@ Just to illustrate, suppose you read data from a different datafile like this: Let’s convert the fields using a similar trick: -```pycon +```python >>> types = [str, float, str, str, float, float, float, float, int] >>> converted = [func(val) for func, val in zip(types, row)] >>> record = dict(zip(headers, converted)) @@ -408,6 +445,10 @@ Let’s convert the fields using a similar trick: >>> ``` -Spend some time to ponder what you’ve done in this exercise. We’ll revisit these ideas a little later. +Bonus: How would you modify this example to additionally parse the +`date` entry into a tuple such as `(6, 11, 2007)`? + +Spend some time to ponder what you’ve done in this exercise. We’ll +revisit these ideas a little later. [Contents](../Contents) \| [Previous (2.6 List Comprehensions)](06_List_comprehension) \| [Next (3 Program Organization)](../03_Program_organization/00_Overview) diff --git a/Notes/02_Working_with_data/references.png b/Notes/02_Working_with_data/references.png new file mode 100644 index 0000000..7204bd0 Binary files /dev/null and b/Notes/02_Working_with_data/references.png differ diff --git a/Notes/02_Working_with_data/shallow.png b/Notes/02_Working_with_data/shallow.png new file mode 100644 index 0000000..bdfa56d Binary files /dev/null and b/Notes/02_Working_with_data/shallow.png differ