本文来自Python官方文档:https://docs.python.org/2/tutorial/index.html
- 0 Coding Style
- 1 Python Interpreter
- 2 Control Flow Statements
- 3 Defining Functions
- Math
- Strings
- Data Structures
0 Coding Style
The most important points:
- Use 4-space indentation, and no tabs.
- Wrap lines so that they don’t exceed 79 characters.
- Use blank lines to separate functions and classes, and larger blocks of code inside functions.
- When possible, put comments(
#
) on a line of their own. - Use docstrings.
- Use spaces around operators and after commas, but not directly inside bracketing constructs:
a = f(1, 2) + g(3, 4)
. - Name your classes and functions consistently; the convention is to use CamelCase for classes and lower_case_with_underscores for functions and methods. Always use self as the name for the first method argument (see A First Look at Classes for more on classes and methods).
- Don’t use fancy encodings if your code is meant to be used in international environments. Plain ASCII works best in any case.
More detailed information is in PEP 8
1 Python Interpreter
1.1 Invoking the Interpreter
Python’s default path is /usr/local/bin/python
(Linux) or C:\python27
(Windows)
- start:
python
python [-i] <source file>
python -c command [arg] ...
python -m module [arg] ...
-
Argument Passing:
When known to the interpreter, the script name and additional arguments thereafter are turned into a list of strings and assigned to the
argv
variable in thesys
module. You can access this list by executingimport sys
. The length of the list is at least one; when no script and no arguments are given,sys.argv[0]
is an empty string. When the script name is given as'-'
(meaning standard input),sys.argv[0]
is set to'-'
. When-c
command is used,sys.argv[0]
is set to'-c'
. When-m
module is used,sys.argv[0]
is set to the full name of the located module. Options found after-c
command or-m
module are not consumed by the Python interpreter’s option processing but left insys.argv
for the command or module to handle. -
Source Code Encoding:
1 2
#!/usr/bin/env python # -*- coding: utf-8 -*-
1.2 Interactive Mode
In this mode it prompts for the next command with the primary prompt, usually three greater-than signs (>>>
); for continuation lines it prompts with the secondary prompt, by default three dots (...
). The last printed expression is assigned to the variable _
.
1
2
3
4
5
6
>>> 100.50 * (12.5)/100
12.5625
>>> price + _
113.0625
>>> round(_, 2)
113.06
- start:
python
- quit:
- use
end-of-file character
(Control-D
on Unix,Control-Z
on Windows) - use
quit()
- use
- GNU readline library: support for the GNU readline library(which adds more elaborate interactive editing and history features).
The current line can be edited using the conventional Emacs control characters.
- Line Editing
C-A
: (Control-A) moves the cursor to the beginning of the lineC-E
: to the endC-B
: moves it one position to the leftC-F
: to the right. Backspace erases the character to the left of the cursorC-D
: the character to its right. C-K kills (erases) the rest of the line to the right of the cursorC-Y
: yanks back the last killed stringC-_
: undoes the last change you made; it can be repeated for cumulative effect.
- History Substitution
C-P
: moves one line up (back) in the history bufferC-N
: moves one down.C-R
: starts an incremental reverse searchC-S
: starts a forward search
- Key Bindings(in
~/.inputrc
)- bind key:
key-name: function-name
,"string": function-name
- set options:
set option-name value
-
examples:
# ~/.inputrc # set vi-style editing: set editing-mode vi # Edit using a single line: set horizontal-scroll-mode On # Rebind some keys: Meta-h: backward-kill-word "\C-u": universal-argument "\C-x\C-r": re-read-init-file # make Tab be used for complete Tab: complete
- bind key:
- Line Editing
-
startup file: Python will execute the contents of a file identified by the
PYTHONSTARTUP
environment variable when you start an interactive interpreter.1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28
# Add auto-completion and a stored history file of commands to your Python # interactive interpreter. Requires Python 2.0+, readline. Autocomplete is # bound to the Esc key by default (you can change it - see readline docs). # # Store the file in ~/.pystartup, and set an environment variable to point # to it: "export PYTHONSTARTUP=~/.pystartup" in bash. import atexit import os import readline import rlcompleter historyPath = os.path.expanduser("~/.pyhistory") def save_history(historyPath=historyPath): import readline readline.write_history_file(historyPath) if os.path.exists(historyPath): readline.read_history_file(historyPath) atexit.register(save_history) # use <Control-O> for complete # readline.parse_and_bind('"\C-O": complete') del os, atexit, readline, rlcompleter, save_history, historyPath
- Alternatives: One alternative enhanced interactive interpreter that has been around for quite some time is IPython, which features tab completion, object exploration and advanced history management. It can also be thoroughly customized and embedded into other applications. Another similar enhanced interactive environment is bpython.
2 Control Flow Statements
- what is
True
: like in C, any non-zero integer value is true; zero is false. The condition may also be any sequence: anything with a non-zero length is true, empty sequences are false. - multiple assignment:
1 2
a, b=0, 1 a, b = b, a+b
the expressions on the right-hand side are all evaluated first before any of the assignments take place. The right-hand side expressions are evaluated from the left to the right.
- indentation: indentation is Python’s way of grouping statements. At the interactive prompt, you have to type a tab or space(s) for each indented line. When a compound statement is entered interactively, it must be followed by a blank line to indicate completion. Note that each line within a basic block must be indented by the same amount.
- standard comparison operators:
<
(less than),>
(greater than),==
(equal to),<=
(less than or equal to),>=
(greater than or equal to) and!=
(not equal to) in
,not in
: check whether a value occurs (does not occur) in a sequenceis
,is not
: compare whether two objects are really the same objecta < b == c
: chained comparison, tests whethera
is less thanb
and moreoverb
equalsc
.-
and
,or
,not
: between them, not has the highest priority and or the lowest. The Boolean operatorsand
andor
are so-called short-circuit operators: their arguments are evaluated from left to right, and evaluation stops as soon as the outcome is determined. When used as a general value and not as a Boolean, the return value of a short-circuit operator is the last evaluated argument.It is possible to assign the result of a comparison or other Boolean expression to a variable. For example,
1 2 3 4
>>> string1, string2, string3 = '', 'Trondheim', 'Hammer Dance' >>> non_null = string1 or string2 or string3 >>> non_null 'Trondheim'
Note that in Python, unlike C, assignment cannot occur inside condition expressions.
-
Comparing Sequences and Other Types: Sequence objects may be compared to other objects with the same sequence type. The comparison uses lexicographical(词典) ordering: first the first two items are compared, and if they differ this determines the outcome of the comparison; if they are equal, the next two items are compared, and so on, until either sequence is exhausted. If two items to be compared are themselves sequences of the same type, the lexicographical comparison is carried out recursively. If all items of two sequences compare equal, the sequences are considered equal. If one sequence is an initial sub-sequence of the other, the shorter sequence is the smaller (lesser) one. Lexicographical ordering for strings uses the ASCII ordering for individual characters. Some examples of comparisons between sequences of the same type:
1 2 3 4 5 6 7
(1, 2, 3) < (1, 2, 4) [1, 2, 3] < [1, 2, 4] 'ABC' < 'C' < 'Pascal' < 'Python' (1, 2, 3, 4) < (1, 2, 4) (1, 2) < (1, 2, -1) (1, 2, 3) == (1.0, 2.0, 3.0) (1, 2, ('aa', 'ab')) < (1, 2, ('abc', 'a'), 4)
Note that comparing objects of different types is legal. The outcome is deterministic but arbitrary: the types are ordered by their name. Thus, a list is always smaller than a string, a string is always smaller than a tuple, etc. Mixed numeric types are compared according to their numeric value, so 0 equals 0.0, etc.
2.1 if..elif..else..
Statements
1
2
3
4
5
6
7
8
9
10
11
12
13
>>> x = int(raw_input("Please enter an integer: "))
Please enter an integer: 42
>>> if x < 0:
... x = 0
... print 'Negative changed to zero'
... elif x == 0:
... print 'Zero'
... elif x == 1:
... print 'Single'
... else:
... print 'More'
...
More
An if … elif … elif …
sequence is a substitute for the switch or case statements found in other languages.
2.2 for..in..
Statements
Python’s for statement iterates over the items of any sequence (a list or a string), in the order that they appear in the sequence:
1
2
3
4
5
6
7
8
>>> # Measure some strings:
... words = ['cat', 'window', 'defenestrate']
>>> for w in words:
... print w, len(w)
...
cat 3
window 6
defenestrate 12
If you need to modify the sequence you are iterating over while inside the loop, it is recommended that you first make a copy. Iterating over a sequence does not implicitly make a copy:
1
2
3
4
5
6
>>> for w in words[:]: # Loop over a slice copy of the entire list.
... if len(w) > 6:
... words.insert(0, w)
...
>>> words
['defenestrate', 'cat', 'window', 'defenestrate']
2.3 while
Statements
1
2
3
4
5
6
>>> a, b = 0, 1
>>> while b < 1000:
... print b, # A trailing comma avoids the newline after the output
... a, b = b, a+b
...
1 1 2 3 5 8 13 21 34 55 89 144 233 377 610 987
2.4 break
, continue
and else
in Loops
break
: Like inC
, breaks out of the innermost enclosingfor
orwhile
loop.continue
: Like inC
, continues with the next iteration of the loop:else
: codes in this block are excuted in the following situations:- the loop terminates through exhaustion of the list (with
for
) - the condition becomes false (with
while
)
but not when the loop is terminated by a break statement.
- the loop terminates through exhaustion of the list (with
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
>>> for n in range(2, 10):
... for x in range(2, n):
... if n % x == 0:
... print n, 'equals', x, '*', n/x
... break
... else: # the else clause belongs to the for loop, not the if statement
... # loop fell through without finding a factor
... print n, 'is a prime number'
...
2 is a prime number
3 is a prime number
4 equals 2 * 2
5 is a prime number
6 equals 2 * 3
7 is a prime number
8 equals 2 * 4
9 equals 3 * 3
2.5 pass
statements
do nothing.
1
2
3
>>> def initlog(*args):
... pass # Remember to implement this!
...
3 Defining Functions
1
2
3
4
5
6
7
8
9
10
11
12
>>> def fib2(n): # return Fibonacci series up to n
... """Return a list containing the Fibonacci series up to n."""
... result = []
... a, b = 0, 1
... while a < n:
... result.append(a) # see below
... a, b = b, a+b
... return result
...
>>> f100 = fib2(100) # call it
>>> f100 # write the result
[0, 1, 1, 2, 3, 5, 8, 13, 21, 34, 55, 89]
3.1 Default Argument Values
1
2
3
4
5
6
7
8
9
10
11
def ask_ok(prompt, retries=4, complaint='Yes or no, please!'):
while True:
ok = raw_input(prompt)
if ok in ('y', 'ye', 'yes'):
return True
if ok in ('n', 'no', 'nop', 'nope'):
return False
retries = retries - 1
if retries < 0:
raise IOError('refusenik user')
print complaint
Important warning: The default value is evaluated only once. This makes a difference when the default is a mutable object such as a list, dictionary, or instances of most classes. For example, the following function accumulates the arguments passed to it on subsequent calls:
1
2
3
4
5
6
7
def f(a, L=[]):
L.append(a)
return L
print f(1)
print f(2)
print f(3)
This will print
1
2
3
[1]
[1, 2]
[1, 2, 3]
If you don’t want the default to be shared between subsequent calls, you can write the function like this instead:
1
2
3
4
5
def f(a, L=None):
if L is None:
L = []
L.append(a)
return L
3.2 Keyword Arguments
1
2
3
4
5
def parrot(voltage, state='a stiff', action='voom', type='Norwegian Blue'):
print "-- This parrot wouldn't", action,
print "if you put", voltage, "volts through it."
print "-- Lovely plumage, the", type
print "-- It's", state, "!"
the above function accepts one required argument (voltage) and three optional arguments (state, action, and type). This function can be called in any of the following ways:
1
2
3
4
5
6
parrot(1000) # 1 positional argument
parrot(voltage=1000) # 1 keyword argument
parrot(voltage=1000000, action='VOOOOOM') # 2 keyword arguments
parrot(action='VOOOOOM', voltage=1000000) # 2 keyword arguments
parrot('a million', 'bereft of life', 'jump') # 3 positional arguments
parrot('a thousand', state='pushing up the daisies') # 1 positional, 1 keyword
When a final formal parameter of the form **name
is present, it receives a dictionary containing all keyword arguments except for those corresponding to a formal parameter. This may be combined with a formal parameter of the form *name
which receives a tuple containing the positional arguments beyond the formal parameter list. (*name
must occur before **name
.) For example, if we define a function like this:
1
2
3
4
5
6
7
8
9
def cheeseshop(kind, *arguments, **keywords):
print "-- Do you have any", kind, "?"
print "-- I'm sorry, we're all out of", kind
for arg in arguments:
print arg
print "-" * 40
keys = sorted(keywords.keys())
for kw in keys:
print kw, ":", keywords[kw]
It could be called like this:
1
2
3
4
5
cheeseshop("Limburger", "It's very runny, sir.",
"It's really very, VERY runny, sir.",
shopkeeper='Michael Palin',
client="John Cleese",
sketch="Cheese Shop Sketch")
and of course it would print:
1
2
3
4
5
6
7
8
-- Do you have any Limburger ?
-- I'm sorry, we're all out of Limburger
It's very runny, sir.
It's really very, VERY runny, sir.
----------------------------------------
client : John Cleese
shopkeeper : Michael Palin
sketch : Cheese Shop Sketch
3.3 Arbitrary Argument Lists
Finally, the least frequently used option is to specify that a function can be called with an arbitrary number of arguments. These arguments will be wrapped up in a tuple. Before the variable number of arguments, zero or more normal arguments may occur.
1
2
def write_multiple_items(file, separator, *args):
file.write(separator.join(args))
3.4 Unpacking Argument Lists
Write the function call with the *
-operator to unpack the arguments out of a list or tuple:
1
2
3
4
5
>>> range(3, 6) # normal call with separate arguments
[3, 4, 5]
>>> args = [3, 6]
>>> range(*args) # call with arguments unpacked from a list
[3, 4, 5]
In the same fashion, dictionaries can deliver keyword arguments with the **
-operator:
1
2
3
4
5
6
7
8
>>> def parrot(voltage, state='a stiff', action='voom'):
... print "-- This parrot wouldn't", action,
... print "if you put", voltage, "volts through it.",
... print "E's", state, "!"
...
>>> d = {"voltage": "four million", "state": "bleedin' demised", "action": "VOOM"}
>>> parrot(**d)
-- This parrot wouldn't VOOM if you put four million volts through it. E's bleedin' demised !
3.5 Lambda Expressions
Small anonymous functions can be created with the lambda
keyword.
1
2
3
4
5
6
7
8
>>> def make_incrementor(n):
... return lambda x: x + n
...
>>> f = make_incrementor(42)
>>> f(0)
42
>>> f(1)
43
The above example uses a lambda expression to return a function. Another use is to pass a small function as an argument:
1
2
3
4
>>> pairs = [(1, 'one'), (2, 'two'), (3, 'three'), (4, 'four')]
>>> pairs.sort(key=lambda pair: pair[1])
>>> pairs
[(4, 'four'), (1, 'one'), (3, 'three'), (2, 'two')]
3.6 Documentation Strings
There are emerging conventions about the content and formatting of documentation strings.
- The first line should always be a short, concise summary of the object’s purpose. This line should begin with a capital letter and end with a period.
- If there are more lines in the documentation string, the second line should be blank, visually separating the summary from the rest of the description.
- The following lines should be one or more paragraphs describing the object’s calling conventions, its side effects, etc.
The first non-blank line after the first line of the string determines the amount of indentation for the entire documentation string.
Here is an example of a multi-line docstring:
1
2
3
4
5
6
7
8
9
10
11
>>> def my_function():
... """Do nothing, but document it.
...
... No, really, it doesn't do anything.
... """
... pass
...
>>> print my_function.__doc__
Do nothing, but document it.
No, really, it doesn't do anything.
Math
- operators:
- fundamental operations:
+
,-
,*
,/
%
: remainder//
: explicit floor division discards the fractional part**
: power
1 2 3 4 5 6 7 8 9 10
>>> 17 / 3 # int / int -> int 5 >>> 17 / 3.0 # int / float -> float 5.666666666666667 >>> 17 // 3.0 # explicit floor division discards the fractional part 5.0 >>> 17 % 3 # the % operator returns the remainder of the division 2 >>> 5 * 3 + 2 # result * divisor + remainder 17
- fundamental operations:
- type of numbers:
int
,float
,Decimal
,Fraction
,complex
Strings
-
""
and''
: The only difference between""
and''
is that within single quotes you don’t need to escape"
(but you have to escape\'
) and vice versa. - default output and
print()
:In the interactive interpreter, the output string is enclosed in quotes and special characters are escaped with backslashes. While this might sometimes look different from the input (the enclosing quotes could change), the two strings are equivalent. The string is enclosed in double quotes if the string contains a single quote and no double quotes, otherwise it is enclosed in single quotes. The print statement produces a more readable output, by omitting the enclosing quotes and by printing escaped and special characters:
1 2 3 4 5 6 7 8 9 10
>> '"Isn\'t," they said.' '"Isn\'t," they said.' >> print '"Isn\'t," they said.' "Isn't," they said. >> s = 'First line.\nSecond line.' # \n means newline >> s # without print, \n is included in the output 'First line.\nSecond line.' >> print s # with print, \n produces a new line First line. Second line.
- character before strings:
- raw strings:
r'C:\some\name'
(‘\n’ now not means newline)1 2 3 4 5
>>> print 'C:\some\name' # here \n means newline! C:\some ame >>> print r'C:\some\name' # note the r before the quote C:\some\name
- Unicode strings:
u'This is a Unicoding\u0020string
1 2 3 4
>>> ur'Hello\u0020World !' # It will only apply the above \uXXXX conversion if there is an uneven number of backslashes in front of the small ‘u’ u'Hello World !' >>> ur'Hello\\u0020World !' # The raw mode is most useful when you have to enter lots of backslashes, as can be necessary in regular expressions. u'Hello\\\\u0020World !'
- raw strings:
- multiple lines string: (In the following examples, the
\
in"""\
is used to prevent End-of-lines, i.e.\n
)print """\ Usage: thingy [OPTIONS] -h Display this usage message -H hostname Hostname to connect to """
- string operators:
+
: Concatenat strings(glued together). Two or more string literals next to each other are automatically concatenated.*
: Repeated strings.1 2 3 4 5
>>> 3 * 'un' + 'ium' 'unununium' >>> # This feature is particularly useful when you want to break long strings >>> text = ('Put several strings within parentheses ' 'to have them joined together.')
- indexing: obtain individual characters. the first character having index 0. There is no separate character type; a character is simply a string of size one:
1 2 3 4 5 6
>>> word = 'Python' >>> word[0] # character in position 0 'P' >>> word[5] # character in position 5 'n'
Indices may also be negative numbers, to start counting from the right:
1 2 3 4 5 6
>>> word[-1] # last character 'n' >>> word[-2] # second-last character 'o' >>> word[-6] 'P'
- slicing: obtain a substring
1 2 3 4 5 6 7 8 9 10 11 12
>>> word[:] # the value of word 'Python' >>> word[0:2] # characters from position 0 (included) to 2 (excluded) 'Py' >>> word[:2] + word[2:] # s[:i] + s[i:] is always equal to s 'Python' >>> word[:2] # character from the beginning to position 2 (excluded) 'Py' >>> word[-2:] # characters from the second-last (included) to the end 'on' >>> word[4:42] # out of range slice indexes are handled gracefully when used for slicing 'on'
- Immutable: Python strings cannot be changed
1 2 3 4 5 6
>>> word[0] = 'J' ... TypeError: 'str' object does not support item assignment >>> word[2:] = 'py' ... TypeError: 'str' object does not support item assignment
- Unicode String:
u
before string:1 2
>>> u"abc" 'abc'
- When a Unicode string is printed, written to a file, or converted with str(), conversion takes place using this default encoding.
1 2 3 4 5 6
>>> str(u"abc") 'abc' >>> str(u"äöü") Traceback (most recent call last): File "<stdin>", line 1, in ? UnicodeEncodeError: 'ascii' codec can't encode characters in position 0-2: ordinal not in range(128)
-
To convert a Unicode string into an 8-bit string using a specific encoding, Unicode objects provide an encode() method that takes one argument, the name of the encoding. Lowercase names for encodings are preferred.
1 2
>>> u"äöü".encode('utf-8') '\xc3\xa4\xc3\xb6\xc3\xbc'
-
If you have data in a specific encoding and want to produce a corresponding Unicode string from it, you can use the unicode() function with the encoding name as the second argument.
1 2
>>> unicode('\xc3\xa4\xc3\xb6\xc3\xbc', 'utf-8') u'\xe4\xf6\xfc'
Data Structures
Sequences
Sequences include str
, unicode
, list
, tuple
, bytearray
, buffer
, xrange
Lists
具体参见:The Python Standard Library
- define:
1 2 3
>>> squares = [1, 4, 9, 16, 25] >>> squares [1, 4, 9, 16, 25]
- indexing:
1 2 3 4
>>> squares[0] # indexing returns the item 1 >>> squares[-1] 25
- slicing:
1 2 3 4
>>> squares[:] [1, 4, 9, 16, 25] >>> squares[-3:] # slicing returns a new list [9, 16, 25]
- concatenation:
1 2
>>> squares + [36, 49, 64, 81, 100] [1, 4, 9, 16, 25, 36, 49, 64, 81, 100]
- mutable:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21
>>> cubes = [1, 8, 27, 65, 125] # something's wrong here >>> 4 ** 3 # the cube of 4 is 64, not 65! 64 >>> cubes[3] = 64 # replace the wrong value >>> cubes [1, 8, 27, 64, 125] >>> letters = ['a', 'b', 'c', 'd', 'e', 'f', 'g'] >>> letters ['a', 'b', 'c', 'd', 'e', 'f', 'g'] >>> # replace some values >>> letters[2:5] = ['C', 'D', 'E'] >>> letters ['a', 'b', 'C', 'D', 'E', 'f', 'g'] >>> # now remove them >>> letters[2:5] = [] >>> letters ['a', 'b', 'f', 'g'] >>> # clear the list by replacing all the elements with an empty list >>> letters[:] = [] >>> letters []
- nest lists:
1 2 3 4 5 6 7 8 9
>>> a = ['a', 'b', 'c'] >>> n = [1, 2, 3] >>> x = [a, n] >>> x [['a', 'b', 'c'], [1, 2, 3]] >>> x[0] ['a', 'b', 'c'] >>> x[0][1] 'b'
-
methods:
print dir(list)
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24
>>> a = [66.25, 333, 333, 1, 1234.5] >>> len(a) 5 >>> print a.count(333), a.count(66.25), a.count('x') 2 1 0 >>> a.insert(2, -1) >>> a.append(333) >>> a [66.25, 333, -1, 333, 1, 1234.5, 333] >>> a.index(333) 1 >>> a.remove(333) >>> a [66.25, -1, 333, 1, 1234.5, 333] >>> a.reverse() >>> a [333, 1234.5, 1, 333, -1, 66.25] >>> a.sort() >>> a [-1, 1, 66.25, 333, 333, 1234.5] >>> a.pop() 1234.5 >>> a [-1, 1, 66.25, 333, 333]
Using Lists as Stacks
push with list.append()
, pop with list.pop()
:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
>>> stack = [3, 4, 5]
>>> stack.append(6)
>>> stack.append(7)
>>> stack
[3, 4, 5, 6, 7]
>>> stack.pop()
7
>>> stack
[3, 4, 5, 6]
>>> stack.pop()
6
>>> stack.pop()
5
>>> stack
[3, 4]
Using Lists as Queues
Lists are not efficient for queue. To implement a queue, use collections.deque
(双端队列) which was designed to have fast appends and pops from both ends. enqueue with collections.deque.append()
, dequeue with collections.deque.popleft()
:
1
2
3
4
5
6
7
8
9
10
>>> from collections import deque
>>> queue = deque(["Eric", "John", "Michael"])
>>> queue.append("Terry") # Terry arrives
>>> queue.append("Graham") # Graham arrives
>>> queue.popleft() # The first to arrive now leaves
'Eric'
>>> queue.popleft() # The second to arrive now leaves
'John'
>>> queue # Remaining queue in order of arrival
deque(['Michael', 'Terry', 'Graham'])
filter()
, map()
and reduce()
-
filter(function, sequence)
: Returns a sequence consisting of those items from the sequence for which function(item) is true. If sequence is a str, unicode or tuple, the result will be of the same type; otherwise, it is always a list. For example:1 2 3 4
>>> def f(x): return x % 3 == 0 or x % 5 == 0 ... >>> filter(f, range(2, 25)) [3, 5, 6, 9, 10, 12, 15, 18, 20, 21, 24]
-
map(function, sequence)
: Calls function(item) for each of the sequence’s items and returns a list of the return values:1 2 3 4
>>> def cube(x): return x*x*x ... >>> map(cube, range(1, 11)) [1, 8, 27, 64, 125, 216, 343, 512, 729, 1000]
More than one sequence may be passed; the function must then have as many arguments as there are sequences and is called with the corresponding item from each sequence (or None if some sequence is shorter than another). For example:
1 2 3 4 5
>>> seq = range(8) >>> def add(x, y): return x+y ... >>> map(add, seq, seq) [0, 2, 4, 6, 8, 10, 12, 14]
-
reduce(function, sequence)
: Returns a single value constructed by calling the binary function function on the first two items of the sequence, then on the result and the next item, and so on:1 2 3 4
>>> def add(x,y): return x+y ... >>> reduce(add, range(1, 11)) 55
If there’s only one item in the sequence, its value is returned; if the sequence is empty, an exception is raised.
A third argument can be passed to indicate the starting value. In this case the starting value is returned for an empty sequence, and the function is first applied to the starting value and the first sequence item, then to the result and the next item, and so on:
1 2 3 4 5 6 7 8
>>> def sum2(seq): ... def add(x,y): return x+y ... return reduce(add, seq, 0) ... >>> sum2(range(1, 11)) 55 >>> sum2([]) 0
List Comprehensions
1
2
3
4
5
6
>>> squares = []
>>> for x in range(10):
... squares.append(x**2)
...
>>> squares
[0, 1, 4, 9, 16, 25, 36, 49, 64, 81]
Equal to:
1
squares = [x**2 for x in range(10)]
Also equal to:
1
squares = map(lambda x: x**2, range(10))
And a more complex example:
1
2
>>> [(x, y) for x in [1,2,3] for y in [3,1,4] if x != y]
[(1, 3), (1, 4), (2, 3), (2, 1), (2, 4), (3, 1), (3, 4)]
And it’s equivalent to:
1
2
3
4
5
6
7
8
>>> combs = []
>>> for x in [1,2,3]:
... for y in [3,1,4]:
... if x != y:
... combs.append((x, y))
...
>>> combs
[(1, 3), (1, 4), (2, 3), (2, 1), (2, 4), (3, 1), (3, 4)]
If the expression is a tuple (e.g. the (x, y) in the previous example), it must be parenthesized.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
>>> vec = [-4, -2, 0, 2, 4]
>>> # create a new list with the values doubled
>>> [x*2 for x in vec]
[-8, -4, 0, 4, 8]
>>> # filter the list to exclude negative numbers
>>> [x for x in vec if x >= 0]
[0, 2, 4]
>>> # apply a function to all the elements
>>> [abs(x) for x in vec]
[4, 2, 0, 2, 4]
>>> # call a method on each element
>>> freshfruit = [' banana', ' loganberry ', 'passion fruit ']
>>> [weapon.strip() for weapon in freshfruit]
['banana', 'loganberry', 'passion fruit']
>>> # create a list of 2-tuples like (number, square)
>>> [(x, x**2) for x in range(6)]
[(0, 0), (1, 1), (2, 4), (3, 9), (4, 16), (5, 25)]
>>> # the tuple must be parenthesized, otherwise an error is raised
>>> [x, x**2 for x in range(6)]
File "<stdin>", line 1, in <module>
[x, x**2 for x in range(6)]
^
SyntaxError: invalid syntax
>>> # flatten a list using a listcomp with two 'for'
>>> vec = [[1,2,3], [4,5,6], [7,8,9]]
>>> [num for elem in vec for num in elem]
[1, 2, 3, 4, 5, 6, 7, 8, 9]
List comprehensions can contain complex expressions and nested functions:
1
2
3
>>> from math import pi
>>> [str(round(pi, i)) for i in range(1, 6)]
['3.1', '3.14', '3.142', '3.1416', '3.14159']
The initial expression in a list comprehension can be any arbitrary expression, including another list comprehension.
Consider the following example of a 3x4 matrix implemented as a list of 3 lists of length 4:
1
2
3
4
5
6
7
>>> matrix = [
... [1, 2, 3, 4],
... [5, 6, 7, 8],
... [9, 10, 11, 12],
... ]
>>> [[row[i] for row in matrix] for i in range(4)]
[[1, 5, 9], [2, 6, 10], [3, 7, 11], [4, 8, 12]]
Equal to:
1
2
3
4
5
6
>>> transposed = []
>>> for i in range(4):
... transposed.append([row[i] for row in matrix])
...
>>> transposed
[[1, 5, 9], [2, 6, 10], [3, 7, 11], [4, 8, 12]]
Also equal to:
1
2
3
4
5
6
7
8
9
10
>>> transposed = []
>>> for i in range(4):
... # the following 3 lines implement the nested listcomp
... transposed_row = []
... for row in matrix:
... transposed_row.append(row[i])
... transposed.append(transposed_row)
...
>>> transposed
[[1, 5, 9], [2, 6, 10], [3, 7, 11], [4, 8, 12]]
And almost equal to:
1
2
>>> zip(*matrix)
[(1, 5, 9), (2, 6, 10), (3, 7, 11), (4, 8, 12)]
Tuples
A tuple consists of a number of values separated by commas, for instance:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
>>> t = 12345, 54321, 'hello!'
>>> t[0]
12345
>>> t
(12345, 54321, 'hello!')
>>> # Tuples may be nested:
... u = t, (1, 2, 3, 4, 5)
>>> u
((12345, 54321, 'hello!'), (1, 2, 3, 4, 5))
>>> # Tuples are immutable:
... t[0] = 88888
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: 'tuple' object does not support item assignment
>>> # but they can contain mutable objects:
... v = ([1, 2, 3], [3, 2, 1])
>>> v
([1, 2, 3], [3, 2, 1])
Tuples can contain mutable objects, such as lists(like above).
Compare with Lists:
- Tuples are immutable, and usually contain a heterogeneous(异质的) sequence of elements that are accessed via unpacking or indexing (or even by attribute in the case of namedtuples)
- Lists are mutable, and their elements are usually homogeneous(同质的) and are accessed by iterating over the list.
A special problem is the construction of tuples containing 0 or 1 items(Carefully):
1
2
3
4
5
6
7
8
>>> empty = ()
>>> singleton = 'hello', # <-- note trailing comma
>>> len(empty)
0
>>> len(singleton)
1
>>> singleton
('hello',)
tuple packing and sequence unpacking:
1
2
>>> t = 12345, 54321, 'hello!'
>>> x, y, z = t # t can also be replaced with [12345, 54321, 'hello!']
Note that multiple assignment is really just a combination of tuple packing and sequence unpacking.
Unordered and Unique
Sets
A set is an unordered collection with no duplicate elements. Basic uses include membership testing and eliminating(消除) duplicate entries. Set objects also support mathematical operations like union, intersection, difference, and symmetric(对称) difference.
{}
or the set()
function can be used to create sets. Note: to create an empty set you have to use set()
, not {}
(the latter creates an empty dictionary)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
>>> basket = ['apple', 'orange', 'apple', 'pear', 'orange', 'banana']
>>> fruit = set(basket) # create a set without duplicates
>>> fruit
set(['orange', 'pear', 'apple', 'banana'])
>>> 'orange' in fruit # fast membership testing
True
>>> 'crabgrass' in fruit
False
>>> # Demonstrate set operations on unique letters from two words
...
>>> a = set('abracadabra')
>>> b = set('alacazam')
>>> a # unique letters in a
set(['a', 'r', 'b', 'c', 'd'])
>>> a - b # letters in a but not in b
set(['r', 'd', 'b'])
>>> a | b # letters in either a or b
set(['a', 'c', 'r', 'd', 'b', 'm', 'z', 'l'])
>>> a & b # letters in both a and b
set(['a', 'c'])
>>> a ^ b # letters in a or b but not both
set(['r', 'd', 'b', 'm', 'z', 'l'])
Similarly to list comprehensions, set comprehensions are also supported:
1
2
3
>>> a = {x for x in 'abracadabra' if x not in 'abc'}
>>> a
set(['r', 'd'])
Dictionaries
*Dictionaries are indexed by keys, which can be any immutable types(strings, numbers and tuples that don’t contain any mutable object).
It is best to think of a dictionary as an unordered set of key: value pairs, with the requirement that the keys are unique. A pair of braces creates an empty dictionary: {}
. Placing a comma-separated list of key:value pairs within the braces adds initial key:value pairs to the dictionary; this is also the way dictionaries are written on output.
The keys()
method of a dictionary object returns a list of all the keys used in the dictionary, in arbitrary order (if you want it sorted, just apply the sorted()
function to it);
The values() method of a dictionary objects returns a list of all the values used in the dictionary.
Here is a small example using a dictionary:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
>>> tel = {'jack': 4098, 'sape': 4139}
>>> tel['guido'] = 4127
>>> tel
{'sape': 4139, 'guido': 4127, 'jack': 4098}
>>> tel['jack']
4098
>>> del tel['sape']
>>> tel['irv'] = 4127
>>> tel
{'guido': 4127, 'irv': 4127, 'jack': 4098}
>>> sorted(tel.keys())
['guido', 'irv', 'jack']
>>> 'guido' in tel
True
>>> tel.values()
[4127, 4127, 4098]
The dict()
constructor builds dictionaries directly from sequences of key-value pairs:
1
2
>>> dict([('sape', 4139), ('guido', 4127), ('jack', 4098)])
{'sape': 4139, 'jack': 4098, 'guido': 4127}
In addition, dict comprehensions can be used to create dictionaries from arbitrary key and value expressions:
1
2
>>> {x: x**2 for x in (2, 4, 6)}
{2: 4, 4: 16, 6: 36}
When the keys are simple strings, it is sometimes easier to specify pairs using keyword arguments:
1
2
>>> dict(sape=4139, guido=4127, jack=4098)
{'sape': 4139, 'jack': 4098, 'guido': 4127}
Looping Techniques
When looping through a sequence, the position index and corresponding value can be retrieved at the same time using the enumerate()
function.
1
2
3
4
5
6
>>> for i, v in enumerate(['tic', 'tac', 'toe']):
... print i, v
...
0 tic
1 tac
2 toe
To loop over two or more sequences at the same time, the entries can be paired with the zip()
function.
1
2
3
4
5
6
7
8
>>> questions = ['name', 'quest', 'favorite color']
>>> answers = ['lancelot', 'the holy grail', 'blue']
>>> for q, a in zip(questions, answers):
... print 'What is your {0}? It is {1}.'.format(q, a)
...
What is your name? It is lancelot.
What is your quest? It is the holy grail.
What is your favorite color? It is blue.
To loop over a sequence in reverse, first specify the sequence in a forward direction and then call the reversed()
function.
1
2
3
4
5
6
7
8
>>> for i in reversed(xrange(1,10,2)):
... print i
...
9
7
5
3
1
To loop over a sequence in sorted order, use the sorted() function which returns a new sorted list while leaving the source unaltered.
1
2
3
4
5
6
7
8
>>> basket = ['apple', 'orange', 'apple', 'pear', 'orange', 'banana']
>>> for f in sorted(set(basket)):
... print f
...
apple
banana
orange
pear
When looping through dictionaries, the key and corresponding value can be retrieved at the same time using the iteritems()
method.
1
2
3
4
5
6
>>> knights = {'gallahad': 'the pure', 'robin': 'the brave'}
>>> for k, v in knights.iteritems():
... print k, v
...
gallahad the pure
robin the brave
It is sometimes tempting(诱人的) to change a list while you are looping over it; however, it is often simpler and safer to create a new list instead.
1
2
3
4
5
6
7
8
9
>>> import math
>>> raw_data = [56.2, float('NaN'), 51.7, 55.3, 52.5, float('NaN'), 47.8]
>>> filtered_data = []
>>> for value in raw_data:
... if not math.isnan(value):
... filtered_data.append(value)
...
>>> filtered_data
[56.2, 51.7, 55.3, 52.5, 47.8]
The del
statement
There is a way to remove an item from a list given its index instead of its value: the del
statement. The del
statement can also be used to remove slices from a list or clear the entire list. For example:
1
2
3
4
5
6
7
8
9
10
>>> a = [-1, 1, 66.25, 333, 333, 1234.5]
>>> del a[0]
>>> a
[1, 66.25, 333, 333, 1234.5]
>>> del a[2:4]
>>> a
[1, 66.25, 1234.5]
>>> del a[:]
>>> a
[]
del can also be used to delete entire variables:
1
>>> del a