Python_19_Udacity_Evans_Intro to CS_3_How to manage data

总目录


课程页面:https://www.udacity.com/course/intro-to-computer-science--cs101
授课教师:Dave Evans https://www.cs.virginia.edu/~evans/
如下内容包含课程笔记和自己的扩展折腾

Aliasing

这个例子很神奇!从前从来没有想到会这样。

a = ['H', 'E', 'L', 'L', 'O']
b = a
a[0] = 'Y'
print a, b

结果完全出乎意料:
['Y', 'E', 'L', 'L', 'O'] ['Y', 'E', 'L', 'L', 'O']

Mutation

依然是很神奇:

def replace_list(list):
    list[0] += 1

a = [0, 1]
replace_list(a)
print a

结果居然是[1, 1], list果然是神奇的生物。

对比string:

def replace_string(string):
    string = string + "1"

a = "0"
replace_string(a)
print a

不用说,string的结果当然是"0"

子集改变

p = [1, 2]
q = [3, 4]
p.append(q)
print p
q[1] = 2
print q
print p

结果:
[1, 2, [3, 4]]
[3, 2]
[1, 2, [3, 2]]

但是如果重新赋值q,就不会同时改变p

p = [1, 2]
q = [3, 4]
p.append(q)
print p
q = [3, 2]
print q
print p

结果:
[1, 2, [3, 4]]
[3, 2]
[1, 2, [3, 4]]

Index

  • .index()
  • 要注意的是,如果没找到,会 return an error
    • 解决办法是用in & not in
  • 如果list有多个需要找的value,只会给第一个找到的index

pop

  • .pop(): "mutates the list by removing and returning its last element"

for 可以一次性赋值好几个:

list = [[1,2,3], [4,5,6]]
for a, b, c in list:
  print a
  print b
  print c
  print "..."

Output:
1
2
3
...
4
5
6
...

web crawler with max depth

# -*- coding: utf-8 -*-

#
# This question explores a different way (from the previous question)
# to limit the pages that it can crawl.
#
#######

# THREE GOLD STARS #
# Yes, we really mean it!  This is really tough (but doable) unless
# you have some previous experience before this course.


# Modify the crawl_web procedure to take a second parameter,
# max_depth, that limits the depth of the search.  We can
# define the depth of a page as the number of links that must
# be followed to reach that page starting from the seed page,
# that is, the length of the shortest path from the seed to
# the page.  No pages whose depth exceeds max_depth should be
# included in the crawl.
#
# For example, if max_depth is 0, the only page that should
# be crawled is the seed page. If max_depth is 1, the pages
# that should be crawled are the seed page and every page that
# it links to directly. If max_depth is 2, the crawl should
# also include all pages that are linked to by these pages.
#
# Note that the pages in the crawl may be in any order.
#
# The following definition of get_page provides an interface
# to the website found at http://www.udacity.com/cs101x/index.html

# The function output order does not affect grading.

def get_page(url):
    try:
        if url == "http://www.udacity.com/cs101x/index.html":
            return ('  This is a test page for learning to crawl! '
            '

It is a good idea to ' 'learn to ' 'crawl before you try to ' 'walk ' 'or fly. ' '

') elif url == "http://www.udacity.com/cs101x/crawling.html": return (' I have not learned to crawl yet, but I ' 'am quite good at ' 'kicking.' ' ') elif url == "http://www.udacity.com/cs101x/walking.html": return (' I cant get enough ' 'crawling! ' ' ') elif url == "http://www.udacity.com/cs101x/flying.html": return (' The magic words are Squeamish Ossifrage! ' ' ') elif url == "http://top.contributors/velak.html": return ('' '') elif url == "http://top.contributors/jesyspa.html": return ('' '') elif url == "http://top.contributors/forbiddenvoid.html": return ('' '' '') elif url == "http://top.contributors/kilaws.html": return ('' '') elif url == "http://top.contributors/graemeblake.html": return ('' '') elif url == "A1": return ' ' elif url == "B1": return '' elif url == "C1": return '' elif url == "D1": return ' ' elif url == "E1": return ' ' except: return "" return "" def get_next_target(page): start_link = page.find('>> ['http://www.udacity.com/cs101x/index.html'] #print crawl_web("http://www.udacity.com/cs101x/index.html",1) #>>> ['http://www.udacity.com/cs101x/index.html', #>>> 'http://www.udacity.com/cs101x/flying.html', #>>> 'http://www.udacity.com/cs101x/walking.html', #>>> 'http://www.udacity.com/cs101x/crawling.html'] print crawl_web("http://www.udacity.com/cs101x/index.html",50) #>>> ['http://www.udacity.com/cs101x/index.html', #>>> 'http://www.udacity.com/cs101x/flying.html', #>>> 'http://www.udacity.com/cs101x/walking.html', #>>> 'http://www.udacity.com/cs101x/crawling.html', #>>> 'http://www.udacity.com/cs101x/kicking.html'] #print crawl_web("http://top.contributors/forbiddenvoid.html",2) #>>> ['http://top.contributors/forbiddenvoid.html', #>>> 'http://top.contributors/graemeblake.html', #>>> 'http://top.contributors/angel.html', #>>> 'http://top.contributors/dreyescat.html', #>>> 'http://top.contributors/johang.html', #>>> 'http://top.contributors/charlzz.html'] #print crawl_web("A1",3) #>>> ['A1', 'C1', 'B1', 'E1', 'D1', 'F1'] # (May be in any order)

Output:

tocrawl []
next_depth ['http://www.udacity.com/cs101x/crawling.html', 'http://www.udacity.com/cs101x/walking.html', 'http://www.udacity.com/cs101x/flying.html']
crawled ['http://www.udacity.com/cs101x/index.html']
depth 0

depth + 1 = 1

tocrawl ['http://www.udacity.com/cs101x/crawling.html', 'http://www.udacity.com/cs101x/walking.html']
next_depth []
crawled ['http://www.udacity.com/cs101x/index.html', 'http://www.udacity.com/cs101x/flying.html']
depth 1

tocrawl ['http://www.udacity.com/cs101x/crawling.html']
next_depth ['http://www.udacity.com/cs101x/index.html']
crawled ['http://www.udacity.com/cs101x/index.html', 'http://www.udacity.com/cs101x/flying.html', 'http://www.udacity.com/cs101x/walking.html']
depth 1

tocrawl []
next_depth ['http://www.udacity.com/cs101x/index.html', 'http://www.udacity.com/cs101x/kicking.html']
crawled ['http://www.udacity.com/cs101x/index.html', 'http://www.udacity.com/cs101x/flying.html', 'http://www.udacity.com/cs101x/walking.html', 'http://www.udacity.com/cs101x/crawling.html']
depth 1

depth + 1 = 2

tocrawl ['http://www.udacity.com/cs101x/index.html']
next_depth []
crawled ['http://www.udacity.com/cs101x/index.html', 'http://www.udacity.com/cs101x/flying.html', 'http://www.udacity.com/cs101x/walking.html', 'http://www.udacity.com/cs101x/crawling.html', 'http://www.udacity.com/cs101x/kicking.html']
depth 2

tocrawl []
depth + 1 = 3

['http://www.udacity.com/cs101x/index.html', 'http://www.udacity.com/cs101x/flying.html', 'http://www.udacity.com/cs101x/walking.html', 'http://www.udacity.com/cs101x/crawling.html', 'http://www.udacity.com/cs101x/kicking.html']

Sudoku

  1. 下面的算法是我的思路。就是直接用.sort对比list。分为row和column两个部分
  2. Udacity的算法和我的思路截然不同。
# -*- coding: utf-8 -*-

# THREE GOLD STARS

# Sudoku [http://en.wikipedia.org/wiki/Sudoku]
# is a logic puzzle where a game
# is defined by a partially filled
# 9 x 9 square of digits where each square
# contains one of the digits 1,2,3,4,5,6,7,8,9.
# For this question we will generalize
# and simplify the game.

# Define a procedure, check_sudoku,
# that takes as input a square list
# of lists representing an n x n
# sudoku puzzle solution and returns the boolean
# True if the input is a valid
# sudoku square and returns the boolean False
# otherwise.

# A valid sudoku square satisfies these
# two properties:

#   1. Each column of the square contains
#       each of the whole numbers from 1 to n exactly once.

#   2. Each row of the square contains each
#       of the whole numbers from 1 to n exactly once.

# You may assume the the input is square and contains at
# least one row and column.

correct = [[1,2,3],
           [2,3,1],
           [3,1,2]]

incorrect = [[1,2,3,4],
             [2,3,1,3],
             [3,1,2,3],
             [4,4,4,4]]

incorrect2 = [[1,2,3,4],
             [2,3,1,4],
             [4,1,2,3],
             [3,4,1,2]]

incorrect3 = [[1,2,3,4,5],
              [2,3,1,5,6],
              [4,5,2,1,3],
              [3,4,5,2,1],
              [5,6,4,3,2]]

incorrect4 = [['a','b','c'],
              ['b','c','a'],
              ['c','a','b']]

incorrect5 = [ [1, 1.5],
               [1.5, 1]]

"""
下面是我的思路和代码:
check_row 把每个小list排序,从小到大
check_column 把column变成row,再用check_row
"""
from copy import deepcopy

def check_sudoku(sudoku):
    a = deepcopy(sudoku)
    b = deepcopy(sudoku)
    return check_row(a) and check_column(b)


def check_row(sudoku):
    correct_row_sorted = range(1, len(sudoku)+1)
    for item in sudoku:
        item.sort()
        if item != correct_row_sorted:
            return False
    return True

def check_column(sudoku):
    new_sudoku = []
    n = len(sudoku)
    for column in range(n):
        new_row = []
        for i in range(n):
            new_row.append(sudoku[i][column])
        new_sudoku.append(new_row)
    return check_row(new_sudoku)



print check_sudoku(incorrect)
#>>> False


print check_sudoku(correct)
#>>> True

print check_sudoku(incorrect2)
#>>> False

print check_sudoku(incorrect3)
#>>> False

print check_sudoku(incorrect4)
#>>> False

print check_sudoku(incorrect5)
#>>> False

比较list的concatenate和append

# Investigating adding and appending to lists
# If you run the following four lines of codes, what are list1 and list2?

list1 = [1,2,3,4]
list2 = [1,2,3,4]

list1 = list1 + [5, 6]
list2.append([5, 6])

# to check, you can print them out using the print statements below.

print "showing list1 and list2:"
print list1
print list2
"""
Console:
showing list1 and list2:
[1, 2, 3, 4, 5, 6]
[1, 2, 3, 4, [5, 6]]
"""

Symmetric Square

# -*- coding: utf-8 -*-

# A list is symmetric if the first row is the same as the first column,
# the second row is the same as the second column and so on. Write a
# procedure, symmetric, which takes a list as input, and returns the
# boolean True if the list is symmetric and False if it is not.

"""
例子:
3*3的
row: [list[0][0], list[0][1], list[0][2]] == column: [list[0][0], list[1][0], list[2][0]]
row: [list[1][0], list[1][1], list[1][2]] == [list[0][1], list[1][1], list[2][1]]
...
以此类推
算法:
for num in range(len(list))
即: row: list[row_num][n] == column: list[n][column_num]
注: row_number = column_num = num,
"""


def symmetric(list):
    if list == []:
        return True
    if len(list) != len(list[0]):
        return False
    for num in range(len(list)):
        row = []
        column = []
        for n in range(len(list)):
            row.append(list[num][n])
            column.append(list[n][num])
        print "round %s : %s, %s" % (num, row, column)
        if row != column:
            return False
    return True

print symmetric([])


print symmetric([[1, 2, 3],
                [2, 3, 4],
                [3, 4, 1]])
#>>> True

print symmetric([["cat", "dog", "fish"],
                ["dog", "dog", "fish"],
                ["fish", "fish", "cat"]])
#>>> True

print symmetric([["cat", "dog", "fish"],
                ["dog", "dog", "dog"],
                ["fish","fish","cat"]])
#>>> False

print symmetric([[1, 2],
                [2, 1]])
#>>> True

print symmetric([[1, 2, 3, 4],
                [2, 3, 4, 5],
                [3, 4, 5, 6]])
#>>> False


print symmetric([[1,2,3],
                 [2,3,1]])
#>>> False

Antisymmetric

# -*- coding: utf-8 -*-

# By Dimitris_GR from forums
# Modify Problem Set 31's (Optional) Symmetric Square to return True
# if the given square is antisymmetric and False otherwise.
# An nxn square is called antisymmetric if A[i][j]=-A[j][i]
# for each i=0,1,...,n-1 and for each j=0,1,...,n-1.

def antisymmetric(list):
    if list == []:
        return True
    if len(list) != len(list[0]):
        return False
    for num in range(len(list)):
        row = []
        column = []
        for n in range(len(list)):
            row.append(-list[num][n])
            column.append(list[n][num])
        # print "round %s : %s, %s" % (num, row, column)
        if row != column:
            return False
    return True
# Test Cases:

print antisymmetric([[0, 1, 2],
                     [-1, 0, 3],
                     [-2, -3, 0]])
#>>> True

print antisymmetric([[0, 0, 0],
                     [0, 0, 0],
                     [0, 0, 0]])
#>>> True

print antisymmetric([[0, 1, 2],
                     [-1, 0, -2],
                     [2, 2,  3]])
#>>> False

print antisymmetric([[1, 2, 5],
                     [0, 1, -9],
                     [0, 0, 1]])
#>>> False

Recognise Identity Matrix

# -*- coding: utf-8 -*-

# By Ashwath from forums
# Given a list of lists representing a n * n matrix as input,
# define a  procedure that returns True if the input is an identity matrix
# and False otherwise.

# An IDENTITY matrix is a square matrix in which all the elements
# on the principal/main diagonal are 1 and all the elements outside
# the principal diagonal are 0.
# (A square matrix is a matrix in which the number of rows
# is equal to the number of columns)

def is_identity_matrix(matrix):
    if matrix == []:
        return False
    if len(matrix) != len(matrix[0]):
        return False
    for n in range(len(matrix)):
        if matrix[n] != [0]*n + [1] + (len(matrix)-n-1)*[0]:
            return False
    return True



# Test Cases:

matrix1 = [[1,0,0,0],
           [0,1,0,0],
           [0,0,1,0],
           [0,0,0,1]]
print is_identity_matrix(matrix1)
#>>>True

matrix2 = [[1,0,0],
           [0,1,0],
           [0,0,0]]

print is_identity_matrix(matrix2)
#>>>False

matrix3 = [[2,0,0],
           [0,2,0],
           [0,0,2]]

print is_identity_matrix(matrix3)
#>>>False

matrix4 = [[1,0,0,0],
           [0,1,1,0],
           [0,0,0,1]]

print is_identity_matrix(matrix4)
#>>>False

matrix5 = [[1,0,0,0,0,0,0,0,0]]

print is_identity_matrix(matrix5)
#>>>False

matrix6 = [[1,0,0,0],
           [0,1,0,1],
           [0,0,1,0],
           [0,0,0,1]]

print is_identity_matrix(matrix6)
#>>>False

matrix7 = [[1, -1, 1],
           [0, 1, 0],
           [0, 0, 1]]
print is_identity_matrix(matrix7)
#>>>False

Numbers in Lists

# -*- coding: utf-8 -*-

# Numbers in lists by SeanMc from forums
# define a procedure that takes in a string of numbers from 1-9 and
# outputs a list with the following parameters:
# Every number in the string should be inserted into the list.
# If a number x in the string is less than or equal
# to the preceding number y, the number x should be inserted
# into a sublist. Continue adding the following numbers to the
# sublist until reaching a number z that
# is greater than the number y.
# Then add this number z to the normal list and continue.

#Hint - "int()" turns a string's element into a number

"""
思路:
看起来挺复杂的,其实很简单.
就是弄一个r1,比y小的时候就把数字塞到r1这个list里面.
遇到比y大的数字了,就先把r1放到y,再把这个大数字放到y里面.
"""

def numbers_in_lists(string):
    if len(string) <= 1:
        return [string]
    r1 = []
    r = [int(string[0])]
    y = int(string[0])
    for i in range(1, len(string)):
        num = int(str(string[i]))
        if num > y:
            if r1 != []:
                r.append(r1)
            r.append(num)
            y = num
            r1 = []
        else:
            r1.append(num)
    if r1 != []:
        r.append(r1)
    # print r
    return r


#testcases
string = '543987'
result = [5,[4,3],9,[8,7]]
print repr(string), numbers_in_lists(string) == result


string= '987654321'
result = [9,[8,7,6,5,4,3,2,1]]
print repr(string), numbers_in_lists(string) == result

string = '455532123266'
result = [4, 5, [5, 5, 3, 2, 1, 2, 3, 2], 6, [6]]
print repr(string), numbers_in_lists(string) == result

string = '123456789'
result = [1, 2, 3, 4, 5, 6, 7, 8, 9]
print repr(string), numbers_in_lists(string) == result

Frequency Analysis

# -*- coding: utf-8 -*-

# Crypto Analysis: Frequency Analysis
#
# To analyze encrypted messages, to find out information about the possible
# algorithm or even language of the clear text message, one could perform
# frequency analysis. This process could be described as simply counting
# the number of times a certain symbol occurs in the given text.
# For example:
# For the text "test" the frequency of 'e' is 1, 's' is 1 and 't' is 2.
#
# The input to the function will be an encrypted body of text that only contains
# the lowercase letters a-z.
# As output you should return a list of the normalized frequency
# for each of the letters a-z.
# The normalized frequency is simply the number of occurrences, i,
# divided by the total number of characters in the message, n.

def freq_analysis(message):
    alphabet="abcdefghijklmnopqrstuvwxyz"
    freq_list = [0.0] * 26
    for n in message:
        freq_list[alphabet.find(n)] += 1.0
    for i in range(len(freq_list)):
        freq_list[i] /= len(message)
    return freq_list



#Tests

print freq_analysis("abcd")
#>>> [0.25, 0.25, 0.25, 0.25, 0.0, ..., 0.0]

print freq_analysis("adca")
#>>> [0.5, 0.0, 0.25, 0.25, 0.0, ..., 0.0]

print freq_analysis('bewarethebunnies')
#>>> [0.0625, 0.125, 0.0, 0.0, ..., 0.0]


你可能感兴趣的:(Python_19_Udacity_Evans_Intro to CS_3_How to manage data)