james20141606

Python Tutorial

This is one of the chapter in Bioinformatics basic course in Tsinghua University. You may also find it here
mainly from my dear colleague Binbin Shi.

The related jupyter file

Python Tutorial

Install Anaconda Python

URL: (https://www.anaconda.com/download/)

Easy install of data science packages (binary distribution)
Package management with

conda

Install Python packages using conda:

conda install h5py

Update a package to the latest version:

conda update h5py

Install Python packages using pip:

pip install h5py

Update a package using pip:

pip install --upgrade h5py

Python language tips

Compatibility between Python 3.x and Python 2.x

Biggest difference: print is a function rather than statement in Python 3

This does not work in Python 3

print 1, 2, 3

Solution: use the __future__ module

from __future__ import print_function
# this works both in Python 2 and Python 3
print(1, 2, 3)

Second biggest difference: some package/function names in the standard library are changed

Python 2 => Python 3

cStringIO => io.StringIO
Queue => queue
cPickle => pickle
ConfigParser => configparser
HTMLParser => html.parser
SocketServer => socketserver
SimpleHTTPServer => http.server

Solution: use the six module

Refer to (https://docs.python.org/3/library/future.html) for usage of the __future__ module.
Refer to (https://pythonhosted.org/six/) for usage of the six module.

Get away from IndentationError

Python forces usage of tabs/spaces to indent code

# use a tab
for i in range(3):
    print(i)
# use 2 spaces
for i in range(3):
  print(i)
# use 4 spaces
for i in range(3):
    print(i)

Best practice: always use 4 spaces. You can set whether to use spaces(soft tabs) or tabs for indentation.

In vim editor, use :set list to inspect incorrect number of spaces/tabs.

Add Shebang and encoding at the beginning of executable scripts

Create a file named welcome.py

#! /usr/bin/env python
# -*- coding: UTF-8 -*-
print('welcome to python!')

Then set the python script as executable:

chmod +x welcome.py

Now you can run the script without specifying the Python interpreter:

./welcome.py

All variables, functions, classes are dynamic objects

class MyClass():
    def __init__(self, name):
        self.name = name
# assign an integer to a
a = 1
print(type(a))
# assign a string to a
a = 'abc'
print(type(a))
# assign a function to a
a = range
print(type(a))
print(a(10))
# assign a class to a
a = MyClass
print(type(a))
b = a('myclass')
print(b.name)
# assign an instance of a class to a
a = MyClass('myclass')
print(b.name)
# get type of a
print(type(a))

All python variables are pointers/references

a = [1, 2, 3]
print('a = ', a)
# this add another refrence to the list
b = a
print('b = ', b)
# this will change contents of both a and b
b[2] = 4
print('a = ', a)
print('b = ', b)

Use `deepcopy` if you really want to COPY a variable

from copy import deepcopy
a = {'A': [1], 'B': [2], 'C': [3]}
print(a)
# shallow copy
b = dict(a)
# modify elements of b will change contents of a
b['A'].append(2)
print('a = ', a)
print('b = ', b)
# this also does not work
c = {k:v for k, v in a}
c['A'].append(3)
print('a = ', a)
print('c = ', c)
# recurrently copy every object of a
d = deepcopy(a)
# modify elements of c will not change contents of a
d['A'].append(2)
print('a = ', a)
print('d = ', d)

What if I accidentally overwrite my builtin functions?

You can refer to (https://docs.python.org/2/library/functions.html) for builtin functions in the standard library.

A = [1, 2, 3, 4]
# Ops! the builtin function sum is overwritten by a number
sum = sum(A)
# this will raise an error because sum is not a function now
print(sum(A))
# recover the builtin function into the current environment
from __builtin__ import sum
# this works because sum is a function
print(sum(A))

Note: in Python 3, you should import from builtins rather than __builtin__

from builtins import sum

`int` is of arbitrary precision in Python!

In Pyhton:

print(2**10000)

In R:

print(2^10000)

Easiest way to swap values of two variables

In C/C++:

int a = 1, b = 2, t;
t = a;
a = b;
b = t;

In Python:

a = 1
b = 2
b, a = a, b
print(a, b)

List comprehension

Use for-loops:

a = []
for i in range(10):
    a.append(i + 10)
print(a)

Use list comprehension

a = [i + 10 for i in range(10)]
print(a)

Dict comprehension

Use for-loops:

a = {}
for i in range(10):
    a[i] = chr(ord('A') + i) 
print(a)

Use dict comprehension:

a = {i:chr(ord('A') + i) for i in range(10)}
print(a)

For the one-liners

Use ‘;’ instead of ‘\n’:

# print the first column of each line
python -c 'import sys; print("\n".join(line.split("\t")[0] for line in sys.stdin))'

For more examples of one-liners, please refer to (https://wiki.python.org/moin/Powerful%20Python%20One-Liners).

Read from standard input

import sys
# read line by line
for line in sys.stdin:
    print(line)

Order of dict keys are NOT as you expected

a = {'A': 1, 'B': 2, 'C': 3, 'D': 4, 'E': 5, 'F': 6}
# not in lexicographical order
print([key for key in a])
# now in lexicographical order
print([key for key in sorted(a)])

Use enumerate() to add a number during iteration

A = ['a', 'b', 'c', 'd']
for i, a in enumerate(A):
    print(i, a)

Reverse a list

# a = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
a = range(10)
print(a)
print(a[::-1])

Strings are immutable in Python

a = 'ABCDF'
# will raise an Error
a[4] = 'E'
# convert str to bytearray
b = bytearray(a)
# bytearray are mutable
b[4] = 'E'
# convert bytearray to str
print(str(b))

tuples are hashable while lists are not hashable

# create dict using tuples as keys
d = {
    ('chr1', 1000, 2000): 'featureA',
    ('chr1', 2000, 3000): 'featureB',
    ('chr1', 3000, 4000): 'featureC',
    ('chr1', 4000, 5000): 'featureD',
    ('chr1', 5000, 6000): 'featureE',
    ('chr1', 6000, 7000): 'featureF'
}
# query the dict using tuples
print(d[('chr1', 3000, 4000)])
print(d[('chr1', 6000, 7000)])
# will raise an error
d = {['chr1', 1000, 2000]: 'featureA'}

Use itertools

Nested loops in a more concise way:

A = [1, 2, 3]
B = ['a', 'b', 'c']
C = ['i', 'j', 'k']
D = ['x', 'y', 'z']
# Use nested for-loops
for a in A:
    for b in B:
        for c in C:
            for d in D:
                print(a, b, c, d)
# Use itertools.product
import itertools
for a, b, c, d in itertools.product(A, B, C, D):
    print(a, b, c, d)

Get all combinations of a list:

A = ['A', 'B', 'C', 'D']
# Use itertools.combinations
import itertools
for a, b, c in itertools.combinations(A, 3):
    print(a, b, c)

Convert iterables to lists

import itertools
A = [1, 2, 3]
B = ['a', 'b', 'c']
a = itertools.product(A, B)
# a is a iterable rather than a list
print(a)
# a is a list now
a = list(a)
print(a)

Use the zip() function to transpose nested lists/tuples/iterables

records = [
    ('chr1', 1000, 2000),
    ('chr1', 2000, 3000),
    ('chr1', 3000, 4000),
    ('chr1', 4000, 5000),
    ('chr1', 5000, 6000),
    ('chr1', 6000, 7000)
]
# iterate by rows
for chrom, start, end in records:
    print(chrom, start, end)
# extract columns
chroms, starts, ends = zip(*records)
# build records from columns
# now records2 is the same as records
records2 = zip(chroms, starts, ends)
print(records)

Global and local variables

# a is global
a = 1
def print_local():
    # a is local
    a = 2
    print(a)

def print_global():
    # a is global
    global a
    a = 2
    print(a)

# print global variable
print(a)
# print local variable from function
print_local()
# a is unchanged
print(a)
# change and print global from function
print_global()
# a is changed
print(a)

Use defaultdict

Use dict:

d = {}
d['a'] = []
d['b'] = []
d['c'] = []
# extend list with new elements
d['a'] += [1, 2]
d['b'] += [3, 4, 5]
d['c'] += [6]
for key, val in d.items():
    print(key, val)

Use defaultdict:

from collections import defaultdict
# a new list is created automatically when new elements are added
d = defaultdict(list)
# extend list with new elements
d['a'] += [1, 2]
d['b'] += [3, 4, 5]
d['c'] += [6]
for key, val in d.items():
    print(key, val)

Use generators

Example: read a large FASTA file

def append_extra_line(f):
    """Yield an empty line after the last line in the file
    """
    for line in f:
        yield line
    yield ''

def read_fasta(filename):
    with open(filename, 'r') as f:
        name = None
        seq = ''
        for line in append_extra_line(f):
            if line.startswith('>') or (len(line) == 0):
                if (len(seq) > 0) and (name is not None):
                    yield (name, seq)
                if line.startswith('>'):
                    name = line.strip()[1:].split()[0]
                    seq = ''
            else:
                if name is None:
                    raise ValueError('the first line does not start with ">"')
                seq += line.strip()
# print sequence name and length of each 
for name, seq in read_fasta('test.fa'):
    print(name, len(seq))

Turn off annoying KeyboardInterrupt and BrokenPipe Error

Without exception handling (press Ctrl+C):

import time
time.sleep(300)

With exception handling (press Ctrl+C):

import time
import errno

try:
    time.sleep(300)
except KeyboardInterrupt:
    sys.exit(1)
except OSError as e:
    if e.errno == errno.EPIPE:
        sys.exit(-e.errno)

Class and instance variables

class MyClass():
    name = 'class_name'
    def __init__(self, name):
        self.name = name

    def change_name(self, name):
        self.name = name

# print class variable
print(MyClass.name)
# create an instance from MyClass
a = MyClass('instance_name')
# print instance name
print(a.name)
# change instance name
a.change_name('instance_new_name')
print(a.name)
print(MyClass.name)
# change class name
MyClass.name = 'class_new_name'
print(a.name)
print(MyClass.name)

Useful Python packages for data analysis

Browser-based interactive programming in Python: jupyter

URL: (http://jupyter.org/)

Start jupyter notebook

jupyter notebook --no-browser

Jupyter notebooks manager

Jupyter process manager

Jupyter notebook

Integrate with matplotlib

Browser-based text editor

Browser-based terminal

Display image

Display dataframe

Display audio

Embedded markdown

Python packages for scientific computing

Vector arithmetics: numpy

URL: (http://www.numpy.org/)

Example:

import numpy as np
# create an empty matrix of shape (5, 4)
X = np.zeros((5, 4), dtype=np.int32)
# create an array of length 5: [0, 1, 2, 3, 4]
y = np.arange(5)
# create an array of length 4: [0, 1, 2, 3]
z = np.arange(4)
# set Row 1 to [0, 1, 2, 3]
X[0] = np.arange(4)
# set Row 2 to [1, 1, 1, 1]
X[1] = 1
# add 1 to all elements
X += 1
# add y to each row of X
X += y.reshape((-1, 1))
# add z to each column of X
X += z.reshape((1, -1))
# get row sums => 
row_sums = X.sum(axis=1)
# get column sums
col_sums = X.sum(axis=0)
# matrix multiplication
A = X.dot(X.T)
# save matrix to text file
np.savetxt('data.txt', A)

Numerical analysis (probability distribution, signal processing, etc.): scipy

URL: (https://www.scipy.org/)

scipy.stats contains a large number probability distributions:

Unified interface for all probability distributions:

Just-in-time (JIT) compiler for vector arithmetics

URL: (https://numba.pydata.org/)

Compile python for-loops to native code to achive similar performance to C/C++ code.
Example:

from numba import jit
from numpy import arange

# jit decorator tells Numba to compile this function.
# The argument types will be inferred by Numba when function is called.
@jit
def sum2d(arr):
    M, N = arr.shape
    result = 0.0
    for i in range(M):
        for j in range(N):
            result += arr[i,j]
    return result

a = arange(9).reshape(3,3)
print(sum2d(a))

Library for symbolic computation: sympy

URL: (http://www.sympy.org/en/index.html)

Operation on data frames: pandas

URL: (http://pandas.pydata.org/pandas-docs/stable/)

Example:

import pandas as pd
# read a bed file
genes = pd.read_table('gene.bed', header=None, sep='\t',
                     names=('chrom', 'start', 'end', 'gene_id', 'score', 'strand', 'biotype'))
# get all gene IDs
gene_ids = genes['gene_id']
# set gene_id as index
genes.index = genes['gene_id']
# get row with given gene_id
gene = genes.loc['ENSG00000212325.1']
# get rows with biotype = 'protein_coding'
genes_selected = genes[genes['biotype'] == 'protein_coding']]
# get protein coding genes in chr1
genes_selected = genes.query('(biotype == "protein_coding") and (chrom == "chr1")')
# count genes for each biotype
biotype_counts = genes.groupby('biotype')['gene_id'].count()
# add a column for gene length
genes['length'] = genes['end'] - genes['start']
# calculate average gene length for each chromosome and biotype
length_table = genes.pivot_table(values='length', index='biotype', columns='chrom')
# save DataFrame to Excel file
length_table.to_excel('length_table.xlsx')

Basic graphics and plotting: matplotlib

URL: (https://matplotlib.org/contents.html)

Statistical data visualization: seaborn

URL: (https://seaborn.pydata.org/)

Interactive programming in Python: ipython

URL: (http://ipython.org/ipython-doc/stable/index.html)

Statistical tests: statsmodels

URL: (https://www.statsmodels.org/stable/index.html)

Machine learning algorithms: scikit-learn

URL: (http://scikit-learn.org/)

Example:

from sklearn.datasets import make_classification
from sklearn.linear_model import LogisticRegression
from sklearn.model_selection import train_test_split
from sklearn.metrics import roc_auc_score, accuracy_score

# generate ramdom data
X, y = make_classification(n_samples=1000, n_classes=2, n_features=10)
# split dataset into training and test dataset
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2)
# create an classifier object
model = LogisticRegression()
# training the classifier
model.fit(X_train, y_train)
# predict outcomes on the test dataset
y_pred = model.predict(X_test)
# evalualte the classification performance
print('roc_auc_score = %f'%roc_auc_score(y_test, y_pred))
print('accuracy_score = %f'%accuracy_score(y_test, y_pred))

Natural language analysis: gensim

URL: (https://radimrehurek.com/gensim/)

HTTP library: requests

URL: (http://docs.python-requests.org/en/master/)

Lightweight Web framework: flask

URL: (http://flask.pocoo.org/)

Deep learning framework: tensorflow

URL: (http://tensorflow.org/)

High-level deep learning framework: keras

URL: (https://keras.io/)

Operation on sequence and alignment formats: biopython

URL: (http://biopython.org/)

from Bio import SeqIO
for record in SeqIO.parse('test.fa', 'fasta'):
    print(record.id, len(record.seq))

from Bio import SeqIO
from Bio.Seq import Seq
from Bio.SeqRecord import SeqRecord
sequences = [
    SeqRecord(Seq('ACCGGTATCTATATCCCCGAGAGGAATGGGTCAGACATGGACCTAC'), id='A', description=''),
    SeqRecord(Seq('TTACAATGTGGCAGTGAACGCGTGACAATCCTCCCCGTTGGACAT'), id='B', description=''),
    SeqRecord(Seq('CAAAGCTGCATCGAATTGTCGAGACAACACTAGATTTAAGCGCA'), id='C', description=''),
    SeqRecord(Seq('CGCCCGCGAGGGCAATCAGGACGGATTTACGGAT'), id='D', description=''),
    SeqRecord(Seq('CCGCCCACGCTCCCGTTTTCTTCCATACCTGTCC'), id='E', description='')
]
with open('test_out.fa', 'w') as f:
    SeqIO.write(sequences, f, 'fasta')

Operation on genomic formats (BigWig,etc.): bx-python

Operation on HDF5 files: h5py

URL: (https://www.h5py.org/)

Save data to an HDF5 file

import h5py
import numpy as np
# generate data
chroms = ['chr1', 'chr2', 'chr3']
chrom_sizes = {
    'chr1': 15000,
    'chr2': 12000,
    'chr3': 11000
}
coverage = {}
counts = {}
for chrom, size in chrom_sizes.items():
    coverage[chrom] = np.random.randint(10, 1000, size=size)
    counts[chrom] = np.random.randint(1000, size=size)%coverage[chrom]
# save data to an HDF5 file
with h5py.File('dataset.h5', 'w') as f:
    for chrom in chrom_sizes:
        g = f.create_group(chrom)
        g.create_dataset('coverage', data=coverage[chrom])
        g.create_dataset('counts', data=counts[chrom])

h5ls -r dataset.h5

/                        Group
/chr1                    Group
/chr1/counts             Dataset {15000}
/chr1/coverage           Dataset {15000}
/chr2                    Group
/chr2/counts             Dataset {12000}
/chr2/coverage           Dataset {12000}
/chr3                    Group
/chr3/counts             Dataset {11000}
/chr3/coverage           Dataset {11000}

Read data from an HDF file:

import h5py
# read data from an HDF5 file
with h5py.File('dataset.h5', 'r') as f:
    coverage = {}
    counts = {}
    for chrom in f.keys():
        coverage[chrom] = f[chrom + '/coverage'][:]
        counts[chrom] = f[chrom + '/counts'][:]

Mixed C/C++ and python programming: cython

URL: (http://cython.org/)

import numpy as np
cimport numpy as np
cimport cython
from cython.parallel import prange
from cython.parallel cimport parallel
cimport openmp

@cython.boundscheck(False) # turn off bounds-checking for entire function
@cython.wraparound(False)  # turn off negative index wrapping for entire function
def compute_mse_grad_linear_ard(np.ndarray[np.float64_t, ndim=1] w,
        np.ndarray[np.float64_t, ndim=2] X1,
        np.ndarray[np.float64_t, ndim=2] X2,
        np.ndarray[np.float64_t, ndim=2] Kinv1,
        np.ndarray[np.float64_t, ndim=2] K2,
        np.ndarray[np.float64_t, ndim=2] a,
        np.ndarray[np.float64_t, ndim=2] err,
        np.ndarray[np.float64_t, ndim=2] mask=None):
    '''Compute the gradients of MSE on the test samples with respect to relevance vector w.
    :param w: 1D array of shape [n_features]
    :return: gradients of MSE wrt. 2, 1D array of shape [n_features]
    '''
    cdef np.int64_t N1, N2, p
    cdef np.int64_t k, i, j, m
    N1 = X1.shape[0]
    N2 = X2.shape[0]
    p = X2.shape[1]

    cdef np.ndarray[np.float64_t, ndim=2] K2Kinv1 = K2.dot(Kinv1)
    cdef np.ndarray[np.float64_t, ndim=1] mse_grad = np.zeros_like(w)

    #cdef np.ndarray[np.float64_t, ndim=3] K1_grad = np.zeros((p, N1, N1), dtype=np.float64)
    #cdef np.ndarray[np.float64_t, ndim=3] K2_grad = np.zeros((p, N2, N1), dtype=np.float64)
    #cdef np.ndarray[np.float64_t, ndim=3] K_grad =  np.zeros((p, N2, N1), dtype=np.float64)
    cdef np.int64_t max_n_threads = openmp.omp_get_max_threads()
    cdef np.ndarray[np.float64_t, ndim=3] K1_grad = np.zeros((max_n_threads, N1, N1), dtype=np.float64)
    cdef np.ndarray[np.float64_t, ndim=3] K2_grad = np.zeros((max_n_threads, N2, N1), dtype=np.float64)
    cdef np.ndarray[np.float64_t, ndim=3] K_grad  = np.zeros((max_n_threads, N1, N1), dtype=np.float64)

    cdef np.int64_t thread_id
    with nogil, parallel():
        for k in prange(p):
            thread_id = openmp.omp_get_thread_num()
            # compute K1_grad
            for i in range(N1):
                for j in range(N1):
                    K1_grad[thread_id, i, j] = 2.0*w[k]*X1[i, k]*X1[j, k]
            # compute K2_grad
            for i in range(N2):
                for j in range(N1):
                    K2_grad[thread_id, i, j] = 2.0*w[k]*X2[i, k]*X1[j, k]
            # compute K_grad
            for i in range(N2):
                for j in range(N1):
                    K_grad[thread_id, i, j] = K2_grad[thread_id, i, j]
                    for m in range(N1):
                        K_grad[thread_id, i, j] += K2Kinv1[i, m]*K1_grad[thread_id, m, j]
            # compute mse_grad
            for i in range(N2):
                for j in range(N1):
                    mse_grad[k] += err[i, 0]*K_grad[thread_id, i, j]*a[j, 0]
    return mse_grad, K_grad

Progress bar: tqdm

URL: (https://pypi.python.org/pypi/tqdm)

Example Python scripts

View a table in a pretty way

The original table is ugly:

head -n 15 metadata.tsv

Output:

File accession	File format	Output type	Experiment accession	Assay	Biosample term id
ENCFF983DFB	fastq	reads	ENCSR429XTR	ChIP-seq	EFO:0002067
ENCFF590TBW	fastq	reads	ENCSR429XTR	ChIP-seq	EFO:0002067
ENCFF258RWG	bam	unfiltered alignments	ENCSR429XTR	ChIP-seq	EFO:0002067
ENCFF468LRV	bam	unfiltered alignments	ENCSR429XTR	ChIP-seq	EFO:0002067
ENCFF216EBS	bam	alignments	ENCSR429XTR	ChIP-seq	EFO:0002067
ENCFF232QFN	bam	unfiltered alignments	ENCSR429XTR	ChIP-seq	EFO:0002067
ENCFF682NGE	bam	alignments	ENCSR429XTR	ChIP-seq	EFO:0002067
ENCFF328UKA	bam	unfiltered alignments	ENCSR429XTR	ChIP-seq	EFO:0002067
ENCFF165COO	bam	alignments	ENCSR429XTR	ChIP-seq	EFO:0002067
ENCFF466OLG	bam	alignments	ENCSR429XTR	ChIP-seq	EFO:0002067
ENCFF595HIY	bigBed narrowPeak	peaks	ENCSR429XTR	ChIP-seq	EFO:0002067
ENCFF494CKB	bigWig	fold change over control	ENCSR429XTR	ChIP-seq	EFO:0002067
ENCFF308BXW	bigWig	fold change over control	ENCSR429XTR	ChIP-seq	EFO:0002067
ENCFF368IHM	bed narrowPeak	peaks	ENCSR429XTR	ChIP-seq	EFO:0002067

Now display the table more clearly:

head -n 15 metadata.tsv | tvi -d $'\t' -j center

Output:

File accession    File format          Output type        Experiment accession  Assay   Biosample term id
 ENCFF983DFB         fastq                reads               ENCSR429XTR      ChIP-seq    EFO:0002067
 ENCFF590TBW         fastq                reads               ENCSR429XTR      ChIP-seq    EFO:0002067
 ENCFF258RWG          bam         unfiltered alignments       ENCSR429XTR      ChIP-seq    EFO:0002067
 ENCFF468LRV          bam         unfiltered alignments       ENCSR429XTR      ChIP-seq    EFO:0002067
 ENCFF216EBS          bam               alignments            ENCSR429XTR      ChIP-seq    EFO:0002067
 ENCFF232QFN          bam         unfiltered alignments       ENCSR429XTR      ChIP-seq    EFO:0002067
 ENCFF682NGE          bam               alignments            ENCSR429XTR      ChIP-seq    EFO:0002067
 ENCFF328UKA          bam         unfiltered alignments       ENCSR429XTR      ChIP-seq    EFO:0002067
 ENCFF165COO          bam               alignments            ENCSR429XTR      ChIP-seq    EFO:0002067
 ENCFF466OLG          bam               alignments            ENCSR429XTR      ChIP-seq    EFO:0002067
 ENCFF595HIY   bigBed narrowPeak          peaks               ENCSR429XTR      ChIP-seq    EFO:0002067
 ENCFF494CKB        bigWig       fold change over control     ENCSR429XTR      ChIP-seq    EFO:0002067
 ENCFF308BXW        bigWig       fold change over control     ENCSR429XTR      ChIP-seq    EFO:0002067
 ENCFF368IHM    bed narrowPeak            peaks               ENCSR429XTR      ChIP-seq    EFO:0002067

You can also get some help by typing tvi -h:

usage: tvi [-h] [-d DELIMITER] [-j {left,right,center}] [-s SEPARATOR]
           [infile]

Print tables pretty

positional arguments:
  infile                input file, default is stdin

optional arguments:
  -h, --help            show this help message and exit
  -d DELIMITER          delimiter of fields of input. Default is white space.
  -j {left,right,center}
                        justification, either left, right or center. Default
                        is left
  -s SEPARATOR          separator of fields in output

tvi.py

#! /usr/bin/env python

import sys
import argparse
import os
from cStringIO import StringIO

def main():
	parser = argparse.ArgumentParser(description='Print tables pretty')
	parser.add_argument('infile', type=str, nargs='?',
					 help='input file, default is stdin')
	parser.add_argument('-d', dest='delimiter', type=str,
					 required=False,
					 help='delimiter of fields of input. Default is white space.')
	parser.add_argument('-j', dest='justify', type=str,
					 required=False, default='left',
					 choices=['left', 'right', 'center'],
					 help='justification, either left, right or center. Default is left')
	parser.add_argument('-s', dest='separator', type=str,
					 required=False, default=' ',
					 help='separator of fields in output')
	args = parser.parse_args()

	table = []
	maxwidth = []

	# default is to read from stdin
	fin = sys.stdin
	if args.infile:
		try:
			fin = open(args.infile, 'rt')
		except IOError as e:
			sys.stderr.write('Error: %s: %s\n'%(e.strerror, args.infile))
			sys.exit(e.errno)

	for line in fin:
		fields = None
		# split line by delimiter
		if args.delimiter:
			fields = line.strip().split(args.delimiter)
		else:
			fields = line.strip().split()
		for i in xrange(len(fields)):
			width = len(fields[i])
			if (i+1) > len(maxwidth):
				maxwidth.append(width)
			else:
				if width > maxwidth[i]:
					maxwidth[i] = width
		table.append(fields)
	fin.close()

	try:
		for fields in table:
			line = StringIO()
			for i in xrange(len(fields)):
				# format field with different justification
				nSpace = maxwidth[i] - len(fields[i])
				if args.justify == 'left':
					line.write(fields[i])
					for j in xrange(nSpace):
						line.write(' ')
				elif args.justify == 'right':
					for j in xrange(nSpace):
						line.write(' ')
					line.write(fields[i])
				elif args.justify == 'center':
					for j in xrange(nSpace/2):
						line.write(' ')
					line.write(fields[i])
					for j in xrange(nSpace - nSpace/2):
						line.write(' ')

				line.write(args.separator)
			print line.getvalue()
			line.close()
	except IOError:
		sys.exit(-1)

if __name__ == '__main__':
	main()

Generate a random FASTA file

seqgen.py

#! /usr/bin/env python

import sys
import argparse
import textwrap
import random

def write_fasta(fout, seq, name='seq', description=None):
	if description:
		fout.write('>' + name + ' ' + description + '\n')
	else:
		fout.write('>' + name + '\n')
	fout.write(textwrap.fill(seq) + '\n')

def main():
	parser = argparse.ArgumentParser(description='Generate sequences and output in various formats')
	parser.add_argument('-n', '--number', dest='number', type=int, required=False,
					 default=10, help='Number of sequences to generate')
	parser.add_argument('--min-length', dest='min_length', type=int, required=False,
					 default=30, help='Minimal length')
	parser.add_argument('--max-length', dest='max_length', type=int, required=False,
					 default=50, help='Maximal length')
	parser.add_argument('-l', '--length', type=int, required=False,
					 help='Fixed length. If specified, --min-length and --max-length will be ignored.')
	parser.add_argument('-a', '--alphabet', type=str, required=False,
					 default='ATGC', help='Letters to used in the sequences')
	parser.add_argument('-f', '--format', type=str, required=False,
					 choices=['fasta', 'text'], default='fasta', help='Output formats')
	parser.add_argument('-o', '--outfile', type=argparse.FileType('w'), required=False,
					 default=sys.stdout, help='Output file name')
	parser.add_argument('-p', '--prefix', type=str, required=False,
					 default='RN_', help='Prefix of sequence names for fasta format')
	args = parser.parse_args()

	rand = random.Random()
	for i in xrange(args.number):
		if args.length:
			length = args.length
		else:
			length = rand.randint(args.min_length, args.max_length)
		seq = bytearray(length)
		for j in xrange(length):
			seq[j] = rand.choice(args.alphabet)
		if args.format == 'fasta':
			write_fasta(args.outfile, str(seq), args.prefix + '%08d'%i)
		else:
			args.outfile.write(seq + '\n')
	args.outfile.close()

if __name__ == '__main__':
	main()

Weekly tasks

All files you need for completing the tasks can be found at: weekly_tasks.zip

Task 1: run examples (Python tips, numpy, pandas) in this tutorial

Install Anaconda on your PC. Try to understand example code and run in Jupyter or IPython.

Task 2: write a Python program to convert a GTF file to BED12 format

Please refer to (https://genome.ucsc.edu/FAQ/FAQformat.html#format1) for BED12 format and refer to (https://www.ensembl.org/info/website/upload/gff.html) for GTF format.
GTF example:

chr1	HAVANA	gene	29554	31109	.	+	.	gene_id "ENSG00000243485.5"; gene_type "lincRNA"; gene_name "MIR1302-2HG"; level 2; tag "ncRNA_host"; havana_gene "OTTHUMG00000000959.2";
chr1	HAVANA	transcript	29554	31097	.	+	.	gene_id "ENSG00000243485.5"; transcript_id "ENST00000473358.1"; gene_type "lincRNA"; gene_name "MIR1302-2HG"; transcript_type "lincRNA"; transcript_name "MIR1302-2HG-202"; level 2; transcript_support_level "5"; tag "not_best_in_genome_evidence"; tag "dotter_confirmed"; tag "basic"; havana_gene "OTTHUMG00000000959.2"; havana_transcript "OTTHUMT00000002840.1";
chr1	HAVANA	exon	29554	30039	.	+	.	gene_id "ENSG00000243485.5"; transcript_id "ENST00000473358.1"; gene_type "lincRNA"; gene_name "MIR1302-2HG"; transcript_type "lincRNA"; transcript_name "MIR1302-2HG-202"; exon_number 1; exon_id "ENSE00001947070.1"; level 2; transcript_support_level "5"; tag "not_best_in_genome_evidence"; tag "dotter_confirmed"; tag "basic"; havana_gene "OTTHUMG00000000959.2"; havana_transcript "OTTHUMT00000002840.1";

BED12 example:

chr1	67522353	67532326	ENST00000230113	0	+	0	0	0	5	45,60,97,64,221,	0,5024,7299,7961,9752,
chr1	39249837	39257649	ENST00000289890	0	-	0	0	0	3	365,78,115,	0,4304,7697,
chr1	144245237	144250279	ENST00000294715	0	-	0	0	0	3	78,135,55,	0,448,4987,
chr1	15111814	15152464	ENST00000310916	0	-	0	0	0	6	5993,578,121,88,146,174,	0,6512,8762,9157,12413,40476,
chr1	34975698	34978706	ENST00000311990	0	-	0	0	0	3	1704,154,29,	0,2232,2979,

The GTF file is weekly_tasks/gencode.v27.long_noncoding_RNAs.gtf.
Each line in the output file is a transcript with the 4th columns as transcript ID
The version number of the transcript ID should be stripped (e.g. ENST00000473358.1 => ENST00000473358).
The output file is sorted first by transcript IDs and then by chromosome in lexicographical order.
Column 5, 7, 8, 9 in the BED12 file should be set to 0.
Please do NOT use any external tools (e.g. sort, awk, etc.) in your program other than Python.
An example output can be found in weekly_tasks/transcripts.bed.

Hint: use dict, list, tuple, str.split, re.match, sorted.

Task 3: write a Python program to add a prefix to all directories

Each prefix is a two-digit number starting from 00 and ‘-’. If the number is less than 10, a single ‘0’ letter should be filled.
The files/directories should be numbered according to the lexicographical order.
For example, if the original directory structure is:

.
├── A
│   ├── A
│   │   ├── A
│   │   ├── B
│   │   └── C
│   ├── B
│   │   └── A
│   └── C
│       └── A
├── B
│   ├── A
│   └── B
└── C
    ├── A
    └── B
        └── A

then you should get the following directory structure after renaming:

.
├── 00-A
│   ├── 00-A
│   │   ├── 00-A
│   │   ├── 01-B
│   │   └── 02-C
│   ├── 01-B
│   │   └── 00-A
│   └── 02-C
│       └── 00-A
├── 01-B
│   ├── 00-A
│   └── 01-B
└── 02-C
    ├── 00-A
    └── 01-B
        └── 00-A

The original directories can be found in weekly_tasks/original_dirs.
The root directory (i.e. original_dirs) should not be renamed.
You can use tree command to display the directory structure as shown above.
An example result can be found in weekly_tasks/renamed_dirs.
Hint: use os.listdir, os.rename, str.format, sorted, yield.

你可能感兴趣的:(Python Tutorial)

python使用技巧超超是超超 python
1、耗时装饰器importtimedefdecorate(func):definner():begin=time.time()result=func()end=time.time()print(f'函数{func}耗时{end-begin}')returnresultreturninner2、查看代码运行耗时fromline_profilerimportLineProfilerdefoperati
Anaconda与python和pycharm的安装及其关系 Daylight.. 学习笔记 pycharm python ide
Anaconda与python和pycharm的安装及其关系一、Anaconda与python和pycharm的关系：1.Anaconda包含python，并且里面含有许多常用的库。（安装了Anaconda就不需要安装python了）2.pycharm是一种IDE（集成开发环境），在其中可以编写Python程序。（工具和语言的关系）。二、如何安装？Anaconda的安装Anaconda官网下载地址
ImportError: cannot import name ‘Mapping‘ from ‘collections‘ AI算法网奇 python基础前端 javascript 数据库
ImportError:cannotimportname'Mapping'from'collections'解决方法：fromcollections.abcimportMapping#正确导入Mappingdefprocess_mapping(data):ifisinstance(data,Mapping):#使用Mapping进行类型检查#处理映射类型的代码pass测试命令：python-c"f
python图形界面化编程GUI（二）常用的组件(Text、Radiobutton、Checkbutton、Canvas)和布局管理器(gird、pack、place) hwwaizs python-GUI图形化编程 python 开发语言
Text文本框Text(多行文本框)的主要用于显示多行文本，还可以显示网页链接,图片,HTML页面,甚至CSS样式表，添加组件等。主要用来显示信息，也常被当做简单的文本处理器、⽂本编辑器或者网页浏览器来使用。IDLE就是Text组件构成的。insert插入的时候可以用INSERT代表当前光标的位置，END代表在结尾的位置，也可以用插入小数的形式，2.3代表第二行第三列后插入。fromtkinter
【深度解析】最短路径算法：Dijkstra与Floyd-Warshall 吴师兄大模型算法数据结构 python 最短路径算法 Dijkstra算法 Floyd-Warshall 开发语言
系列文章目录01-从零开始掌握Python数据结构：提升代码效率的必备技能！02-算法复杂度全解析：时间与空间复杂度优化秘籍03-线性数据结构解密：数组的定义、操作与实际应用04-深入浅出链表：Python实现与应用全面解析05-栈数据结构详解：Python实现与经典应用场景06-深入理解队列数据结构：从定义到Python实现与应用场景07-双端队列（Deque）详解：Python实现与滑动窗口应
CSE 231 Computer Python program 后端
CSE231Spring2025ComputerProject#4LearningobjectivesThisassignmentfocusesonthedesign,implementationandtestingofaPythonprogramthatusescharacterstringsforlookingattheDNAsequencesforkeyproteinsandseeingho
全网最全！DeepSeek 新手入门教程合集人工智能deepseek
如果你是初次接触DeepSeek的普通用户或开发者，面对海量教程却无从下手？别担心！本文为你整理全网最易懂、最实用的DeepSeek学习资源，涵盖快速上手、编程实战、系统手册等，附直达链接，收藏这一篇就够了！一、快速入门指南《DeepSeek入门教程》-博客园亮点：手把手教你注册账号、获取APIKey，并提供Python调用多轮对话的代码示例，适合初级开发者。直达链接：点击查看核心内容：API调用
【Python】Python入门——判断语句 zhoushanguhe Python python 编程开发语言
Python入门——判断语句。内容包括if语句、条件表达式、三元运算、match语句等。目录一、if语句1.基本if-else语句2.常用比较运算符3.if-else连写4.pass语句5.变量的作用域二、条件表达式三、三元运算四、match语句五、其他一、if语句1.基本if-else语句当条件成立时，执行某些语句；否则执行另一些语句。注意：if和else后需要加上冒号:if语句的代码块需要缩进
兄弟们，我的deepseek终于可以控制浏览器了：Part 1/n，含代码几道之旅 Dify：智能体（Agent）工作流知识库全搞定几道之旅AI专栏VVVIP 人工智能
文章目录前言helloworld前言其实，deepseek控制浏览器咱之前就发过，只不过当时没有想到这么好的标题，哈哈。所依赖的，依然是BrowserUse这个项目BrowserUse项目官网helloworld按照官网配置好环境后，只需新建一个python文件（例如，叫main.py?）然后运行即可。fromlangchain_openaiimportChatOpenAIfrombrowser_
CSE 231 Computer Python program 后端
CSE231Spring2025ComputerProject#4LearningobjectivesThisassignmentfocusesonthedesign,implementationandtestingofaPythonprogramthatusescharacterstringsforlookingattheDNAsequencesforkeyproteinsandseeingho
【部署】Ktransformer是什么、如何利用单卡24GB显存部署Deepseek-R1 和 Deepseek-V3 仙人掌_lz 人工智能人工智能 AI 部署自然语言处理
简介KTransformers是一个灵活的、以Python为中心的框架，旨在通过先进的内核优化和放置/并行策略提升HuggingFaceTransformers的使用体验。它具有高度的可扩展性，用户可通过单行代码注入优化模块，获得兼容Transformers的接口、符合OpenAI和Ollama的RESTfulAPI，甚至简化的ChatGPT风格的WebUI。KTransformers的性能优化基
C语言-回调函数的应用 woainizhongguo. C/C++c语言
什么是回调函数回调函数就是一个被作为参数传递的函数。在C语言中，回调函数只能使用函数指针实现，在C++、Python、ECMAScript等更现代的编程语言中还可以使用仿函数或匿名函数。工作机制⑴定义一个回调函数；⑵提供函数实现的一方在初始化的时候，将回调函数的函数指针注册给调用者；⑶当特定的事件或条件发生的时候，调用者使用函数指针调用回调函数对事件进行处理。应用案例（1）应用层：通过调用hal层
Python Union 联合类型注解详解人才程序员杂谈 python 服务器 java linux 后端软件工程开发语言
文章目录PythonUnion联合类型注解详解1.什么是Union联合类型？**语法（Python3.9及之前版本）**：**语法（Python3.10及之后版本）**：2.Union联合类型注解示例**(1)使用Union来表示多个类型的参数****(2)使用`|`来表示联合类型（Python3.10及之后版本）**3.使用Union进行复杂类型注解**(1)使用Union与列表结合****(2
释放 DeepSeek 的力量：像专家一样本地安装与探索！ guzhoumingyue AI python
要在本地运行DeepSeek，您需要遵循以下步骤。请确保您的计算机上已安装Python和Git，并且满足DeepSeek的依赖项。步骤1:安装依赖项安装Python和pip确保您已安装Python（建议使用Python3.6及以上版本）。您可以通过在终端/命令提示符中输入以下命令来检查Python是否已安装：bash复制代码python--version或者bash复制代码python3--ver
ffmpeg-python安装 neverayever 计算机 ffmpeg python linux
centos-ffmpeg-python安装安装ffmpeg一：下载并解压wgethttp://www.ffmpeg.org/releases/ffmpeg-4.2.tar.gztar-zxvfffmpeg-4.2.tar.gz若linux服务器没网，可以在windows上直接访问http://www.ffmpeg.org/releases/ffmpeg-4.2.tar.gz就可下载，然后上传至服
Python的那些事第二十七篇：Python中的“数据魔法师”NumPy 暮雨哀尘 Python的那些事 python numpy 开发语言数据分析算法数组索引
摘要在这篇幽默风趣的论文中，我们将深入探讨NumPy——Python中最强大的数值计算库之一。它不仅提供了高性能的多维数组对象，还让复杂的数学运算变得像吃冰淇淋一样简单。本文将通过生动的代码示例和幽默的比喻，带你领略NumPy的魔法世界，让你在欢笑中掌握这个强大的工具。一、引言：为什么NumPy是程序员的“超级英雄”？1.1NumPy的起源：从“数据苦力”到“数据魔法师”想象一下，你被困在一个全是
Python爬虫TLS dme. Python爬虫零基础入门爬虫 python
TLS指纹校验原理和绕过浏览器可以正常访问，但是用requests发送请求失败。后端是如何监测得呢？为什么浏览器可以返回结果，而requests模块不行呢？https://cn.investing.com/equities/amazon-com-inc-historical-data1.指纹校验案例1.1案例：ascii2dhttps://ascii2d.net/importrequestsres
python爬虫Selenium库详细教程_python爬虫之selenium库的使用详解嘻嘻哈哈学编程程序员 python 爬虫 selenium
网上学习资料一大堆，但如果学到的知识不成体系，遇到问题时只是浅尝辄止，不再深入研究，那么很难做到真正的技术提升。需要这份系统化学习资料的朋友，可以戳这里获取一个人可以走的很快，但一群人才能走的更远！不论你是正从事IT行业的老鸟或是对IT行业感兴趣的新人，都欢迎加入我们的的圈子（技术交流、学习资源、职场吐槽、大厂内推、面试辅导），让我们一起学习成长！2.2访问页面2.3查找元素2.3.1单个元素下面
排序算法：冒泡排序（Python）娱乐不打烊丶排序算法算法数据结构
思路：大家一定都喝过汽水吧，汽水中常常有许多小小的气泡，往上飘，这是因为组成小气泡的二氧化碳比水要轻，所以小气泡才会一点一点的向上浮。而冒泡排序之所以叫冒泡排序，正是因为这种排序算法的每一个元素都可以向小气泡一样，根据自身大小，一点一点向着数组的一侧移动。一图解百惑，上图！那么，话不多说，上代码！defbubble_sort(input_list):#冒泡排序：每次循环，锁定一个最值，并朝着最大或
supervisord 命令介绍和使用案例 lisanmengmeng linux 命令工具系统运维 shell编程服务器 linux 运维
supervisord命令介绍和使用案例supervisord是一个用Python编写的进程管理工具，用于监控和管理Linux系统中的进程。它可以将普通的命令行进程转变为后台守护进程（daemon），并监控进程状态，在进程异常退出时自动重启。它通过fork/exec的方式把被管理的进程当作自己的子进程来启动。主要功能:进程管理：能够启动、停止、重启和关闭进程.自动重启：监控进程状态，并在进程崩溃时
ptython setup.py install 设置python包编译时的并行数 leo0308 基础知识 Python python pytorch3d
通过源码编译安装pytorch3d的时候，直接执行pythonsetup.pyinstall时，默认开的并行数很多，有10几个，直接导致机器卡死。通过设置下面的环境变量，可以设置较小的并行数，避免占用过多的资源。exportMAX_JOBS=4设置后，同时只有4个编译的进程。
python 自动化数据提取之正则表达式_python 正则提取(2) m0_60607245 程序员 python 学习面试
一、Python所有方向的学习路线Python所有方向的技术点做的整理，形成各个领域的知识点汇总，它的用处就在于，你可以按照下面的知识点去找对应的学习资源，保证自己学得较为全面。二、Python必备开发工具工具都帮大家整理好了，安装就可直接上手！三、最新Python学习笔记当我学到一定基础，有自己的理解能力的时候，会去阅读一些前辈整理的书籍或者手写的笔记资料，这些笔记详细记载了他们对一些技术点的理
GUI编程（window系统→Linux系统）诚信爱国敬业友善心得 linux python gui
最近有个项目需要将windows系统的程序往Linux系统上面移植，由于之前程序没有考虑过多平台兼容的问题，导致部分功能不可用以下是对近期遇到的问题的总结，以及相应的解决方案和经验分享。1.Python模块安装与管理在Linux系统中，安装和管理Python模块时可能会遇到权限问题或依赖冲突。安装模块：使用pip安装模块时，建议使用--user选项，避免需要管理员权限：bash复制pipinsta
spring boot基于知识图谱的阿克苏市旅游管理系统python-计算机毕业设计 QQ1963288475 spring boot 知识图谱旅游 python vue.js django flask
目录功能和技术介绍具体实现截图开发核心技术：开发环境开发步骤编译运行核心代码部分展示系统设计详细视频演示可行性论证软件测试源码获取功能和技术介绍该系统基于浏览器的方式进行访问，采用springboot集成快速开发框架，前端使用vue方式，基于es5的语法，开发工具IntelliJIDEAx64，因为该开发工具，内嵌了Tomcat服务运行机制，可不用单独下载Tomcatserver服务器。由于考虑到
Python从0到100（三十九）：数据提取之正则（文末免费送书）是Dream呀 python mysql 开发语言
前言：零基础学Python：Python从0到100最新最全教程。想做这件事情很久了，这次我更新了自己所写过的所有博客，汇集成了Python从0到100，共一百节课，帮助大家一个月时间里从零基础到学习Python基础语法、Python爬虫、Web开发、计算机视觉、机器学习、神经网络以及人工智能相关知识，成为学习学习和学业的先行者！欢迎大家订阅专栏：零基础学Python：Python从0到100最新
Python学习心得两大编程思想 lifegoesonwjl python 开发语言 pycharm 前端 c语言
一、两大编程思想：1.面向过程：功能上的封装典型代表：C语言2.面向对象：属性和行为上的封装典型代表：Python、Java二、面向过程与面向对象的异同点：1.区别：面向过程：事物比较简单，可用线性的思维去解决面向对象：事务比较复杂，使用简单的线性思维无法解决2.共同点：（1）面向过程和面向对象都是解决实际问题的一种思维方式；（2）二者相辅相成，并不是对立的；（3）解决复杂问题，通过面向对象方式便
Linux升级Anacodna并配置jupyterLab 伪_装环境部署 linux 服务器 Anaconda python jupyter
在使用Anaconda的过程中，随着项目和需求的发展，可能需要升级Anaconda的Base环境中的Python版本。本文将详细介绍如何安全地进行升级，包括步骤、代码示例与最终流程图。升级Python一、环境准备在进行任何升级之前，建议先检查当前的Python版本以及各个库的兼容性。我们可以通过以下命令检查当前的Python版本：condainfo你会看到类似以下的输出，其中包含了当前Python
【Linux】删除Conda虚拟环境不是伍壹 Linux linux conda 运维
1、查看当前系统的conda虚拟环境condainfo--envscondaenvlist2、创建虚拟的环境condacreate-n（你的环境名字）python=（你需要的版本号，如（3.7,3.8,3.10））3、查看安装了哪些包condalist4、删除虚拟环境condaremove-nname--all5、删除虚拟环境中的包condaremove--name$（需要删除的环境名字）$（需要
动态规划之背包问题--python版本我是小码搬运工 #python基础动态规划背包问题 python版本
动态规划之背包问题–python版本问题已知一个最大量的背包，给定一组给定固定价值和固定体积的物品，求在不超过最大值的前提下，能放入背包中的最大总价值。解题思路该问题是典型的动态规划问题，分为三种不同的类型（0-1背包问题、完全背包和多重背包问题）解题关键–状态转移表达式：B(k,C)=max(B(k−1,C),B(k−1,C−ci)+vi)B(k,C)=max(B(k-1,C),B(k-1,C-
Centos7 搭建 Jupyter + Nginx 服务某龙兄 python nginx linux centos
JupyterNotebook（此前被称为IPythonnotebook）是一个交互式笔记本，支持运行40多种编程语言。JupyterNotebook的本质是一个Web应用程序，便于创建和共享文学化程序文档，支持实时代码，数学方程，可视化和markdown。用途包括：数据清理和转换，数值模拟，统计建模，机器学习等等。本文讲述如何搭建Jupyter+Nginx服务,仅供学习与交流，请勿用于商业用途一
算法单链的创建与删除换个号韩国红果果 c 算法
先创建结构体 struct student { int data; //int tag;//标记这是第几个 struct student *next; }; // addone 用于将一个数插入已从小到大排好序的链中 struct student *addone(struct student *h,int x){ if(h==NULL) //??????
《大型网站系统与Java中间件实践》第2章读后感白糖_ java中间件
断断续续花了两天时间试读了《大型网站系统与Java中间件实践》的第2章，这章总述了从一个小型单机构建的网站发展到大型网站的演化过程---整个过程会遇到很多困难，但每一个屏障都会有解决方案，最终就是依靠这些个解决方案汇聚到一起组成了一个健壮稳定高效的大型系统。看完整章内容，
zeus持久层spring事务单元测试 deng520159 java DAO spring jdbc
今天把zeus事务单元测试放出来,让大家指出他的毛病, 1.ZeusTransactionTest.java 单元测试 package com.dengliang.zeus.webdemo.test; import java.util.ArrayList; import java.util.List; import org.junit.Test; import
Rss 订阅开发周凡杨 html xml 订阅 rss 规范
RSS是 Really Simple Syndication的缩写（对rss2.0而言，是这三个词的缩写，对rss1.0而言则是RDF Site Summary的缩写，1.0与2.0走的是两个体系）。 RSS
分页查询实现 g21121 分页查询
在查询列表时我们常常会用到分页，分页的好处就是减少数据交换，每次查询一定数量减少数据库压力等等。按实现形式分前台分页和服务器分页：前台分页就是一次查询出所有记录，在页面中用js进行虚拟分页，这种形式在数据量较小时优势比较明显，一次加载就不必再访问服务器了，但当数据量较大时会对页面造成压力，传输速度也会大幅下降。服务器分页就是每次请求相同数量记录，按一定规则排序，每次取一定序号直接的数据
spring jms异步消息处理 510888780 jms
spring JMS对于异步消息处理基本上只需配置下就能进行高效的处理。其核心就是消息侦听器容器，常用的类就是DefaultMessageListenerContainer。该容器可配置侦听器的并发数量，以及配合MessageListenerAdapter使用消息驱动POJO进行消息处理。且消息驱动POJO是放入TaskExecutor中进行处理，进一步提高性能，减少侦听器的阻塞。具体配置如下：
highCharts柱状图布衣凌宇 hightCharts 柱图
第一步：导入 exporting.js,grid.js,highcharts.js;第二步：写controller @Controller@RequestMapping(value="${adminPath}/statistick")public class StatistickController { private UserServi
我的spring学习笔记2-IoC（反向控制依赖注入） aijuans spring mvc Spring 教程 spring3 教程 Spring 入门
IoC（反向控制依赖注入）这是Spring提出来了，这也是Spring一大特色。这里我不用多说，我们看Spring教程就可以了解。当然我们不用Spring也可以用IoC，下面我将介绍不用Spring的IoC。 IoC不是框架，她是java的技术，如今大多数轻量级的容器都会用到IoC技术。这里我就用一个例子来说明：如：程序中有 Mysql.calss 、Oracle.class 、SqlSe
TLS java简单实现 antlove java ssl keystore tls secure
1. SSLServer.java package ssl; import java.io.FileInputStream; import java.io.InputStream; import java.net.ServerSocket; import java.net.Socket; import java.security.KeyStore; import
Zip解压压缩文件百合不是茶 Zip格式解压 Zip流的使用文件解压
ZIP文件的解压缩实质上就是从输入流中读取数据。Java.util.zip包提供了类ZipInputStream来读取ZIP文件,下面的代码段创建了一个输入流来读取ZIP格式的文件; ZipInputStream in = new ZipInputStream(new FileInputStream(zipFileName)); &n
underscore.js 学习（一） bijian1013 JavaScript underscore
工作中需要用到underscore.js，发现这是一个包括了很多基本功能函数的js库，里面有很多实用的函数。而且它没有扩展 javascript的原生对象。主要涉及对Collection、Object、Array、Function的操作。学
java jvm常用命令工具——jstatd命令(Java Statistics Monitoring Daemon) bijian1013 java jvm jstatd
1.介绍 jstatd是一个基于RMI（Remove Method Invocation）的服务程序，它用于监控基于HotSpot的JVM中资源的创建及销毁，并且提供了一个远程接口允许远程的监控工具连接到本地的JVM执行命令。 jstatd是基于RMI的，所以在运行jstatd的服务
【Spring框架三】Spring常用注解之Transactional bit1129 transactional
Spring可以通过注解@Transactional来为业务逻辑层的方法(调用DAO完成持久化动作)添加事务能力，如下是@Transactional注解的定义： /* * Copyright 2002-2010 the original author or authors. * * Licensed under the Apache License, Version
我(程序员)的前进方向 bitray 程序员
作为一个普通的程序员,我一直游走在java语言中,java也确实让我有了很多的体会.不过随着学习的深入,java语言的新技术产生的越来越多,从最初期的javase,我逐渐开始转变到ssh,ssi,这种主流的码农,.过了几天为了解决新问题,webservice的大旗也被我祭出来了,又过了些日子jms架构的activemq也开始必须学习了.再后来开始了一系列技术学习,osgi,restful.....
nginx lua开发经验总结 ronin47
使用nginx lua已经两三个月了，项目接开发完毕了，这几天准备上线并且跟高德地图对接。回顾下来lua在项目中占得必中还是比较大的，跟PHP的占比差不多持平了，因此在开发中遇到一些问题备忘一下 1：content_by_lua中代码容量有限制，一般不要写太多代码，正常编写代码一般在100行左右（具体容量没有细心测哈哈，在4kb左右），如果超出了则重启nginx的时候会报 too long pa
java-66-用递归颠倒一个栈。例如输入栈{1,2,3,4,5}，1在栈顶。颠倒之后的栈为{5,4,3,2,1}，5处在栈顶 bylijinnan java
import java.util.Stack; public class ReverseStackRecursive { /** * Q 66.颠倒栈。 * 题目：用递归颠倒一个栈。例如输入栈{1,2,3,4,5}，1在栈顶。 * 颠倒之后的栈为{5,4,3,2,1}，5处在栈顶。 *1. Pop the top element *2. Revers
正确理解Linux内存占用过高的问题 cfyme linux
Linux开机后，使用top命令查看，4G物理内存发现已使用的多大3.2G，占用率高达80%以上： Mem: 3889836k total, 3341868k used, 547968k free, 286044k buffers Swap: 6127608k total,&nb
[JWFD开源工作流]当前流程引擎设计的一个急需解决的问题 comsci 工作流
当我们的流程引擎进入IRC阶段的时候，当循环反馈模型出现之后，每次循环都会导致一大堆节点内存数据残留在系统内存中，循环的次数越多，这些残留数据将导致系统内存溢出，并使得引擎崩溃。。。。。。而解决办法就是利用汇编语言或者其它系统编程语言，在引擎运行时，把这些残留数据清除掉。
自定义类的equals函数 dai_lm equals
仅作笔记使用 public class VectorQueue { private final Vector<VectorItem> queue; private class VectorItem { private final Object item; private final int quantity; public VectorI
Linux下安装R语言 datageek R语言 linux
命令如下：sudo gedit /etc/apt/sources.list1、deb http://mirrors.ustc.edu.cn/CRAN/bin/linux/ubuntu/ precise/ 2、deb http://dk.archive.ubuntu.com/ubuntu hardy universesudo apt-key adv --keyserver ke
如何修改mysql 并发数(连接数)最大值 dcj3sjt126com mysql
MySQL的连接数最大值跟MySQL没关系，主要看系统和业务逻辑了方法一：进入MYSQL安装目录打开MYSQL配置文件 my.ini 或 my.cnf查找 max_connections=100 修改为 max_connections=1000 服务里重起MYSQL即可　　方法二：MySQL的最大连接数默认是100客户端登录：mysql -uusername -ppass
单一功能原则 dcj3sjt126com 面向对象的程序设计软件设计编程原则
单一功能原则[ 编辑] SOLID 原则单一功能原则开闭原则 Liskov代换原则接口隔离原则依赖反转原则查论编在面向对象编程领域中，单一功能原则（Single responsibility principle）规定每个类都应该有
POJO、VO和JavaBean区别和联系 fanmingxing VO POJO javabean
POJO和JavaBean是我们常见的两个关键字，一般容易混淆，POJO全称是Plain Ordinary Java Object / Plain Old Java Object，中文可以翻译成：普通Java类，具有一部分getter/setter方法的那种类就可以称作POJO，但是JavaBean则比POJO复杂很多，JavaBean是一种组件技术，就好像你做了一个扳子，而这个扳子会在很多地方被
SpringSecurity3.X--LDAP：AD配置 hanqunfeng SpringSecurity
前面介绍过基于本地数据库验证的方式，参考http://hanqunfeng.iteye.com/blog/1155226，这里说一下如何修改为使用AD进行身份验证【只对用户名和密码进行验证，权限依旧存储在本地数据库中】。将配置文件中的如下部分删除：
mac mysql 修改密码 IXHONG mysql
$ sudo /usr/local/mysql/bin/mysqld_safe –user=root & //启动MySQL(也可以通过偏好设置面板来启动)$ sudo /usr/local/mysql/bin/mysqladmin -uroot password yourpassword //设置MySQL密码（注意，这是第一次MySQL密码为空的时候的设置命令，如果是修改密码，还需在-
设计模式--抽象工厂模式 kerryg 设计模式
抽象工厂模式：工厂模式有一个问题就是，类的创建依赖于工厂类，也就是说，如果想要拓展程序，必须对工厂类进行修改，这违背了闭包原则。我们采用抽象工厂模式，创建多个工厂类，这样一旦需要增加新的功能，直接增加新的工厂类就可以了，不需要修改之前的代码。总结：这个模式的好处就是，如果想增加一个功能，就需要做一个实现类，
评"高中女生军训期跳楼” nannan408
首先，先抛出我的观点，各位看官少点砖头。那就是，中国的差异化教育必须做起来。孔圣人有云：有教无类。不同类型的人，都应该有对应的教育方法。目前中国的一体化教育，不知道已经扼杀了多少创造性人才。我们出不了爱迪生，出不了爱因斯坦，很大原因，是我们的培养思路错了，我们是第一要“顺从”。如果不顺从，我们的学校，就会用各种方法，罚站，罚写作业，各种罚。军
scala如何读取和写入文件内容？ qindongliang1922 java jvm scala
直接看如下代码： package file import java.io.RandomAccessFile import java.nio.charset.Charset import scala.io.Source import scala.reflect.io.{File, Path} /** * Created by qindongliang on 2015/
C语言算法之百元买百鸡 qiufeihu c 算法
中国古代数学家张丘建在他的《算经》中提出了一个著名的“百钱买百鸡问题”，鸡翁一，值钱五，鸡母一，值钱三，鸡雏三，值钱一，百钱买百鸡，问翁，母，雏各几何？代码如下： #include <stdio.h> int main() { int cock,hen,chick; /*定义变量为基本整型*/ for(coc
Hadoop集群安全性：Hadoop中Namenode单点故障的解决方案及详细介绍AvatarNode wyz2009107220 NameNode
正如大家所知，NameNode在Hadoop系统中存在单点故障问题，这个对于标榜高可用性的Hadoop来说一直是个软肋。本文讨论一下为了解决这个问题而存在的几个solution。 1. Secondary NameNode 原理：Secondary NN会定期的从NN中读取editlog，与自己存储的Image进行合并形成新的metadata image 优点：Hadoop较早的版本都自带，

Python Tutorial

Python Tutorial

Install Anaconda Python

Python language tips

Compatibility between Python 3.x and Python 2.x

Get away from IndentationError

Add Shebang and encoding at the beginning of executable scripts

All variables, functions, classes are dynamic objects

All python variables are pointers/references

Use deepcopy if you really want to COPY a variable

What if I accidentally overwrite my builtin functions?

int is of arbitrary precision in Python!

Easiest way to swap values of two variables

List comprehension

Dict comprehension

For the one-liners

Read from standard input

Order of dict keys are NOT as you expected

Use enumerate() to add a number during iteration

Reverse a list

Strings are immutable in Python

tuples are hashable while lists are not hashable

Use itertools

Convert iterables to lists

Use the zip() function to transpose nested lists/tuples/iterables

Global and local variables

Use defaultdict

Use generators

Turn off annoying KeyboardInterrupt and BrokenPipe Error

Class and instance variables

Useful Python packages for data analysis

Browser-based interactive programming in Python: jupyter

Python packages for scientific computing

Vector arithmetics: numpy

Numerical analysis (probability distribution, signal processing, etc.): scipy

Just-in-time (JIT) compiler for vector arithmetics

Library for symbolic computation: sympy

Operation on data frames: pandas

Basic graphics and plotting: matplotlib

Statistical data visualization: seaborn

Interactive programming in Python: ipython

Statistical tests: statsmodels

Machine learning algorithms: scikit-learn

Natural language analysis: gensim

HTTP library: requests

Lightweight Web framework: flask

Deep learning framework: tensorflow

High-level deep learning framework: keras

Operation on sequence and alignment formats: biopython

Operation on genomic formats (BigWig,etc.): bx-python

Operation on HDF5 files: h5py

Mixed C/C++ and python programming: cython

Progress bar: tqdm

Example Python scripts

View a table in a pretty way

Generate a random FASTA file

Weekly tasks

你可能感兴趣的:(Python Tutorial)

Use `deepcopy` if you really want to COPY a variable

`int` is of arbitrary precision in Python!