pipisorry

SciPy教程 - sparse module稀疏矩阵

http://blog.csdn.net/pipisorry/article/details/41762945

sparse matrix稀疏矩阵不同的存储形式在sparse模块中对应如下

bsr_matrix(arg1[, shape, dtype,copy, blocksize]) Block Sparse Row matrix

coo_matrix(arg1[, shape, dtype,copy]) A sparse matrix in COOrdinate format.

csc_matrix(arg1[, shape, dtype,copy]) Compressed Sparse Column matrix

csr_matrix(arg1[, shape, dtype,copy]) Compressed Sparse Row matrix

dia_matrix(arg1[, shape, dtype,copy]) Sparse matrix with DIAgonal storage

dok_matrix(arg1[, shape, dtype,copy]) Dictionary Of Keys based sparse matrix.

lil_matrix(arg1[, shape, dtype,copy]) Row-based linked list sparse matrix
不同稀疏矩阵的介绍和优缺点
scipy.sparse库中提供了多种表示稀疏矩阵的格式，每种格式都有不同的用处。 Sparse matrices can be used in arithmetic operations: they support addition, subtraction, multiplication, division,and matrix power.
bsr_matrix(arg1, shape=None, dtype=None, copy=False, blocksize=None)Block Sparse Row matrix
The Block Compressed Row (BSR) format is very similar to the Compressed Sparse Row (CSR) format. BSR is appropriate for sparse matrices with dense sub matrices. Block matrices often arise in vector-valued finite element discretizations. In such cases, BSR is considerably more efficient than CSR and CSC for many sparse arithmetic operations.
csc_matrix(arg1,shape=None, dtype=None, copy=False)压缩的列稀疏矩阵

Advantages of the CSC format
•efficient arithmetic operations CSC + CSC, CSC * CSC, etc.
•efficient column slicing
•fast matrix vector products (CSR, BSR may be faster!)
Disadvantages of the CSC format
•slow row slicing operations (consider CSR)
•changes to the sparsity structure are expensive (consider LIL or DOK)

csr_matrix(arg1, shape=None, dtype=None, copy=False)Compressed Sparse Row matrix

Advantages of the CSR format
•efficient arithmetic operations CSR + CSR, CSR * CSR, etc.
•efficient row slicing
•fast matrix vector products
Disadvantages of the CSR format
•slow column slicing operations (consider CSC)
•changes to the sparsity structure are expensive (consider LIL or DOK)

coo_matrix(arg1,shape=None,dtype=None,copy=False):坐标形式的一种稀疏矩阵。采用三个数组row、col和data保存非零元素的信息。这三个数组的长度相同，row保存元素的行，col保存元素的列，data保存元素的值。

coo_matrix不支持元素的存取和增删，一旦创建之后，除了将之转换成其它格式的矩阵，几乎无法对其做任何操作和矩阵运算。

Advantages of the COO format
•facilitates fast conversion among sparse formats
•permits duplicate entries (see example)
•very fast conversion to and from CSR/CSC formats

优点：快速的和CSR/CSC formats转换、允许重复录入. coo_matrix支持重复元素。

Disadvantages of the COO format
•does not directly support:
–arithmetic operations
–slicing

缺点：不能直接进行科学计算和切片操作

最常用的函数：

tocsc()	Return a copy of this matrix in Compressed Sparse Column format
tocsr()	Return a copy of this matrix in Compressed Sparse Row format
todense([order, out])	Return a dense matrix representation of this matrix

ps:许多稀疏矩阵的数据都是采用这种格式保存在文件中的，例如某个CSV文件中可能有这样三列：“用户ID，商品ID，评价值”。采用numpy.loadtxt或pandas.read_csv将数据读入之后，可以通过coo_matrix快速将其转换成稀疏矩阵：矩阵的每行对应一位用户，每列对应一件商品，而元素值为用户对商品的评价。

dia_matrix(arg1, shape=None, dtype=None, copy=False)Sparse matrix with DIAgonal storage

dok_matrix(arg1, shape=None, dtype=None, copy=False)Dictionary Of Keys based sparse matrix.

This is an efficient structure for constructing sparse matrices incrementally.Allows for efficient O(1) access of individual elements. Duplicates are not allowed. Can be efficiently converted to a coo_matrix once constructed.

dok_matrix从dict继承，它采用字典保存矩阵中不为0的元素：字典的键是一个保存元素(行,列)信息的元组，其对应的值为矩阵中位于(行,列)中的元素值。显然字典格式的稀疏矩阵很适合单个元素的添加、删除和存取操作。通常用来逐渐添加非零元素，然后转换成其它支持快速运算的格式。

lil_matrix(arg1, shape=None, dtype=None, copy=False)Row-based linked list sparse matrix
This is an efficient structure for constructing sparse matrices incrementally.

lil_matrix使用两个列表保存非零元素。data保存每行中的非零元素，rows保存非零元素所在的列。这种格式也很适合逐个添加元素，并且能快速获取行相关的数据。

Advantages of the LIL format
•supports flexible slicing
•changes to the matrix sparsity structure are efficient
Disadvantages of the LIL format
•arithmetic operations LIL + LIL are slow (consider CSR or CSC)
•slow column slicing (consider CSC)
•slow matrix vector products (consider CSR or CSC)
Intended Usage
•LIL is a convenient format for constructing sparse matrices
•once a matrix has been constructed, convert to CSR or CSC format for fast arithmetic and matrix vector operations
•consider using the COO format when constructing large matrices

Note:{dok_matrix和lil_matrix适合逐渐添加元素}

[scipy-ref-0.14.0 - Sparse matrices (scipy.sparse)]

[用Python做科学计算-第二版SciPy-数值计算库-稀疏矩阵-sparse]

皮皮blog

Sparse Matrix Storage Formats稀疏矩阵的存储格式

对于很多元素为零的稀疏矩阵，仅存储非零元素可使矩阵操作效率更高。

现有许多种稀疏矩阵的存储方式，但是多数采用相同的基本技术，即存储矩阵所有的非零元素到一个线性数组中，并提供辅助数组来描述原数组中非零元素的位置。

1. Coordinate Format (COO)

这种存储方式的主要优点是灵活、简单。仅存储非零元素以及每个非零元素的坐标。

使用3个数组进行存储：values, rows, andcolumn

values: 实数或复数数据，包括矩阵中的非零元素，顺序任意。
rows: 数据所处的行。
columns: 数据所处的列.

参数：矩阵中非零元素的数量 nnz，3个数组的长度均为nnz.

2. Diagonal Storage Format (DIA)

If the sparse matrix has diagonals containing only zero elements, then the diagonal storage format can be used to reduce the amount of information needed to locate the non-zero elements. This storage format is particularly useful in many applications where the matrix arises from a finite element or finite difference discretization.

The Intel MKL diagonal storage format is specified by two arrays:values anddistance, and two parameters:ndiag, which is the number of non-empty diagonals, andlval, which is the declared leading dimension in the calling (sub)programs.

values: A real or complex two-dimensional array is dimensioned aslval byndiag. Each column of it contains the non-zero elements of certain diagonal ofA. The key point of the storage is that each element invalues retains the row number of the original matrix. To achieve this diagonals in the lower triangular part of the matrix are padded from the top, and those in the upper triangular part are padded from the bottom. Note that the value ofdistance(i) is the number of elements to be padded for diagonali.
distance: An integer array with dimension ndiag. Elementi of the arraydistance is the distance betweeni-diagonal and the main diagonal. The distance is positive if the diagonal is above the main diagonal, and negative if the diagonal is below the main diagonal. The main diagonal has a distance equal to zero.

3. Compressed Sparse Row Format (CSR)

The Intel MKL compressed sparse row (CSR) format is specified by four arrays: thevalues,columns,pointerB, andpointerE. The following table describes the arrays in terms of the values, row, and column positions of the non-zero elements in a sparse matrixA.

values: A real or complex array that contains the non-zero elements ofA. Values of the non-zero elements ofA are mapped into thevalues array using the row-major storage mapping described above.
columns: Element i of the integer array columns is the number of the column inA that contains thei-th value in thevalues array.
pointerB: Element j of this integer array gives the index of the element in thevalues array that is first non-zero element in a rowj ofA. Note that this index is equal topointerB(j) -pointerB(1)+1 .
pointerE: An integer array that contains row indices, such thatpointerE(j)-pointerB(1) is the index of the element in thevalues array that is last non-zero element in a row j of A.

4. Compressed Sparse Column Format (CSC)

The compressed sparse column format (CSC) is similar to the CSR format, but the columns are used instead the rows. In other words, the CSC format is identical to the CSR format for the transposed matrix. The CSR format is specified by four arrays: values, columns, pointerB, and pointerE. The following table describes the arrays in terms of the values, row, and column positions of the non-zero elements in a sparse matrixA.

values: A real or complex array that contains the non-zero elements ofA. Values of the non-zero elements ofA are mapped into thevalues array using the column-major storage mapping.
rows: Element i of the integer array rows is the number of the row inA that contains thei-th value in thevalues array.
pointerB: Element j of this integer array gives the index of the element in thevalues array that is first non-zero element in a columnj ofA. Note that this index is equal topointerB(j) -pointerB(1)+1 .
pointerE: An integer array that contains column indices, such thatpointerE(j)-pointerB(1) is the index of the element in thevalues array that is last non-zero element in a column j ofA.

5. Skyline Storage Format

The skyline storage format is important for the direct sparse solvers, and it is well suited for Cholesky or LU decomposition when no pivoting is required.

The skyline storage format accepted in Intel MKL can store only triangular matrix or triangular part of a matrix. This format is specified by two arrays:values andpointers. The following table describes these arrays:

values: A scalar array. For a lower triangular matrix it contains the set of elements from each row of the matrix starting from the first non-zero element to and including the diagonal element. For an upper triangular matrix it contains the set of elements from each column of the matrix starting with the first non-zero element down to and including the diagonal element. Encountered zero elements are included in the sets.
pointers: An integer array with dimension (m+1), where m is the number of rows for lower triangle (columns for the upper triangle).pointers(i) -pointers(1)+1 gives the index of element invalues that is first non-zero element in row (column)i. The value ofpointers(m+1) is set tonnz+pointers(1), wherennz is the number of elements in the arrayvalues.

6. Block Compressed Sparse Row Format (BSR)

The Intel MKL block compressed sparse row (BSR) format for sparse matrices is specified by four arrays:values,columns,pointerB, andpointerE. The following table describes these arrays.

values: A real array that contains the elements of the non-zero blocks of a sparse matrix. The elements are stored block-by-block in row-major order. A non-zero block is the block that contains at least one non-zero element. All elements of non-zero blocks are stored, even if some of them is equal to zero. Within each non-zero block elements are stored in column-major order in the case of one-based indexing, and in row-major order in the case of the zero-based indexing.
columns: Element i of the integer array columns is the number of the column in the block matrix that contains thei-th non-zero block.
pointerB: Element j of this integer array gives the index of the element in thecolumns array that is first non-zero block in a rowj of the block matrix.
pointerE: Element j of this integer array gives the index of the element in thecolumns array that contains the last non-zero block in a rowj of the block matrix plus 1.

7. ELLPACK (ELL)

8. Hybrid (HYB)

由ELL+COO两种格式结合而成。

皮皮blog

选择稀疏矩阵存储格式的一些经验：

1. DIA和ELL格式在进行稀疏矩阵-矢量乘积(sparse matrix-vector products)时效率最高，所以它们是应用迭代法(如共轭梯度法)解稀疏线性系统最快的格式；

2. COO和CSR格式比起DIA和ELL来，更加灵活，易于操作；

3. ELL的优点是快速，而COO优点是灵活，二者结合后的HYB格式是一种不错的稀疏矩阵表示格式；

4. 根据Nathan Bell的工作：

CSR格式在存储稀疏矩阵时非零元素平均使用的字节数(Bytes per Nonzero Entry)最为稳定（float类型约为8.5，double类型约为12.5）

而DIA格式存储数据的非零元素平均使用的字节数与矩阵类型有较大关系，适合于StructuredMesh结构的稀疏矩阵（float类型约为4.05，double类型约为8.10）

对于Unstructured Mesh以及Random Matrix,DIA格式使用的字节数是CSR格式的十几倍；

5. 一些线性代数计算库：COO格式常用于从文件中进行稀疏矩阵的读写，如matrix market即采用COO格式，而CSR格式常用于读入数据后进行稀疏矩阵计算。

[Sparse Matrix Representations & Iterative Solvers, Lesson 1 by Nathan Bell]

[稀疏线性系统 Sparse Linear Systems]

[Intel MKL 库中使用的稀疏矩阵格式]

皮皮blog

sparse matrix稀疏矩阵的相关操作
创建稀疏矩阵
以coo_matrix为例：

1 直接将dense矩阵转换成稀疏矩阵

A =coo_matrix([[1,2],[3,4]])
print(A)
  (0, 0)    1
  (0, 1)    2
  (1, 0)    3
  (1, 1)    4

2 按照相应存储形式的要求构建矩阵：

row  = array([0,0,0,0,1,3,1])
col  = array([0,0,0,2,1,3,1])
data = array([1,1,1,8,1,1,1])
matrix = coo_matrix((data, (row,col)), shape=(4,4))
print(matrix)
print(matrix.todense())
  (0, 0)    1
  (0, 0)    1
  (0, 0)    1
  (0, 2)    8
  (1, 1)    1
  (3, 3)    1
  (1, 1)    1
[[3 0 8 0]
 [0 2 0 0]
 [0 0 0 0]
 [0 0 0 1]]

Note:csr_matrix总是返回稀疏矩阵，而不会返回一维向量。即使csr_matrix([2,3])也返回矩阵。
稀疏矩阵大小

csr = csr_matrix([[1, 5], [4, 0], [1, 3]])
print(csr.todense())    #todense()之后是<class 'numpy.matrixlib.defmatrix.matrix'>
print(csr.shape)
print(csr.shape[1])
[[1 5]
 [4 0]
 [1 3]]
(3, 2)
2

稀疏矩阵下标存取slice

print(csr)  (0, 0)    1
  (0, 1)    5
  (1, 0)    4
  (2, 0)    1
  (2, 1)    3
print(csr[0]) #<class 'scipy.sparse.csr.csr_matrix'>
  (0, 0)    1
  (0, 1)    5
print(csr[1,1])
1
print(csr[0,0])
0
for c in csr:    #每次读取csr中的一行 type(c) <class 'scipy.sparse.csr.csr_matrix'>
    print(c)
    break
  (0, 0)    1
  (0, 1)    5

csr_mat = csr_matrix([1, 5, 0])
print(csr_mat.todense())
# print(type(csr_mat.nonzero()))  #<class 'tuple'>
for row, col in csr_mat.nonzero():
    print(row, col, csr_mat[row, col])
[[1 5 0]]
0 0 1
0 1 5

将稀疏矩阵横向或者纵向合并

from scipy.sparse import coo_matrix, vstack

csr = csr_matrix([[1, 5, 5], [4, 0, 6], [1, 3, 7]])
print(csr.todense())
[[1 5 5]
 [4 0 6]
 [1 3 7]]
csr2 = csr_matrix([[3, 0, 9]])
print(csr2.todense())
[[3 0 9]]
print(vstack([csr, csr2]).todense())
[[1 5 5]
 [4 0 6]
 [1 3 7]
 [3 0 9]]

Note:如果合并数据形式不一样，不能合并。一个矩阵中的数据格式必须是相同的。

diags函数建立稀疏的对角矩阵

sparce矩阵的读取
可以像常规矩阵一样通过下标读取。也可以通过getrow(i)，gecol(i)读取特定的列或者特定的行，以及nonzero()读取非零元素的位置。

对于大多数（似乎只处了coo之外）稀疏矩阵的存储格式，都可以进行slice操作，比如对于csc，csr。也可以进行arithmeticoperations，矩阵的加减乘除，速度很快。

取矩阵的指定列数

sub = matrix.getcol(1)    #'coo_matrix' object does not support indexing，不能使用matrix[1]
print(sub)  (1, 0)    2
sub = matrix.todense()[:,[1,2]]    #常规矩阵取指定列print(sub)
[[0 8]
 [2 0]
 [0 0]
 [0 0]]

稀疏矩阵点积计算

A = csr_matrix([[1, 2, 0], [0, 0, 3]])
print(A.todense())[[1 2 0]
 [0 0 3]]
v = A.T
print(v.todense())[[1 0]
 [2 0]
 [0 3]]
d = A.dot(v)
print(d)
  (0, 0)    5
  (1, 1)    9

A = lil_matrix([[1, 2, 0], [0, 0, 3], [4, 0, 5]])
v = array([1, 0, -1])
s = datetime.datetime.now()
for i in range(100000):
    d = A.dot(v)    #这里v是一个ndarray
print(datetime.datetime.now() - s)

计算时间：
bsr:0:00:01.666072
coo:1.04
csc:0.93
csr:0.90
dia:1.06
dok:1.57
lil:11.37
故推荐用csr计算点积

csr_mat1 = csr_matrix([1, 2, 0])
csr_mat2 = csr_matrix([1, 0, -1])
similar = (csr_mat1.dot(csr_mat2.transpose()))   #这里csr_mat2也是一个csr_matrix
print(type(similar))
print(similar)
print(similar[0, 0])
<class 'scipy.sparse.csr.csr_matrix'>
  (0, 0)    1
1

scipy稀疏矩阵在文件中的读取（读取和保存稀疏矩阵）

mmwrite(target, a[, comment, field, precision]) Writes the sparse or dense array a to a Matrix Market formatted file.

mmread(source) Reads the contents of a Matrix Market file ‘filename’ into a matrix.<class 'scipy.sparse.coo.coo_matrix'>

mminfo(source) Queries the contents of the Matrix Market file ‘filename’ to extract size and storage.

def save_csr_mat(
        item_item_sparse_mat_filename=r'.\datasets\lastfm-dataset-1K\item_item_csr_mat.mtx'):
    random.seed(10)
    raw_user_item_mat = random.randint(0, 6, (3, 2))
    d = csr_matrix(raw_user_item_mat)
    print(d.todense())
    print(d)
    mmwrite(item_item_sparse_mat_filename, d)
    print("item_item_sparse_mat_file information: ")
    print(mminfo(item_item_sparse_mat_filename))
    k = mmread(item_item_sparse_mat_filename)
    print(k.todense())

[[1 5]
 [4 0]
 [1 3]]  

  (0, 0)    1
  (0, 1)    5
  (1, 0)    4
  (2, 0)    1
  (2, 1)    3

item_item_sparse_mat_file information: 
(3, 2, 5, 'coordinate', 'integer', 'general')

[[1 5]
 [4 0]
 [1 3]]

保存的文件中的内容：
%%MatrixMarket matrix coordinate integer general
%
3 2 5
1 1 1
1 2 5
2 1 4
3 1 1
3 2 3

Note:保存的文件拓展名应为.mtx

[scipy-ref-0.14.0 - Matrix Market files]

皮皮blog

一种比较省内存的稀疏矩阵Python存储方案

    推荐系统中经常需要处理类似user_id, item_id, rating这样的数据，其实就是数学里面的稀疏矩阵，scipy中提供了sparse模块来解决这个问题。
    但scipy.sparse有很多问题不太合用：1、不能很好的同时支持data[i, ...]、data[..., j]、data[i, j]快速切片；2、由于数据保存在内存中，不能很好的支持海量数据处理。
    要支持data[i, ...]、data[..., j]的快速切片，需要i或者j的数据集中存储；同时，为了保存海量的数据，也需要把数据的一部分放在硬盘上，用内存做buffer。这里的解决方案比较简单，用一个类Dict的东西来存储数据，对于某个i（比如9527），它的数据保存在dict['i9527']里面，同样的，对于某个j（比如3306），它的全部数据保存在dict['j3306']里面，需要取出data[9527, ...]的时候，只要取出dict['i9527']即可，dict['i9527']原本是一个dict对象，储存某个j对应的值，为了节省内存空间，我们把这个dict以二进制字符串形式存储。
    采用类Dict来存储数据的另一个好处是你可以随便用内存Dict或者其他任何形式的DBM，甚至传说中的Tokyo Cabinet. [http://blogread.cn/it/article/1229]

from: http://blog.csdn.net/pipisorry/article/details/41762945

ref:sparse模块的官方document

http://blog.sina.com.cn/s/blog_6a90ae320101aavg.html

华为OD机试E卷 - 数大雁（Java & Python& JS & C++ & C ）算法大师最新华为OD机试华为od python java c语言 javascript c++华为od机考e卷
最新华为OD机试真题目录：点击查看目录华为OD面试真题精选：点击立即查看题目描述一群大雁往南飞，给定一个字符串记录地面上的游客听到的大雁叫声，请给出叫声最少由几只大雁发出。具体的:1.大雁发出的完整叫声为”quack“，因为有多只大雁同一时间嘎嘎作响，所以字符串中可能会混合多个”quack”。2.大雁会依次完整发出”quack”，即字符串中’q’,‘u’,‘a’,‘c’,‘k’这5个字母按顺序完整
Python lambda函数总结编程零零七 python python 开发语言 python基础 python学习 python教程
在Python中，lambda函数是一种快速定义匿名函数（即没有名字的函数）的方式。它们通常用于需要将一个简单函数作为参数传递的场合，或者在某些需要函数对象但又不希望正式定义一个完整函数的场景下。下面是对lambda函数的详细总结：基本语法lambda参数1,参数2,...:表达式lambda关键字用来声明一个匿名函数。参数列表可以包含多个参数，用逗号分隔。冒号后面是一个表达式，该表达式的结果就是
深入探索Python编程技术：从入门到精通的全方位学习指南小码快撩 python 开发语言
引言在当今信息技术飞速发展的时代，Python以其简洁优雅、功能强大、易于上手的特点，成为了众多开发者和初学者首选的编程语言。无论是数据科学、机器学习、Web开发、自动化脚本编写，还是桌面应用开发，Python都能发挥其独特优势，帮助开发者高效完成任务。本文旨在为Python学习者提供一个全面的学习路径与关键知识点概述，助您快速掌握这门强大的编程语言。一、基础语法1.变量定义与数据类型示例代码：#
获取列表中最后一个位置的元素内容 - Python 雪域Code python 开发语言 Python
获取列表中最后一个位置的元素内容-Python在Python编程中，经常需要对列表进行操作，其中一项常见的需求是获取列表中最后一个位置的元素内容。本文将介绍如何使用Python编程语言来实现这一功能，并提供相应的源代码示例。获取列表最后一个元素的方法有多种，下面将介绍其中的两种常见方法。方法一：使用索引在Python中，可以使用负数索引来获取列表中的元素，其中-1代表最后一个元素，-2代表倒数第二
Python人工智能在气象中的应用，包括：天气预测、气候模拟、降雨量和降水预测、气象数据分析、气象预警系统 xiao5kou4chang6kai4 气象气候预报天气预测气候模拟.降雨量和降水预测气象数据分析气象预警系统 python
Python人工智能在气象中有多种应用，包括：天气预测、气候模拟、降雨量和降水预测、气象数据分析、气象预警系统Python是功能强大、免费、开源，实现面向对象的编程语言，在数据处理、科学计算、数学建模、数据挖掘和数据可视化方面具备优异的性能，这些优势使得Python在气象、海洋、地理、气候、水文和生态等地学领域的科研和工程项目中得到广泛应用。可以预见未来Python将成为的主流编程语言之一。人工智
华为OD机试E卷 --过滤组合字符串--24年OD统一考试（Java & JS & Python & C & C++）飞码创造者最新华为OD机试题库2024 华为od java javascript c语言 python
文章目录题目描述输入描述输出描述用例题目解析JS算法源码Java算法源码python算法源码c算法源码c++算法源码题目描述数字0、1、2、3、4、5、6、7、8、9分别关联a~z26个英文字母。0关联“a”"b”"c1关联“d”"e”"f2关联“g"“h”“i”3关联“j","k"l”4关联“m”"n”“o”5关联“p""q”"r6关联“s”"t7关联“u”"v”8关联“w”“x”9关联“y”"
第8篇：从入门到精通：掌握Python异常处理猿享天开 python从入门到精通 python 开发语言
第8篇：异常处理内容简介本篇文章将深入探讨Python中的异常处理机制。您将学习异常的基本概念与类型，掌握使用try-except块处理异常的方法，了解finally语句的作用，以及如何抛出和定义自定义异常。通过丰富的代码示例，您将能够有效地管理程序中的错误，提高代码的健壮性和可维护性。目录异常处理概述什么是异常异常的类型异常处理的重要性使用try-except块处理异常try-except语法结
Python Turtle：绚丽多彩的烟花动画赵闪闪168 闪闪精选 python 开发语言
以下是一个使用Python的turtle库实现的烟花动画程序示例：收起python复制importturtleimportrandomimporttime#创建画布screen=turtle.Screen()screen.bgcolor("black")screen.title("FireworksAnimation")#烟花颜色列表colors=["red","orange","yellow",
Python自动化办公测试开发漫漫成长路 python办公自动化自动化
一、引言以下是一个完整的Python自动化办公框架的目录结构和详细的解释。该框架将结合多种工具和技术，涵盖从数据处理到任务调度、自动化邮件发送、网页抓取等常见办公自动化任务。二、常用框架与工具pandas：使用场景：数据处理与分析描述：pandas是一个强大的数据处理库，适用于读取、清理、分析、操作Excel、CSV等表格数据。它提供了DataFrame数据结构，便于处理复杂的数据任务。典型应用：
chatgpt赋能python：Python如何给微信群发消息？李自提 ChatGpt python chatgpt 开发语言计算机
Python如何给微信群发消息？概述微信是全球最受欢迎的聊天应用程序之一，拥有数十亿用户。在这些用户中，许多人都属于微信群。微信群是集体聊天，可以让您与朋友、同事或家人交流。您可以在群中分享照片、链接甚至语音消息，而且现在，您甚至可以使用Python自动给微信群发消息。Python实现微信群发消息使用Python给微信群发消息确实可能有些复杂，但是有很多库可以使用。其中最受欢迎的库之一是itcha
深入解析 Python 包调用原理与最佳实践 SSS不知-道 Python python pip 程序人生学习方法
深入解析Python包调用原理与最佳实践文章目录深入解析Python包调用原理与最佳实践一、引言二、什么是Python的包？三、深入解析Pythonimport原理3.1实验一3.2实验二3.3实验三3.4实验四四、循环依赖问题4.1实验五4.2实验六五、动态加载对象六、总结七、参考资料一、引言写下这篇文章的起因，是最近我在参与vLLM项目的开发过程中，发现其中使用了一种动态加载对象的方式值得学习
python 特征选择方法_【来点干货】机器学习中常用的特征选择方法及非常详细的Python实例... Blair Long python 特征选择方法
花费了很长时间整理编辑，转载请联系作者授权，违者必究。特征选择(Featureselection)是在构建预测模型的过程中减少输入变量的一个过程。它是机器学习中非常重要的一步并在很大程度上可以提高模型预测精度。这里我总结了一些机器学习中常见的比较有用的特征选择方法并附上相关python实现code。希望可以给大家一些启发。首先，我们为什么要进行特征选择呢？它有以下几个优点：减少过拟合：冗余数据常常
python 内存操作使用技巧默默前行的旅者 Python 基础 python
假设一个场景，有一个大小存放100个元素大小的列表，现在有个脚本不断向这个列表添加元素，要做到的就是当列表长度达到100时，把头部最开始进入的第一个元素给踢出，尾部则继续添加一个元素，我之前的做法是利用切片操作a=[1,2,3,4,5,6,7,8,9,10]#假设此时元素已满经过判断剔除第一个元素iflen(a)==10:a=a[1:]这样的做法看似没有什么问题，但是从内存角度考虑，则是不好的做法
python 微信群发_Python-Pyqt5编写微信群发软件 weixin_39624816 python 微信群发
环境：Python3.7，Eric6，Pyqt5Python库：wxpywxpy在itchat的基础上，通过大量接口优化提升了模块的易用性，并进行丰富的功能##扩展。成品图微信群聊功能点：单个好友，单个群发送，以及全部好友，全部群聊发送待完善：图片，文件发送(不想玩了，有兴趣的自己玩)按钮样式border-style:none;border:1pxsolid#3f3f3f;padding:5px;
python微信库itchat_用python写一个微信群发工具（基于itchat库） weixin_39654619 python微信库itchat
fromwxpyimport*fromtimeimportsleepimportrandombot=Bot(cache_path=True)#print('防止微信账号违规操作被封，每次发送信息时间间隔为随机0-1.5s')message=input('请输入要发送的微信信息：')friends_number=input('请输入账号好友数量：')number=int(friends_number
python 内存数据库 memlite_python绘图cpu/mem监控曲线云智冷 python 内存数据库 memlite
输入日志格式举例：[2012-09-2612:55:31]16070sosotest2008302m41m11mS00.10:00.93java428368501676HandleNum:28#-*-coding:cp936-*-importre,sys,osimporttimefrompylabimport*'''修改：2010.04.23增加开始运行时间信息修改参数，将进程号作为文件名一部分修
python 内存数据库,python 内存数据库博博de宝宝 python 内存数据库
场景：python打开sqlite3内存数据库，操作完毕将数据保留到文件数据库python打开sqlite3内存数据库，操作完毕将数据保存到文件数据库#encoding=utf-8#甄码农代码20120306#打开sqlite3内存数据库，执行操作，将内存数据库保存到文件importsqlite3importStringIO#使用:memory:标识打开的是内存数据库con=sqlite3.con
Matplotlib教程 weixin_30905133 python c/c++人工智能
Matplotlib是用于数据可视化的最流行的Python包之一。它是一个跨平台库，用于根据数组中的数据制作2D图。它提供了一个面向对象的API，有助于使用PythonGUI工具包（如PyQt，WxPythonotTkinter）在应用程序中嵌入绘图。它也可以用于Python和IPythonshell，Jupyter笔记本和Web应用程序服务器。面向读者本教程专为希望获得数据可视化基础知识的学员而
python 单因子方差分析_假设检验之F检验-方差分析雏Carnation python 单因子方差分析
这一次我们来了解一下假设检验中另一个重要检验-F检验什么是F检验？F检验(F-test)，最常用的别名叫做联合假设检验(英语：jointhypothesestest)，此外也称方差比率检验、方差齐性检验，方差分析，它是一种在(H0)之下，统计值服从的检验。其通常是用来分析用了超过一个参数的统计模型，以判断该模型中的全部或一部分参数是否适合用来估计总体F检验对于数据的正态性非常敏感，因此在检验方差齐
chatgpt赋能python：Python群发微信消息：解决方案 suimodina ChatGpt python chatgpt 微信计算机
Python群发微信消息：解决方案肆无忌惮的群发微信消息，是否是你目前所需的解决方案？如果是，那么你来对地方了。Python是一门十分强大的编程语言，广泛用于各种人工智能、计算机视觉、机器学习等领域。Python可以用于开发各种应用程序，它也可以用于批量处理和发送微信消息。本文将概述如何用Python发送微信消息。我们将介绍用Python实现微信消息的流程和步骤，并提供一些有关如何使用Python
Python内存数据库/引擎(sqlite memlite pydblite) ronon77 内存数据库 sqlite memlite pydblite
1初探在平时的开发工作中，我们可能会有这样的需求：我们希望有一个内存数据库或者数据引擎，用比较Pythonic的方式进行数据库的操作（比如说插入和查询）。举个具体的例子，分别向数据库db中插入两条数据，”a=1,b=1″和“a=1,b=2”,然后想查询a=1的数据可能会使用这样的语句db.query(a=1)，结果就是返回前面插入的两条数据；如果想查询a=1,b=2的数据，就使用这样的语句db.q
C# dynamic 类型详解 c#.net
简介C#中的dynamic是一种特殊类型，它允许在运行时确定对象的类型和成员，而不是在编译时。dynamic的定义dynamic是一种类型，它告诉编译器对其进行“动态类型解析”。dynamic类型的变量会跳过编译时类型检查，所有的操作会推迟到运行时进行。适合处理未知类型的对象，或需要与动态语言（如Python、JavaScript）互操作的场景。dynamic的使用动态类型赋值dynamicobj
python内存管理 jiang_mingyi python学习日记
对象的内存使用id可以取到python对象的内存地址以赋值操作为例可以看到a与常量1的地址是一致的。查阅得知在Python中，整数和短小的字符，Python都会缓存这些对象，以便重复使用。当我们创建多个等于1的引用时，实际上是让所有这些引用指向同一个对象。可以看到a和b指向同一个变量。a和b均是1的一个引用。可以看到当a的值发生变化后，a指向的地址发生变化，并且随着数字的复杂度变高，同一个数字指向
使用Python来下一场雪小黄编程快乐屋 python pygame 开发语言
具体效果：（大雪缓缓下落）完整代码：importpygameimportrandom#初始化Pygamepygame.init()#设置窗口width,height=800,600screen=pygame.display.set_mode((width,height))pygame.display.set_caption("下雪动画")#定义雪花类classSnowflake:def__init
C++简单实现一个日志类第六帅编程语言 C++面向对象日志
C++没有貌似自带的日志类，如果仅仅使用cout输出调试信息的话比较凌乱，所以我尝试自己实现了一个Logger类，主要考虑实现以下功能:日志等级:参考python的logger类，我设置了四个日志等级,从低到高依次为debug,info,warning,error，这样的话我想输出一条debug信息就可以这样写logger.debug("something...")，(关于日志等级是什么意思可以参
分享 7 个用 Python 开发成的数据库编程咕咕gu- 数据库 Python入门 Python基础 python 开发语言
如果你正在学习Python，那么你需要的话可以，点击这里Python重磅福利：入门&进阶全套学习资料、电子书、软件包、项目源码等等免费分享！Python作为一种高层次的编程语言，因其简单易用和强大的社区支持，被用于实现多种类型的数据库。这些数据库可以分为几大类，包括关系型数据库、NoSQL数据库、嵌入式数据库和面向对象数据库等。下面这些数据库不常用，看可以通过学习这些项目源码提升自己对数据库的理解
华为OD机试E卷 --简易压缩算法--24年OD统一考试（Java & JS & Python & C & C++）飞码创造者最新华为OD机试题库2024 java 华为od javascript python c语言
文章目录题目描述输入描述输出描述用例题目解析JS算法源码Java算法源码python算法源码c算法源码题目描述有一种简易压缩算法Q∶针对全部由小写英文字母组成的字符串，将其中连续超过两个相同字母的部分压缩为连续个数加该字母，其他部分保持原样不变。例如:字符串“aaabbccccd"经过压缩成为字符串"3abb4cd”"。请您编写解压函数，根据输入的字符串，判断其是否为合法压缩过的字符串，若输入合法
【已解决】ImportError: libnvinfer.so.8: cannot open shared object file: No such file or directory 小小小小祥 python
问题描述：按照tensorrt官方安装文档：https://docs.nvidia.com/deeplearning/tensorrt/install-guide/index.html#installing-tar安装完成后，使用python测试导入tensorrtimporttensorrt上述代码报错：Traceback(mostrecentcalllast):File“main.py”,li
python实现烟花效果 (附完整源码) 源代码大师 Python实战教程 python pygame 开发语言
python实现烟花效果下面是一个使用Python的pygame库实现烟花效果的完整示例代码。该代码创建了一个简单的窗口，并在窗口中生成烟花效果。首先，确保你已经安装了pygame库。如果没有安装，可以使用以下命令进行安装：pipinstallpygame接下来，使用以下代码创建一个烟花效果：importpygameimportrandomimportmath#初始化pygamepygame.in
2025年1月18日（树莓派点亮呼吸灯） Mason Lin Raspberry Pi Zero 2W python 树莓派 LED linux
系统信息：RaspberryPiZero2W系统版本：2024-10-22-raspios-bullseye-armhfPython版本：Python3.9.2已安装pip3支持拍摄1080p30(1092*1080),720p60(1280*720),60/90(640*480)已安装vim已安装git学习目标：pwm呼吸灯学习内容：呼吸灯importRPi.GPIOasGPIOimportti
如何用ruby来写hadoop的mapreduce并生成jar包 wudixiaotie mapreduce
ruby来写hadoop的mapreduce，我用的方法是rubydoop。怎么配置环境呢： 1.安装rvm：不说了网上有 2.安装ruby：由于我以前是做ruby的，所以习惯性的先安装了ruby，起码调试起来比jruby快多了。 3.安装jruby： rvm install jruby然后等待安
java编程思想 -- 访问控制权限百合不是茶 java 访问控制权限单例模式
访问权限是java中一个比较中要的知识点,它规定者什么方法可以访问,什么不可以访问一:包访问权限; 自定义包: package com.wj.control; //包 public class Demo { //定义一个无参的方法 public void DemoPackage(){ System.out.println("调用
[生物与医学]请审慎食用小龙虾 comsci 生物
现在的餐馆里面出售的小龙虾,有一些是在野外捕捉的,这些小龙虾身体里面可能带有某些病毒和细菌,人食用以后可能会导致一些疾病,严重的甚至会死亡..... 所以,参加聚餐的时候,最好不要点小龙虾...就吃养殖的猪肉,牛肉,羊肉和鱼,等动物蛋白质
org.apache.jasper.JasperException: Unable to compile class for JSP: 商人shang maven 2.2 jdk1.8
环境： jdk1.8 maven tomcat7-maven-plugin 2.0 原因： tomcat7-maven-plugin 2.0 不知吃 jdk 1.8，换成 tomcat7-maven-plugin 2.2就行，即 <plugin>
你的垃圾你处理掉了吗?GC oloz GC
前序:本人菜鸟，此文研究学习来自网络，各位牛牛多指教　 1.垃圾收集算法的核心思想　　Java语言建立了垃圾收集机制，用以跟踪正在使用的对象和发现并回收不再使用(引用)的对象。该机制可以有效防范动态内存分配中可能发生的两个危险：因内存垃圾过多而引发的内存耗尽，以及不恰当的内存释放所造成的内存非法引用。　　垃圾收集算法的核心思想是：对虚拟机可用内存空间，即堆空间中的对象进行识别
shiro 和 SESSSION 杨白白 shiro
shiro 在web项目里默认使用的是web容器提供的session，也就是说shiro使用的session是web容器产生的，并不是自己产生的，在用于非web环境时可用其他来源代替。在web工程启动的时候它就和容器绑定在了一起，这是通过web.xml里面的shiroFilter实现的。通过session.getSession()方法会在浏览器cokkice产生JESSIONID，当关闭浏览器，此
移动互联网终端淘宝客如何实现盈利小桔子移動客戶端淘客淘寶App
2012年淘宝联盟平台为站长和淘宝客带来的分成收入突破30亿元，同比增长100%。而来自移动端的分成达1亿元，其中美丽说、蘑菇街、果库、口袋购物等App运营商分成近5000万元。可以看出，虽然目前阶段PC端对于淘客而言仍旧是盈利的大头，但移动端已经呈现出爆发之势。而且这个势头将随着智能终端(手机，平板)的加速普及而更加迅猛
wordpress小工具制作 aichenglong wordpress 小工具
wordpress 使用侧边栏的小工具，很方便调整页面结构小工具的制作过程 1 在自己的主题文件中新建一个文件夹(如widget)，在文件夹中创建一个php(AWP_posts-category.php) 小工具是一个类,想侧边栏一样，还得使用代码注册，他才可以再后台使用，基本的代码一层不变 <?php class AWP_Post_Category extends WP_Wi
JS微信分享 AILIKES js
// 所有功能必须包含在 WeixinApi.ready 中进行 WeixinApi.ready(function(Api) { // 微信分享的数据 var wxData = { &nb
封装探讨百合不是茶 JAVA面向对象封装
//封装属性方法将某些东西包装在一起，通过创建对象或使用静态的方法来调用，称为封装；封装其实就是有选择性地公开或隐藏某些信息，它解决了数据的安全性问题，增加代码的可读性和可维护性在 Aname类中申明三个属性，将其封装在一个类中：通过对象来调用例如 1： //属性将其设为私有姓名 name 可以公开
jquery radio/checkbox change事件不能触发的问题 bijian1013 JavaScript jquery
我想让radio来控制当前我选择的是机动车还是特种车，如下所示： <html> <head> <script src="http://ajax.googleapis.com/ajax/libs/jquery/1.7.1/jquery.min.js" type="text/javascript"><
AngularJS中安全性措施 bijian1013 JavaScript AngularJS 安全性 XSRF JSON漏洞
在使用web应用中，安全性是应该首要考虑的一个问题。AngularJS提供了一些辅助机制，用来防护来自两个常见攻击方向的网络攻击。一.JSON漏洞当使用一个GET请求获取JSON数组信息的时候（尤其是当这一信息非常敏感，
[Maven学习笔记九]Maven发布web项目 bit1129 maven
基于Maven的web项目的标准项目结构 user-project user-core user-service user-web src
【Hive七】Hive用户自定义聚合函数(UDAF) bit1129 hive
用户自定义聚合函数，用户提供的多个入参通过聚合计算(求和、求最大值、求最小值)得到一个聚合计算结果的函数。问题：UDF也可以提供输入多个参数然后输出一个结果的运算，比如加法运算add(3，5)，add这个UDF需要实现UDF的evaluate方法,那么UDF和UDAF的实质分别究竟是什么？ Double evaluate(Double a, Double b)
通过 nginx-lua 给 Nginx 增加 OAuth 支持 ronin47
前言：我们使用Nginx的Lua中间件建立了OAuth2认证和授权层。如果你也有此打算，阅读下面的文档，实现自动化并获得收益。SeatGeek 在过去几年中取得了发展，我们已经积累了不少针对各种任务的不同管理接口。我们通常为新的展示需求创建新模块，比如我们自己的博客、图表等。我们还定期开发内部工具来处理诸如部署、可视化操作及事件处理等事务。在处理这些事务中，我们使用了几个不同的接口来认证： &n
利用tomcat-redis-session-manager做session同步时自定义类对象属性保存不上的解决方法 bsr1983 session
在利用tomcat-redis-session-manager做session同步时，遇到了在session保存一个自定义对象时，修改该对象中的某个属性，session未进行序列化，属性没有被存储到redis中。在 tomcat-redis-session-manager的github上有如下说明： Session Change Tracking As noted in the &qu
《代码大全》表驱动法-Table Driven Approach-1 bylijinnan java 算法
关于Table Driven Approach的一篇非常好的文章： http://www.codeproject.com/Articles/42732/Table-driven-Approach package com.ljn.base; import java.util.Random; public class TableDriven { public
Sybase封锁原理 chicony Sybase
昨天在操作Sybase IQ12.7时意外操作造成了数据库表锁定，不能删除被锁定表数据也不能往其中写入数据。由于着急往该表抽入数据，因此立马着手解决该表的解锁问题。无奈此前没有接触过Sybase IQ12.7这套数据库产品，加之当时已属于下班时间无法求助于支持人员支持，因此只有借助搜索引擎强大的
java异常处理机制 CrazyMizzz java
java异常关键字有以下几个，分别为 try catch final throw throws 他们的定义分别为 try： Opening exception-handling statement. catch： Captures the exception. finally： Runs its code before terminating
hive 数据插入DML语法汇总 daizj hive DML 数据插入
Hive的数据插入DML语法汇总1、Loading files into tables语法：1) LOAD DATA [LOCAL] INPATH 'filepath' [OVERWRITE] INTO TABLE tablename [PARTITION (partcol1=val1, partcol2=val2 ...)]解释：1)、上面命令执行环境为hive客户端环境下： hive>l
工厂设计模式 dcj3sjt126com 设计模式
使用设计模式是促进最佳实践和良好设计的好办法。设计模式可以提供针对常见的编程问题的灵活的解决方案。工厂模式工厂模式（Factory）允许你在代码执行时实例化对象。它之所以被称为工厂模式是因为它负责“生产”对象。工厂方法的参数是你要生成的对象对应的类名称。 Example #1 调用工厂方法（带参数） <?phpclass Example{
mysql字符串查找函数 dcj3sjt126com mysql
FIND_IN_SET(str,strlist) 假如字符串str 在由N 子链组成的字符串列表strlist 中，则返回值的范围在1到 N 之间。一个字符串列表就是一个由一些被‘,’符号分开的自链组成的字符串。如果第一个参数是一个常数字符串，而第二个是type SET列，则 FIND_IN_SET() 函数被优化，使用比特计算。如果str不在strlist 或st
jvm内存管理 easterfly jvm
一、JVM堆内存的划分分为年轻代和年老代。年轻代又分为三部分：一个eden,两个survivor。工作过程是这样的：e区空间满了后，执行minor gc，存活下来的对象放入s0, 对s0仍会进行minor gc，存活下来的的对象放入s1中，对s1同样执行minor gc，依旧存活的对象就放入年老代中；年老代满了之后会执行major gc，这个是stop the word模式，执行
CentOS-6.3安装配置JDK-8 gengzg centos
JAVA_HOME=/usr/java/jdk1.8.0_45 JRE_HOME=/usr/java/jdk1.8.0_45/jre PATH=$PATH:$JAVA_HOME/bin:$JRE_HOME/bin CLASSPATH=.:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jar:$JRE_HOME/lib export JAVA_HOME
【转】关于web路径的获取方法 huangyc1210 Web 路径
假定你的web application 名称为news,你在浏览器中输入请求路径： http://localhost:8080/news/main/list.jsp 则执行下面向行代码后打印出如下结果： 1、 System.out.println(request.getContextPath()); //可返回站点的根路径。也就是项
php里获取第一个中文首字母并排序远去的渡口数据结构 PHP
很久没来更新博客了，还是觉得工作需要多总结的好。今天来更新一个自己认为比较有成就的问题吧。最近在做储值结算，需求里结算首页需要按门店的首字母A-Z排序。我的数据结构原本是这样的： Array ( [0] => Array ( [sid] => 2885842 [recetcstoredpay] =&g
java内部类 hm4123660 java 内部类匿名内部类成员内部类方法内部类
　在Java中，可以将一个类定义在另一个类里面或者一个方法里面，这样的类称为内部类。内部类仍然是一个独立的类，在编译之后内部类会被编译成独立的.class文件，但是前面冠以外部类的类名和$符号。内部类可以间接解决多继承问题,可以使用内部类继承一个类，外部类继承一个类，实现多继承。 &nb
Caused by: java.lang.IncompatibleClassChangeError: class org.hibernate.cfg.Exten zhb8015
maven pom.xml关于hibernate的配置和异常信息如下，查了好多资料，问题还是没有解决。只知道是包冲突，就是不知道是哪个包....遇到这个问题的分享下是怎么解决的。。 maven pom: <dependency> <groupId>org.hibernate</groupId> <ar
Spark 性能相关参数配置详解－任务调度篇 Stark_Summer spark cache cpu 任务调度 yarn
随着Spark的逐渐成熟完善, 越来越多的可配置参数被添加到Spark中来, 本文试图通过阐述这其中部分参数的工作原理和配置思路, 和大家一起探讨一下如何根据实际场合对Spark进行配置优化。由于篇幅较长，所以在这里分篇组织，如果要看最新完整的网页版内容，可以戳这里：http://spark-config.readthedocs.org/，主要是便
css3滤镜 wangkeheng html css
经常看到一些网站的底部有一些灰色的图标，鼠标移入的时候会变亮，开始以为是js操作src或者bg呢，搜索了一下，发现了一个更好的方法：通过css3的滤镜方法。 html代码： <a href='' class='icon'><img src='utv.jpg' /></a> css代码： .icon{-webkit-filter: graysc

SciPy教程 - sparse module稀疏矩阵

一种比较省内存的稀疏矩阵Python存储方案

你可能感兴趣的:(python,Matrix,Systems,scipy,sparse,稀疏矩阵,Linear,稀疏线性系统)