Python中的Dask数组

Python Dask数组 (Python Dask Array)

Dask is parallel computing python library and it is mainly used to run across multiple systems. Dask is used to process the data efficiently on a different cluster of machines. Dask can completely use all the cores available in the machine.

Dask是并行计算的python库,主要用于跨多个系统运行。 Dask用于在其他计算机群集上有效地处理数据。 Dask可以完全使用机器中可用的所有内核。

Dask stores the complete data on the disk and uses chunks of data from the disk for processing. Dask analyzes the large data sets with the help of Pandas data frame and "numpy arrays".

Dask将完整的数据存储在磁盘上,并使用磁盘中的数据块进行处理。 Dask借助Pandas数据框和“ numpy数组” 分析大型数据集。

Basically,

你可能感兴趣的:(python,numpy,java,数据分析,大数据)