数据科学之路 的多语言选择

数据科学之路 的多语言选择_第1张图片
image.png

python 也正是因为 scipy numpy pandas sklearn matplot tf 四个包在数据科学中非常显眼,不过由于GIL的影响,python 和java 为人诟病的就是【慢】,所以这些缺点是我们一直选择去寻找更快的替代方案 ,集中在 c++ golang julia 三种语言 swift 和rust 也稍微找找,java 太沉重了,scala 有spark 支撑 。clojure 语法有点反人类,R 和python 速度大同小异


image.png

jupyter 支持的后端语言
https://github.com/jupyter/jupyter/wiki/Jupyter-kernels

image.png

c++版
pandas https://github.com/hosseinmoein/DataFrame
numpy https://github.com/QuantStack/xtensor
https://github.com/ndarray/Boost.NumPy

https://github.com/AtsushiSakai/numpycpp
https://github.com/rogersce/cnpy

sklearn https://github.com/davisking/dlib

数据科学之路 的多语言选择_第2张图片
image.png

data science tookit https://github.com/meta-toolkit/meta/
https://meta-toolkit.org/
matplot https://github.com/lava/matplotlib-cpp

image.png

golang版
data science https://github.com/cpmech/gosl
matplot https://github.com/zieckey/gochart
scipy https://github.com/montanaflynn/stats
sklearn https://github.com/sjwhitworth/golearn
https://github.com/pa-m/sklearn
https://github.com/qingtiandalaoye/GoDataframe
https://github.com/piquette/finance-go
https://github.com/orcaman/financial
numpy
https://github.com/ledao/arrgo
https://piquette.io/projects/finance-go/
tf.go
https://github.com/google/grumpy

image.png

julia
https://github.com/JuliaPy/Pandas.jl
https://www.jianshu.com/p/87977f582c27?utm_source=oschina-app
https://github.com/cstjean/ScikitLearn.jl

matplot https://github.com/JuliaPlots/Plots.jl
http://docs.juliaplots.org/latest/
tf.jl
https://github.com/IntelLabs/ParallelAccelerator.jl
https://blog.csdn.net/u014636245/article/details/82216716
https://blog.csdn.net/a_step_further/article/details/79662088
有一个模型转换器
https://github.com/nok/sklearn-porter

https://github.com/7125messi

image.png

swift
numpy https://github.com/sonsongithub/numsw
https://github.com/nifty-swift/Nifty
plot https://github.com/i-schuetz/SwiftCharts

https://github.com/apple/coremltools
coreml https://developer.apple.com/documentation/coreml

image.png

rust
https://github.com/rust-numpy/rust-numpy

tf https://github.com/tensorflow/rust
sklearn https://github.com/maciejkula/rustlearn
pandas https://github.com/weld-project/weld
matplot https://github.com/SiegeLord/RustGnuplot
https://github.com/ubnt-intrepid/rustplotlib
https://github.com/milliams/plotlib
https://github.com/coder543/dataplotlib

image.png

R
pandas dplyr
https://www.dataquest.io/blog/python-vs-r/
http://www.10tiao.com/html/403/201806/2650629741/1.html
http://mathesaurus.sourceforge.net/r-numpy.html
https://github.com/topepo/caret

1. caret

image.png

caret has been used by me with success: http://caret.r-forge.r-project.org/

2. MLR

There is also the MLR package: https://cran.r-project.org/web/packages/mlr/index.html

From the site:


image.png

H2O

你可能感兴趣的:(数据科学之路 的多语言选择)