相关资源

最近出来实习,泪奔,没时间学习了,把一些觉得很好但是没时间看的资源放这 以后学习


如果说理解一个技术的最高境界,就是能够用最简单的方式将这个技术表达出来的话,那么Igor对于CPU Cache的理解绝对达到了此境界。他的博文:Gallery of Processor Cache Effects http://t.cn/hrXwvb 7个简单至极的代码示例,覆盖了Cache Line、Cache Size、False Sharing等重要知识点,不得不服


NAACL今天的tutorial包括了斯坦福Richard Socher和Christopher Manning关于深度学习在NLP中应用的教学讲座。看了一下slides,比去年ACL的版本增加了一些新内容,可以算是关于深度学习在语言技术的应用中相当全面的tutorial了。"Deep Learning for NLP (without Magic)" slides: http://t.cn/zHHyKUo



mahout 应用 非常多的实例
http://chimpler.wordpress.com/category/mahout/


教程tutorial 

ubc 的machine learning 2013 课程

有mcmc  以及最新的深度学习的课程

http://www.cs.ubc.ca/~nando/540-2013/lectures.html


文本挖掘技术

http://www.icst.pku.edu.cn/course/mining/11-12spring/index.html


rbm  java 代码  估计是最对我胃口的代码

https://github.com/tjake/rbm-dbn-mnist


Stanford NLP组专门设置了Deep Learning in Natural Language Processing的主页

http://nlp.stanford.edu/projects/DeepLearningInNaturalLanguageProcessing.shtml


一个大牛的主页

http://alex.smola.org/

这是其教学  有很多资料

http://alex.smola.org/teaching/


http://www.cs.princeton.edu/courses/archive/spring10/cos424/w/syllabus



The Large Scale Learning class notes

http://cilvr.cs.nyu.edu/doku.php?id=courses:bigdata:slides:start



算法tutorial 

一个剑桥大学教授的主页  高斯过程的pdf讲得很细很好

http://mlg.eng.cam.ac.uk/zoubin/

变分贝叶斯 tutorial  很nice

http://people.inf.ethz.ch/bkay/talks/Brodersen_2013_03_22.pdf


关于协同过滤 和graph mind 的hadoop 实现

https://code.google.com/p/hadoop-network/



 

单机模式处理大数据,搜集一些好用的开源利器


1. LibFM

项目主页:http://www.libfm.org/


2. Svdfeature

项目主页:http://apex.sjtu.edu.cn/apex_wiki/svdfeature


3. Libsvm和Liblinear

libsvm项目主页:http://www.csie.ntu.edu.tw/~cjlin/libsvm/

liblinear项目主页:http://www.csie.ntu.edu.tw/~cjlin/liblinear/

初次使用必读:practical guide

libsvm的开发心得by林智仁:http://www.csie.ntu.edu.tw/~cjlin/talks/kdd.pdf


4. rt-rank

项目主页:http://research.engineering.wustl.edu/~amohan/

rt-rank中实现了推荐系统中常见的random forests和gradient boosted decision trees这两种方法,使用起来很方便。


3. Mahout

项目主页:http://mahout.apache.org/


4. MyMediaLite

项目主页:http://www.ismll.uni-hildesheim.de/mymedialite/


4. GraphLab 和 GraphChi

GraphLab项目主页:http://graphlab.org/ 

GraphChi项目主页:http://graphlab.org/graphchi/

GraphChi的下载地址:https://code.google.com/p/graphchi/downloads/detail?name=graphchi_src_v0.1.2_toolkits.tar.gz

GraphChi介绍:http://www.technologyreview.com/news/428497/your-laptop-can-now-analyze-big-data/?nlid=nldly&nld=2012-07-17

CF for GraphChi: http://bickson.blogspot.com/2012/08/collaborative-filtering-with-graphchi.html



pylearn2

https://github.com/lisa-lab/pylearn2

包含很多特性  ,更新很快

  • Training algorithms
    • A “default training algorithm” that asks the model to train itself

    • Stochastic gradient descent, with extensions including
      • Learning rate decay
      • Momentum
      • Polyak averaging
      • Early stopping
      • A simple framework for adding your own extensions
    • Batch gradient descent with line searches

    • Nonlinear conjugate gradient descent (with line searches)

  • Model Estimation Criteria
    • Score Matching
    • Denoising Score Matching
    • Noise-Contrastive Estimation
    • Cross-entropy
    • Log-likelihood
  • Models
    • Autoencoders, including Contractive and Denoising Autoencoders

    • RBMs, including gaussian and ssRBM. Varying levels of integration into

      the full framework.

    • k-means

    • Local Coordinate Coding

    • Maxout networks

    • PCA

    • Spike-and-Slab Sparse coding

    • SVMs (we provide a wrapper around scikit-learn that makes it easy to

      train a multiclass svm on dense training data in a memory efficient way, which doesn’t always happen using scikit-learn directly)

    • Partial implementation of DBMs (contact Ian Goodfellow if you would like

      to complete it)

  • Datasets:
    • MNIST, MNIST with background and rotations
    • STL-10
    • CIFAR-10, CIFAR-100
    • NIPS Workshops 2011 Transfer Learning Challenge
    • UTLC
    • NORB
    • Toronto Faces Dataset
  • Dataset pre-processing
    • Contrast normalization
    • ZCA whitening
    • Patch extraction (for implementing convolution-like algorithms)
    • The Coates+Lee+Ng CIFAR processing pipeline
  • Miscellaneous algorithms and utilities:
    • AIS

    • Weight visualization for single layer networks

    • Can plot learning curves showing how user-configured quantities

      change during learning


你可能感兴趣的:(相关资源)