Anaconda:蟒蛇,估计来源就是python logo里那条可爱的小蟒蛇吧。
Anaconda is the leading open data science platform powered by Python. The open source version of Anaconda is a high performance distribution of Python and R and includes over 100 of the most popular Python, R and Scala packages for data science. Additionally, you’ll have access to over 720 packages that can easily be installed with conda, our renowned package, dependency and environment manager, that is included in Anaconda. Anaconda is BSD licensed which gives you permission to use Anaconda commercially and for redistribution. See the packages included with Anaconda and the Anaconda changelog.
After you have Eclipse, PyDev, and Anaconda installed, follow these steps to set Anaconda Python as your default by adding it as a new interpreter, and then selecting that new interpreter:
Open the Eclipse Preferences window:
Go to PyDev -> Interpreters -> Python Interpreter.
Click the New button:
In the “Interpreter Name” box, type “Anaconda Python”.
Browse to ~/anaconda/bin/python or wherever your Anaconda Python is installed.
Click the OK button.
In the next window, select all the folders and click the OK button again to select the folders to add to the SYSTEM python path.
The Python Interpreters window will now display Anaconda Python. Click OK.
You are now ready to use Anaconda Python with your Eclipse and PyDev installation.
lei.wang ~ $ which python
lei.wang ~ $ python --version
Python 2.7.12 :: Anaconda 4.1.1 (x86_64)
# added by Anaconda2 4.1.1 installer
export PATH="/Users/wanglei/anaconda/bin:$PATH"
lei.wang ~ $ conda
usage: conda [-h] [-V] [--debug] command ...
conda is a tool for managing and deploying applications, environments and packages.
positional arguments:
info Display information about current conda install.
help Displays a list of available conda commands and their help
list List linked packages in a conda environment.
search Search for packages and display their information. The input
is a Python regular expression. To perform a search with a
search string that starts with a -, separate the search from
the options with --, like 'conda search -- -h'. A * in the
results means that package is installed in the current
environment. A . means that package is not installed but is
cached in the pkgs directory.
create Create a new conda environment from a list of specified
install Installs a list of packages into a specified conda
update Updates conda packages to the latest compatible version. This
command accepts a list of package names and updates them to
the latest versions that are compatible with all other
packages in the environment. Conda attempts to install the
newest versions of the requested packages. To accomplish
this, it may update some packages that are already installed,
or install additional packages. To prevent existing packages
from updating, use the --no-update-deps option. This may
force conda to install older versions of the requested
packages, and it does not prevent additional dependency
packages from being installed. If you wish to skip dependency
checking altogether, use the '--force' option. This may
result in an environment with incompatible packages, so this
option must be used with great caution.
upgrade Alias for conda update. See conda update --help.
remove Remove a list of packages from a specified conda environment.
uninstall Alias for conda remove. See conda remove --help.
config Modify configuration values in .condarc. This is modeled
after the git config command. Writes to the user .condarc
file (/Users/lei.wang/.condarc) by default.
init Initialize conda into a regular environment (when conda was
installed as a Python package, e.g. using pip). (DEPRECATED)
clean Remove unused packages and caches.
package Low-level conda package utility. (EXPERIMENTAL)
bundle Create or extract a "bundle package" (EXPERIMENTAL)
#!/usr/bin/env python
Created on 2016年7月15日
@author: lei.wang
import numpy as np
import urllib
from sklearn import preprocessing
from sklearn import metrics
from sklearn.ensemble import ExtraTreesClassifier
from sklearn.linear_model import LogisticRegression
def t1():
url = "http://archive.ics.uci.edu/ml/machine-learning-databases/pima-indians-diabetes/pima-indians-diabetes.data"
raw_data = urllib.urlopen(url)
dataset = np.loadtxt(raw_data,delimiter=",")
X = dataset[:,0:7]
y = dataset[:,8]
# normalize the data attributes
normalized_X = preprocessing.normalize(X)
# standardize the data attributes
standardized_X = preprocessing.scale(X)
model = ExtraTreesClassifier()
model.fit(X, y)
# display the relative importance of each attribute
print model.feature_importances_
model = LogisticRegression()
model.fit(X, y)
# make predictions
expected = y
predicted = model.predict(X)
# summarize the fit of the model
print(metrics.classification_report(expected, predicted))
print(metrics.confusion_matrix(expected, predicted))
[ 0.13697671 0.26771573 0.11139943 0.08658428 0.079841 0.16862413
0.1488587 ]
LogisticRegression(C=1.0, class_weight=None, dual=False, fit_intercept=True,
intercept_scaling=1, max_iter=100, multi_class='ovr', n_jobs=1,
penalty='l2', random_state=None, solver='liblinear', tol=0.0001,
verbose=0, warm_start=False)
precision recall f1-score support
0.0 0.79 0.89 0.84 500
1.0 0.74 0.55 0.63 268
avg / total 0.77 0.77 0.77 768
[[447 53]
[120 148]]