seaborn ——Jointplot

单个标量或者两个变量的画图

seaborn.jointplot(x, y, data=None, kind=’scatter’, stat_func=, color=None, size=6, ratio=5, space=0.2, dropna=True, xlim=None, ylim=None, joint_kws=None, marginal_kws=None, annot_kws=None, **kwargs)
Parameters:

  • x, y : strings or vectors
    Data or names of variables in data.
  • data : DataFrame, optional
    DataFrame when x and y are variable names.
  • kind : { “scatter” | “reg” | “resid” | “kde” | “hex” }, optional
    Kind of plot to draw.
  • **stat_fun**c : callable or None, optional
    Function used to calculate a statistic about the relationship and annotate the plot. Should map x and y either to a single value or to a (value, p) tuple. Set to None if you don’t want to annotate the plot.
  • color : matplotlib color, optional
    Color used for the plot elements.
  • size : numeric, optional
    Size of the figure (it will be square).
  • ratio : numeric, optional
    Ratio of joint axes size to marginal axes height.
  • space : numeric, optional
    Space between the joint and marginal axes
  • dropna : bool, optional
    If True, remove observations that are missing from x and y.
  • {x, y}lim : two-tuples, optional
    Axis limits to set before plotting.
  • {joint, marginal, annot}_kws : dicts, optional
    Additional keyword arguments for the plot components.
  • kwargs : key, value pairings
    Additional keyword arguments are passed to the function used to draw the plot on the joint Axes, superseding items in the joint_kws dictionary.
    Returns:
    grid : JointGrid
    JointGrid object with the plot on it.

class seaborn.JointGrid(x, y, data=None, size=6, ratio=5, space=0.2, dropna=True, xlim=None, ylim=None)
Parameters:

  • x, y : strings or vectors
    Data or names of variables in data.
  • data : DataFrame, optional
    DataFrame when x and y are variable names.
  • size : numeric
    Size of each side of the figure in inches (it will be square).
  • ratio : numeric
    Ratio of joint axes size to marginal axes height.
  • space : numeric, optional
    Space between the joint and marginal axes
  • dropna : bool, optional
    If True, remove observations that are missing from x and y.
  • {x, y}lim : two-tuples, optional
    Axis limits to set before plotting.
%matplotlib inline
import seaborn as sns
import matplotlib.pyplot as plt
import pandas as pd
import numpy as np
import scipy.stats as sci
sns.set_style('darkgrid')
sns.set_context('talk')
  • jointplot是画两个变量或者单变量的图像,是对JointGrid类的实现
  • x,y为DataFrame中的列名或者是两组数据,data指向dataframe ,kind是你想要画图的类型
  • stat_func 用于计算统计量关系的函数
  • kind 图形的类型scatter,reg,resid,kde,hex
  • 以下均是以股票数据作为演示
stock=pd.read_csv('sample.csv',index_col=0)
sns.jointplot(x='v_ma5',y='price_change',data=stock,kind='kde')

sns.jointplot(x='v_ma5',y='price_change',data=stock,kind='reg')

  • space 定义的是上,右图像距离中间图像的距离
  • color 定义的是图形的整体颜色
  • edgecolor 定义中间图形散点的颜色
  • linewidth 定义图形线条或者散点图的大小
  • marginal_kws 定义边界上hist的参数,以字典的形式传参

sns.jointplot(x='v_ma5',y='price_change',data=stock,kind='reg',stat_func=sci.pearsonr,space=0,color='r')

g=(sns.jointplot(x='v_ma5',y='price_change',data=stock,edgecolor="g").set_axis_labels("X","Y"),)

sns.jointplot(x='v_ma5',y='price_change',data=stock,linewidth=6,marginal_kws=dict(bins=20, rug=True))

sns.jointplot(x='v_ma5',y='price_change',data=stock,annot_kws=dict(stat="r"))

#初始化类
g=sns.JointGrid(x='v_ma5',y='price_change',data=stock,space=0.5,ratio=5)

#join 和marginals分开画
g=sns.JointGrid(x='v_ma5',y='price_change',data=stock,space=0.5,ratio=5)
g=g.plot(sns.regplot,sns.distplot)

g=sns.JointGrid(x='v_ma5',y='price_change',data=stock,space=0.5,ratio=5)
g=g.plot_joint(plt.scatter,color='.3',edgecolor='r')
g=g.plot_marginals(sns.distplot,kde=False)

#marginal 的图像和join的图像变量不一致
from scipy import stats
g=sns.JointGrid(x='v_ma5',y='price_change',data=stock,space=0.5,ratio=5)
g=g.plot_joint(plt.scatter,color='.3',edgecolor='r')
_=g.ax_marg_x.hist(stock.v_ma10,color='r',alpha=.6,bins=50)
_=g.ax_marg_y.hist(stock.low,color='y',orientation="horizontal",bins=20)
g = g.annotate(stats.pearsonr)
#annotate  注释;注解

  • 使用不同函数给join形成注释
from scipy import stats
g=sns.JointGrid(x='v_ma5',y='price_change',data=stock,space=0.5,ratio=5)
g=g.plot_joint(plt.scatter,color='.3',edgecolor='r')
_=g.ax_marg_x.hist(stock.v_ma10,color='r',alpha=.6,bins=50)
_=g.ax_marg_y.hist(stock.low,color='y',orientation="horizontal",bins=20)
rquare=lambda a,b:stats.pearsonr(a,b)[0]**2
g=g.annotate(rquare,template='{stat}:{val:.2f}',stat='$R^2$',loc='upper left',fontsize=12)

  • shade 表示是吗画阴影部分
g=sns.JointGrid(x='v_ma5',y='price_change',data=stock,space=0.5,ratio=5)
g = g.plot_joint(sns.kdeplot, cmap="Reds_d")
g = g.plot_marginals(sns.kdeplot, color="r", shade=True)

核密度函数

%matplotlib inline
import seaborn as sns
import matplotlib.pyplot as plt
import pandas as pd
import numpy as np
import scipy.stats as sci
sns.set_style('darkgrid')
sns.set_context('talk')

核密度函数kde(kernel density estimate)
kdeplot画单变量或者双因子的核密度函数
seaborn.kdeplot(data, data2=None, shade=False, vertical=False, kernel=’gau’, bw=’scott’, gridsize=100, cut=3, clip=None, legend=True, cumulative=False, shade_lowest=True, ax=None, **kwargs)
Parameters:
- data : 1d array-like
Input data.
- data2: 1d array-like, optional
Second input data. If present, a bivariate KDE will be estimated.
- shade : bool, optional
If True, shade in the area under the KDE curve (or draw with filled contours when data is bivariate).
vertical : bool, optional
If True, density is on x-axis.
- kernel : {‘gau’ | ‘cos’ | ‘biw’ | ‘epa’ | ‘tri’ | ‘triw’ }, optional
Code for shape of kernel to fit with. Bivariate KDE can only use gaussian kernel.
- bw : {‘scott’ | ‘silverman’ | scalar | pair of scalars }, optional
Name of reference method to determine kernel size, scalar factor, or scalar for each dimension of the bivariate plot.
- gridsize : int, optional
Number of discrete points in the evaluation grid.
- cut : scalar, optional
Draw the estimate to cut * bw from the extreme data points.
clip : pair of scalars, or pair of pair of scalars, optional
Lower and upper bounds for datapoints used to fit KDE. Can provide a pair of (low, high) bounds for bivariate plots.
- legend : bool, optional
If True, add a legend or label the axes when possible.
- cumulative : bool, optional
If True, draw the cumulative distribution estimated by the kde.
- shade_lowest : bool, optional
If True, shade the lowest contour of a bivariate KDE plot. Not relevant when drawing a univariate plot or when shade=False. Setting this to False can be useful when you want multiple densities on the same Axes.
- ax : matplotlib axis, optional
Axis to plot on, otherwise uses current axis.
kwargs : key, value pairings
Other keyword arguments are passed to plt.plot() or plt.contour{f} depending on whether a univariate or bivariate plot is being drawn.
Returns:
ax : matplotlib Axes
Axes with plot.

stock=pd.read_csv('sample.csv',index_col=0)
ax=sns.kdeplot(stock.open,shade=True,color='r',cumulative=True)

ax=sns.kdeplot(stock.open,shade=True,color='r',vertical=True)

sns.kdeplot(stock.open,stock.v_ma10,shade=True)

countplot
Parameters:

  • x, y, hue : names of variables in data or vector data, optional
    Inputs for plotting long-form data. See examples for interpretation.
  • data : DataFrame, array, or list of arrays, optional
    Dataset for plotting. If x and y are absent, this is interpreted as wide-form. Otherwise it is expected to be long-form.
  • order, hue_order : lists of strings, optional
    Order to plot the categorical levels in, otherwise the levels are inferred from the data objects.
  • orient : “v” | “h”, optional
    Orientation of the plot (vertical or horizontal). This is usually inferred from the dtype of the input variables, but can be used to specify when the “categorical” variable is a numeric or when plotting wide-form data.
  • color : matplotlib color, optional
    Color for all of the elements, or seed for light_palette() when using hue nesting.
  • palette : seaborn color palette or dict, optional
    Colors to use for the different levels of the hue variable. Should be something that can be interpreted by color_palette(), or a dictionary mapping hue levels to matplotlib colors.
  • saturation : float, optional
    Proportion of the original saturation to draw colors at. Large patches often look better with slightly desaturated colors, but set this to 1 if you want the plot colors to perfectly match the input color spec.
  • ax : matplotlib Axes, optional
    Axes object to draw the plot onto, otherwise uses the current Axes.
  • kwargs : key, value mappings
    Other keyword arguments are passed to plt.bar.
sns.countplot(x='code',data=stock)

你可能感兴趣的:(数据分析)