matlab箱形图_使用javascript可视化世界幸福来构建箱形图

matlab箱形图

Data visualization is an important and sometimes undervalued tool in a data scientist’s toolkit. It allows us to gain an understanding and intuition about the data, through exploratory data analysis, which influences preprocessing, feature engineering, and the correct machine learning algorithm choice. It also helps to better evaluate models and even allows you to spot areas in the data where models could have poor performance.

数据可视化是数据科学家工具包中重要的,有时被低估的工具。 它使我们能够通过探索性数据分析获得对数据的理解和直觉,这会影响预处理,特征工程以及正确的机器学习算法选择。 它还有助于更好地评估模型,甚至还允许您在数据中发现模型性能可能不佳的区域。

Taking data visualization one step further by adding interactivity is even more advantageous. By adding interactive elements to your visualizations you create a more engaging experience. This in turn makes a user ‘explore’ visualizations instead of just reading them!

通过增加交互性,使数据可视化更进一步是有利的。 通过将可视化元素添加到可视化中,您可以创建更加引人入胜的体验。 反过来,这使用户可以“探索”可视化效果,而不仅仅是阅读它们!

In this tutorial, I will be covering how to build an interactive data visualization, specifically a box plot as an example, using JavaScript and a charting library. I will begin by first briefly covering the basics of box plots before going through the steps of building one and then finally using the technique in a fun example to investigate the distribution of happiness between the different regions of the planet in an attempt to answer the question: ‘Where should you live to be happier?’.

在本教程中,我将介绍如何使用JavaScript和图表库构建交互式数据可视化,特别是以箱形图为例。 首先,我将简要介绍箱形图的基础知识,然后再进行构建箱形图的步骤,最后在一个有趣的示例中使用该技术,研究行星不同区域之间的幸福度分布,以尝试回答问题。 :' 您应该在哪里生活更快乐? '。

什么是箱形图? (What is a box plot?)

A box plot, also widely called a box-and-whisker plot, is a data visualization technique used to visualize descriptive statistics of datasets. While this chart type is not as useful as a histogram at understanding a single datasets distribution, these visualizations do well at allowing a user to compare different datasets.

箱形图(也称为箱须图)是一种数据可视化技术,用于可视化数据集的描述性统计数据。 尽管此图表类型在理解单个数据集分布方面不如直方图有用,但这些可视化效果很好,允许用户比较不同的数据集。

Box plots visualize the following summary statistics: The median, the first and third quartile (Q1 and Q3), the low and the high as well as the outliers. These are displayed as follows:

箱形图可视化了以下汇总统计数据: 中位数 ,第一和第三四分位数( Q1Q3 ), 以及离群值 。 这些显示如下:

matlab箱形图_使用javascript可视化世界幸福来构建箱形图_第1张图片
Diagram by the author 作者图

如何创建JavaScript箱形图 (How to create a JavaScript box plot)

To build an interactive data visualization there are a quite a few options. If you want to learn about JavaScript alternatives, you can have a look here. In this example I will be using a JS charting library and specifically AnyChart. I’m going with AnyChart as it supports box-and-whisker plots (among multiple other chart types), and I think both its documentation and API are really great for beginners and advanced users alike but alternatives which better suit your needs can work too and will follow similar steps.

要构建交互式数据可视化,有很多选择。 如果您想了解JavaScript替代方法,可以在此处查看 。 在此示例中,我将使用JS图表库,尤其是AnyChart 。 我正在使用AnyChart,因为它支持箱形图(在多个其他图表类型中),并且我认为其文档和API确实对初学者和高级用户都非常有用,但更适合您需求的替代方案也可以工作并将遵循类似的步骤。

第1步:设置页面 (Step 1: Set up the page)

The first step is to set up a page for the box plot visualization. This includes adding the HTML elements, loading the required scripts and setting up the CSS for our chart. Which looks like:

第一步是为箱形图可视化设置页面。 这包括添加HTML元素,加载所需的脚本以及为图表设置CSS。 看起来像:










When using a charting library you will need to import the correct script in order to use that library and in some cases different modules for different chart types. For access to AnyChart’s box-and-whisker chart, for example, I will need to use the base module.

使用图表库时,您需要导入正确的脚本才能使用该库,并且在某些情况下,需要使用不同的模块来处理不同的图表类型。 例如,要访问AnyChart的箱须图,我将需要使用基本模块。

Once that is sorted I will then need to set the CSS properties for my chart element. Here I have set the box chart to have a width and height of 100%. You can change this depending on your own use case. CSS width and height properties accept percentages (of the parent element), and various length units (most commonly pixels).

排序后,我将需要为图表元素设置CSS属性。 在这里,我将箱形图设置为具有100%的宽度和高度。 您可以根据自己的用例进行更改。 CSS的width和height属性接受百分比(父元素的百分比)和各种长度单位(最常见的是像素)。

Finally, I have a script tag with the JavaScript function anychart.onDocumentReady() which is simply a function triggered when the document is loaded. Placing the JavaScript charting code within this function ensures that the code does not trigger before the page is ready which can lead to bad results (read up on asynchronous JavaScript to learn more about this).

最后,我有一个带有JavaScript函数anychart.onDocumentReady()的脚本标签,它只是在加载文档时触发的函数。 将JavaScript图表代码放在此函数中可确保该代码不会在页面准备就绪之前触发,这可能会导致不良结果(请在异步JavaScrip t上阅读以了解更多信息)。

步骤2:载入资料 (Step 2: Load the data)

I will be using data sourced from the World Happiness Report which is the results compiled from a global survey that attempts to quantify happiness of each country’s citizens to a value between 0 and 10. I obtained this data from Kaggle, a great place to find fun and interesting datasets. Admittedly most of them are geared towards machine learning applications but a few work well for data visualization purposes.

我将使用来自《 世界幸福报告》的数据,该数据是一项全球调查得出的结果,该调查试图将每个国家公民的幸福量化为0到10之间的一个值。我从Kaggle获得了这些数据,这是一个寻找乐趣的好地方和有趣的数据集。 诚然,它们中的大多数都是针对机器学习应用程序的,但有一些可以很好地用于数据可视化。

In preparation for drawing box plots, I need to provide the data in a format and form that is accepted by our chosen charting library. For example, AnyChart JS accepts box plot data in the following form:

在准备绘制箱形图时,我需要以我们选择的制图库可接受的格式和形式提供数据。 例如,AnyChart JS接受以下格式的箱形图数据:

{x:"Name", low: value, q1: value, median: value, q3: value, high: value, outliers: array}

Where x is the label, q1 and q3 are the first and third quartile values, low and high are the 1.5 x the interquartile range below q1 and 1.5 x the interquartile range above q3 respectively, and the outliers is an array containing all the outlier values.

其中x是标号,q1和q3是第一个和第三个四分位数,低和高分别是q1之下的1.5 x四分位数范围和q3之上的1.5 x四分位数范围,离群值是包含所有离群值的数组。

I have conveniently preprocessed the data from the world happiness report to produce the following array:

我已经方便地对世界幸福报告中的数据进行了预处理,以产生以下数组:

var data = [
{x:"Western Europe", low: 5.03, q1: 6.36, median: 6.91, q3: 7.34, high: 7.53},
{x:"North America", low: 7.10, q1: 7.18, median: 7.25, q3: 7.33, high: 7.40},
{x:"Australia and New Zealand", low: 7.31, q1: 7.32, median: 7.32, q3: 7.33, high: 7.33},
{x:"Middle East and Northern Africa", low: 3.07, q1: 4.78, median: 5.30, q3: 6.30, high: 7.27},
{x:"Latin America and Caribbean", low: 4.87, q1: 5.80, median: 6.13, q3: 6.66, high: 7.09, outliers: [4.03]},
{x:"Southeastern Asia", low: 3.91, q1: 4.88, median: 5.28, q3: 6.01, high: 6.74},
{x:"Central and Eastern Europe", low: 4.22, q1: 5.15, median: 5.49, q3: 5.81, high: 6.60},
{x:"Eastern Asia", low: 4.91, q1: 5.30, median: 5.65, q3: 5.90, high: 6.38},
{x:"Sub-Saharan Africa", low: 2.91, q1: 3.74, median: 4.13, q3: 4.43, high: 5.44, outliers: [5.648]},
{x:"Southern Asia", low: 4.40, q1: 4.41, median: 4.64, q3: 4.96, high: 5.20, outliers: [3.36]}
]

步骤3:绘制箱形图 (Step 3: Draw the box chart)

With only these few lines of code I can draw my box plots:

仅需这些几行代码,我便可以绘制箱形图:

// create a chart
chart = anychart.box();// create a box series and set the data
series = chart.box(data);// set the container id
chart.container("container");// initiate drawing the chart
chart.draw();

And putting this all together, you will get the following:

将所有这些放在一起,您将获得以下内容:










Which results in:

结果是:

Codepen of results. 结果的简笔画。

With these simple steps, I have quickly produced a very functional interactive box-and-whisker plot that I can now easily embed in any web site or app!

通过这些简单的步骤,我快速制作了一个功能非常强大的交互式盒须图,现在可以轻松地将其嵌入任何网站或应用程序中!

While not bad, I think I can do better. Just keep reading.

虽然还不错,但我认为我可以做得更好。 继续阅读。

步骤4:自定义盒须图 (Step 4: Customize the box-and-whisker plot)

Data visualization isn’t just processing some data and putting it into a chart. It is about storytelling. It is about making adjustments in order to highlight an insight or making a visualization more engaging.

数据可视化不仅仅是处理一些数据并将其放入图表中。 这是讲故事的。 这是为了进行调整以突出显示见解或使可视化效果更引人入胜。

All decent charting libraries will provide many ways to do this. They will provide this through their API and you can generally find these options by looking through their documentation. As I am using the AnyChart JS charting library in this example I will go through a few of its customization options that are available for box plots.

所有体面的图表库都将提供许多方法来实现此目的。 他们将通过其API提供此选项,您通常可以通过阅读其文档来找到这些选项。 在此示例中,当我使用AnyChart JS图表库时,我将介绍一些可用于箱形图的自定义选项。

定制图表设计 (Customize the chart design)

It is possible to change many cosmetic and functional aspects of the visualization. To start with, I’ll add a custom title, change the individual axis labels, and stagger the x-axis labels to prevent them from overlapping:

可以更改可视化的许多外观和功能方面。 首先,我将添加一个自定义标题,更改各个轴标签,并错开x轴标签以防止它们重叠:

// set the chart title
var title = chart.title("Happiness Level by Region");// label axis
chart.xAxis().title("Regions");
chart.yAxis().title("Happiness Level");// stagger the x-axis labels
chart.xAxis().staggerMode(true);

For the boxes themselves, for example, I can change details for when they are in their default state, their state when hovered over and their state when selected. I can make similar changes to the median line, the stem, the whiskers as well as the outliers. Typically, I would take advantage of these customization options to make my visualization better fit in with the layout/theme of where I will be hosting my viz or to better suit the data being displayed (eg using greens when visualizing environmental data).

例如,对于框本身,我可以更改其默认状态,悬停时的状态以及选中状态的详细信息。 我可以对中线,茎干,晶须以及离群值进行类似的更改。 通常,我会利用这些自定义选项来使我的可视化效果更好地适应我将要托管的位置的布局/主题,或者更好地适合显示的数据(例如,在可视化环境数据时使用绿色)。

For the outliers I can even change the shape by setting the marker type. (AnyChart has a variety of options which can be seen in the API reference).

对于离群值,我什至可以通过设置标记类型来更改形状。 (AnyChart有各种选项,可以在API参考中看到)。

These changes can be made easily with the following code:

使用以下代码可以轻松进行这些更改:

// configure visual appearance of series
series.normal().fill("#36558F", 0.2);
series.hovered().fill("#36558F", 0.2);
series.selected().fill("#36558F", 0.6);
series.normal().stroke("#36558F", 1);
series.hovered().stroke("#36558F", 2);
series.selected().stroke("#36558F", 4);// configure medians
series.normal().medianStroke("#dd2c00", 1);
series.hovered().medianStroke("#dd2c00", 2);
series.selected().medianStroke("#dd2c00", 2);// configure outliers
series.normal().outlierMarkers({
fill: "#36558F 0.2",
stroke: { color: "#36558F", thickness: 1 },
size: 5,
type: "star5",
});
series.hovered().outlierMarkers({
fill: "#36558F 0.2",
stroke: { color: "#36558F", thickness: 2 },
size: 5,
type: "star5",
});
series.selected().outlierMarkers({
fill: "#36558F 0.6",
stroke: { color: "#36558F", thickness: 4 },
size: 7.5,
type: "star5",
});// configure stems
series.normal().stemStroke("#36558F", 0.5);
series.hovered().stemStroke("#36558F", 1);
series.selected().stemStroke("#36558F", 2);// configure whiskers
series.whiskerWidth(50);
series.normal().whiskerStroke("#36558F", 0.5);
series.hovered().whiskerStroke("#36558F", 1);
series.selected().whiskerStroke("#36558F", 2);

Here I input the color and opacity and the color and width of the fill and stroke respectively. Color arguments can be given in many accepted formats. Here I’ve gone with the more widely used hex codes.

在这里,我分别输入颜色和不透明度以及填充和笔触的颜色和宽度。 颜色参数可以多种接受的格式给出。 在这里,我介绍了使用更广泛的十六进制代码。

Incorporating all of this results in:

结合所有这些结果可以:

Codepen of results. 结果的简笔画。

Improve the box plot tooltip

改善箱形图工具提示

As you may have noticed, when hovering over a box you can see all of the summary statistics used to draw these plots… except for the outliers. I’ll fix that and add the outlier data to the box plot tooltip as well.

您可能已经注意到,将鼠标悬停在一个框上时,您会看到用于绘制这些图的所有汇总统计信息……但异常值除外。 我将解决此问题,并将异常数据也添加到箱形图工具提示中。

// configure tooltip
chart.tooltip().titleFormat("Region: {%x}")
chart.tooltip().format("Low: {%low} \n High: {%high} \n Quantile 1: {%q1} \n Quantile 3: {%q3} \n Median: {%median} \n Outliers: {%outliers}");

And if you add that to the previous code you get the following interactive box-and-whisker chart:

并且,如果将其添加到先前的代码中,则会得到以下交互式箱须图:

Codepen of results. 结果的简笔画。

Awesome! We’ve just visualized happiness (sort-of)! From the above plot I can clearly see that Sub Saharan Africa isn’t the happiest of places, while Western Europeans and North Americans smile a ton the happiest place to be is clearly Australia and New Zealand!

太棒了! 我们只是将幸福形象化(分类)! 从上面的情节中,我可以清楚地看到撒哈拉以南非洲不是最幸福的地方,而西欧人和北美人笑到最幸福的地方显然是澳大利亚和新西兰!

*I am from Sub Saharan Africa and am not too confident of these results from my anecdotal experiences!

*我来自撒哈拉以南非洲,对我的轶事所产生的这些结果不太自信!

结论 (Conclusion)

As you can see, making an interactive data visualization is very easy. It doesn’t require much knowledge of JavaScript to get started (depending on the charting library you use) and the results are great! Here I created a box plot but the process is very similar for other chart types and by consulting the documentation it can be very easy to change.

如您所见,进行交互式数据可视化非常容易。 它不需要太多JavaScript知识即可上手(取决于您使用的图表库),并且效果很好! 在这里,我创建了一个箱形图,但是其他图表类型的过程非常相似,并且通过查阅文档可以很容易地进行更改。

This is only the tip of the iceberg with regards to what you can do, whether it be more interesting customizations or using different data sources. I hope that this tutorial on box and whisker plots can be a great springboard for learning further!

无论是更有趣的自定义设置还是使用不同的数据源,这只是冰山一角。 我希望本箱图和晶须图教程可以成为进一步学习的理想跳板!

翻译自: https://towardsdatascience.com/building-box-plots-using-javascript-visualizing-world-happiness-ab0dd1d370c5

matlab箱形图

你可能感兴趣的:(可视化,python,matlab,javascript,人工智能)