GarfieldEr007

可视化MNIST：降维之探索Visualizing MNIST: An Exploration of Dimensionality Reduction

At some fundamental level, no one understands machine learning.

It isn’t a matter of things being too complicated. Almost everything we do is fundamentally very simple. Unfortunately, an innate human handicap interferes with us understanding these simple things.

Humans evolved to reason fluidly about two and three dimensions. With some effort, we may think in four dimensions. Machine learning often demands we work with thousands of dimensions – or tens of thousands, or millions! Even very simple things become hard to understand when you do them in very high numbers of dimensions.

Reasoning directly about these high dimensional spaces is just short of hopeless.

As is often the case when humans can’t directly do something, we’ve built tools to help us. There is an entire, well-developed field, called dimensionality reduction, which explores techniques for translating high-dimensional data into lower dimensional data. Much work has also been done on the closely related subject of visualizing high dimensional data.

These techniques are the basic building blocks we will need if we wish to visualize machine learning, and deep learning specifically. My hope is that, through visualization and observing more directly what is actually happening, we can understand neural networks in a much deeper and more direct way.

And so, the first thing on our agenda is to familiarize ourselves with dimensionality reduction. To do that, we’re going to need a dataset to test these techniques on.

MNIST

MNIST is a simple computer vision dataset. It consists of 28x28 pixel images of handwritten digits, such as:

Every MNIST data point, every image, can be thought of as an array of numbers describing how dark each pixel is. For example, we might think of as something like:

≃ ⎡ ⎣ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ 000000000000000000000000000000000000000000000000000000000000000000000000000000000000 00 .6 .7 .7 .5 00000000 00 .8 1111111 .9 .3 00 00000 .4 .4 .4 .7 11100 0000000000 .1 .1 00 00000000000000000000000000000000000000000000000000000000 ⎤ ⎦ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥

Since each image has 28 by 28 pixels, we get a 28x28 array. We can flatten each array into a 28∗28=784 dimensional vector. Each component of the vector is a value between zero and one describing the intensity of the pixel. Thus, we generally think of MNIST as being a collection of 784-dimensional vectors.

Not all vectors in this 784-dimensional space are MNIST digits. Typical points in this space are very different! To get a sense of what a typical point looks like, we can randomly pick a few points and examine them. In a random point – a random 28x28 image – each pixel is randomly black, white or some shade of gray. The result is that random points look like noise.

Images like MNIST digits are very rare. While the MNIST data points are embedded in 784-dimensional space, they live in a very small subspace. With some slightly harder arguments, we can see that they occupy a lower dimensional subspace.

People have lots of theories about what sort of lower dimensional structure MNIST, and similar data, have. One popular theory among machine learning researchers is the manifold hypothesis: MNIST is a low dimensional manifold, sweeping and curving through its high-dimensional embedding space. Another hypothesis, more associated with topological data analysis, is that data like MNIST consists of blobs with tentacle-like protrusions sticking out into the surrounding space.

But no one really knows, so lets explore!

The MNIST Cube

We can think of the MNIST data points as points suspended in a 784-dimensional cube. Each dimension of the cube corresponds to a particular pixel. The data points range from zero to one according to the pixels intensity. On one side of the dimension, there are images where that pixel is white. On the other side of the dimension, there are images where it is black. In between, there are images where it is gray.

可视化MNIST：降维之探索Visualizing MNIST: An Exploration of Dimensionality Reduction_第1张图片

可视化MNIST：降维之探索Visualizing MNIST: An Exploration of Dimensionality Reduction_第2张图片

可视化MNIST：降维之探索Visualizing MNIST: An Exploration of Dimensionality Reduction_第3张图片

可视化MNIST：降维之探索Visualizing MNIST: An Exploration of Dimensionality Reduction_第4张图片

If we think of it this way, a natural question occurs. What does the cube look like if we look at a particular two-dimensional face? Like staring into a snow-globe, we see the data points projected into two dimensions, with one dimension corresponding to the intensity of a particular pixel, and the other corresponding to the intensity of a second pixel. Examining this allows us to explore MNIST in a very raw way.

In this visualization, each dot is an MNIST data point. The dots are colored based on which class of digit the data point belongs to. When your mouse hovers over a dot, the image for that data point is displayed on each axis. Each axis corresponds to the intensity of a particular pixel, as labeled and visualized as a blue dot in the small image beside it. By clicking on the image, you can change which pixel is displayed on that axis.

p7,12 p7,12

p18,15 p18,15

Exploring this visualization, we can see some glimpses of the structure of MNIST. Looking at thepixels p18,16 and p7,12 , we are able to separate a lot of zeros to the bottom right and a lot of nines to the top left. Looking at pixels p5,6 and p7,9 we can see a lot of twos at the top right and threes at the bottom right.

Despite minor successes like these, one can’t really can’t understand MNIST this way. The small insights one gains feel very fragile and feel a lot like luck. The truth is, simply, that very little of MNIST’s structure is visible from these perspectives. You can’t understand images by looking at just two pixels at a time.

But there’s lots of other perspectives we could look at MNIST from! In these perspectives, instead of looking a face straight on, one looks at it from an angle.

The challenge is that we need to choose what perspective we want to use. What angle do we want to look at it from horizontally? What angle do we want to look at it from vertically? Thankfully, there’s a technique called Principal Components Analysis (PCA) that will find the best possible angle for us. By this, we mean that PCA will find the angle that spreads out the points the most (captures the most variance possible).

But, what does it even mean to look at a 784-dimensional cube from an angle? Well, we need to decide which direction every axis of the cube should be tilted: to one side, to the other, or somewhere in between?

To be concrete, the following are pictures of the two angles PCA chooses. Red represents tilting a pixel’s dimension to one side, blue to the other.

可视化MNIST：降维之探索Visualizing MNIST: An Exploration of Dimensionality Reduction_第5张图片

可视化MNIST：降维之探索Visualizing MNIST: An Exploration of Dimensionality Reduction_第6张图片

可视化MNIST：降维之探索Visualizing MNIST: An Exploration of Dimensionality Reduction_第7张图片

If an MNIST digit primarily highlights red, it ends up on one side. If it highlights blue, it ends up on a different side. The first angle – the “first principal component” – will be our horizontal angle, pushing ones (which highlight lots of red and little blue) to the left and zeros (which highlight lots of blue and little red) to the right.

可视化MNIST：降维之探索Visualizing MNIST: An Exploration of Dimensionality Reduction_第8张图片

可视化MNIST：降维之探索Visualizing MNIST: An Exploration of Dimensionality Reduction_第9张图片

可视化MNIST：降维之探索Visualizing MNIST: An Exploration of Dimensionality Reduction_第10张图片

可视化MNIST：降维之探索Visualizing MNIST: An Exploration of Dimensionality Reduction_第11张图片

Now that we know what the best horizontal and vertical angle are, we can try to look at the cube from that perspective.

This visualization is much like the one above, but now the axes are fixed to displaying the first and second ‘principal components,’ basically angles of looking at the data. In the image on each axis, blue and red are used to denote what the ‘tilt’ is for that pixel. Pixel intensity in blue regions pushes a data point to one side, pixel intensity in red regions pushes us to the other.

Visualizing MNIST with PCA

While much better than before, it’s still not terribly good. Unfortunately, even looking at the data from the best angle, MNIST data doesn’t line up nicely for us to look at. It’s a non-trivial high-dimensional structure, and these sorts of linear projections just aren’t going to cut it.

Thankfully, we have some powerful tools for dealing with datasets which are… uncooperative.

Optimization-Based Dimensionality Reduction

What would we consider a success? What would it mean to have the ‘perfect’ visualization of MNIST? What should our goal be?

One really nice property would be if the distances between points in our visualization were the same as the distances between points in the original space. If that was true, we’d be capturing the global geometry of the data.

Let’s be a bit more precise. For any two MNIST data points, xi and xj , there are two notions of distance between them. One is the distance between them in the original space1 and one is the distance between them in our visualization. We will use d∗i,j to denote the distance between xi and xj in the original space and di,j to denote the distance between xi and xj in our visualization. Now we can define a cost:

C = \sum i \neq j (d * i, j - d i, j) 2

This value describes how bad a visualization is. It basically says: “It’s bad for distances to not be the same. In fact, it’s quadratically bad.” If it’s high, it means that distances are dissimilar to the original space. If it’s small, it means they are similar. If it is zero, we have a ‘perfect’ embedding.

That sounds like an optimization problem! And deep learning researchers know what to do with those! We pick a random starting point and apply gradient descent. 2

play

Visualizing MNIST with MDS

This technique is called multidimensional scaling (or MDS). If you like, there’s a more physical description of what’s going on. First, we randomly position each point on a plane. Next we connect each pair of points with a spring with the length of the original distance, d∗i,j . Then we let the points move freely and allow physics to take its course!

We don’t reach a cost of zero, of course. Generally, high-dimensional structures can’t be embedded in two dimensions in a way that preserves distances perfectly. We’re demanding the impossible! But, even though we don’t get a perfect answer, we do improve a lot on the original random embedding, and come to a decent visualization. We can see the different classes begin to separate, especially the ones.

Sammon’s Mapping

Still, it seems like we should be able to do much better. Perhaps we should consider different cost functions? There’s a huge space of possibilities. To start, there’s a lot of variations on MDS. A common theme is cost functions emphasizing local structure as more important to maintain than global structure. A very simple example of this is Sammon’s Mapping, defined by the cost function:

C = \sum i \neq j ( d * i , j - d i , j ) 2 d * i , j

In Sammon’s mapping, we try harder to preserve the distances between nearby points than between those which are far apart. If two points are twice as close in the original space as two others, it is twice as important to maintain the distance between them.

play

Visualizing MNIST with Sammon’s Mapping

For MNIST, the result isn’t that different. The reason has to do with a rather unintuitive property regarding distances in high-dimensional data like MNIST. Let’s consider the distances between some MNIST digits. For example, the distance between the similar ones, and , is

d (,) = 4.53

On the other hand, the difference between the very different data points,

and

, is

d (,) = 12.0

less than three times

) d(,)!

Because there’s so many ways similar points can be slightly different, the average distance between similar points is quite high. Conversely, as you get further away from a point, the amount of volume within that distance increases to an extremely high power, and so you are likely to run into different kinds of points. The result is that, in pixel space, the difference in distances between ‘similar’ and ‘different’ points can be much less than we’d like, even in good cases.

Graph Based Visualization

Perhaps, if local behavior is what we want our embedding to preserve, we should optimize for that more explicitly.

Consider a nearest neighbor graph of MNIST. For example, consider a graph (V,E) where the nodes are MNIST data points, and each point is connected to the three points that are closest to it in the original space.3 This graph is a simple way to encode local structure and forget about everything else.

Given such a graph, we can use standard graph layout algorithms to visualize MNIST. Here, we will use force-directed graph drawing: we pretend that all points are repelling charged particles, and that the edges are springs. This gives us a cost function:

C = \sum i \neq j 1 d i , j + 1 2 \sum (i, j) \in E (d i, j - d * i, j) 2

Which we minimize.

play

Visualizing MNIST as a Graph

The graph discovers a lot of structure in MNIST. In particular, it seems to find the different MNIST classes. While they overlap, during the graph layout optimization we can see the clusters sliding over each other. They are unable to avoid overlapping when embedded on the plane due to connections between classes, but the cost function is at least trying to separate them.

One nice property of the graph visualization is that it explicitly shows us which points are connected to which other points. In earlier visualizations, if we see a point in a strange place, we are uncertain as to whether it’s just stuck there, or if it should actually be there. The graph structure avoids this. For example, if you look at the red cluster of zeros, you will see a single blue point, the six , among them. You can see from its neighbors that it is supposed to be there, and from looking at it you can see that it is, in fact, a very poorly written six that looks more like a zero.

t-Distributed Stochastic Neighbor Embedding

The final technique I wish to introduce is the t-Distributed Stochastic Neighbor Embedding (t-SNE). This technique is extremely popular in the deep learning community. Unfortunately, t-SNE’s cost function involves some non-trivial mathematical machinery and requires some significant effort to understand.

But, roughly, what t-SNE tries to optimize for is preserving the topology of the data. For every point, it constructs a notion of which other points are it’s ‘neighbors,’ trying to make all points have the same number of neighbors. Then it tries to embed them so that those points all have the same number of neighbors.

In some ways, t-SNE is a lot like the graph based visualization. But instead of just having points be neighbors (if there’s an edge) or not neighbors (if there isn’t an edge), t-SNE has a continuous spectrum of having points be neighbors to different extents.

t-SNE is often very successful at revealing clusters and subclusters in data.

play

Visualizing MNIST with t-SNE

t-SNE does an impressive job finding clusters and subclusters in the data, but is prone to getting stuck in local minima. For example, in the following image we can see two clusters of zeros (red) that fail to come together because a cluster of sixes (blue) get stuck between them.

A number of tricks can help us avoid these bad local minima. Firstly, using more data helps a lot. Because these visualizations are embeded in a blog post, they only use 1,000 points. Using the full 50,000 MNIST points works a lot better. In addition, it is recommended that one usesimulated annealing and carefully select a number of hyperparamters.

Well done t-SNE plots reveal many interesting features of MNIST.

A t-SNE plot of MNIST

An even nicer plot can be found on the page labeled 2590, in the original t-SNE paper, Maaten & Hinton (2008).

It’s not just the classes that t-SNE finds. Let’s look more closely at the ones.

A t-SNE plot of MNIST ones

The ones cluster is stretched horizontally. As we look at digits from left to right, we see a consistent pattern.

\to \to \to \to \to

They move from forward leaning ones, like , into straighter like , and finally to slightly backwards leaning ones, like . It seems that in MNIST, the primary factor of variation in the ones is tilting. This is likely because MNIST normalizes digits in a number of ways, centering and scaling them. After that, the easiest way to be “far apart” is to rotate and not overlap very much.

Similar structure can be observed in other classes, if you look at the t-SNE plot again.

Visualization in Three Dimensions

Watching these visualizations, there’s sometimes this sense that they’re begging for another dimension. For example, watching the graph visualization optimize, one can see clusters slide over top of each other.

Really, we’re trying to compress this extremely high-dimensional structure into two dimensions. It seems natural to think that there would be very big wins from adding an additional dimension. If nothing else, at least in three dimensions a line connecting two clusters doesn’t divide the plane, precluding other connections between clusters.

In the following visualization, we construct a nearest neighbor graph of MNIST, as before, and optimize the same cost function. The only difference is that there are now three dimensions to lay it out in.

play

Visualizing MNIST as a Graph in 3D
(click and drag to rotate)

The three dimensional version, unsurprisingly, works much better. The clusters are quite separated and, while entangled, no longer overlap.

In this visualization, we can begin to see why it is easy to achieve around 95% accuracy classifying MNIST digits, but quickly becomes harder after that. You can make a lot of ground classifying digits by chopping off the colored protrusions above, the clusters of each class sticking out. (This is more or less what a linear Support Vector Machine does.4) But there’s some much harder entangled sections, especially in the middle, that are difficult to classify.

Of course, we could do any of the above techniques in 3D! Even something as simple as MDS is able to display quite a bit in 3D.

play

Visualizing MNIST with MDS in 3D
(click and drag to rotate)

In three dimensions, MDS does a much better job separating the classes than it did with two dimensions.

And, of course, we can do t-SNE in three dimensions.

play

Visualizing MNIST with t-SNE in 3D
(click and drag to rotate)

Because t-SNE puts so much space between clusters, it benefits a lot less from the transition to three dimensions. It’s still quite nice, though, and becomes much more so with more points.

If you want to visualize high dimensional data, there are, indeed, significant gains to doing it in three dimensions over two.

Conclusion

Dimensionality reduction is a well developed area, and we’re only scratching the surface here. There are hundreds of techniques and variants that are unmentioned here. I encourage you to explore!

It’s easy to slip into a mind set of thinking one of these techniques is better than the others, but I think they’re all complementary. There’s no way to map high-dimensional data into low dimensions and preserve all the structure. So, an approach must make trade-offs, sacrificing one property to preserve another. PCA tries to preserve linear structure, MDS tries to preserve global geometry, and t-SNE tries to preserve topology (neighborhood structure).

These techniques give us a way to gain traction on understanding high-dimensional data. While directly trying to understand high-dimensional data with the human mind is all but hopeless, with these tools we can begin to make progress.

In the next post, we will explore applying these techniques to some different kinds of data – in particular, to visualizing representations of text. Then, equipped with these techniques, we will shift our focus to understanding neural networks themselves, visualizing how they transform high-dimensional data and building techniques to visualize the space of neural networks. If you’re interested, you can subscribe to my rss feed so that you’ll see these posts when they are published.

(I would be delighted to hear your comments and thoughts: you can comment inline or at the end. For typos, technical errors, or clarifications you would like to see added, you are encouraged to make a pull request on github)

Acknowledgements

I’m grateful for the hospitality of Google’s deep learning research group, which had me as an intern while I wrote this post and did the work it is based on. I’m especially grateful to my internship host, Jeff Dean.

I was greatly helped by the comments, advice, and encouragement of many Googlers, both in the deep learning group and outside of it. These include: Greg Corrado, Jon Shlens, Matthieu Devin, Andrew Dai, Quoc Le, Anelia Angelova, Oriol Vinyals, Ilya Sutskever, Ian Goodfellow, Jutta Degener, and Anna Goldie.

I was strongly influenced by the thoughts, comments and notes of Michael Nielsen, especially his notes on Bret Victor’s work. Michael’s thoughts persuaded me that I should think seriously about interactive visualizations for understanding deep learning.

I was also helped by the support of a number of non-Googler friends, including Yoshua Bengio, Dario Amodei, Eliana Lorch, Taren Stinebrickner-Kauffman, and Laura Ball.

This blog post was made possible by a number of wonderful Javascript libraries, including D3.js,MathJax, jQuery, and three.js. A big thank you to everyone who contributed to these libraries.

We have a number of options for defining distance between these high-dimensional vectors. For this post, we will use L2 distance, d(xi,xj)=∑n(xi,n−xj,n)2−−−−−−−−−−−−−√ ↩
We initialize the points’ positions by sampling a Gaussian around the origin. Our optimization process isn’t standard gradient descent. Instead, we use a variant of momentum gradient descent. Before adding the gradient to the momentum, we normalize the gradient. This reduces the need for hyper-parameter tuning. ↩
Note that points can end up connected to more, if they are the nearest neighbor of many points. ↩
This isn’t quite true. A linear SVM operates on the original space. This is a non-linear transformation of the original space. That said, this strongly suggests something similar in the original space, and so we’d expect something similar to be true. ↩

from: http://colah.github.io/posts/2014-10-Visualizing-MNIST/

Python技术全景解析：从基础到前沿的深度探索靠近彗星 python 开发语言性能优化个人开发极限编程
目录一、Python为何成为开发者首选？1.核心优势矩阵2.性能进化史二、Python核心应用领域1.数据科学黄金三角2.AI开发新范式三、现代Python进阶技巧1.类型提示革命2.异步编程实战四、Python工程化实践1.现代项目架构2.性能优化矩阵五、Python未来生态展望1.前沿技术融合2.性能革命六、学习路线图1.技能成长路径基础阶段（1-3月）专业方向（3-6月）深度进阶（6-12月
回归任务训练--MNIST全连接神经网络（Mnist_NN）豆芽819 深度学习框架PyTorch pytorch 深度学习人工智能机器学习回归
importtorchimportnumpyasnpimportloggingfromtorch.utils.dataimportTensorDataset,DataLoaderfromtorch.utils.dataimportDataLoader#配置日志logging.basicConfig(level=logging.INFO,format='%(asctime)s-%(levelname
HarmonyOS Next～HarmonyOS应用开发工具之AppGallery Connect Bruce_xiaowei 总结经验编程笔记 harmonyos 华为
HarmonyOS应用开发工具之AppGalleryConnect一、AppGalleryConnect概述1.1定位与核心价值AppGalleryConnect（AGC）是华为面向HarmonyOS开发者打造的全生命周期服务平台，作为HarmonyOS应用开发的核心工具链，提供从开发、测试、上架到运营的全流程支持。其核心价值体现在：服务集成化：聚合40+云端服务能力开发效率提升：平均缩短30%开
LLM-Agent方法评估与效果分析 agent人工智能ai开发
1.引言近年来，随着大型语言模型（LLM）的快速发展，基于强化学习（RL）对LLM进行微调以使其具备代理（Agent）能力成为研究热点。从基础的单智能体强化学习算法（如PPO）到多智能体协作、语料重组以及在线自学习等新技术不断涌现，研究人员致力于探索如何提高LLM在实际应用中的决策能力、推理能力和任务执行效率。本文主要聚焦于当前LLM-Agent方法的检索与评估，旨在全面探讨各类方法的技术实现、实
Python 标准库之 logging 模块 36度道 python系列学习笔记 python
1.logging模块简介在软件开发过程中，了解程序的运行状态、记录重要事件以及排查错误是至关重要的。logging模块为Python提供了灵活且强大的日志记录功能。它允许开发者控制日志的输出内容、输出位置（如文件、控制台）、日志级别（用于过滤不同重要程度的日志信息）等，帮助开发者更好地监控和调试程序。2.基本使用简单配置与输出：importlogging#配置日志基本设置logging.basi
python 标准库之 functools 模块 36度道 python系列学习笔记 python
functools模块提供了一系列用于处理函数的工具。其中，像partial可以创建一个新的可调用对象，这个对象固定了原函数的部分参数，有点像给函数穿上了“参数防护服”；reduce能对一个序列进行累积计算，就好比是一个勤劳的小会计，按顺序把序列里的数加起来或者做其他运算；wraps主要用于装饰器，它能帮助装饰器函数保留被装饰函数的元信息，比如函数名、文档字符串等，让被装饰函数“表里如一”。底层原
《南京日报》专题报道 | 耘瞳科技“工业之眼”加码“中国智造” 耘瞳科技科技
在江宁开发区，机器人已不再是科幻电影里的遥远想象，他们就像人类的“同事”，在工地上忙着贴砖、刷墙、搬运、检测；在体育训练场上帮助运动员矫正姿势；在医院里帮助医生发现帕金森早期征兆，在智慧工厂里与人类分工协作……作为南京市机器人产业“一核多翼”布局的“核”，江宁开发区当前聚集人工智能产业核心及上下游关联企业超百家。近日，《南京日报》走访了多家链条上的“明星企业”，耘瞳科技作为中国领先的智能检测与测量
可视化动态表单动态表单界的天花板--Formily(阿里开源) hhzz 前端相关开源可视化动态表单
文章目录1、Formily表单介绍2、安装依赖2.1、安装内核库2.2、安装UI桥接库2.3、Formily支持多种UI组件生态：3、表单设计器3.1、核心理念3.2、安装3.3、示例源码4、场景案例-登录注册4.1、MarkupSchema案例4.2、JSONSchema案例4.3、纯JSX案例1、Formily表单介绍Formily是一个由阿里开源的动态表单解决方案，主要用于构建和管理复杂的表
RTOS之环形缓冲区和队列三五度 RTOS 单片机 stm32 嵌入式硬件 c语言
一、环形缓冲区（CircularBuffer）类似一个环形跑道，运动员（数据）在跑道上循环奔跑。跑道首尾相连，运动员跑到终点后又会回到起点继续跑。实际上环形缓冲区是一个固定大小的连续内存空间，用两个指针管理数据：写指针：指向下一个可以写入数据的位置。读指针：指向下一个可以读取的数据位置。当数据写到缓冲区末尾时，会自动回到开头继续写（类似“循环”），覆盖旧数据或阻止写入（取决于设计）。运行机制关键设
嵌入式音频框架alsa学习之pcm状态 Liu-Eleven linux声音框架音视频学习 pcm
/**PCMstate*/typedefenum_snd_pcm_state{/**Open*/SND_PCM_STATE_OPEN=0,/**Setupinstalled*/SND_PCM_STATE_SETUP,/**Readytostart*/SND_PCM_STATE_PREPARED,/**Running*/SND_PCM_STATE_RUNNING,/**Stopped:underru
通信之段开销、管理单元指针、净负荷玖Yee 信息与通信
今天来讲讲sdh段开销、管理单元指针、净负荷吧~SDH段开销（SOH）是指STM-N帧结构中为了保证信息净负荷正常灵活传送所必需的附加字节，用于网络的运行、管理和维护。它位于STM-N帧的第1至第9×N列中，第1至第3行和第5行至第9行，可进一步划分为再生段开销（RSOH）和复用段开销（MSOH）。具体介绍如下：再生段开销（RSOH）-帧定位字节（A1、A2）：规定为两种固定代码，A1=11110
探索Python中的集成方法：Stacking Echo_Wish Python 笔记 Python 算法 python 开发语言
在机器学习领域，Stacking是一种高级的集成学习方法，它通过将多个基本模型的预测结果作为新的特征输入到一个元模型中，从而提高整体模型的性能和鲁棒性。本文将深入介绍Stacking的原理、实现方式以及如何在Python中应用。什么是Stacking？Stacking，又称为堆叠泛化（StackedGeneralization），是一种模型集成方法，与Bagging和Boosting不同，它并不直
adb 如何导出手机的文件风继续吹.. 工具类 Uni-App adb 智能手机 uniapp sqlite
目录1.开启USB调试2.连接设备3.启动ADB4.导出文件使用adbpull命令5.可视化工具预览adb（AndroidDebugBridge）是Android开发中常用的一个工具，它允许开发者通过电脑与Android设备进行通信。如果你想通过adb导出手机上的文件，你可以按照以下步骤业务需求:前端通过使用uni-app的sqlite(关系型数据库系统),存储了大量的机密数据在手机上,直接通过代
使用Seaborn库中的`violinplot`函数绘制水平小提琴图（Violin Plot）是一种常见的数据可视化方法 code_welike 信息可视化数据分析数据挖掘 Python
使用Seaborn库中的violinplot函数绘制水平小提琴图（ViolinPlot）是一种常见的数据可视化方法。水平小提琴图可以展示数据的分布特征，并可以对比不同组别之间的差异。本文将介绍如何使用Python和Seaborn库绘制水平小提琴图，并提供相应的源代码示例。首先，我们需要确保已经安装了Seaborn库。可以使用以下命令在Python中安装Seaborn：pipinstallseabo
使用Seaborn绘制水平小提琴图 YOUFDJ python 开发语言 Python
使用Seaborn绘制水平小提琴图水平小提琴图是一种常用的数据可视化工具，可以用于展示不同类别之间的分布情况。在Python中，我们可以使用Seaborn库的catplot函数来轻松地绘制水平小提琴图。本文将介绍如何使用Seaborn绘制水平小提琴图，并附带相应的源代码示例。首先，确保你已经安装了Seaborn库。如果没有安装，可以使用以下命令在命令行中安装：pipinstallseaborn安装
Python文件与格式化：编程世界的“读写之道“（技术深挖版）被窝妄想家 python进阶指南 python 数据库开发语言
一、文件操作：Python的"读写之眼"1.1文件基础哲学在计算机世界中，文件就像一本本等待翻阅的典籍。Python的open()函数如同手持放大镜，让我们能精确控制阅读和书写：#经典打开模式组合withopen("data.txt","r+",encoding="utf-8")asf:#r+模式：可读可写，文件指针初始位置在开头content=f.read(10)#读取前10个字节f.seek(
使用Seaborn绘制小提琴图 CodeWG python 开发语言
使用Seaborn绘制小提琴图在数据分析与可视化中，小提琴图是一种常用的图表类型。它能够展示数据的分布情况，同时还能显示中位数、四分位数和异常值等统计指标。在Python中，我们可以使用Seaborn库来轻松地绘制小提琴图。下面就来详细介绍一下如何使用Seaborn来创建小提琴图。首先，我们需要导入必要的库和数据集。这里我们使用Seaborn自带的数据集tips作为例子。importseaborn
LLMs之minimind：minimind源码解读(pretrain.py)——实现基于Transformer架构的大规模语言模型预训练及wandb监控—支持余弦退火学习率调度/分布式预训练/自动混一个处女座的程序猿 NLP/LLMs CaseCode transformer minimind 预训练
LLMs之minimind：minimind源码解读(pretrain.py)——实现基于Transformer架构的大规模语言模型预训练及wandb监控—支持余弦退火学习率调度/分布式预训练/自动混合精度优化/梯度累积/梯度裁剪/定期保存模型目录minimind源码解读(pretrain.py)——实现基于Transformer架构的大规模语言模型预训练及wandb监控—支持余弦退火学习率调度/
Linux系统之cal命令详解门前灯 linux 运维服务器 cal
cal命令详解cal是一个用于显示日历的简单工具。默认情况下，它会显示当前月份的日历，但可以通过参数和选项显示特定月份、年份或自定义格式的日历。基本语法cal[options][[[day]month]year]无参数：显示当前月份的日历。单参数：显示指定年份的日历。双参数：显示指定月份和年份的日历。三参数：显示指定日、月和年份的日历，并在终端上高亮显示该日期。常用选项选项描述-1,--one显示
从入门到进阶：Python数据可视化实战技巧 Blossom.118 分布式系统与高性能计算领域信息可视化 python 开发语言网络协议 spring boot java 后端
在数据分析和数据科学领域，数据可视化是将复杂数据以直观图形展示的重要手段。Python作为数据科学领域的首选语言之一，提供了强大的数据可视化库，如Matplotlib、Seaborn、Plotly等。本文将从入门到进阶，逐步介绍Python数据可视化的实战技巧，帮助读者快速提升数据可视化能力。一、入门：Matplotlib基础Matplotlib是Python中最基础、最强大的数据可视化库之一。它
【DeepThinking】人生反思洞察之「知行合一」（经验贴）碣石潇湘无限路经验分享笔记生活人生深度思考知行合一
引言最近，我深刻体会到一种焦虑：既有生活的现实压力，也有对人生方向的迷茫与无奈。回顾自身，我发现这并不是物质层面的匮乏或欲望驱动，而是对“我是谁”“我想要什么”“我能做什么”的追问。这种焦虑，常常让我想起人的出生：起初我们依赖父母和环境，被动地活着；成年后，我们凭借主动学习、工作和不断积累的信念，去实现自我价值。但终有一天，我们会停下来审视自己，看清一些本质问题，并且发觉自己需要对这一生负责：我应
Yolo系列之Yolo的基本理解是十一月末 YOLO python 开发语言 yolo
YOLO的基本理解目录YOLO的基本理解1YOLO1.1概念1.2算法2单、多阶段对比2.1FLOPs和FPS2.2one-stage单阶段2.3two-stage两阶段1YOLO1.1概念YOLO(YouOnlyLookOnce)是一种基于深度学习的目标检测算法，由JosephRedmon等人于2016年提出。它的核心思想是将目标检测问题转化为一个回归问题，通过一个神经网络直接预测目标的类别和位
网页版 123 分身数字人源码搭建，OEM贴牌 18538162800=余音视频矩阵
在数字化时代的浪潮下，数字人技术蓬勃发展，网页版123分身数字人源码搭建为众多开发者和企业提供了实现个性化数字人应用的可能。本文将深入探讨其技术开发过程，从底层架构到关键技术实现，全方位解析如何构建一个功能强大的网页版数字人系统。技术架构设计前端展示层HTML5与CSS3：构建数字人的可视化界面，实现流畅的动画效果和交互元素。利用CSS3的过渡、动画属性，为数字人的动作、表情变化提供细腻的视觉呈现
【数据库】MySQL数据类型decimal详解以及对于float和double两种类型精度问题的探索明璐花生牛奶数据库 mysql 数据库经验分享
引言或许很多同学都很好奇为什么在数据库里要引入decimal这一种数据类型来表示小数？使用float和double这两种数据类型来表示小数为什么不可以？那是因为float和double这两种类型可能会出现精度问题如果本文出现了错误，还请路过的大佬在评论区指出，您的批评是我前进的动力！谢谢！decimal数据类型参考文献：https://cloud.tencent.com/developer/art
PyTorch基础知识讲解（一）完整训练流程示例苏雨流丰机器学习 pytorch 人工智能 python 机器学习深度学习
文章目录Tutorial1.数据处理2.网络模型定义3.损失函数、模型优化、模型训练、模型评价4.模型保存、模型加载、模型推理Tutorial大多数机器学习工作流程涉及处理数据、创建模型、优化模型参数和保存训练好的模型。本教程向你介绍一个用PyTorch实现的完整的ML工作流程，并提供链接来了解这些概念中的每一个。我们将使用FashionMNIST数据集来训练一个神经网络，预测输入图像是否属于以下
【Pandas】pandas Series plot.bar liuweidong0802 Pandas Series pandas 信息可视化
Pandas2.2SeriesPlotting方法描述Series.plot([kind,ax,figsize,…])用于绘制Series对象的数据可视化图表Series.plot.area([x,y,stacked])用于绘制堆叠面积图（StackedAreaPlot）Series.plot.bar([x,y])用于绘制垂直条形图（VerticalBarPlot）pandas.Series.pl
入门 Canvas：Web 绘图的强大工具 Hopebearer_ 前端 es6 javascript canva可画
文章目录入门Canvas：Web绘图的强大工具一、Canvas简介二、Canvas的基本用法（一）绘制基本图形（二）绘制文本三、Canvas的应用场景（一）数据可视化（二）游戏开发（三）图像编辑四、Canvas的动画效果五、Canvas的优势与局限性（一）优势（二）局限性六、总结入门Canvas：Web绘图的强大工具在Web开发的广阔天地中，为了满足用户对丰富、交互性强的体验的不断追求，前端技术持
探索HTML5 Canvas的无限可能：一个丰富多彩的开源项目黎情卉Desired
探索HTML5Canvas的无限可能：一个丰富多彩的开源项目去发现同类优质开源项目:https://gitcode.com/在这个充满活力的数字时代，JavaScript、HTML和CSS已经成为构建互动式网页体验的核心技术。今天，我们向您推荐一个独特而有趣的开源项目，它将这些技术结合在一起，创造出一系列生动活泼的可视化元素，包括时钟、计时器、地图、国际象棋、温度计等，让您在学习和实践中感受HTM
探索HTML5 Canvas：创造动态与交互性网页内容的强大工具 A-Kamen html5 前端 html
探索HTML5Canvas：创造动态与交互性网页内容的强大工具引言在HTML5的众多新特性中，Canvas无疑是最引人注目的元素之一。它为网页设计师和开发者提供了一个通过JavaScript和HTML直接在网页上绘制图形、图像以及进行动画处理的画布。Canvas的灵活性和强大功能，使得它成为创造动态、交互性网页内容的首选工具。本文将深入探讨HTML5Canvas的基本用法、应用场景以及如何利用它来
软件测试基础知识必备之浅谈单元测试程序员阿沐软件测试软件测试单元测试
什么是单元测试？单元测试是指，对软件中的最小可测试单元在与程序其他部分相隔离的情况下进行检查和验证的工作，这里的最小可测试单元通常是指函数或者类。单元测试都是以自动化的方式执行，所以在大量回归测试的场景下更能带来高收益。单元测试代码里提供函数的使用示例，因为单元测试的具体表现形式就是对函数以各种不同输入参数组合进行调用。如何做好单元测试？1）代码的基本特征与产生错误的原因无论是开发语言还是脚本语言
二分查找排序算法周凡杨 java 二分查找排序算法折半
一：概念二分查找又称折半查找（折半搜索/ 二分搜索），优点是比较次数少，查找速度快，平均性能好；其缺点是要求待查表为有序表，且插入删除困难。因此，折半查找方法适用于不经常变动而查找频繁的有序列表。首先，假设表中元素是按升序排列，将表中间位置记录的关键字与查找关键字比较，如果两者相等，则查找成功；否则利用中间位置记录将表分成前、后两个子表，如果中间位置记录的关键字大于查找关键字，则进一步
java中的BigDecimal bijian1013 java BigDecimal
在项目开发过程中出现精度丢失问题，查资料用BigDecimal解决，并发现如下这篇BigDecimal的解决问题的思路和方法很值得学习，特转载。原文地址：http://blog.csdn.net/ugg/article/de
Shell echo命令详解 daizj echo shell
Shell echo命令 Shell 的 echo 指令与 PHP 的 echo 指令类似，都是用于字符串的输出。命令格式： echo string 您可以使用echo实现更复杂的输出格式控制。 1.显示普通字符串: echo "It is a test" 这里的双引号完全可以省略，以下命令与上面实例效果一致： echo Itis a test 2.显示转义
Oracle DBA 简单操作周凡杨 oracle dba sql
--执行次数多的SQL select sql_text,executions from ( select sql_text,executions from v$sqlarea order by executions desc ) where rownum<81; &nb
画图重绘朱辉辉33 游戏
我第一次接触重绘是编写五子棋小游戏的时候，因为游戏里的棋盘是用线绘制的，而这些东西并不在系统自带的重绘里，所以在移动窗体时，棋盘并不会重绘出来。所以我们要重写系统的重绘方法。在重写系统重绘方法时，我们要注意一定要调用父类的重绘方法，即加上super.paint(g)，因为如果不调用父类的重绘方式，重写后会把父类的重绘覆盖掉，而父类的重绘方法是绘制画布，这样就导致我们
线程之初体验西蜀石兰线程
一直觉得多线程是学Java的一个分水岭，懂多线程才算入门。之前看《编程思想》的多线程章节，看的云里雾里，知道线程类有哪几个方法，却依旧不知道线程到底是什么？书上都写线程是进程的模块，共享线程的资源，可是这跟多线程编程有毛线的关系，呜呜。。。线程其实也是用户自定义的任务，不要过多的强调线程的属性，而忽略了线程最基本的属性。你可以在线程类的run()方法中定义自己的任务，就跟正常的Ja
linux集群互相免登陆配置林鹤霄 linux
配置ssh免登陆 1、生成秘钥和公钥 ssh-keygen -t rsa 2、提示让你输入，什么都不输，三次回车之后会在~下面的.ssh文件夹中多出两个文件id_rsa 和 id_rsa.pub 其中id_rsa为秘钥，id_rsa.pub为公钥，使用公钥加密的数据只有私钥才能对这些数据解密 c
mysql : Lock wait timeout exceeded; try restarting transaction aigo mysql
原文：http://www.cnblogs.com/freeliver54/archive/2010/09/30/1839042.html 原因是你使用的InnoDB 表类型的时候, 默认参数:innodb_lock_wait_timeout设置锁等待的时间是50s, 因为有的锁等待超过了这个时间,所以抱错. 你可以把这个时间加长,或者优化存储
Socket编程基本的聊天实现。 alleni123 socket
public class Server { //用来存储所有连接上来的客户 private List<ServerThread> clients; public static void main(String[] args) { Server s = new Server(); s.startServer(9988); } publi
多线程监听器事件模式(一个简单的例子) 百合不是茶线程监听模式
多线程的事件监听器模式监听器时间模式经常与多线程使用,在多线程中如何知道我的线程正在执行那什么内容,可以通过时间监听器模式得到创建多线程的事件监听器模式思路: 1, 创建线程并启动,在创建线程的位置设置一个标记 2,创建队
spring InitializingBean接口 bijian1013 java spring
spring的事务的TransactionTemplate，其源码如下： public class TransactionTemplate extends DefaultTransactionDefinition implements TransactionOperations, InitializingBean{ ... } TransactionTemplate继承了DefaultT
Oracle中询表的权限被授予给了哪些用户 bijian1013 oracle 数据库权限
Oracle查询表将权限赋给了哪些用户的SQL，以备查用。 select t.table_name as "表名", t.grantee as "被授权的属组", t.owner as "对象所在的属组"
【Struts2五】Struts2 参数传值 bit1129 struts2
Struts2中参数传值的3种情况 1.请求参数绑定到Action的实例字段上 2.Action将值传递到转发的视图上 3.Action将值传递到重定向的视图上一、请求参数绑定到Action的实例字段上以及Action将值传递到转发的视图上 Struts可以自动将请求URL中的请求参数或者表单提交的参数绑定到Action定义的实例字段上，绑定的规则使用ognl表达式语言
【Kafka十四】关于auto.offset.reset[Q/A] bit1129 kafka
I got serveral questions about auto.offset.reset. This configuration parameter governs how consumer read the message from Kafka when there is no initial offset in ZooKeeper or
nginx gzip压缩配置 ronin47 nginx gzip 压缩范例
nginx gzip压缩配置更多 0 nginx gzip 配置随着nginx的发展，越来越多的网站使用nginx，因此nginx的优化变得越来越重要，今天我们来看看nginx的gzip压缩到底是怎么压缩的呢？ gzip(GNU-ZIP)是一种压缩技术。经过gzip压缩后页面大小可以变为原来的30%甚至更小，这样，用
java-13.输入一个单向链表，输出该链表中倒数第 k 个节点 bylijinnan java
two cursors. Make the first cursor go K steps first. /* * 第 13 题：题目：输入一个单向链表，输出该链表中倒数第 k 个节点 */ public void displayKthItemsBackWard(ListNode head,int k){ ListNode p1=head,p2=head;
Spring源码学习-JdbcTemplate queryForObject bylijinnan java spring
JdbcTemplate中有两个可能会混淆的queryForObject方法： 1. Object queryForObject(String sql, Object[] args, Class requiredType) 2. Object queryForObject(String sql, Object[] args, RowMapper rowMapper) 第1个方法是只查
[冰川时代]在冰川时代,我们需要什么样的技术? comsci 技术
看美国那边的气候情况....我有个感觉...是不是要进入小冰期了? 那么在小冰期里面...我们的户外活动肯定会出现很多问题...在室内呆着的情况会非常多...怎么在室内呆着而不发闷...怎么用最低的电力保证室内的温度.....这都需要技术手段... &nb
js 获取浏览器型号 cuityang js 浏览器
根据浏览器获取iphone和apk的下载地址 <!DOCTYPE html> <html> <head> <meta charset="utf-8" content="text/html"/> <meta name=
C# socks5详解转 dalan_123 socket C#
http://www.cnblogs.com/zhujiechang/archive/2008/10/21/1316308.html 这里主要讲的是用.NET实现基于Socket5下面的代理协议进行客户端的通讯，Socket4的实现是类似的，注意的事，这里不是讲用C#实现一个代理服务器，因为实现一个代理服务器需要实现很多协议，头大，而且现在市面上有很多现成的代理服务器用，性能又好，
运维 Centos问题汇总 dcj3sjt126com 云主机
一、sh 脚本不执行的原因 sh脚本不执行的原因只有2个 1.权限不够 2.sh脚本里路径没写完整。二、解决You have new mail in /var/spool/mail/root 修改/usr/share/logwatch/default.conf/logwatch.conf配置文件 MailTo = MailFrom 三、查询连接数
Yii防注入攻击笔记 dcj3sjt126com sql WEB安全 yii
网站表单有注入漏洞须对所有用户输入的内容进行个过滤和检查，可以使用正则表达式或者直接输入字符判断，大部分是只允许输入字母和数字的，其它字符度不允许；对于内容复杂表单的内容，应该对html和script的符号进行转义替换：尤其是<,>,',"",&这几个符号这里有个转义对照表： http://blog.csdn.net/xinzhu1990/articl
MongoDB简介[一] eksliang mongodb MongoDB简介
MongoDB简介转载请出自出处：http://eksliang.iteye.com/blog/2173288 1.1易于使用 MongoDB是一个面向文档的数据库，而不是关系型数据库。与关系型数据库相比，面向文档的数据库不再有行的概念，取而代之的是更为灵活的“文档”模型。另外，不
zookeeper windows 入门安装和测试 greemranqq zookeeper 安装分布式
一、序言以下是我对zookeeper 的一些理解： zookeeper 作为一个服务注册信息存储的管理工具，好吧，这样说得很抽象，我们举个“栗子”。栗子1号：假设我是一家KTV的老板，我同时拥有5家KTV，我肯定得时刻监视
Spring之使用事务缘由(2-注解实现) ihuning spring
Spring事务注解实现 1. 依赖包： 1.1 spring包： spring-beans-4.0.0.RELEASE.jar spring-context-4.0.0.
iOS App Launch Option 啸笑天 option
iOS 程序启动时总会调用application:didFinishLaunchingWithOptions:，其中第二个参数launchOptions为NSDictionary类型的对象，里面存储有此程序启动的原因。 launchOptions中的可能键值见UIApplication Class Reference的Launch Options Keys节。 1、若用户直接
jdk与jre的区别（_） macroli java jvm jdk
简单的说JDK是面向开发人员使用的SDK，它提供了Java的开发环境和运行环境。SDK是Software Development Kit 一般指软件开发包，可以包括函数库、编译程序等。 JDK就是Java Development Kit JRE是Java Runtime Enviroment是指Java的运行环境，是面向Java程序的使用者，而不是开发者。如果安装了JDK，会发同你
Updates were rejected because the tip of your current branch is behind qiaolevip 学习永无止境每天进步一点点众观千象 git
$ git push joe prod-2295-1 To [email protected]:joe.le/dr-frontend.git ! [rejected] prod-2295-1 -> prod-2295-1 (non-fast-forward) error: failed to push some refs to '[email protected]
[一起学Hive]之十四-Hive的元数据表结构详解 superlxw1234 hive hive元数据结构
关键字：Hive元数据、Hive元数据表结构之前在 “[一起学Hive]之一–Hive概述，Hive是什么”中介绍过，Hive自己维护了一套元数据，用户通过HQL查询时候，Hive首先需要结合元数据，将HQL翻译成MapReduce去执行。本文介绍一下Hive元数据中重要的一些表结构及用途，以Hive0.13为例。文章最后面，会以一个示例来全面了解一下，
Spring 3.2.14，4.1.7，4.2.RC2发布 wiselyman Spring 3
Spring 3.2.14、4.1.7及4.2.RC2于6月30日发布。其中Spring 3.2.1是一个维护版本(维护周期到2016-12-31截止)，后续会继续根据需求和bug发布维护版本。此时，Spring官方强烈建议升级Spring框架至4.1.7 或者将要发布的4.2 。其中Spring 4.1.7主要包含这些更新内容。

可视化MNIST：降维之探索Visualizing MNIST: An Exploration of Dimensionality Reduction

MNIST

The MNIST Cube

Optimization-Based Dimensionality Reduction

Sammon’s Mapping

Graph Based Visualization

t-Distributed Stochastic Neighbor Embedding

Visualization in Three Dimensions

Conclusion

Acknowledgements

你可能感兴趣的:(可视化MNIST：降维之探索Visualizing MNIST: An Exploration of Dimensionality Reduction)