数据可视化机器学习工具在线_为什么您不能跳过学习数据可视化

数据可视化机器学习工具在线

重点 (Top highlight)

There’s no scarcity of posts online about ‘fancy’ data topics like data modelling and data engineering. But I’ve noticed their cousin, data visualization, barely gets the same amount of attention. Among data practitioners in my field, I find there is solid consensus that data viz is an important skill that’s worth devoting time to learn. However, we somehow spend more time learning complex models over figuring out why pie charts are no-nos.

在线上没有关于“花式”数据主题(如数据建模和数据工程)的帖子。 但是我注意到他们的表弟,数据可视化,几乎没有受到同样的关注。 在我所在领域的数据从业者中,我发现有一个牢固的共识,即数据可视化是一项重要的技能,值得花时间学习。 但是,我们花了更多的时间来学习复杂的模型,而不是弄清楚为什么饼图不行。

When I was just starting out in my career, the hype was really in modelling. I considered making graphs and visualizations ‘chores’, and I thought that the depth of my fancy data science knowledge was the greatest determinant of how much value I could bring.

当我刚开始我的职业生涯时,炒作真的是在建模。 我考虑过使图表和可视化成为“琐事”,并且我认为我对数据科学知识的深度决定了我可以带来多少价值。

As I developed the data viz aspect of my skill set, I picked up valuable lessons that now influence how I approach everything that I do. These lessons have proven extremely useful for me in my career journey, and I’d like to build the case here for why data viz is a core tool in any data person’s skill set.

在开发技能的数据可视化方面时,我吸取了宝贵的经验教训,这些经验教训现在影响着我处理工作方式的方式。 在我的职业生涯中,这些课程对我来说非常有用,我想在这里举例说明为什么数据可视化是任何数据人员技能中的核心工具。

学习1:如果您希望人们做正确的事,则必须使其变得容易 (Learning 1: If you want people to do the right thing, you have to make it easy)

A common frustration I hear from data people is that business stakeholders often don’t seem to make decisions that align with what data analysts have found to be optimal. This leads to a lose-lose scenario where stakeholders don’t get the results they’re looking for, while analysts get frustrated with ‘wasted work.’ I was also once stuck in the ‘if only stakeholders listened’ mentality.

我从数据人员那里听到的一个普遍沮丧是,业务利益相关者似乎常常没有做出与数据分析师认为最佳的决策一致的决策。 这导致了失败的情况,即利益相关者没有得到他们想要的结果,而分析师则对“浪费的工作”感到沮丧。 我也曾经陷入“如果只有利益相关者倾听”的心态。

Darkhorseanalytics. Darkhorseanalytics 。

It was working with User Experience (UX) Product Designers (and surprisingly, not a data seminar) that brought me to rethink this mindset. They brought attention to things I would have called ‘trivial’ before, critiquing how websites and applications were built: ‘this banner is too big’, ‘this button should be colored blue’, etc — and for good reason. They told me that ‘If we don’t do X, the user will have trouble doing what we want them to do.’

它与用户体验(UX)产品设计师(而且令人惊讶的是,不是数据研讨会)合作,使我重新考虑了这种思维方式。 他们引起了我以前所谓的“琐碎”的注意,并批评了网站和应用程序的构建方式:“此横幅太大”,“此按钮应被涂成蓝色”等,这是有充分理由的。 他们告诉我, “如果我们不做X,用户将很难做我们想要他们做的事情。”

Coming back to the context of data, this insight is exactly why I would argue that data visualization is powerful. Well thought-out visuals, rather than ugly charts, most clearly explain to stakeholders why a data-backed recommendation will be most beneficial. After all, the main role of a data professional is not just to churn out models and analysis, but also to inspire data-driven action.

回到数据上下文,这种见解正是我认为数据可视化功能强大的原因。 经过深思熟虑的视觉效果,而不是难看的图表,可以最清晰地向利益相关者解释为什么数据支持的建议将最有益。 毕竟,数据专业人员的主要作用不仅在于建立模型和分析,还在于激发数据驱动的行动。

As a data professional (or any profession really), maybe we should think twice the next time ‘slides are a chore’ crosses our minds.

作为数据专业人士(或者实际上是任何专业),也许我们下次应该想到“幻灯片是一件琐事”时就要三思而行。

学习2:一张图片值一千字一分钟。 (Learning 2: A picture is worth a thousand words AND a thousand minutes.)

tjpalanca.com tjpalanca.com

Full disclosure — I got irritated whenever people asked me to edit what seemed like irrelevant details in presentations. ‘Change the y-axis label from revenue to Revenue’, ‘Make this category green instead of red’, and ‘Add another graph,’ were common phrases I dealt with in my everyday work. I guess I did not like hearing them because I thought they’d mean that I’d have to re-render a report and waste time on a task I didn’t care about.

完全公开-每当有人要求我编辑演示文稿中似乎无关紧要的细节时,我都会很生气。 我在日常工作中经常使用“将y轴标签从“收入”更改为“收入””,“将此类别设置为绿色而不是红色”和“添加其他图表”。 我想我不喜欢听到他们的声音,因为我认为他们的意思是我必须重新提交报告,并在我不关心的任务上浪费时间。

This was admittedly ironic for me, since when doing research, I [hated] reading papers that were written as if it was the author’s goal to sound as complex as possible. I’d have to spend hours dissecting a set of paragraphs and equations in order to comprehend a piece of insight.

诚然,这对我具有讽刺意味,因为在进行研究时,我[讨厌]阅读所写的论文,好像作者的目标是听起来尽可能的复杂。 为了理解这一点,我不得不花几个小时来剖析一组段落和等式。

On the other hand, what I find most enjoyable and engaging to read are papers with extensive visualisations. In those papers, it was clear that the author took time and effort to create data visuals to aid a reader in understanding the point he or she wanted to get across.

另一方面,我觉得最有趣,最吸引人的是那些具有广泛可视化效果的论文。 在这些论文中,很明显,作者花了时间和精力来创建数据视觉效果,以帮助读者理解他或她想传达的观点。

Even while trying to understand complex topics in other mediums, I can’t count how many hours I’ve saved thanks to Youtube channels like 3Blue1Brown or Kurzgesagt that pair rich explanations with engaging visuals to illustrate their point.

即使尝试理解其他媒体中的复杂主题,由于YouTube频道(如3Blue1Brown或Kurzgesagt)将丰富的说明与引人入胜的视觉效果结合起来,我无法节省多少时间。

演示地址

3Blue1Brown on Neural Networks 神经网络上的3Blue1Brown

These materials take time to make, and in analytics, making visuals can sometimes appear as optional. However, that 10 minutes you spent making a visual is 5–10 minutes saved for EACH individual trying to understand your point. This especially gets important in a fast-moving business, where decisions have to be made on the fly and a simple visual would have helped push through blockers to get to the action.

这些材料制作需要花费时间,在分析中,制作视觉效果有时可能是可选的。 但是,对于每一个试图理解您的观点的人来说,您花10分钟的时间进行视觉处理可以节省5-10分钟。 在快速发展的业务中,这尤其重要,因为该业务必须即时做出决策,而简单的视觉效果将有助于推动阻止者采取行动。

It can’t be overstated how the effort to make good data visualisations pays in the long run by influencing organizations and contributing to the efficiency of the decision-making process.

从长远来看,通过影响组织并为决策流程的效率做出贡献,使良好的数据可视化所付出的努力是如何付出的,这是不夸张的。

学习3:解决问题并不一定是一项艰巨的任务! (Learning 3: Problem solving does not need to be a solitary undertaking!)

I found this tweet by Mat Velloso very funny and relatable:

我发现Mat Velloso的这条推文非常有趣且相关:

演示地址

I assume the reason this happens is because companies find it difficult to measure the level of complexity of the problems they’re facing and “AI” sounds like a good catch-all solution. Even as a data practitioner, I’ve also been guilty of this, falling into analysis-paralysis and trying every single model, hoping one will magically solve everything. This is a big waste of the analyst’s (and the stakeholder’s) time.

我认为发生这种情况的原因是,公司发现难以衡量所面临问题的复杂程度,而“ AI”听起来像是一个万能的解决方案。 即使作为一名数据从业者,我也对此感到内,,陷入了分析瘫痪并尝试每个模型,希望一个模型能神奇地解决所有问题。 这浪费了分析师(和利益相关者)的时间。

It’s funny and all, but it goes to show that we can all still work on how we collectively understand complex data problems.

一切都很好笑,但这表明我们仍然可以共同致力于理解复杂数据问题。

Notice the number of reactions in this post in our Slack Channels (I blurred some details):

请注意我们的Slack频道中此帖子的React数量(我模糊了一些细节):

On the other hand, check out the reactions (and replies!) on this other post with a data viz attached:

另一方面,请查看此帖子的React(和回复!),并附加数据:

The posts show how people are happy to share thoughts and insights, provided they can relate and easily understand the context.

帖子显示了人们如何乐于分享思想和见解,只要他们能够联系在一起并 轻松了解上下文。

Modelling is indeed useful when we are sure that it is the most efficient solution to address our problem. However, the beauty about data viz is the communication piece that allows us to make problem solving a collaborative experience. After all, there is no scarcity of ideas from subject-matter experts whose only barrier is the technical data aspect. Imagine, if data viz made the problem scope clearer for other people, those who see our visuals can start ideating solutions on a business problem and the ripple effect amplifies. Suddenly from one person making a model (you), other people are already giving you ideas to further improve your work (and possibly even a solution that doesn’t need a model!)

当我们确定建模是解决我们的问题的最有效解决方案时,建模确实有用。 但是,关于数据即可视化的美是使我们能够解决协作体验问题的沟通工具。 毕竟,主题专家们并不缺乏想法,他们的唯一障碍是技术数据方面。 想象一下,如果数据可视化使其他人更清楚地了解问题的范围,那么看到我们的视觉效果的人就可以开始就业务问题提出解决方案,并且连锁React会不断扩大。 突然间,一个人(您)在制作模型,其他人已经在为您提供构想,以进一步改进您的工作(甚至可能不需要模型的解决方案!)

学习4:有一个时间和地点,一切都可视 (Learning 4: There is a time and place AND VISUAL for everything)

演示地址

Early in my career, my development goals were bullet points of technical topics that I wanted to learn and apply: recommendation systems, image processing, bayesian modelling, etc. So of course, I hopped on the first chance to apply and try out these techniques, thinking that these will generate the greatest amount of value with my limited amount of time.

在我职业生涯的早期,我的发展目标是我想学习和应用的技术主题的要点:推荐系统,图像处理,贝叶斯建模等。因此,我当然希望有机会应用和尝试这些技术,认为这些将在有限的时间内产生最大的价值。

Most of the time, however, a visual can offer the best bang-for-buck. In the model development process after all, there is always the exploration piece which usually involves data viz. In this step, we can already get a lot of insights from different visuals:

但是,大多数情况下,视觉效果可以提供最佳的性价比。 毕竟,在模型开发过程中,总是存在通常涉及数据的探索部分。 在这一步中,我们已经可以从不同的视觉效果中获得很多见解:

  • Dodged Bar Graph — Which brand of milk tea sold the most?

    闪避条形图—哪个品牌的奶茶销量最高?
  • Stacked Bar Graph — Which milk tea branch contributes the most to revenue?

    堆积条形图—哪个奶茶分支对收入的贡献最大?
  • Line Graph — Did average cost of milk tea decrease/increase?

    线形图—奶茶的平均成本是否降低/增加了?
  • Scatter Plot — How is age related to number of milk tea purchased?

    散点图—年龄与购买的奶茶数量有何关系?
  • Density Plot — What is the distribution of the age of milk tea drinkers?

    密度图-喝奶茶的年龄分布如何?
  • Tables — What is the total, average, mean, and median of milk tea purchases per customer across all months?

    表格-在每个月中,每位客户购买奶茶的总数,平均值,平均值和中位数是多少?
  • Pie Chart — No.

    饼图-不
The Extreme Presentation Method by Dr. Abela 的极端演示方法重现视觉

Data viz helps us dissect problems into comprehensible pieces, and potentially equips us to answer business questions without the need to apply complex methodologies. Going deep into visualisation will force you to ask “Are we looking at the right things?” and eventually — “Are we being led to the best strategies?”

数据可视化帮助我们将问题分解为可理解的部分,并有可能使我们无需使用复杂的方法即可回答业务问题。 深入可视化将迫使您问“我们在看正确的东西吗?” 最终-“我们被引导到最佳策略了吗?”

Data Viz是一种同时培养软技能和硬技能的实践 (Data Viz is a practice that cultivates both soft and hard skills at the same time)

  • Hard Skills: Data Wrangling (How do we transform data into its pre-graph form?), Grammar of Graphics (How are visualizations constructed?)

    硬技能:数据整理(如何将数据转换为图形形式), 图形语法 (如何构建可视化?)

  • Soft Skills: Communication (What visual will effectively drive home a point?), Critical Thinking (What visual will best convey a good strategy?)

    软技能:交流(哪种视觉方法可以有效地传达观点?),批判性思维(哪种视觉方法可以最好地传达一种良好的策略?)

From these experiences, I believe learning the principles behind data visualisation is beneficial for any data practitioner, and possibly even for people from other professions as well. Data viz is how we can communicate the complexity of data to the visual learners of the world (65% of the population, according to Google). We cannot skip learning this extremely useful and practical skill, and I’ll wait for the day when pie charts only exist in the form of memes.

从这些经验中,我相信学习数据可视化背后的原理对任何数据从业者都是有益的,甚至对于其他专业人士也可能是有益的。 数据即是我们如何将数据的复杂性传达给世界范围内的视觉学习者(根据Google的数据,占总人口的65%)。 我们不能跳过学习这一极其有用和实用的技能,而我将等待饼图仅以模因形式存在的那一天。

演示地址

Geckoboard on Pie Charts Geckoboard在饼图上的短短一分钟视频

This post is mostly a compilation of opinions I formed in my data career. I’m super open to thoughts and comments from fellow practitioners and aspiring ones alike! Please feel free to shoot an email to [email protected] or connect with me via LinkedIn. Shout out to Fernandina Ko who helped me edit this piece, Christian San Jose for inviting me to the UX Fundamentals Class, and TJ Palanca for the minimalist map visual!

这篇文章大部分是我在数据职业生涯中形成的观点的汇编。 我非常欢迎同修和有抱负的人的想法和评论! 请随时发送电子邮件至[email protected]或通过LinkedIn与我联系。 向Fernandina Ko大喊大叫 ,后者帮助我编辑了这篇文章, Christian San Jose邀请我参加UX基础知识课程,向TJ Palanca提供了极简地图视觉效果!

翻译自: https://medium.com/swlh/why-you-cant-skip-learning-data-visualization-6314896ccdc0

数据可视化机器学习工具在线

你可能感兴趣的:(可视化,python,机器学习,大数据,人工智能)