2021 Tableau CA EXAM GUIDE WITH TIPS

整理了最新的Tableau Associate Certification 考试指南下考点对应的tableau help 文档,方便大家复习的时候快速找到知识点,也方便大家在考试的时候迅速定位. 此外每个知识点下面还整理了一些常作为knowledge questions 形式出的,平时很难注意的到的点. 不建议在初学的时候看,强烈建议在考前迅速过一遍.

如果对数据分析基本知识, 包括数据结构,图表类型,基础统计学不是很清楚的可以先看这一篇: 

https://help.tableau.com/current/pro/desktop/en-us/data_structure_for_analysis.htm

1.Data Connections - 17%

● Connect to Tableau Server

Publish a workbook

-选择filter as user 可以切换为其他用户预览

-如果储存.tds文件到server, this allows power user to access the data without the credentials needed access or edit the database directly.

● Describe connection options

-live: (default, 要么打开的时候refresh, 要么手动refresh,在data source处运行query)

    do not import your data, get start immediately

    maintain a connection to the data source, so data refreshes in the work book.

    refresh data: tableau will refresh the data each time you open the workbook.

-extract ( 部分数据, 离线使用, 使用内部数据引擎运行)

        makes a copy of a portion of your data

        如果改变了数据结构,改变列名了则不能update, 新加了一行或者一列可以update

        extract的时候可以aggregate data for visible dimensions , 把data roll up to 某一个level. (pre calculate the default aggregation for each measure of all the dimensions in the table, 会使得build new view 更快,因为简化了数据)

-refresh data

    可以选择refresh all the data 或者 incremental refresh

    仅支持extract 不支持live的是 salesforce

    extract 好处:

    连接很大的数据库导致运行效率低, 并且只用到其中一部分的数据

    increase workbook performance by using tableau data engine to run queries rather than sending a query to datasource. 

     extract give special functionality:

    keep a copy of the data

    extract 坏处:

data 不会自动更新,需要手动更新或者schedule data refresh

   extract 会filter掉一些数据, 会被误当作missing data.

save extracted data: .tde 现在变成了.hpyer

.tde的文件再打开后会自动变成.hpyer 文件,此举动不可逆

用.hpyer 替代.tde是为了better performance

● Connect to different data source types

spread sheets, statistical files, relational database, OLAP cubes, big data, online data source

two file type:1.flat-file data 2.server based data

● Join tables from single and multiple databases

    Default join type is inner join

    Published Tableau data sources cannot be used in joins. 

    some type of data sources don't support certain join types.

    join 需要每个表格都有一个 related field, 且related field之间是same data type

    cross databases join

    需要每个database建立一个connection,然后再选表格. primary database是蓝色的, secondary是橘色的 (同Blending)

Join calculations

    只能用原始数据中的filed join, 通过拆分split得到的无法用于join.

    Join 的filed需要同样的data type,如果不是的话要使用函数转换一下field type.

● Prepare Data for Analysis

    convert field to measures or dimensions 

    自动读取为dimension或measure的field都有什么特征

    edit default properties

    全局改变:在data pane 选好一个field, 右键, default properties, 可以comment, number format, color,aggregation.

    只应用在一个worksheet上: 把field拖入viz 后右键改变format

split & custom split

    splited field is calculated field

    cannot join on splited field/calculated field, but can blend on splited&calculation field.

●Blending

连接不同的datasource 这样可以把不同数据源的数据在一个viz中展示,适用于数据结构或数据层级不同的表的连接.

    require share the same dimension.

    blending query each data source and aggregate the results to a common level

    blending works like left join ( primary datasource相当于左边的表)

blending和join的区别

    join是data from same data source

    blend 不创建row level join

    blend 的common field必须是dimension, join 不一定.

    blend data without a common field

只要两个数据源中有fields 有same value (不一定是要完全等同,只要有相同值,就可以手动连接)

或者手动更改其中一个数据源的field,使其和另一个数据源中要链接的field的名字是一样的, 这样tableau就可以自动连接了.

如果不同数据源的两个field指的是同一件事物,但是一个用了全称,一个用了缩写,就可以给其中一个添加 alias的方法连接

when use blended fields in a calculation, you must aggregate all data source fields.

some limitations on secondary data sources. 

You may not be able to sort by a field from a secondary data source, and action filters may not work as expected with blended data. For more information, see Other data blending issues. 

Data blending behaves similarly to a left join, which may result in missing data from the secondary data source. Blending 也是会导致missing data的,但不会duplicate data.

● Metadata Grid 

    可以更改名字, 字段类型 以row 排列 不会改变原数据

● Pivot

● Union

union的表格必须来自同一数据源,拥有相同的数据结构,相同的名称和数据类型,mismatched field 会自动保留,需要手动合并.

connect multiple tables from single data source (from the same connection)

data must have the same structure (same number of fields, and related fields must have same name and types)

union会保留duplicates row的.如果定期增加数据, 可是使用wildcard来实现.

union的适用范围:excel, google sheet, text, json, database tables.

● Data Interpreter

仅可用于 Microsoft Excel、文本 (.csv) 文件、PDF 文件和 Google Sheets。对于 Excel, 数据必须为 .xls 和 .xlsx 格式。

● Explain data extract formats and capabilities

从 10.5 开始,新数据提取使用 .hyper 格式,而不是 .tde 格式。.Hyper 格式的数据提取利用改进的数据引擎,该数据引擎的快速分析和查询性能与之前的数据引擎不相上下,但可适用于更大的数据提取。

尽管 Tableau 版本 2021.1 可以继续读取 .tde 数据提取,但它无法创建新的 .tde 数据提取。此格式更改的影响意味着,当用户或 Tableau Server 执行某些数据提取任务(如数据提取刷新或附加数据)时,.tde 数据提取会自动升级并转换为 .hyper 数据提取。

升级不能逆转。升级后的数据提取无法转换回 .tde 数据提取。

tde 数据提取可以通过以下三种不同的方法升级为 .hyper 数据提取:1.) 在数据提取刷新(完整或增量)期间,2.) 将数据附加到数据提取时,以及 3.) 使用 Tableau Desktop 2021.1 手动升级数据提取时。升级数据提取后,如果其他工作簿未引用原始 .tde 数据提取,则该原始数据提取将从 Tableau Server 中自动删除。

● Create extracts with multiple tables

● Explain performance considerations between blends, joins, and cross-database joins

Joins combine tables by adding more columns of data across similar row structures. This can cause data loss or duplication if tables are at different levels of detail, and joined data sources must be fixed before analysis can begin.

Blends never truly combine the data. Instead, blends query each data source independently,the results are aggregated to the appropriate level, then the results are presented visually together in the view. Because of this, blends can handle different levels of detail and working with published data sources. Blends are also established individually on every sheet and can never be published, because there is no true “blended data source”, simply blended results from multiple data sources in a visualization.

Data blending simulates a traditional left join. The main difference between the two is when the aggregation is performed. A join combines the data and then aggregates. A blend aggregates and then combines the data. ( Measure values are aggregated based on how the field is aggregated in the view. However, all fields from a secondary data source must be aggregated. )

● Use Automatic & Custom Split


2.Organizing & Simplifying Data - 10%

● Filter data

Tableau's Order of Operationshelp.tableau.com

多个filter之间的关系是and,展现出来的是交集.

灵活使用filter 选项卡中的wild card,condition和top. 日期filter可以选择relative late筛选某个日期的前几天,几个月的范围.

context filter是嵌套关系, 先执行context filter,再此基础上执行普通的filter,可以有效的increase performance

应用场景: if you would like your Top N set to change depending on what filter choices are changed, you need to use context filter. context filters are applied before the Top n filter are applied.

        eg: a context filter on region =west and set showing top 5 customers with highest sales, the filter on region will first limit your data to show only rows with the west region, and then the set will determine the top 5 customers for those rows.

● Sort data

正常排序:非嵌套排序考虑跨窗格的值,并且每个窗格的值顺序相同。 默认情况下,通过字段标题进行排序会进行非嵌套排序。

嵌套排序:嵌套排序会单独考虑各个窗格独立的值,而不是跨窗格聚合中的值。默认情况下,通过轴进行排序会进行嵌套排序。

在Hierarchy全部排序: 不管哪个hierarchy的层级,全部measure以从小到大/从大到小的方式排列: 创建 把Hierarchy下的那几个field combined field 然后直接sort

● Build groups

●Build hierarchies


3. Build sets Field & Chart Types - 15%

● Explain the difference between measures and dimensions

Understand Field Type Detection and Naming Improvementshelp.tableau.com

-被tableau 自动当成dimensions的field name

--Keywords Code, Key, and ID

-- Keywords Number, Num, and Nbr

--keywords related to dates

--Field names that use all capital letters with non-letter characters are converted to all lower-case letters except for the characters immediately after the non-letter character.

● Explain the difference between discrete and continuous fields

● Explain how to utilize Tableau-generated fields

● Understand how and when to build:

•Histograms

--主要来展示distribution的,一个measure就可以,使用show me 的话bin是自动生成的,也可以在field 右键great

--考点主要集中为怎么customise bin size. 

•Heat maps

--主要是通过mark card 下拉菜单选择Density来实现. 主要用于scatter plot中overlap的情况或者创建density map来看一个区域某种指标的密度. 

Maps that Show Density or Trendshelp.tableau.com

• Tree maps

--display data in nested rectangles. 是饼图的完美替代,主要用于比较各个部分的大小. 

•Bullet graphs

--是bar chart有了reference line的版本,主要用于看指标是否达到了target.

•Combined axis charts

--两个指标同时比较,公用一个坐标轴. 两个指标各有一套mark card 可以改颜色大小和类型, 同时也有一个总mark card.

•Dual axis charts

•Scatter plots

--如果要全部数据的散点图,要在analysis menu-取消勾选aggregate data

•Cross tabs

--灵活使用totals和table calculations.

--total using 选择automatic的时候会使用的是disaggregated underlying data.

•Bar in bar charts

•Box plots

• Use titles, captions and tooltips effectively

tooltips: 只能把viz放入tooltips中,dashborad和story不行.

Edit axes

Use mark labels and annotations

Color Palettes and Effects


4.Calculations - 18%

● date calculation:

--将分开的年月日拼接在一起创建 date field : makedate() function.

● string function:

● Create quick table calculations

--Moving/window calculation 

一定要注意包含不包含current value, 有个check box 看是否要勾选

regular calculation vs table calculation

regular calculation 是tableau ask query to underlying data source, computation is handled by data source itself

table calculation is secondary calculation which based on the viz shows. calculation is done within tableau.

--calculated in tableau based on only imformation in the view

Transform Values with Table Calculationshelp.tableau.com

● Choosing the Right Calculation for Your Question

www.tableau.comTable Calculation Functionshelp.tableau.com

● Use level of detail (LOD) expressions

● Explain different types of LOD expressions

LOD 表达式中不能使用table calculation 和ATTR

table scope: {sum([sales]} or {fixed: sum(sales)} --- 是对全部数据(table scope)的聚合.

Level of detail 越高越聚合.

● Fixed

如果打算只使用FIXED建立LOD, 在FIXED后添加所需要的field以外, 把剩下在view中影响LOD的全部dimension 都加上.

例如计算每个国家的每个订单的平均利润的时候, 在view中添加了订单, 使用LOD表达式计算每个国家的订单的平均利润额, 除了fixed [order id」 之外,还要fixed [country], 以防 order id 在不同国家间有重复.

● Include:

--Calculating At A Lower Level Of Detail

● exclude:

--Calculating At A Higher Level Of Detail

--default to show an EXCLUDE LOD in a view as ATTR

在Viz上添加细节影响或不影响LOD的功能:

● Use Ad-hoc calculations

Ad-hoc calculations are supported on the Rows, Columns, Marks, and Measure Values shelves; they are not supported on the Filters or Pages shelves.

Only one ad-hoc calculation can be open at a time.

Ad-hoc calculations are not named, but are saved when you close the workbook.

Ad-hoc calculations are not available when you create groups, sets, hierarchies, or parameters.

Ad-hoc calculations are valid for creating trend lines, forecasts, and reference lines, bands, and distributions.

● Work with aggregation options

aggregate dimensions 

当创建calculated field 时, 使用LOD 或者If 语句时是不允许 mix aggregate field with non-aggregated field, 所以就需要aggregate dimension, 主要可以用max,min ,attr

attribute 的逻辑是 if min=max then return that value, 如果只有一个值,则返回该值, 如果有多个值,则返回星号.是用来监测数据是否存在多个值的好工具.

●Build logic statements

●Build arithmetic calculations

--如果两个field做加减乘除,前提是这两个field中不能有null, 如果有则需要用到ZN函数.

--between and 要记得带等号!

● Build grand totals and sub-totals

--from the Analysis menu,Totals >Total All Usingis set to the default valueAutomatic.

--Only Automatic totals are available for table calculations and fields from a secondary data source. Total aggregations cannot be applied to table calculations or fields from a secondary data source.

● Use calculations in join clauses


5.Mapping - 13%

● Navigate maps, including:

● Pan & Zoom

● Filtering

● Map layering

● Custom territories

● Lasso & Radial selection

● Geographic search

● Modify locations within Tableau

deal with unrecognised location?

1. 如果有hierarchy的话,把其上面level的field都加上, tableau会自动解决一些ambiguous location

2.手动匹配

●Import and manage custom geocoding

Bring in a locations that tableau doesn't recognize

must be in .csv format

can also be in schema.ini file which tells tableau to treat numeric fields as text

Tableau only accepts text fields for new geographic roles

● Use a background image map

●  Connect to spatial files

是一种特殊的类型,遇到了搜spatial files in tableau

尤其是join spatial files in tableau

can create point, line, polygon maps

when you connect to spatial data, tableau creates a geometry field for your point geometies or your polygons. you use the geometry field to create a map with your spatial data.

geometry field 有三个value,

point - point geometries

linestring/ multilinestring - lienar geometries

polygon/ multipolygon - polygons.

geometry is a measure by defualt and is aggregated into a single mark using the COLLECT aggregation when it is added to the view.

 就是当view中显示的是国家而我们想看一个城市的时候, 要么加LOD,没有的话就analysis 取消勾选aggregate.

Tableau supports joining two spatial data sources using their spatial features (geography or geometry) you can only create spatial joins bewteen points and polygons. 看见有多张表的spatial 文件,直接join就完事了!!!


6.Analytics - 15%

● Reference Lines

● Reference Bands

● Trend Lines

只能是continuous fields而不能是discrete fields.

A continuous field that is not in view but in details in marks card can be used in reference line

不能和set连用

● Trend Model

● Forecasting

1.横坐标一定要是continuous date field (如果不是一定要先改格式)

2. 默认的预测是不把最后一个实际点纳入在内的.

● To create a forecast, your view must be using at least one date dimension and one measure.

How Forecasting Works in Tableauhelp.tableau.comDescribe Forecast Dialog Boxhelp.tableau.comhttps://help.tableau.com/current/pro/desktop/en-us/forecast_options.htmhelp.tableau.comForecast Field Resultshelp.tableau.comForecasting When No Date is in the Viewhelp.tableau.com

● Drag & Drop Analytics

● Box Plot: 

Tableau 201: How to Make a Box-and-Whisker Plot | Evolyticsevolytics.com

    1.拖一个dimension和一个mesure 创建一个bar chart

    2.add the distribution that you care about to the Detail Marks Card.

    3.在mark card上下拉菜单选择cricle

    4.right-click on the Y-Axis, and choose “Add Reference Line”. When the add reference line dialog box appears, click on the choice for Box Plot

● add box plot from reference line:

● Reference distributions

● Statistical summary card

● Instant Analytics

● Data Highlighter

    1. view中:在mark card上

    2. view中在legend上

    3. view中的toolbar 的下拉菜单

    3. dashboard中: 创建highlight action

    4. 在story中: highlight can be saved to preserved a specific selection by updating the point

7.Dashboards - 12%

● Build dashboards and stories

仪表板可以插入什么?

文本框、图像、网页、空白对象。

● Create dashboard actions

1.Filter action

guide analysis by using the data from one view to filter data in another. 

高级用法: eg region filter, 创造一些按钮去让所有的view显示某一区域的各种情况.

filter actions send data values from the relevant source fields as filters to the target sheet

create navigation to link other dashboard

2.Highlight action

allow you to call attention to marks of interest by retaining color for specific marks and dimming all others.

highlight data 的五种方法:

1.使用highlight legend

2.可以手动选择

3. 使用highlighter to search for marks in context

4. create an advanced highlight action

5. highlight button on the toolbar

3.Go to Sheet action

可以去other worksheet, other dashboards 或者other stories

4.Go to URL action (Tableau Desktop Only)

URL actions create hyperlinks to external resources, such as a web page, email link, or a file.

链接出现在tooptip中

5.Parameter (Tableau Desktop Only)

Parameter actions let your audience change a parameter value through direct interaction with a viz, such as clicking or selecting a mark. 

可以在一张dashboard上展示更多的图表.

6. Set Action ( Tableau Desktop only)

Set actions let users change the values in a set by directly interacting with marks on a viz.

Set actions then, allow users to dynamically update which members are in a set through interacting with a data point.

● Design dashboards for viewing on devices

● Utilize visual best practices for viewing on devices

● Describe publishing & sharing options



LOD 常用套路:

1.find 1st order date

{fixed [customer id]: min(order date)}

2.find each customer‘s second order date:

{fixed [customer ID]: MIN( iif ([order date]> [1st purchase],[order date],null))}

3. 问哪个类别的客户的平均下单数最大?

{fixed [category]: AVG({fixed [customer ID]: countd([order id])})}

4.Average shipping cost per customer?

sum(shipping cost)/countd(customer id)

类型一: 各种限制条件下算比例:

(1) Using the Players Sheet from Baseball, what percent of players had a batting average above 0.200 and more than 10 runs?

(2)Using the player sheet from the baseball file, what percentage of total runs did Badgers players, who scored between 10 and 15 runs, account for?

该表格的最小单位是player, player ID是primary key. 所以可以直接手动filter 保留10到15的player 并且group 然后比较即可.

构造calculated field, if [runs]>= 10 and [runs]<=15 then True and False 然后拖到view中, 看total runs.

(3) using superstore, isolate all customers who placed more than 3 orders in 2013, what was the % increase in total orders for 2013, compared to 2012, for that group?

类型二: 计算条件跨越一个分类变量的多个值--灵活使用COUNTD

(1)using superstore, how many postal codes had orders placed in all years?

{fixed [postal codes]: countd(year([order date]))}

(2) using the “ALL medalists” sheet from Summer_Plympic_medallists_1986-2008, how many athletes won all medals ( Gold, Silver & Bronze) in Tennis (Discipline)?

iif({fixed:[Athlete],[Discipline]:countd([medal])}>=3,1,0)

类型三: 两个条件相互交叉的LOD

(1)Using superstore, how many customers placed an order in 2013 that also placed at least 1 order before 2012?

首先区分有没有在2012定过订单的顾客:

Ordered before 2012: IIF({fixed [customer ID]:min(year([order id]))<2012, 1,0)

然后添加filter order date year 为2013

将ordered before 2012拖入view中, sum(order before 2012) 就是总数, 如果换成avg就是比率.

(2) What % of customers ordering items in 2011 also ordered items in 2012? 

identify customers ordered in 2012

{fixed [customer ID]: max(if year([order date])=2012 then 1 else 0 end)}

if在里面 fixed 在外面 (不然if需要aggregation 没法写)

{LOD}中的aggregation用max最好 因为只要对于某一个顾客,存在一个2012年的订单就可以了,可以存在不是订单的情况.

类型四: 某一部分占总体的比率

(1) From the 1 digit sheet of EMSI JOBchange_UK,find the city that ranked 3rd in overall contribution to jobs in 2014 in UK. How much % did 'education' industry from this city contributed to overall jobs in 2014 in UK?

calculated field:

{fixed [city],[industry]:sum([Jobs 2014])}/{sum([Jobs 2014])}

(2) find the avg sales value for order which include office supplies

sum({fixed [order ID]: if max ([category]='office supplies' then 1 else 0)}*[sales])/ countd(if [category]='office supplies' then [order id] endfixfixor)

(3)for the states that border New York, which state has the lowest average sales per customer?

先计算出一个customer的总sales 额度” 

sum_sales{fixed [customer ID]: sum(sales)}

再算出average sales per customer for states

sum([sum_sales])/countd(customer ID)

类型五: 是否存在系列:

没有女性运动员的国家有多少个?

{fixed [country]: max(if gender='women' then True else False end)} 然后把该field 拖入到view 中 countd ([country]) 拖入 看数量/比例.

比率计算:

which ship mode was associated with the highest proportion of returned orders?

left join order with return table

drag ship mode to the view

countd([order ID (return)]/countd([order ID ])

你可能感兴趣的:(2021 Tableau CA EXAM GUIDE WITH TIPS)