ezbit

使用DBMS_STATS来收集统计信息

原文参见：http://www.idevelopment.info/data/Oracle/DBA_tips/Tuning/TUNING_17.shtml

overview

Oracle's cost-based optimizer (COB) uses statistics to calculate the selectivity (the fraction of rows in a table that the SQL statement's predicate chooses) of predicates and to estimate the "cost" of each execution plan. The COB will use the selectivity of a predicate to estimate the cost of a particular access method and to determin the optimal join order

ORACLE COB使用统计信息来计算查询谓词的选择性，并借此评估执行计划的成本。然后COB会使用谓词的选择性来评估特定的访问路径的成本并确定最优的连接顺序。

statistics are used to quantify the data distribution and storage characteristics of tables, columns, indexes and partitions. The COB uses these statistics to estimate how much I/O and memory are required to execute a SQL statement using a particular execution plan. Statistics are stored in the data dictionary, and they can be exported from one database and imported into another. Situations in where you would want to perform this, might be to transfer production statistics to a test system to simulate the real environment, even though the test system may only have small samples of the data。

统计信息被用来量化表、列、索引和分区的数据分布特征和存储特征。COB使用统计信息来评估SQL语句采用某特定执行计划时的内存和输入输出量。统计信息存储在数据字典视图中，它们可以被导出和导入，例如，我们可以将生产环境的统计信息导入到测试环境中以便模拟真环境，即使测试环境具有较小的数据采样。

In order to give the Oracle cost-based optimizer the most up-to-date information about schema objects (and the best chance for choosing a good execution plan) all application tables and indexes to be accessed must be analyzed. New statistics should be gathered on schema objects that are out of date. After loading or deleting large amounts of data would obviously change the number of rows. Other changes like updating a large amount of rows would not effect the number of rows, but may effect the average row length.

为了给ORACLE COB提供最新的关于模式对象的信息（从而可以选择最优执行计划），所有被访问的应用表和索引都需要被分析。如果对象的统计信息已经过时，我们需要更新统计信息，例如，在进行大量的装载或者删除数据后，或者对表数据进行了大量的更新操作。

Statistics can be generated with the ANALYZE statement or with the package DBMS_STATS (introduced in Oracle8i). The DBMS_STATS package is great for DBA's in managing database statistics only for use by the COB. The package itself allows the DBA to create, modify, view and delete statistics from a standard, well-defined set of package procedures. The statistics can be gathered on tables, indexes, columns, partitions and schemas, but note that it does not generate statistics for clusters.

统计信息可以通过ANALYZE命令或者DBMS_STATS包来收集。在COB模式下，DBMS_STATS包是DBA管理统计信息的有力工具。DBMS_STATS包允许管理员以调用过程的方式创建，编辑，查看和删除统计信息。它可以收集表、索引、列、分区和模式的统计信息，但是它不可以生成cluster的统计信息；

DBMS_STATS provides a mechanism for you to view and modify optimizer statistics gathered for database objects.The statistics can reside in two different locations:

The dictionary.
A table created in the user's schema for this purpose

dbms_stats包为我们提供了查看和编辑统计信息的机制。统计信息可以存储在2个不同的位置：数据字典视图和用户自定义的表中。

Only statistics stored in the dictionary itself have an impact on the cost-based optimizer.

When you generate statistics for a table, column, or index, if the data dictionary already contains statistics for the object, then Oracle updates the existing statistics. Oracle also invalidates any currently parsed SQL statements that access the object.

The next time such a statement executes, the optimizer automatically chooses a new execution plan based on the new statistics. Distributed statements issued on remote databases that access the analyzed objects use the new statistics the next time Oracle parses them.

When you associate a statistics type with a column or domain index, Oracle calls the statistics collection method in the statistics type if you analyze the column or domain index.

只有存储在字典视图中的统计信息，才会被优化器使用。

当我们收集表、列或者索引的统计信息时，如果数据字典中已经包含有统计信息，oracle会将已有信息进行更新。同时oracle也会使当前解析的与更新对象相关的sql语句无效，以便可以使用信息的统计信息。在远程主机中执行的分布式语句，则在oracle下次解析时才会使用心得统计信息。

当我们将某列或者域索引与某一统计类别管理时，oracle会在分析该列或者域索引是调用该统计类别下的统计收集方法。

missing statistics

When statistics do not exist on schema objects, the optimizer uses the following default values.

当统计信息不存在是，oracle会使用如下的默认统计信息。

Tables
Statistic	Default Value Used by Optimizer
Cardinality	100 rows
Avg. row len	20 bytes
No. of blocks	100
Remote cardinality	2000 rows
Remote average row length	100 bytes
Indexes
Statistic	Default Value Used by Optimizer
Levels	1
Leaf blocks	25
Leaf blocks/key	1
Data blocks/key	1
Distinct keys	100
Clustering factor	800 (8*no. of blocks)

Analyze vs DBMS_STATS

The following is a quick overview of the two.

Analyze
- The only method available for collecting statistics in Oracle 8.0 and lower.(ORACLE 8之前仅有的统计信息收集方式)
- ANALYZE can only run serially（只可以串行执行）.
- ANALYZE cannot overwrite or delete certain types of statistics that where generated by DBMS_STATS（不可以覆盖DBMS_STATS生成的部分统计信息）.
- ANALYZE calculates global statistics for partitioned tables and indexes instead of gathering them directly. This can lead to inaccuracies for some statistics, such as the number of distinct values.（ANALYZE针对分区表和索引计算全局统计信息，而不是直接针对整张表进行统计分析，这可能造成不正确的统计信息，例如distinct value的取值）
  - For partitioned tables and indexes, ANALYZE gathers statistics for the individual partitions and then calculates the global statistics from the partition statistics.（对于分区表，ANALYZE收集每个分区的统计信息，然后根据各个分区的信息计算出全局统计信息）
  - For composite partitioning, ANALYZE gathers statistics for the subpartitions and then calculates the partition statistics and global statistics from the subpartition statistics.（对于组合分区表，ANALYZE收集每个子分区的统计信息，然后据此计算各个分区和全局的统计信息）
- ANALYZE can gather additional information that is not used by the optimizer, such as information about chained rows and the structural integrity of indexes, tables, and clusters. DBMS_STATS does not gather this information.（ANALYZE 会收集某些与优化器无关的信息，例如chainrow，索引、表和cluster的结构完整性，DBMS_STATS不会收集这些信息）
- No easy way of knowing which tables or how much data within the tables have changed. The DBA would generally re-analyze all of their tables on a semi-regular basis.（没有办法知道哪些表或者表中的哪些数据发生了变化，dba通常会依据一定的规则重新收集所有标的统计信息）
DBMS_STATS
- Only available for Oracle 8i and higher.(在oracle8之后才可用)
- Statistics can be generated to a statistics table and can then be imported or exported between databases and re-loaded into the data dictionary at any time. This allows the DBA to experiment with various statistics.（统计信息可以被导出导入，方便了DBA的使用）
- DBMS_STATS routines have the option to run via parallel query or operate serially（可以并行或者串行执行）.
- Can gather statistics for sub-partitions or partitions.（可以收集分区和子分区的统计信息）
- Certain DDL commands (ie. create index) automatically generate statistics, therefore eliminating the need to generate statistics explicitly after DDL command.（某些DDL语句可以自动收集统计信息）
- DBMS_STATS does not generate information about chained rows and the structural integrity of segments.（不会收集chainrow和段结构有效性的统计信息）
- The DBA can set a particular table, a whole schema or the entire database to be automatically monitored when a modification occurs. When enabled, any change (insert, update, delete, direct load, truncate, etc.) that occurs on a table will be tracked in the SGA. This information is incorporated into the data dictionary by the SMON process at a pre-set interval (every 3 hours in Oracle 8.1.x, and every 15 minutes in Oracle 9i). The information collected by this monitoring can be seen in the DBA_TAB_MODIFICATIONS view. Oracle 9i introduced a new function in the DBMS_STATS package called: FLUSH_DATABASE_MONITORING_INFO. The DBA can make use of this function to flush the monitored table data more frequently. Oracle 9i will also automatically call this procedure prior to executing DBMS_STATS for statistics gathering purposes. Note that this function is not included with Oracle 8i.（使用DBMS_STATS，DBA可以指定某张表，或者整个用户，或者这个数据库自动监视数据的变化。当发生任何变化时（增删改查，装载，truncate等），oracle会在sga中自动记录数据的变化，随后SMON进程会将这些变化与已有的统计信息进行合并（oracle8每3个小时合并一次，oracle9之后没15分钟合并一次）。我们可以通过DBA_TAB_MODIFICATIONS视图来查看已经发生的变化。我们也可以直接使用9i引入的新函数FLUSH_DATABASE_MONITORING_INFO来将信息手动合并到已有统计信息中。在9i中，oracle会在每次调用DBMS_STATS时，首先调用FLASH_DATABASE_MONITORING_INFO函数。）
- DBMS_STATS provides a more efficient, scalable solution for statistics gathering and should be used over the traditional ANALYZE command which does not support features such as parallelism and stale statistics collection.（DBMS_STAS提供了一种更高效，可伸缩的信息统计方式，我们优先使用DBMS_STATS,而不使用ANNLYZE）
- Use of table monitoring in conjunction with DBMS_STATS stale object statistics generation is highly recommended for environments with large, random and/or sporadic data changes. These features allow the database to more efficiently determine which tables should be re-analyzed versus the DBA having to force statistics collection for all tables. Including those that have not changed enough to merit a re-scan)（优先使用dbms_stats）

What gets collected?

Table Statistics

Oracle collects the following statistics for a table. Statistics marked with an asterisk are always computed exactly. Table statistics, including the status of domain indexes, appear in the data dictionary views USER_TABLES, ALL_TABLES, and DBA_TABLES in the columns shown in parentheses.

oracle可以为表收集如下的统计信息，部分统计信息始终是准确的（带*）。表的统计信息（包括domain index）都可以在 USER_TABLES, ALL_TABLES, and DBA_TABLES等视图的如下字段中可以查看到。

Number of rows (NUM_ROWS)记录数量

* Number of data blocks below the high water mark (that is, the number of data blocks that have been formatted to receive data, regardless whether they currently contain data or are empty) (BLOCKS)位于高水位线之下的数据块数量（在mssm中，oracle通过freelist管理段，当段空间不足时，oracle会分配新的数据块到高水位线下，并进行格式化后放到freelist上以备后用，此时高水位线下的块都是格式化的，但可能并没有被使用。在ASSM下，段的管理模式发生了变化，当空间不足时，oracle会分配数据块到高水位线下，但是并不会立即格式化，而是在使用时才格式化，此时引入了另一个概念low 高水位线，lowhwm下的块都是格式化的，lowhwm和hwm之间的数据库可能是格式化也可能并未格式化，当lowhwm和hwm之间的数据块全部格式化时，lowhwm上移到hwm的位置），

* Number of data blocks allocated to the table that have never been used (EMPTY_BLOCKS)空闲数据块的数量,HWM之上的数据块

Average available free space in each data block in bytes (AVG_SPACE)平均每个数据上的空闲空间，blocks+empty_blocks

Number of chained rows. [Not collected by DBMS_STATS] (CHAIN_COUNT)发生chainrow的记录数量

Average row length, including the row's overhead, in bytes (AVG_ROW_LEN)平均每行的长度，包含overhead信息

Index Statistics

Oracle collects the following statistics for an index. Statistics marked with an asterisk are always computed exactly. For conventional indexes, the statistics appear in the data dictionary views USER_INDEXES, ALL_INDEXES, and DBA_INDEXES in the columns in parentheses.（带*为准确值）

oracle收集如下的索引统计信息。对于常规索引，可以在视图USER_INDEXES, ALL_INDEXES, and DBA_INDEXES中查看到如下的统计信息。

* Depth of the index from its root block to its leaf blocks (BLEVEL)（从0开始）

Number of leaf blocks (LEAF_BLOCKS)（叶子块的数量）

Number of distinct index values (DISTINCT_KEYS)

Average number of leaf blocks per index value (AVG_LEAF_BLOCKS_PER_KEY)（每个索引值存在于几个叶子块，通常为1）

Average number of data blocks per index value (for an index on a table) (AVG_DATA_BLOCKS_PER_KEY)（每个索引值对应的记录存在于几个数据块，通常为1）

Clustering factor (how well ordered the rows are about the indexed values) (CLUSTERING_FACTOR)（聚簇因子）

Where are the statistics stored?

Statistics are stored into the Oracle Data Dictionary, in tables owned by SYS. Views are created on these tables to retrieve data more easily.

These views are prefixed with DBA_ or ALL_ or USER_. For ease of reading, we will use DBA_% views, but ALL_% views or USER_% views could be used as well.

统计信息存储在数据字典中，在sys用户下的表内。通过视图我们可以非常方便的从这些表中获取信息。视图通常以DBA_ USER_ ALL_开始。为了简便，我们以DBA_开头的视图为例。

Conventions Used

- Statistics available only since 8.0.X rdbms release         : (*)
- Statistics available only since 8.1.X rdbms release         : (**)
- Statistics not available at partition or subpartition level : (G)
- Statistics not available at subpartition level              : (GP)

Table level statistics can be retrieved from:

DBA_ALL_TABLES - (8.X onwards)
DBA_OBJECT_TABLES - (8.X onwards
DBA_TABLES - (all versions)
DBA_TAB_PARTITIONS - (8.X onwards)

DBA_TAB_SUBPARTITIONS - (8.1 onwards)

Columns to look at are:

  NUM_ROWS                         : Number of rows (always exact even when computed 
                   			 with ESTIMATE method) 
  BLOCKS                           : Number of blocks which have been used even  
                                     if they are empty due to delete statements 
  EMPTY_BLOCKS                     : Number of empty blocks (these blocks have  
                                     never been used) 
  AVG_SPACE                        : Average amount of FREE space in bytes in blocks  
                                     allocated to the table : Blocks + Empty Blocks 
  CHAIN_CNT                        : Number of chained or migrated rows     
  AVG_ROW_LEN                      : Average length of rows in bytes 
  AVG_SPACE_FREELIST_BLOCKS (*)(G) : Average free space of blocks in the freelist 
  NUM_FREELIST_BLOCKS       (*)(G) : Number of blocks in the freelist 
  SAMPLE_SIZE                      : Sample defined in ESTIMATE method (0 if COMPUTE) 
  LAST_ANALYZED                    : Timestamp of last analysis 
  GLOBAL_STATS             (**)    : For partitioned tables, YES means statistics  
                                     are collected for the TABLE as a whole 
                                     NO means statistics are estimated from statistics  
                                     on underlying table partitions or subpartitions 
  USER_STATS               (**)    : YES if statistics entered directly by the user

Index level statistics can be retrieved from:

DBA_INDEXES - (all versions )
DBA_IND_PARTITIONS - (8.X onwards)

DBA_IND_SUBPARTITIONS - (8.1 onwards )

Columns to look at are:

  BLEVEL                       : B*Tree level : depth of the index from its root  
                                 block to its leaf blocks (从0开始)
  LEAF_BLOCKS                  : Number of leaf blocks 
  DISTINCT_KEYS                : Number of distinct keys 
  AVG_LEAF_BLOCKS_PER_KEY      : Average number of leaf blocks in which each 
                                 distinct key appears (1 for a UNIQUE index) 
  AVG_DATA_BLOCKS_PER_KEY      : Average number of data blocks in the table that  
                                 are pointed to by a distinct key 
  CLUSTERING_FACTOR            : - if near the number of blocks, then the table is  
                                   ordered : index entries in a single leaf block  
                                   tend to point to rows in same data block 
                                 - if near the number of rows, the table is  
                                   randomly ordered : index entries in a single  
                                   leaf block are unlikely to point to rows in  
                                   same data block 
  SAMPLE_SIZE                  : Sample defined in ESTIMATE method (0 if COMPUTE) 
  LAST_ANALYZED                : Timestamp of last analysis 
  GLOBAL_STATS            (**) : For partitioned indexes, YES means statistics  
                                 are collected for the INDEX as a whole 
                                 NO means statistics are estimated from statistics  
                                 on underlying index partitions or subpartitions 
  USER_STATS              (**) : YES if statistics entered directly by the user 
  PCT_DIRECT_ACCESS   (**)(GP) : For secondary indexes on IOTs, percentage of  
                                 rows with VALID guess（可以通过alter index index_name update block references来更新）

Column level statistics can be retrieved from:

DBA_TAB_COLUMNS - (all versions)
DBA_TAB_COL_STATISTICS - (Version 8.X onwards)
DBA_PART_COL_STATISTICS - (Version 8.X onwards)

DBA_SUBPART_COL_STATISTICS - (Version 8.1 onwards)

The last three views extract statistics data from DBA_TAB_COLUMNS.（后三个视图是从DBA_TAB_COLUMNS获取数据）

Columns to look at are:

  NUM_DISTINCT                 : Number of distinct values 
  LOW_VALUE                    : Lowest value  
  LOW_VALUE                    : Highest value  
  DENSITY                      : Density 
  NUM_NULLS                    : Number of columns having a NULL value 
  AVG_COL_LEN                  : Average length in bytes 
  NUM_BUCKETS                  : Number of buckets in histogram for the column    
  SAMPLE_SIZE                  : Sample defined in ESTIMATE method (0 if COMPUTE) 
  LAST_ANALYZED                : Timestamp of last analysis 
  (**)GLOBAL_STATS             : For partitioned tables, YES means statistics  
                                 are collected for the TABLE as a whole 
                                 NO means statistics are estimated from statistics 
                                 on underlying table partitions or subpartitions 
  (**)USER_STATS               : YES if statistics entered directly by the user

Compute statistics vs. Estimate statistics

Both computed and estimated statistics are used by the Oracle optimizer to choose the execution plan for SQL statements that access analyzed objects. These statistics may also be useful to application developers who write such statements.

无论是采用compute还是采用estimat的方式计算统计信息，优化器都会根据这些信息来选择执行计划。程序员也可以根据这些统计信息来编写sql语句。

COMPUTE STATISTICS

COMPUTE STATISTICS instructs Oracle to compute exact statistics about the analyzed object and store them in the data dictionary.
When computing statistics, an entire object is scanned to gather data about the object. This data is used by Oracle to compute exact statistics about the object. Slight variances throughout the object are accounted for in these computed statistics. Because an entire object is scanned to gather information for computed statistics, the larger the size of an object, the more work that is required to gather the necessary information.

To perform an exact computation, Oracle requires enough space to perform a scan and sort of the table. If there is not enough space in memory, then temporary space may be required. For estimations, Oracle requires enough space to perform a scan and sort of only the rows in the requested sample of the table. For indexes, computation does not take up as much time or space, so it is best to perform a full computation.

Some statistics are always computed exactly, such as the number of data blocks currently containing data in a table or the depth of an index from its root block to its leaf blocks.

Use estimation for tables and clusters rather than computation, unless you need exact values. Because estimation rarely sorts, it is often much faster than computation, especially for large tables.

当COMPUTE STATISTICS时，oracle会精确计算被分析对象的统计信息，并将其存储在数据字典中。oracle会扫描整个对象来获取数据，并根据这些数据计算统计信息。对于这种方式，基本是轻微的变化也会被计算在内。因为整个对象都会被扫描，因此对象越大就会需要越多的工作量来完成统计。

为了完成精确统计，oracle需要足够的空间来执行扫描和排序作业。如果在内存中不存在足够的空间，就会占用磁盘的临时空间。对于estimation方式，oracle仅仅需要扫描和排序所采样的内容。如果我们统计的对象是索引，computation方式不会占用太多的时间和空间，因此对于索引我们最好采用compute方式。

某些统计信息总是精确计算的，例如表所占用的数据块数量和索引的深度。

对于表和聚簇，我们建议使用estimation的方式，除非真的需要精确的统计信息。因此estatimation方式通常不会发生排序，速度更快，尤其在分析大表时。

ESTIMATE STATISTICS

ESTIMATE STATISTICS instructs Oracle to estimate statistics about the analyzed object and stores them in the data dictionary.
When estimating statistics, Oracle gathers representative information from portions of an object. This subset of information provides reasonable, estimated statistics about the object. The accuracy of estimated statistics depends upon how representative the sampling used by Oracle is. Only parts of an object are scanned to gather information for estimated statistics, so an object can be analyzed quickly. You can optionally specify the number or percentage of rows that Oracle should use in making the estimate.

estimate statistics 使得oracle评估待分析对象的统计信息并将它们存储在数据字典中。当评估统计信息时，oracle在待分析对象的部分区间内收集信息。这部分信息为分析对象提供了足够的内容。estimate方式的准确程度主要依赖于oracle是如何采样的。由于只有部分内容被扫描，因此速度更快。我们可以指定oracle采样的百分比。

To estimate statistics, Oracle selects a random sample of data. You can specify the sampling percentage and whether sampling should be based on rows or blocks.

对于estimate方式，oracle会随机采样数据。我们可以指定采样的百分比，也可以指定是根据记录还是根据块来采样。

Row sampling reads rows without regard to their physical placement on disk. This provides the most random data for estimates, but it can result in reading more data than necessary. For example, in the worst case a row sample might select one row from each block, requiring a full scan of the table or index.基于记录的采用不会考虑记录的物理存储位置。这种方式提供了更好的随机性，但是可能会造成读取更多的数据。在最坏的情况下，oracle可能会在每个数据块中读取一条记录，从而会全表扫描表或者索引

Block sampling reads a random sample of blocks and uses all of the rows in those blocks for estimates. This reduces the amount of I/O activity for a given sample size, but it can reduce the randomness of the sample if rows are not randomly distributed on disk. Block sampling is not available for index statistics.基于块的采样会随机读取数据块，然后利用数据块中的所有记录来进行分析统计工作。这无疑减少了输入输出的数量，但是如果记录在块内的分布不是随机的，这种方式会影响采样的随机性。对于索引，基于块的采样方式是不可用的。

Notes on estimating statistics

The default estimate of the analyze command reads the first approx 1064 rows of the table so the results often leave a lot to be desired.默认情况下，oracle会读取表中的前1064条记录来作为采样数据。

The general consensus is that the default value of 1064 is not sufficient for accurate statistics when dealing with tables of any size. Many claims have shown that estimating statistics on 30 percent produces very accurate results. I personally have been running estimate 35 percent. This seems to produce very accurate numbers. It also saves a lot of time over full scans.通常情况下，默认采样1064条记录是不充分的。多数人认为30%的采样会产生比较准确的结果。我个人常常将采样比例设置为35%

Note that if an estimate does 50% or more of a table Oracle converts the estimate to a full compute statistics.如果采样比超过50%，oracle会将其转换为full compute statiistics

DBMS_STATS functions and variable definitions

Most of the DBMS_STATS procedures include the three parameters statown, stattab, and statid. These parameters allow you to store statistics in your own tables (outside of the dictionary), which does not affect the optimizer. Therefore, you can maintain and experiment with sets of statistics.

大部分DBMS_STAT过程包含三个参数STATOWN,STATTAB和statid。这些参数允许我们将统计信息存放到自己的表中，这些统计信息不回影响优化器。因此，我们可以维护和测试统计信息。

The stattab parameter specifies the name of a table in which to hold statistics, and it is assumed that it resides in the same schema as the object for which statistics are collected (unless the statown parameter is specified). Users may create multiple tables with different stattabidentifiers to hold separate sets of statistics.stattab参数规定了保存统计信息的表明，通常情况下，如果没有指定statown参数，oracle以被统计对象所在的模式用户为stattab的拥有者。我们可以使用不同的stattab来分别存储不同的统计信息。

Additionally, users can maintain different sets of statistics within a single stattab by using the statid parameter, which can help avoid cluttering the user's schema.

灵位，我们也可以指定statid参数，从而在相同的stattab中存储不同的统计信息，这样可以使用户模式显得井井有条。

For all of the SET or GET procedures, if stattab is not provided (i.e., NULL), then the operation works directly on the dictionary statistics; therefore, users do not need to create these statistics tables if they only plan to modify the dictionary directly. However, if stattab is not NULL, then the SET or GET operation works on the specified user statistics table, and not the dictionary.

对于所有的set和get过程，如果我们没有指定stattab，oracle会将统计信息写入数据字典，如果指定了stattab，orcle只会将统计信息写入用户自定义表，而不会更新数据字典。

Create Stats Table
DBMS_STATS.CREATE_STAT_TABLE (
  ownname  VARCHAR2, 
  stattab  VARCHAR2,
  tblspace VARCHAR2 DEFAULT NULL);
ownname : Name of the schema.

stattab : Name of the table to create. This value should be passed as the stattab parameter to other procedures when the user does not want to modify the dictionary statistics directly.

tblspace : Tablespace in which to create the stat tables. If none is specified, then they are created in the user's default tablespace.

Drop Stats Table
DBMS_STATS.drop_stat_table (
  ownname VARCHAR2, 
  stattab VARCHAR2);
ownname : Name of the schema.

stattab : User stat table identifier.

Gather Schema Stats (本人在测试过程中，即便指定了stattab，该过程依然更新了数据字典)
DBMS_STATS.gather_schema_stats (
  ownname          VARCHAR2,
  estimate_percent NUMBER   DEFAULT NULL, 
  block_sample     BOOLEAN  DEFAULT FALSE,
  method_opt       VARCHAR2 DEFAULT 'FOR ALL COLUMNS SIZE 1',（size 1 指在该列上不创建histogram，如果该值大于1，则创建histogram）
  degree           NUMBER   DEFAULT NULL,
  granularity      VARCHAR2 DEFAULT 'DEFAULT', 
  cascade          BOOLEAN  DEFAULT FALSE,
  stattab          VARCHAR2 DEFAULT NULL, 
  statid           VARCHAR2 DEFAULT NULL,
  options          VARCHAR2 DEFAULT 'GATHER', 
  objlist     OUT  ObjectTab,
  statown          VARCHAR2 DEFAULT NULL);
ownname : Schema to analyze (NULL means current schema).

estimate_percent : Percentage of rows to estimate (NULL means compute): The valid range is [0.000001,100).

block_sample : Whether or not to use random block sampling instead of random row sampling. Random block sampling is more efficient, but if the data is not randomly distributed on disk, then the sample values may be somewhat correlated. Only pertinent when doing an estimate statistics.

method_opt : Method options of the following format (the phrase 'SIZE 1' is required to ensure gathering statistics in parallel and for use with the phrase hidden):
FOR ALL [INDEXED | HIDDEN] COLUMNS [SIZE integer]

This value is passed to all of the individual tables.

degree : Degree of parallelism (NULL means use table default value).

granularity : Granularity of statistics to collect (only pertinent if the table is partitioned).

DEFAULT: Gather global- and partition-level statistics.

SUBPARTITION: Gather subpartition-level statistics.

PARTITION: Gather partition-level statistics.

GLOBAL: Gather global statistics.

ALL: Gather all (subpartition, partition, and global) statistics.

cascade : Gather statistics on the indexes as well.
Index statistics gathering is not parallelized. Using this option is equivalent to running the gather_index_stats procedure on each of the indexes in the schema in addition to gathering table and column statistics.

stattab : User stat table identifier describing where to save the current statistics.

statid : Identifier (optional) to associate with these statistics within stattab.

options : Further specification of which objects to gather statistics for:

GATHER: Gather statistics on all objects in the schema.

GATHER STALE: Gather statistics on stale objects as determined by looking at the *_tab_modifications views. Also, return a list of objects found to be stale.

GATHER EMPTY: Gather statistics on objects which currently have no statistics. also, return a list of objects found to have no statistics.

LIST STALE: Return list of stale objects as determined by looking at the *_tab_modifications views.

LIST EMPTY: Return list of objects which currently have no statistics.

objlist : List of objects found to be stale or empty.

statown : Schema containing stattab (if different than ownname).

Export Schema Stats(从数据字典导出到用户表）
DBMS_STATS.export_schema_stats (
  ownname VARCHAR2,
  stattab VARCHAR2, 
  statid  VARCHAR2 DEFAULT NULL,
  statown VARCHAR2 DEFAULT NULL);
ownname : Name of the schema.

stattab : User stat table identifier describing where to store the statistics.

statid : Identifier (optional) to associate with these statistics within stattab.

statown : Schema containing stattab (if different than ownname).

Import Schema Stats（从用户表导入到数据字典）
DBMS_STATS.import_schema_stats (
  ownname VARCHAR2,
  stattab VARCHAR2, 
  statid  VARCHAR2 DEFAULT NULL,
  statown VARCHAR2 DEFAULT NULL);
ownname : Name of the schema.

stattab : User stat table identifier describing from where to retrieve the statistics.

statid : Identifier (optional) to associate with these statistics within stattab.

statown : Schema containing stattab (if different than ownname).

Delete Schema Stats
DBMS_STATS.delete_schema_stats (
  ownname VARCHAR2, 
  stattab VARCHAR2 DEFAULT NULL,
  statid  VARCHAR2 DEFAULT NULL,
  statown VARCHAR2 DEFAULT NULL);
ownname : Name of the schema.

stattab : User stat table identifier describing from where to delete the statistics. If stattab is NULL, then the statistics are deleted directly in the dictionary.

statid : Identifier (optional) to associate with these statistics within stattab (Only pertinent if stattab is not NULL).

statown : Schema containing stattab (if different than ownname).

Set Table Stats
DBMS_STATS.set_table_stats (
  ownname  VARCHAR2, 
  tabname  VARCHAR2, 
  partname VARCHAR2 DEFAULT NULL,
  stattab  VARCHAR2 DEFAULT NULL, 
  statid   VARCHAR2 DEFAULT NULL,
  numrows  NUMBER   DEFAULT NULL, 
  numblks  NUMBER   DEFAULT NULL,
  avgrlen  NUMBER   DEFAULT NULL, 
  flags    NUMBER   DEFAULT NULL,
  statown  VARCHAR2 DEFAULT NULL);
ownname : Name of the schema.

tabname : Name of the table.

partname : Name of the table partition in which to store the statistics. If the table is partitioned and partname is NULL, then the statistics are stored at the global table level.

stattab : User stat table identifier describing where to store the statistics. If stattab is NULL, then the statistics are stored directly in the dictionary.

statid : Identifier (optional) to associate with these statistics within stattab (Only pertinent if stattab is not NULL).

numrows : Number of rows in the table (partition).

numblks : Number of blocks the table (partition) occupies.

avgrlen : Average row length for the table (partition).

flags : For internal Oracle use (should be left as NULL).

statown : Schema containing stattab (if different than ownname).

Get Table Stats
DBMS_STATS.get_table_stats (
  ownname     VARCHAR2, 
  tabname     VARCHAR2, 
  partname    VARCHAR2 DEFAULT NULL,
  stattab     VARCHAR2 DEFAULT NULL, 
  statid      VARCHAR2 DEFAULT NULL,
  numrows OUT NUMBER, 
  numblks OUT NUMBER,
  avgrlen OUT NUMBER,
  statown     VARCHAR2 DEFAULT NULL);
ownname : Name of the schema.

tabname : Name of the table to which this column belongs.

partname : Name of the table partition from which to get the statistics. If the table is partitioned and if partname is NULL, then the statistics are retrieved from the global table level.

stattab : User stat table identifier describing from where to retrieve the statistics. If stattab is NULL, then the statistics are retrieved directly from the dictionary.

statid : Identifier (optional) to associate with these statistics within stattab (Only pertinent if stattab is not NULL).

numrows : Number of rows in the table (partition).

numblks : Number of blocks the table (partition) occupies.

avgrlen : Average row length for the table (partition).

statown : Schema containing stattab (if different than ownname).

Get Index Stats
DBMS_STATS.GET_INDEX_STATS (
  ownname      VARCHAR2, 
  indname      VARCHAR2,
  partname     VARCHAR2 DEFAULT NULL,
  stattab      VARCHAR2 DEFAULT NULL, 
  statid       VARCHAR2 DEFAULT NULL,
  numrows  OUT NUMBER, 
  numlblks OUT NUMBER,
  numdist  OUT NUMBER, 
  avglblk  OUT NUMBER,
  avgdblk  OUT NUMBER, 
  clstfct  OUT NUMBER,
  indlevel OUT NUMBER,
  statown      VARCHAR2 DEFAULT NULL);
ownname : Name of the schema.

indname : Name of the index.

partname : Name of the index partition for which to get the statistics. If the index is partitioned and if partname is NULL, then the statistics are retrieved for the global index level.

stattab : User stat table identifier describing from where to retrieve the statistics. If stattab is NULL, then the statistics are retrieved directly from the dictionary.

statid : Identifier (optional) to associate with these statistics within stattab (Only pertinent if stattab is not NULL).

numrows : Number of rows in the index (partition).

numlblks : Number of leaf blocks in the index (partition).

numdist : Number of distinct keys in the index (partition).

avglblk : Average integral number of leaf blocks in which each distinct key appears for this index (partition).

avgdblk : Average integral number of data blocks in the table pointed to by a distinct key for this index (partition).

clstfct : Clustering factor for the index (partition).

indlevel : Height of the index (partition).

statown : Schema containing stattab (if different than ownname).

Automated table monitoring and stale statistics gathering example

在oracle10g中 statistics_level 初始化参数作为一个全局设置影响对表的监控操作，本文下面涉及的alter_schema_tab_monitoring已经不再被使用，但是到我们调用这些过程时，不会报错，只是没有任何事情发生。

You can automatically gather statistics or create lists of tables that have stale or no statistics.

To automatically gather statistics, run the DBMS_STATS.GATHER_SCHEMA_STATS and DBMS_STATS.GATHER_DATABASE_STATS procedures with the OPTIONS and objlist parameters. Use the following values for the options parameter:

GATHER STALE : Gathers statistics on tables with stale statistics.（通过*_tab_modifications视图）

GATHER : Gathers statistics on all tables. (default)

GATHER EMPTY : Gathers statistics only on tables without statistics.

LIST STALE : Creates a list of tables with stale statistics.（通过*_tab_modifications视图）

LIST EMPTY : Creates a list of tables that do not have statistics.

The objlist parameter identifies an output parameter for the LIST STALE and LIST EMPTY options. The objlist parameter is of type DBMS_STATS.OBJECTTAB.

Step 1 : Perform a quick analyze to load in base statistics
BEGIN
DBMS_STATS.GATHER_SCHEMA_STATS (
  ownname           => 'scott',
  estimate_percent  => null,              -- Small table, lets compute
  block_sample      => false,
  method_opt        => 'FOR ALL COLUMNS',
  degree            => null,              -- No parallelism used in this example
  granularity       => 'ALL',
  cascade           => true,              -- Make sure we include indexes
  options           => 'GATHER'           -- Gather mode
  );
END;
/

PL/SQL procedure successfully completed.
Step 2 : Examine the current statistics
SELECT table_name, num_rows, blocks, avg_row_len    
FROM user_tables
WHERE table_name='EMP';

TABLE_NAME                       NUM_ROWS     BLOCKS AVG_ROW_LEN
------------------------------ ---------- ---------- -----------
EMP                                  1500         28          92
Step 3 : Turn on Automatic Monitoring
Now turn on automatic monitoring for the emp table. This can be done using the alter table method. Starting with Oracle 9i, you can also perform this at the "schema", and "entire database" level. I provide the syntax for all three methods below.

通过alter table 语句我们可以设置oracle数据库自动监控某张表的变化，从9i开始，我们还可以在schema或者数据库级别设置是否监控数据变化，监控结构会存储在*_tab_modifications视图中。

Monitor only the EMP table.
alter table emp monitoring;

Table altered.
Monitor all of the tables within Scott's schema. (Oracle 9i and higher)
BEGIN
  DBMS_STATS.alter_schema_tab_monitoring('scott', true);
END;
/

PL/SQL procedure successfully completed.
Monitor all of the tables within the database. (Oracle 9i and higher)
Note: Although the option to collect statistics for SYS tables is available via ALTER_DATABASE_TAB_MONITORING, Oracle continues to recommend against this practice until the next major release after 9i Release 2. Also note that the ALTER_DATABASE_TAB_MONITORING procedure in the DBMS_STATS package only monitors tables; there is an ALTER INDEX...MONITORING statement which can be used to monitor indexes. Thanks to Nabil Nawaz for providing this and pointing out an error I made in the previous version of this article.
BEGIN
  DBMS_STATS.alter_database_tab_monitoring (
    monitoring => true,
    sysobjs    => false);      -- Don't set to true, see note above.
END;
/

PL/SQL procedure successfully completed.
Step 4 : Verify that monitoring is turned on.
Note: The results of the following query are from running the alter table ... statement on the emp table only.

可以通过*_tables视图的monitoring字段来判断某张表是否开启了自动监控
SELECT table_name, monitoring
FROM user_tables
ORDER BY monitoring;

TABLE_NAME                     MONITORING
------------------------------ ----------
DEPT                           NO
EMP                            YES
Step 5 : Delete some rows from the database.
SQL> DELETE FROM emp WHERE rownum < 501;

500 rows deleted.

SQL> commit;

Commit complete.
Step 6 : Wait until the monitered data is flushed.
Data can be flushed in several ways.

In Oracle 8i, you can wait it out for 3 hours.

In Oracle 9i and higher, you only need to wait 15 minutes.

In either version, restart the database.

For immediate results in Oracle 9i and higher, use the DBMS_STATS.flush_database_monitoring_info package.

OK, I'm impatient...
exec dbms_stats.flush_database_monitoring_info;

PL/SQL procedure successfully completed.
Step 7 : Check for what it has collected.
As user "scott", check USER_TAB_MODIFICATIONS to see what it was collected.
SELECT * FROM user_tab_modifications;

TABLE_NAME PARTITION_NAME SUBPARTITION_NAME INSERTS UPDATES DELETES TIMESTAMP TRUNCATED
---------- -------------- ----------------- ------- ------- ------- --------- ---------
EMP                                               0       0     500 18-SEP-02 NO
Step 8 : Execute DBMS_STATS to gather stats on all "stale" tables.
BEGIN
  DBMS_STATS.GATHER_SCHEMA_STATS(
    ownname           => 'scott',
    estimate_percent  => null,
    block_sample      => false,
    method_opt        => 'FOR ALL COLUMNS',
    degree            => null,
    granularity       => 'ALL',
    cascade           => true,
    options           => 'GATHER STALE');
END;
/

PL/SQL procedure successfully completed.
Step 9 : Verify that the table is no longer listed in USER_TAB_MODIFICATIONS.
SQL> SELECT * FROM user_tab_modifications;

no rows selected.
Step 10 : Examine some of new statistics collected.
SELECT table_name, num_rows, blocks, avg_row_len    
FROM user_tables where table_name='EMP';

TABLE_NAME                       NUM_ROWS     BLOCKS AVG_ROW_LEN
------------------------------ ---------- ---------- -----------
EMP                                  1000         28          92

How to determine if dictionary statistics are RDBMS-generated or user-defined

The following section explains how to determine if your dictionary statistics are RDBMS-generated or set by users through one of the DBMS_STATS.SET_xx_STATS procedures.
This is crucial for development environments that are testing the performance of SQL statements with various sets of statistics. The DBA will need to know if the relying statistics are RDBMS-defined or user-defined.

RDBMS-generated statistics are generated by the following:（我们可以通过如下方式生成统计信息）

ANALYZE SQL command

DBMS_UTILITY.ANALYZE_SCHEMA procedure

DBMS_UTILITY.ANALYZE_DATABASE procedure

DBMS_DDL.ANALYZE_OBJECT procedure

8.1 DBMS_STATS.GATHER_xx_STATS procedures

User generated statistics are only done through the use of the DBMS_STATS.SET_xx_STATS procedures（如果我们需要手工设置统计信息，只可以通过dbms_stats包的set_xx_stats过程来实现）
The column USER_STATS from DBA_TABLES, ALL_TABLES, USER_TABLES displays:

YES, when statistics are entered directly by a user.

NO, when statistics are generated by RDBMS through an ANALYZE statement（如果USER_STATS字段的值为Yes，则统计信息为手工指定，NO，为通过dbms或者analyze方式系统生成）

你可能感兴趣的:(Oracle,Tuning,oracle,dbms_stats,analyze)

oracle 归档日志与RECOVERY_FILE_DEST 视图是桃萌萌鸭~ oracle 数据库
1.RECOVERY_FILE_DEST视图的作用RECOVERY_FILE_DEST是Oracle数据库用于管理快速恢复区（FastRecoveryArea,FRA）的一个视图。FRA是Oracle提供的一种集中存储恢复相关文件（如归档日志、备份文件、闪回日志等）的区域。RECOVERY_FILE_DEST视图的主要作用显示快速恢复区的路径和状态：快速恢复区的配置路径。快速恢复区的总大小和当前使
使用datax进行mysql的表恢复是桃萌萌鸭~ mysql 数据库
DataXDataX是阿里巴巴集团内被广泛使用的离线数据同步工具/平台，实现包括MySQL、SQLServer、Oracle、PostgreSQL、HDFS、Hive、HBase、OTS、ODPS等各种异构数据源之间高效的数据同步功能。FeaturesDataX本身作为数据同步框架，将不同数据源的同步抽象为从源头数据源读取数据的Reader插件，以及向目标端写入数据的Writer插件，理论上Dat
Oracle 导入导出 dmp 数据文件实战 dazhong2012 数据库 oracle 数据库
一、DMP文件基础知识1.DMP文件定义DMP（DataPumpDumpFile）是Oracle数据库专用的二进制格式文件，由expdp/impdp或旧版exp/imp工具生成。它包含数据库对象的元数据（表结构、索引等）和实际数据，是数据备份、迁移和恢复的核心载体。2.DMP文件结构文件头：记录Oracle版本、字符集、导出时间等元信息。数据段：存储表数据，按数据块组织，支持并行读写。索引段：加速
Oracle 临时表空间相关操作 dazhong2012 数据库 oracle 数据库
一、临时表空间概述临时表空间（TemporaryTablespace）是Oracle数据库中用于存储临时数据的特殊存储区域，其数据在会话结束或事务提交后自动清除，重启数据库后彻底消失。主要用途包括：存储排序操作（如ORDERBY）的中间结果支持哈希连接（HashJoin）等复杂查询索引创建时的临时数据存储核心特点：数据非永久性，关闭数据库后自动删除不能存储永久性对象（如表、视图）独立于永久表空间管
SmartSoftHelp NetCoreApi+MySQL/Oracle/SqlServer 部署Windows/Linux--深度优化版：SmartSoftHelp DeepCore XSuite SmartSoftHelp魔法精灵工作室优化安全科技 mysql oracle sqlserver
NetCoreAPI优势明显：SmartSofHelp菜单之Net9API智能微代码(SmartNetCoreAIDeep)NetCoreAPI与数据库组合在Linux/Windows部署的深度分析一、跨平台部署基础架构对比组合类型Linux部署方案Windows部署方案NetCoreAPI+MySQLDocker+MySQLDockerImageIIS+MySQLInstaller(MSI)Ne
【赵渝强老师】OceanBase数据库从零开始：Oracle模式
这里我们来介绍一下新上线的课程《OceanBase数据库从零开始：Oracle模式》，本门课程共11章，视频讲解如下：https://www.bilibili.com/video/BV1r4NCzHEka/?aid=114720556191...下面详细介绍一下每一章的主要内容：第01章-OceanBase的体系架构本章主要介绍OceanBase分布式数据库集群的体系架构，包括：OBServer节
解密大模型全栈开发：从搭建环境到实战案例，一站式攻略海棠AI实验室 “智元启示录“-AI发展的深度思考与未来展望人工智能大模型全栈开发
目录大模型基础概念什么是大模型？大模型的发展历程大模型的类型大模型全栈开发环境搭建硬件需求软件环境配置云服务选择大模型应用开发流程模型选择策略提示工程（PromptEngineering）模型微调（Fine-tuning）参数高效微调（PEFT）大模型应用架构设计基本应用架构RAG（检索增强生成）系统Agent系统设计大模型应用部署与优化模型部署选项模型优化技术性能监控与调优大模型应用实战案例智能
阿里云Redhat系Linux修改ssh默认端口 z同学的编程之旅环境搭建阿里云 linux ssh
阿里云Redhat系Linux修改ssh默认端口在阿里云买了个服务器，想着ssh的默认端口是22，这不安全。我就将修改ssh默认端口的过程记录下来了，方便日后回看。本命令适用于Redhat系Linux，例如Redhat、Centos、AlibabaCloudLinux、OracleLinux、RockyLinux、AlmaLinux等。我为什么知道这些Linux？因为公司有内核相关业务，接触的多了
Oracle 神级函数 Decode 实战：一条 SQL 替代 3000 行代码的计算逻辑 AI、少年郎 oracle sql 数据库递归组织树
在企业级应用开发中，复杂的业务统计需求往往需要编写大量代码进行数据处理。本文将通过Oracle的DECODE函数与分组函数的巧妙结合，展示如何用一条SQL语句实现原本需要3000行代码的复杂计算逻辑，尤其针对企业组织架构中的部门级请假数据统计场景。一、基础准备：构建业务数据表1.创建单位部门表（模拟组织架构）CREATETABLEt_dept(dept_idNUMBERPRIMARYKEY,--部
SnowConvert：自动化数据迁移的技术解析与最佳实践 weixin_30777913 迁移学习数据库运维
SnowConvert是Snowflake生态系统的关键迁移工具，专为将传统数据仓库（如Oracle、Teradata、SQLServer等）的代码资产高效、准确地转换为Snowflake原生语法而设计。以下基于官方文档对其技术原理、工作流程及最佳实践进行深入分析：一、SnowConvert核心技术解析精准的语法映射引擎语言支持：深度解析源系统特有语法（OraclePL/SQL,TeradataB
【Python常用模块】_Pandas模块3-DataFrame对象失心疯_2023 Python常用模块数据分析 pandas 数据挖掘 python 数据统计数据处理
课程推荐我的个人主页：失心疯的个人主页入门教程推荐：Python零基础入门教程合集虚拟环境搭建：Python项目虚拟环境(超详细讲解)PyQt5系列教程：PythonGUI(PyQt5)教程合集Oracle数据库教程：Oracle数据库教程合集MySQL数据库教程：MySQL数据库教程合集优质资源下载：资源下载合集
linux网络编程之SCTP套接字常用接口码莎拉蒂 . Linux 网络编程 linux网络编程 SCTP套接字常用接口
转载地址：oracle开发帮助文档：http://docs.oracle.com/cd/E19253-01/819-7052/index.htmlSCTP套接字接口当socket()调用为IPPROTO_SCTP创建套接字时，它会调用特定于SCTP的套接字创建例程。针对SCTP套接字执行的套接字调用会自动调用相应的SCTP套接字例程。在一对一套接字中，每个套接字都对应一个SCTP关联。可以通过调用
数据库技术演进史：从穿孔卡片到云原生小李独爱秋计算机那些事儿~数据库云原生 mysql
一、数据库的定义与核心地位数据库（Database）是“长期存储在计算机内、有组织的、可共享的统一管理数据集合”，与芯片、操作系统并称IT系统三大核心。其核心价值在于：结构化存储：通过数据模型组织信息，解决文件系统冗余问题；高效访问：支持并发查询与事务处理；安全共享：权限控制保障数据安全。分类维度全景图：分类维度类型代表产品数据模型关系型(SQL)MySQL,Oracle,PostgreSQL非关
Flink CDC同步Oracle无主键表 Zzz...209 java flink oracle
FlinkCDC同步Oracle无主键表问题背景问题解决问题背景FlinkCDC是一种很强大且实用的实时数据同步工具，官网如下。链接:link但是在实际使用过程中还是会有些不足之处，比如说同步Oracle数据库中无主键以及唯一键的表时，关于目标端的幂等性时无法保证的。问题解决在Oracle数据库中，表中有一个伪列ROWID，而在CDC同步过来的数据中是不包含此列的。修改源码如下，使之携带ROWID
Flink Oracle CDC Connector详解 24k小善 flink java 大数据
1.FlinkOracleCDCConnector核心功能功能模块描述实时数据捕获实时捕捉Oracle数据库中的DML操作（INSERT,UPDATE,DELETE）。Schema变更支持支持部分DDL操作的检测（如表结构变更）。端到端一致性确保数据从Oracle到Flink的传输过程中的完整性和一致性。可扩展性支持高吞吐量和大规模数据处理需求。容错机制具备断点续传能力，确保在中断后能够从上次的位
【人工智能】微调的秘密武器：释放大模型的无限潜能蒙娜丽宁 Python杂谈人工智能人工智能
《PythonOpenCV从菜鸟到高手》带你进入图像处理与计算机视觉的大门！解锁Python编程的无限可能：《奇妙的Python》带你漫游代码世界在人工智能迅猛发展的今天，大规模语言模型（LLMs）以其强大的通用能力席卷各行各业。然而，如何让这些通用模型在特定领域或任务中发挥最大潜力？答案是微调（Fine-tuning）。本文深入探讨微调的理论基础、技术细节与实践方法，揭示其作为解锁大模型隐藏潜力
【Servo】自整定、惯量识别、调谐我不是程序猿儿 Servo C c++C++c语言
好的，这里为你用伺服驱动领域的语言详细解释“自整定”的概念：自整定（AutoTuning）的定义自整定是指伺服驱动器通过内置的检测、识别和计算算法，自动测量并设置控制系统参数（如增益、惯量、摩擦等），以实现对伺服系统性能的快速优化和匹配，无需人工手动逐一调节。通俗理解：就像智能家电的“一键设置”，自整定可以让初学者或非专业人员只需按一个按钮，系统会自动检测自身状态，选择或计算出合适的参数，使伺服系
向量数据库milvus中文全文检索取不到数据的处理办法 --勇数据库 milvus 全文检索
检查中文分词配置Milvus2.5+支持原生中文全文检索，但需显式配置中文分词器：创建集合时指定分词器类型为chinesepythonschema.add_field(field_name="text",datatype=DataType.VARCHAR,max_length=65535,enable_analyzer=True,analyzer_params={"type":"chinese"}
87-Oracle DBlink和透明网关的创建远方1609 oracle 数据库 database sql 大数据
各位小伙伴，有没有业务侧要求除了生产使用的实例还有其他的oracle实例需要链接，还有其他的业务的MSSQLerver，PG等数据库的连接要求，需要配置LISTENER的配置和对应的脚本设置。此次安装现场要求给oracle11g和sqlserver2016进行透明网关链接，同时整理dblink创建留个记录。不过通过透明网关的查询注定了不同架构上的链接性能会有丢失，如果出现业务大查询到异端数据库的历
信创时代技术栈选择与前景分析：国产替代背景下的战略路径与实践指南猿享天开信创开发系统安全科技创业创新开发语言
博主简介：CSDN博客专家、CSDN平台优质创作者，高级开发工程师，数学专业，10年以上C/C++,C#,Java等多种编程语言开发经验，拥有高级工程师证书；擅长C/C++、C#等开发语言，熟悉Java常用开发技术，能熟练应用常用数据库SQLserver,Oracle,mysql,postgresql等进行开发应用，熟悉DICOM医学影像及DICOM协议,业余时间自学JavaScript,Vue,
【面试宝典】【大模型入门】【模型微调】曾小文人工智能深度学习机器学习
面试热点科普：监督微调vs无监督微调，有啥不一样？在大模型时代（比如BERT、GPT）里，我们经常听到“预训练+微调”的范式。但你可能会疑惑——监督微调、无监督微调，到底有啥区别？用的场景一样吗？今天这篇，带你5分钟搞懂这对“孪生兄弟”的异同✅1.术语定义名称定义说明预训练（Pretraining）在大规模通用数据上训练模型，学习“通用知识”，比如语言规律、语义表示。微调（Fine-tuning）
中文工单分类模型选择 SugarPPig 人工智能分类人工智能数据挖掘
采用基于预训练模型的微调（Fine-tuning）方案来做中文工单分类，这是非常明智的选择，因为预训练模型已经在大量中文语料上学习了丰富的语言知识，能大幅提升分类效果。在HuggingFace上，针对中文文本分类，我为你推荐以下最合适的模型：最推荐的模型：BERT-base-chinese模型名称(HuggingFaceID):google-bert/bert-base-chinese为什么推荐它
LLM模型的一些思考巴基海贼王 nlp
对通用LLM模型进行Fine-tuning操作（SFT，supervisedfinetuning），带来的影响往往是有害的？从表象看，使用领域数据对LLM做Fine-tuning，通常会造成灾难性的“灾难遗忘”问题。简单点儿说，SFT在赋予对领域知识理解能力的同时，由于修正模型参数，导致模型遗忘之前学会的某些知识。目前的“智能=压缩”的理论是否正确？LLM的压缩能力是否可以拆解成单个神经元的“压缩
（mysql、oracle、pgsql、mongodb、redis、es）主流数据库的核心差异不愿意透露姓名的樊同学数据库 mysql oracle postgresql
以下是主流数据库的核心差异及适用场景的全面对比，结合技术特性和实际应用需求整理：一、数据库分类与核心差异1.关系型数据库（RDBMS）数据库核心特点适用场景MySQL开源、读写性能均衡，易用性高，但复杂查询较弱Web应用（博客/电商）、中小企业OLTP系统（如用户管理）Oracle商业级、强事务支持（RAC集群）、功能全面，成本高金融核心系统（银行交易）、大型ERP（复杂事务）PostgreSQL
Oracle19C运维管理，深度总结02 韩公子的Linux大集市 #002-SQL基础篇运维
文章目录一、架构与部署二、备份与恢复（RMAN）三、性能优化四、高可用与容灾五、安全与合规六、自动化运维七、升级与补丁八、故障排查工具箱九、关键监控指标十、最佳实践总结以下是针对Oracle19C运维管理的深度总结，涵盖核心运维场景、最佳实践及关键技术要点：一、架构与部署多租户架构（CDB/PDB）CDB：容器数据库，管理元数据和公共资源。PDB：可插拔数据库，独立业务单元，支持快速克隆、迁移（A
DeepSeek在数据分析与科学计算中的革命性应用软考和人工智能学堂 #DeepSeek快速入门 Python开发经验 #深度学习 python 机器学习开发语言
1.数据预处理自动化1.1智能数据清洗fromdeepseekimportDataCleanerimportpandasaspddefauto_clean_data(df):cleaner=DataCleaner()analysis=cleaner.analyze(df)print("数据问题诊断:")forissueinanalysis['issues']:print(f"-{issue['ty
Fastapi+Celery实现异步回调现实、狠残酷项目部署 fastapi
这里写目录标题场景简介（模拟大模型调用）：一、准备工作二、FastAPI+Celery项目结构三、项目代码test_client.pymain.pytasks.pytest.py四、测试流程场景简介（模拟大模型调用）：用户请求接口/analyze，传入一个文本；FastAPI处理后，用Celery异步任务模拟调用大模型进行文本分析；分析完成后，调用用户提供的回调地址（比如/callback）并把分
java压缩包解压之后怎么安装_解压之后压缩包可以删除吗网站推广优化yetaoaiueo java 服务器 linux windows 开发语言
java压缩包解压之后怎么安装：Java是一种广泛使用的编程语言，它可以在不同的操作系统上运行。在安装Java之前，您首先需要将Java的安装包解压缩。接下来，我将为您提供关于如何解压缩Java安装包和安装Java的详细步骤。java压缩包的解压缩Java安装包步骤如下：1.下载Java安装包：您需要从Oracle官方网站下载Java的安装包。根据您的操作系统和Java版本的要求，选择适合您的安装
使用 Clang-Tidy 进行静态代码分析：完整的配置与 CMake 集成实例橘色的喵静态检测单元测试 c++clang clang-tidy cppcheck 静态检查 cmake 代码质量
文章目录使用Clang-Tidy进行静态代码分析：完整的配置与CMake集成实例0.概要1.安装Clang-Tidy2.配置`.clang-tidy`3.检查项详解3.1静态分析器（StaticAnalyzer）3.2现代化（Modernize）3.3Google代码风格（Google）3.4可读性（Readability）3.5CERT安全编码标准（CERT）3.6Bug检测（Bugprone）
Docker 常见容器第三方镜像地址 Docker国内镜像 docker国内镜像 docker镜像头上一片天空 Docker docker 容器运维
Docker常见容器第三方镜像地址Docker国内镜像docker国内镜像docker镜像Docker常见容器第三方镜像地址Docker国内镜像docker国内镜像docker镜像注意这里提供的镜像需要区分Inter和AMD架构，等后面会慢慢补齐amd架构镜像1、MySQLInter平台AMD平台2、SQLServer(mssql)3、Oracle4、MongoDBInter/AMD平台5、Red
数据采集高并发的架构应用 3golden .net
问题的出发点：最近公司为了发展需要，要扩大对用户的信息采集，每个用户的采集量估计约2W。如果用户量增加的话，将会大量照成采集量成3W倍的增长，但是又要满足日常业务需要，特别是指令要及时得到响应的频率次数远大于预期。 &n
不停止 MySQL 服务增加从库的两种方式 brotherlamp linux linux视频 linux资料 linux教程 linux自学
现在生产环境MySQL数据库是一主一从，由于业务量访问不断增大，故再增加一台从库。前提是不能影响线上业务使用，也就是说不能重启MySQL服务，为了避免出现其他情况，选择在网站访问量低峰期时间段操作。一般在线增加从库有两种方式，一种是通过mysqldump备份主库，恢复到从库，mysqldump是逻辑备份，数据量大时，备份速度会很慢，锁表的时间也会很长。另一种是通过xtrabacku
Quartz——SimpleTrigger触发器 eksliang SimpleTrigger TriggerUtils quartz
转载请出自出处：http://eksliang.iteye.com/blog/2208166 一.概述 SimpleTrigger触发器，当且仅需触发一次或者以固定时间间隔周期触发执行；二.SimpleTrigger的构造函数 SimpleTrigger(String name, String group)：通过该构造函数指定Trigger所属组和名称； Simpl
Informatica应用（1） 18289753290 sql workflow lookup 组件 Informatica
1.如果要在workflow中调用shell脚本有一个command组件，在里面设置shell的路径；调度wf可以右键出现schedule，现在用的是HP的tidal调度wf的执行。 2.designer里面的router类似于SSIS中的broadcast（多播组件）;Reset_Workflow_Var：参数重置（比如说我这个参数初始是1在workflow跑得过程中变成了3我要在结束时还要
python 获取图片验证码中文字酷的飞上天空 python
根据现成的开源项目 http://code.google.com/p/pytesser/改写在window上用easy_install安装不上看了下源码发现代码很少于是就想自己改写一下添加支持网络图片的直接解析 #coding:utf-8 #import sys #reload(sys) #sys.s
AJAX 永夜-极光 Ajax
1.AJAX功能:动态更新页面,减少流量消耗,减轻服务器负担 2.代码结构: <html> <head> <script type="text/javascript"> function loadXMLDoc() { .... AJAX script goes here ...
创业OR读研随便小屋创业
现在研一，有种想创业的想法，不知道该不该去实施。因为对于的我情况这两者是矛盾的，可能就是鱼与熊掌不能兼得。研一的生活刚刚过去两个月，我们学校主要的是
需求做得好与坏直接关系着程序员生活质量 aijuans IT 生活
这个故事还得从去年换工作的事情说起，由于自己不太喜欢第一家公司的环境我选择了换一份工作。去年九月份我入职现在的这家公司，专门从事金融业内软件的开发。十一月份我们整个项目组前往北京做现场开发，从此苦逼的日子开始了。系统背景：五月份就有同事前往甲方了解需求一直到6月份，后续几个月也完
如何定义和区分高级软件开发工程师 aoyouzi
在软件开发领域，高级开发工程师通常是指那些编写代码超过 3 年的人。这些人可能会被放到领导的位置，但经常会产生非常糟糕的结果。Matt Briggs 是一名高级开发工程师兼 Scrum 管理员。他认为，单纯使用年限来划分开发人员存在问题，两个同样具有 10 年开发经验的开发人员可能大不相同。近日，他发表了一篇博文，根据开发者所能发挥的作用划分软件开发工程师的成长阶段。　　初
Servlet的请求与响应百合不是茶 servlet get提交 java处理post提交
Servlet是tomcat中的一个重要组成,也是负责客户端和服务端的中介 1,Http的请求方式(get ,post); 客户端的请求一般都会都是Servlet来接受的,在接收之前怎么来确定是那种方式提交的,以及如何反馈,Servlet中有相应的方法, http的get方式 servlet就是都doGet(
web.xml配置详解之listener bijian1013 java web.xml listener
一.定义 <listener> <listen-class>com.myapp.MyListener</listen-class> </listener> 二.作用该元素用来注册一个监听器类。可以收到事件什么时候发生以及用什么作为响
Web页面性能优化（yahoo技术） Bill_chen JavaScript Ajax Web css Yahoo
1.尽可能的减少HTTP请求数 content 2.使用CDN server 3.添加Expires头(或者 Cache-control) server 4.Gzip 组件 server 5.把CSS样式放在页面的上方。 css 6.将脚本放在底部(包括内联的) javascript 7.避免在CSS中使用Expressions css 8.将javascript和css独立成外部文
【MongoDB学习笔记八】MongoDB游标、分页查询、查询结果排序 bit1129 mongodb
游标游标，简单的说就是一个查询结果的指针。游标作为数据库的一个对象，使用它是包括声明打开循环抓去一定数目的文档直到结果集中的所有文档已经抓取完关闭游标游标的基本用法，类似于JDBC的ResultSet(hasNext判断是否抓去完,next移动游标到下一条文档)，在获取一个文档集时，可以提供一个类似JDBC的FetchSize
ORA-12514 TNS 监听程序当前无法识别连接描述符中请求服务的解决方法白糖_ ORA-12514
今天通过Oracle SQL*Plus连接远端服务器的时候提示“监听程序当前无法识别连接描述符中请求服务”，遂在网上找到了解决方案： ①打开Oracle服务器安装目录\NETWORK\ADMIN\listener.ora文件，你会看到如下信息： # listener.ora Network Configuration File: D:\database\Oracle\net
Eclipse 问题 A resource exists with a different case bozch eclipse
在使用Eclipse进行开发的时候，出现了如下的问题： Description Resource Path Location TypeThe project was not built due to "A resource exists with a different case: '/SeenTaoImp_zhV2/bin/seentao'.&
编程之美-小飞的电梯调度算法 bylijinnan 编程之美
public class AptElevator { /** * 编程之美小飞电梯调度算法 * 在繁忙的时间，每次电梯从一层往上走时，我们只允许电梯停在其中的某一层。 * 所有乘客都从一楼上电梯，到达某层楼后，电梯听下来，所有乘客再从这里爬楼梯到自己的目的层。 * 在一楼时，每个乘客选择自己的目的层，电梯则自动计算出应停的楼层。 * 问：电梯停在哪
SQL注入相关概念 chenbowen00 sql Web 安全
SQL Injection：就是通过把SQL命令插入到Web表单递交或输入域名或页面请求的查询字符串，最终达到欺骗服务器执行恶意的SQL命令。具体来说，它是利用现有应用程序，将（恶意）的SQL命令注入到后台数据库引擎执行的能力，它可以通过在Web表单中输入（恶意）SQL语句得到一个存在安全漏洞的网站上的数据库，而不是按照设计者意图去执行SQL语句。首先让我们了解什么时候可能发生SQ
[光与电]光子信号战防御原理 comsci 原理
无论是在战场上,还是在后方,敌人都有可能用光子信号对人体进行控制和攻击,那么采取什么样的防御方法,最简单,最有效呢? 我们这里有几个山寨的办法,可能有些作用,大家如果有兴趣可以去实验一下根据光
oracle 11g新特性:Pending Statistics daizj oracle dbms_stats
oracle 11g新特性:Pending Statistics 转从11g开始，表与索引的统计信息收集完毕后，可以选择收集的统信息立即发布，也可以选择使新收集的统计信息处于pending状态，待确定处于pending状态的统计信息是安全的，再使处于pending状态的统计信息发布，这样就会避免一些因为收集统计信息立即发布而导致SQL执行计划走错的灾难。在 11g 之前的版本中，D
快速理解RequireJs dengkane jquery requirejs
RequireJs已经流行很久了，我们在项目中也打算使用它。它提供了以下功能：声明不同js文件之间的依赖可以按需、并行、延时载入js库可以让我们的代码以模块化的方式组织初看起来并不复杂。在html中引入requirejs 在HTML中，添加这样的 <script> 标签： <script src="/path/to
C语言学习四流程控制if条件选择、for循环和强制类型转换 dcj3sjt126com c
# include <stdio.h> int main(void) { int i, j; scanf("%d %d", &i, &j); if (i > j) printf("i大于j\n"); else printf("i小于j\n"); retu
dictionary的使用要注意 dcj3sjt126com IO
NSDictionary *dict = [NSDictionary dictionaryWithObjectsAndKeys: user.user_id , @"id", user.username , @"username",
Android 中的资源访问(Resource) finally_m xml android String drawable color
简单的说，Android中的资源是指非代码部分。例如，在我们的Android程序中要使用一些图片来设置界面，要使用一些音频文件来设置铃声，要使用一些动画来显示特效，要使用一些字符串来显示提示信息。那么，这些图片、音频、动画和字符串等叫做Android中的资源文件。在Eclipse创建的工程中，我们可以看到res和assets两个文件夹，是用来保存资源文件的，在assets中保存的一般是原生
Spring使用Cache、整合Ehcache 234390216 spring cache ehcache @Cacheable
Spring使用Cache 从3.1开始，Spring引入了对Cache的支持。其使用方法和原理都类似于Spring对事务管理的支持。Spring Cache是作用在方法上的，其核心思想是这样的：当我们在调用一个缓存方法时会把该方法参数和返回结果作为一个键值对存放在缓存中，等到下次利用同样的
当druid遇上oracle blob(clob) jackyrong oracle
http://blog.csdn.net/renfufei/article/details/44887371 众所周知，Oracle有很多坑, 所以才有了去IOE。在使用Druid做数据库连接池后，其实偶尔也会碰到小坑，这就是使用开源项目所必须去填平的。【如果使用不开源的产品，那就不是坑，而是陷阱了，你都不知道怎么去填坑】用Druid连接池，通过JDBC往Oracle数据库的
easyui datagrid pagination获得分页页码、总页数等信息 ldzyz007
var grid = $('#datagrid'); var options = grid.datagrid('getPager').data("pagination").options; var curr = options.pageNumber; var total = options.total; var max =
浅析awk里的数组 nigelzeng 二维数组 array 数组 awk
awk绝对是文本处理中的神器，它本身也是一门编程语言，还有许多功能本人没有使用到。这篇文章就单单针对awk里的数组来进行讨论，如何利用数组来帮助完成文本分析。有这么一组数据： abcd,91#31#2012-12-31 11:24:00 case_a,136#19#2012-12-31 11:24:00 case_a,136#23#2012-12-31 1
搭建 CentOS 6 服务器(6) - TigerVNC rensanning centos
安装GNOME桌面环境 # yum groupinstall "X Window System" "Desktop" 安装TigerVNC # yum -y install tigervnc-server tigervnc 启动VNC服务 # /etc/init.d/vncserver restart # vncser
Spring 数据库连接整理 tomcat_oracle spring bean jdbc
1、数据库连接jdbc.properties配置详解　　jdbc.url=jdbc:hsqldb:hsql://localhost/xdb 　　jdbc.username=sa 　　jdbc.password= 　　jdbc.driver=不同的数据库厂商驱动，此处不一一列举　　接下来，详细配置代码如下：　　 Spring连接池
Dom4J解析使用xpath java.lang.NoClassDefFoundError: org/jaxen/JaxenException异常 xp9802
用Dom4J解析xml,以前没注意,今天使用dom4j包解析xml时在xpath使用处报错异常栈：java.lang.NoClassDefFoundError: org/jaxen/JaxenException异常导入包 jaxen-1.1-beta-6.jar 解决; &nb