一瓢一瓢的饮 alanchan

4、介绍Flink的流批一体、transformations的18种算子详细介绍、Flink与Kafka的source、sink介绍

Flink 系列文章

1、Flink1.12.7或1.13.5详细介绍及本地安装部署、验证
2、Flink1.13.5二种部署方式(Standalone、Standalone HA )、四种提交任务方式（前两种及session和per-job）验证详细步骤
3、flink重要概念（api分层、角色、执行流程、执行图和编程模型）及dataset、datastream详细示例入门和提交任务至on yarn运行
4、介绍Flink的流批一体、transformations的18种算子详细介绍、Flink与Kafka的source、sink介绍
5、Flink 的 source、transformations、sink的详细示例（一）
5、Flink的source、transformations、sink的详细示例（二）-source和transformation示例
5、Flink的source、transformations、sink的详细示例（三）-sink示例
6、Flink四大基石之Window详解与详细示例（一）
6、Flink四大基石之Window详解与详细示例（二）
7、Flink四大基石之Time和WaterMaker详解与详细示例（watermaker基本使用、kafka作为数据源的watermaker使用示例以及超出最大允许延迟数据的接收实现）
8、Flink四大基石之State概念、使用场景、持久化、批处理的详解与keyed state和operator state、broadcast state使用和详细示例
9、Flink四大基石之Checkpoint容错机制详解及示例（checkpoint配置、重启策略、手动恢复checkpoint和savepoint）
10、Flink的source、transformations、sink的详细示例（二）-source和transformation示例【补充示例】
11、Flink配置flink-conf.yaml详细说明（HA配置、checkpoint、web、安全、zookeeper、historyserver、workers、zoo.cfg）
12、Flink source和sink 的 clickhouse 详细示例

13、Flink 的table api和sql的介绍、示例等系列综合文章链接

文章目录

Flink 系列文章
一、流批一体API
- 2、DataStream API
- 3、Flink的编程模型
- 4、编程步骤
- 1）、准备环境env
- 2）、加载数据源
- 3）、转换操作
- 4）、sink结果
- 5）、触发执行
- 5、Source-Transformations-Sink介绍
- - 1）、Source介绍
  - - 1、File-based
    - 2、Socket-based
    - 3、Collection-based
    - 4、Custom
  - 2）、Transformations介绍
  - - 1、map
    - 2、flatmap
    - 3、Filter
    - 5、Reduce
    - 6、Aggregations
    - 7、Window
    - 8、WindowAll
    - 9、Window Apply
    - 10、Window Reduce
    - 11、Aggregations on windows
    - 12、Union
    - 13、Window Join
    - 14、Interval Join
    - 15、Window CoGroup
    - 16、Connect
    - 17、CoMap, CoFlatMap
    - 18、Iterate
  - 3）、Sink介绍
  - - 1、Flink支持的sink
    - 2、自定义Sink
- 6、DataStream Connectors
- - 1）、JDBC
  - 2）、kafka
二、Flink与kafka
- 1、Kafka Source
- - 1）、用法Usage
  - 2）、主题订阅Topic-partition Subscription
  - 3）、反序列化Deserializer
  - 4）、偏移量Starting Offset
  - 5）、有界性Boundedness
  - 6）、配置Additional Properties
  - 7）、动态分区Dynamic Partition Discovery
  - 8）、事件时间与水印Event Time and Watermarks
  - 9）、惰性Idleness
  - 10）、消费者偏移量提交Consumer Offset Committing
- 2、Kafka Sink
- - 1)、用法Usage
  - 2）、序列化Serializer
  - 3）、容错Fault Tolerance
- 3、Transformations

本文详细介绍了流批一体的开发过程、source、transformations、sink详细的api以及flink与kafka的详细功能介绍。详细的介绍了18种transformations算子的功能、示例代码。
本文仅仅是介绍概念性内容，详细示例参考该系列文章Flink（五）source、transformations、sink的详细示例
本文部分图片、文字来源于互联网。
本分为2个部分，即流批一体介绍和flink与kafka介绍。

一、流批一体API

在自然环境中，数据的产生原本就是流式的。无论是来自 Web 服务器的事件数据，证券交易所的交易数据，还是来自工厂车间机器上的传感器数据，其数据都是流式的。但是当你分析数据时，可以围绕有界流（bounded）或无界流（unbounded）两种模型来组织处理数据，当然，选择不同的模型，程序的执行和处理方式也都会不同。

批处理是有界数据流处理的范例。在这种模式下，你可以选择在计算结果输出之前输入整个数据集，这也就意味着你可以对整个数据集的数据进行排序、统计或汇总计算后再输出结果。
流处理正相反，其涉及无界数据流。至少理论上来说，它的数据输入永远不会结束，因此程序必须持续不断地对到达的数据进行处理。

在 Flink 中，应用程序由用户自定义算子转换而来的流式 dataflows 所组成。这些流式 dataflows 形成了有向图，以一个或多个源（source）开始，并以一个或多个汇（sink）结束。

通常，程序代码中的 transformation 和 dataflow 中的算子（operator）之间是一一对应的。但有时也会出现一个 transformation 包含多个算子的情况，如上图所示。
Flink 应用程序可以消费来自消息队列或分布式日志这类流式数据源（例如 Apache Kafka 或 Kinesis）的实时数据，也可以从各种的数据源中消费有界的历史数据。同样，Flink 应用程序生成的结果流也可以发送到各种数据汇中。

2、DataStream API

DataStream API 支持流批执行模式
Flink 的核心 API 最初是针对特定的场景设计的，尽管 Table API / SQL 针对流处理和批处理已经实现了统一的 API，但当用户使用较底层的 API 时，仍然需要在批处理（DataSet API）和流处理（DataStream API）这两种不同的 API 之间进行选择。鉴于批处理是流处理的一种特例，将这两种 API 合并成统一的 API，有一些非常明显的好处，比如：

可复用性：作业可以在流和批这两种执行模式之间自由地切换，而无需重写任何代码。因此，用户可以复用同一个作业，来处理实时数据和历史数据。
维护简单：统一的 API 意味着流和批可以共用同一组 connector，维护同一套代码，并能够轻松地实现流批混合执行，例如 backfilling 之类的场景。

流批统一的 DataStream API 支持高效的批处理（FLIP-134），DataSet API 将被弃用（FLIP-131），其功能将被包含在 DataStream API 和 Table API / SQL 中。

3、Flink的编程模型

4、编程步骤

官网链接说明：Apache Flink 1.12 Documentation: Flink DataStream API 编程指南
Flink programs look like regular programs that transform DataStreams/dataset. Each program consists of the same basic parts:

Obtain an execution environment, 准备环境env
Load/create the initial data,加载数据源
Specify transformations on this data,转换操作
Specify where to put the results of your computations,sink结果
Trigger the program execution，触发执行

1）、准备环境env

getExecutionEnvironment()，推荐使用
createLocalEnvironment()
createRemoteEnvironment(String host, int port, String... jarFiles)
示例如下：
final StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment();

2）、加载数据源

env可以加载很多种数据源，比如文件、socket、fromelements等
示例如下：
DataSet<String> lines = env.fromElements("flink hadoop hive", "flink hadoop hive", "flink hadoop", "flink");
DataStream<String> text = env.readTextFile("file:///path/to/file");

3）、转换操作

flink的核心功能之一就是转换处理操作，有很多种实现
示例如下：
DataStream<Integer> parsed = input.map(new MapFunction<String, Integer>() {
    @Override
    public Integer map(String value) {
        return Integer.parseInt(value);
    }
});

4）、sink结果

sink可以有很多种数据源，比如关系型数据库、消息队列、hdfs、redis等
示例如下：
writeAsText(String path)
print()

5）、触发执行

Once you specified the complete program you need to trigger the program execution by calling execute() on the StreamExecutionEnvironment. Depending on the type of the ExecutionEnvironment the execution will be triggered on your local machine or submit your program for execution on a cluster.
The execute() method will wait for the job to finish and then return a JobExecutionResult, this contains execution times and accumulator results.
If you don’t want to wait for the job to finish, you can trigger asynchronous job execution by calling executeAysnc() on the StreamExecutionEnvironment. It will return a JobClient with which you can communicate with the job you just submitted. For instance, here is how to implement the semantics of execute() by using executeAsync().

示例如下：
final JobClient jobClient = env.executeAsync();
final JobExecutionResult jobExecutionResult = jobClient.getJobExecutionResult().get();

That last part about program execution is crucial to understanding when and how Flink operations are executed. All Flink programs are executed lazily: When the program’s main method is executed, the data loading and transformations do not happen directly. Rather, each operation is created and added to a dataflow graph. The operations are actually executed when the execution is explicitly triggered by an execute() call on the execution environment. Whether the program is executed locally or on a cluster depends on the type of execution environment
The lazy evaluation lets you construct sophisticated programs that Flink executes as one holistically planned unit.

5、Source-Transformations-Sink介绍

官网链接：Apache Flink 1.12 Documentation: Flink DataStream API 编程指南

1）、Source介绍

source是数据输入源，使用StreamExecutionEnvironment.addSource(sourceFunction)添加数据源，支持的数据源有基于文件的、基于socket的、基于集合的和基于自定义的。
Sources are where your program reads its input from. You can attach a source to your program by using StreamExecutionEnvironment.addSource(sourceFunction). Flink comes with a number of pre-implemented source functions, but you can always write your own custom sources by implementing the SourceFunction for non-parallel sources, or by implementing the ParallelSourceFunction interface or extending the RichParallelSourceFunction for parallel sources.
There are several predefined stream sources accessible from the StreamExecutionEnvironment:

1、File-based

一般用于测试或生产应用场景
path可以是本地文件（包含压缩文件）、hdfs等文件、文件夹

readTextFile(path) - Reads text files, i.e. files that respect the TextInputFormat specification, line-by-line and returns them as Strings.
readFile(fileInputFormat, path) - Reads (once) files as dictated by the specified file input format.
readFile(fileInputFormat, path, watchType, interval, pathFilter, typeInfo) - This is the method called internally by the two previous ones. It reads files in the path based on the given fileInputFormat. Depending on the provided watchType, this source may periodically monitor (every interval ms) the path for new data (FileProcessingMode.PROCESS_CONTINUOUSLY), or process once the data currently in the path and exit (FileProcessingMode.PROCESS_ONCE). Using the pathFilter, the user can further exclude files from being processed.

IMPLEMENTATION:
Under the hood, Flink splits the file reading process into two sub-tasks, namely directory monitoring and data reading. Each of these sub-tasks is implemented by a separate entity. Monitoring is implemented by a single, non-parallel (parallelism = 1) task, while reading is performed by multiple tasks running in parallel. The parallelism of the latter is equal to the job parallelism. The role of the single monitoring task is to scan the directory (periodically or only once depending on the watchType), find the files to be processed, divide them in splits, and assign these splits to the downstream readers. The readers are the ones who will read the actual data. Each split is read by only one reader, while a reader can read multiple splits, one-by-one.

IMPORTANT NOTES:
If the watchType is set to FileProcessingMode.PROCESS_CONTINUOUSLY, when a file is modified, its contents are re-processed entirely. This can break the “exactly-once” semantics, as appending data at the end of a file will lead to all its contents being re-processed.
If the watchType is set to FileProcessingMode.PROCESS_ONCE, the source scans the path once and exits, without waiting for the readers to finish reading the file contents. Of course the readers will continue reading until all file contents are read. Closing the source leads to no more checkpoints after that point. This may lead to slower recovery after a node failure, as the job will resume reading from the last checkpoint.

2、Socket-based

一般用于测试应用场景
socketTextStream - Reads from a socket. Elements can be separated by a delimiter.

3、Collection-based

一般用于测试、验证场景

fromCollection(Collection) - Creates a data stream from the Java Java.util.Collection. All elements in the collection must be of the same type.
fromCollection(Iterator, Class) - Creates a data stream from an iterator. The class specifies the data type of the elements returned by the iterator.
fromElements(T …) - Creates a data stream from the given sequence of objects. All objects must be of the same type.
fromParallelCollection(SplittableIterator, Class) - Creates a data stream from an iterator, in parallel. The class specifies the data type of the elements returned by the iterator.
generateSequence(from, to) - Generates the sequence of numbers in the given interval, in parallel.该方法被替换了成了fromSequence了

4、Custom

一般用于测试、生产应用场景

addSource - Attach a new source function. For example, to read from Apache Kafka you can use addSource(new FlinkKafkaConsumer<>(…)). See connectors for more details.

2）、Transformations介绍

用户通过算子Operators能将一个或多个 DataStream 转换成新的 DataStream，在应用程序中可以将多个数据转换算子Operators合并成一个复杂的数据流拓扑。
这部分内容将描述 Flink DataStream API 中基本的数据转换API，数据转换后各种数据分区方式，以及算子的链接策略。

整体来说，流式数据上的操作可以分为四类。

第一类是对于单条记录的操作，比如筛除掉不符合要求的记录（Filter 操作），或者将每条记录都做一个转换（Map 操作）
第二类是对多条记录的操作。比如说统计一个小时内的订单总成交量，就需要将一个小时内的所有订单记录的成交量加到一起。为了支持这种类型的操作，就得通过 Window 将需要的记录关联到一起进行处理
第三类是对多个流进行操作并转换为单个流。例如，多个流可以通过 Union、Join 或 Connect 等操作合到一起。这些操作合并的逻辑不同，但是它们最终都会产生了一个新的统一的流，从而可以进行一些跨流的操作。
第四类DataStream 还支持与合并对称的拆分操作，即把一个流按一定规则拆分为多个流（Split 操作），每个流是之前流的一个子集，这样就可以对不同的流作不同的处理。

1、map

DataStream → DataStream
将函数作用在集合中的每一个元素上,并返回作用后的结果

Takes one element and produces one element. A map function that doubles the values of the input stream:

DataStream<Integer> dataStream = //...
dataStream.map(new MapFunction<Integer, Integer>() {
    @Override
    public Integer map(Integer value) throws Exception {
        return 2 * value;
    }
});

2、flatmap

DataStream → DataStream
将集合中的每个元素变成一个或多个元素,并返回扁平化之后的结果

Takes one element and produces zero, one, or more elements. A flatmap function that splits sentences to words:

dataStream.flatMap(new FlatMapFunction<String, String>() {
    @Override
    public void flatMap(String value, Collector<String> out)
        throws Exception {
        for(String word: value.split(" ")){
            out.collect(word);
        }
    }
});

3、Filter

DataStream → DataStream
按照指定的条件对集合中的元素进行过滤,过滤出返回true/符合条件的元素

Evaluates a boolean function for each element and retains those for which the function returns true. A filter that filters out zero values:

dataStream.filter(new FilterFunction<Integer>() {
    @Override
    public boolean filter(Integer value) throws Exception {
        return value != 0;
    }
});

4、KeyBy
DataStream → KeyedStream
按照指定的key来对流中的数据进行分组

Logically partitions a stream into disjoint partitions. All records with the same key are assigned to the same partition. Internally, keyBy() is implemented with hash partitioning. There are different ways to specify keys.
This transformation returns a KeyedStream, which is, among other things, required to use keyed state.

dataStream.keyBy(value -> value.getSomeKey()) // Key by field "someKey"
dataStream.keyBy(value -> value.f0) // Key by the first element of a Tuple

A type cannot be a key if:

it is a POJO type but does not override the hashCode() method and relies on the Object.hashCode() implementation.
it is an array of any type.

5、Reduce

KeyedStream → DataStream
对集合中的元素进行聚合

A “rolling” reduce on a keyed data stream. Combines the current element with the last reduced value and emits the new value.
A reduce function that creates a stream of partial sums:

keyedStream.reduce(new ReduceFunction<Integer>() {
    @Override
    public Integer reduce(Integer value1, Integer value2)
    throws Exception {
        return value1 + value2;
    }
});

6、Aggregations

KeyedStream → DataStream
Rolling aggregations on a keyed data stream. The difference between min and minBy is that min returns the minimum value, whereas minBy returns the element that has the minimum value in this field (same for max and maxBy).

keyedStream.sum(0);
keyedStream.sum("key");
keyedStream.min(0);
keyedStream.min("key");
keyedStream.max(0);
keyedStream.max("key");
keyedStream.minBy(0);
keyedStream.minBy("key");
keyedStream.maxBy(0);
keyedStream.maxBy("key");

7、Window

KeyedStream → WindowedStream
Windows can be defined on already partitioned KeyedStreams. Windows group the data in each key according to some characteristic (e.g., the data that arrived within the last 5 seconds). See windows for a complete description of windows.

dataStream.keyBy(value -> value.f0).window(TumblingEventTimeWindows.of(Time.seconds(5))); // Last 5 seconds of data

8、WindowAll

DataStream → AllWindowedStream
Windows can be defined on regular DataStreams. Windows group all the stream events according to some characteristic (e.g., the data that arrived within the last 5 seconds). See windows for a complete description of windows.
WARNING: This is in many cases a non-parallel transformation. All records will be gathered in one task for the windowAll operator.

dataStream.windowAll(TumblingEventTimeWindows.of(Time.seconds(5))); // Last 5 seconds of data

9、Window Apply

WindowedStream → DataStream
AllWindowedStream → DataStream
Applies a general function to the window as a whole. Below is a function that manually sums the elements of a window.
Note: If you are using a windowAll transformation, you need to use an AllWindowFunction instead.

windowedStream.apply (new WindowFunction<Tuple2<String,Integer>, Integer, Tuple, Window>() {
    public void apply (Tuple tuple,
            Window window,
            Iterable<Tuple2<String, Integer>> values,
            Collector<Integer> out) throws Exception {
        int sum = 0;
        for (value t: values) {
            sum += t.f1;
        }
        out.collect (new Integer(sum));
    }
});

10、Window Reduce

WindowedStream → DataStream
Applies a functional reduce function to the window and returns the reduced value.

windowedStream.reduce (new ReduceFunction<Tuple2<String,Integer>>() {
    public Tuple2<String, Integer> reduce(Tuple2<String, Integer> value1, Tuple2<String, Integer> value2) throws Exception {
        return new Tuple2<String,Integer>(value1.f0, value1.f1 + value2.f1);
    }
});

11、Aggregations on windows

WindowedStream → DataStream
Aggregates the contents of a window. The difference between min and minBy is that min returns the minimum value, whereas minBy returns the element that has the minimum value in this field (same for max and maxBy).

windowedStream.sum(0);
windowedStream.sum("key");
windowedStream.min(0);
windowedStream.min("key");
windowedStream.max(0);
windowedStream.max("key");
windowedStream.minBy(0);
windowedStream.minBy("key");
windowedStream.maxBy(0);
windowedStream.maxBy("key");

12、Union

union算子可以合并多个同类型的数据流，并生成同类型的数据流，即可以将多个DataStream[T]合并为一个新的DataStream[T]。数据将按照先进先出（First In First Out）的模式合并，且不去重

DataStream* → DataStream
Union of two or more data streams creating a new stream containing all the elements from all the streams. Note: If you union a data stream with itself you will get each element twice in the resulting stream.

dataStream.union(otherStream1, otherStream2, ...);

13、Window Join

DataStream,DataStream → DataStream
Join two data streams on a given key and a common window.

dataStream.join(otherStream)
    .where(<key selector>).equalTo(<key selector>)
    .window(TumblingEventTimeWindows.of(Time.seconds(3)))
    .apply (new JoinFunction () {...});

14、Interval Join

KeyedStream,KeyedStream → DataStream
Join two elements e1 and e2 of two keyed streams with a common key over a given time interval, so that e1.timestamp + lowerBound <= e2.timestamp <= e1.timestamp + upperBound

// this will join the two streams so that
// key1 == key2 && leftTs - 2 < rightTs < leftTs + 2
keyedStream.intervalJoin(otherKeyedStream)
    .between(Time.milliseconds(-2), Time.milliseconds(2)) // lower and upper bound
    .upperBoundExclusive(true) // optional
    .lowerBoundExclusive(true) // optional
    .process(new IntervalJoinFunction() {...});

15、Window CoGroup

DataStream,DataStream → DataStream
Cogroups two data streams on a given key and a common window.

dataStream.coGroup(otherStream)
    .where(0).equalTo(1)
    .window(TumblingEventTimeWindows.of(Time.seconds(3)))
    .apply (new CoGroupFunction () {...});

16、Connect

connect提供了和union类似的功能，用来连接两个数据流，它与union的区别在于：
connect只能连接两个数据流，union可以连接多个数据流。
connect所连接的两个数据流的数据类型可以不一致，union所连接的两个数据流的数据类型必须一致。
两个DataStream经过connect之后被转化为ConnectedStreams，ConnectedStreams会对两个流的数据应用不同的处理方法，且双流之间可以共享状态。

DataStream,DataStream → ConnectedStreams
“Connects” two data streams retaining their types. Connect allowing for shared state between the two streams.

DataStream<Integer> someStream = //...
DataStream<String> otherStream = //...

ConnectedStreams connectedStreams = someStream.connect(otherStream);

17、CoMap, CoFlatMap

ConnectedStreams → DataStream
Similar to map and flatMap on a connected data stream

connectedStreams.map(new CoMapFunction<Integer, String, Boolean>() {
    @Override
    public Boolean map1(Integer value) {
        return true;
    }

    @Override
    public Boolean map2(String value) {
        return false;
    }
});
connectedStreams.flatMap(new CoFlatMapFunction<Integer, String, String>() {

   @Override
   public void flatMap1(Integer value, Collector<String> out) {
       out.collect(value.toString());
   }

   @Override
   public void flatMap2(String value, Collector<String> out) {
       for (String word: value.split(" ")) {
         out.collect(word);
       }
   }
});

18、Iterate

DataStream → IterativeStream → DataStream
Creates a “feedback” loop in the flow, by redirecting the output of one operator to some previous operator. This is especially useful for defining algorithms that continuously update a model. The following code starts with a stream and applies the iteration body continuously. Elements that are greater than 0 are sent back to the feedback channel, and the rest of the elements are forwarded downstream.

IterativeStream<Long> iteration = initialStream.iterate();
DataStream<Long> iterationBody = iteration.map (/*do something*/);
DataStream<Long> feedback = iterationBody.filter(new FilterFunction<Long>(){
    @Override
    public boolean filter(Long value) throws Exception {
        return value > 0;
    }
});
iteration.closeWith(feedback);
DataStream<Long> output = iterationBody.filter(new FilterFunction<Long>(){
    @Override
    public boolean filter(Long value) throws Exception {
        return value <= 0;
    }
});

3）、Sink介绍

1、Flink支持的sink

Data sinks consume DataStreams and forward them to files, sockets, external systems, or print them. Flink comes with a variety of built-in output formats that are encapsulated behind operations on the DataStreams:

writeAsText() / TextOutputFormat - Writes elements line-wise as Strings. The Strings are obtained by calling the toString() method of each element.
writeAsCsv(…) / CsvOutputFormat - Writes tuples as comma-separated value files. Row and field delimiters are configurable. The value for each field comes from the toString() method of the objects.
print() / printToErr() - Prints the toString() value of each element on the standard out / standard error stream. Optionally, a prefix (msg) can be provided which is prepended to the output. This can help to distinguish between different calls to print. If the parallelism is greater than 1, the output will also be prepended with the identifier of the task which produced the output.
writeUsingOutputFormat() / FileOutputFormat - Method and base class for custom file outputs. Supports custom object-to-bytes conversion.
writeToSocket - Writes elements to a socket according to a SerializationSchema
addSink - Invokes a custom sink function. Flink comes bundled with connectors to other systems (such as Apache Kafka) that are implemented as sink functions.

Note that the write*() methods on DataStream are mainly intended for debugging purposes. They are not participating in Flink’s checkpointing, this means these functions usually have at-least-once semantics. The data flushing to the target system depends on the implementation of the OutputFormat. This means that not all elements send to the OutputFormat are immediately showing up in the target system. Also, in failure cases, those records might be lost.
For reliable, exactly-once delivery of a stream into a file system, use the StreamingFileSink. Also, custom implementations through the .addSink(…) method can participate in Flink’s checkpointing for exactly-once semantics.

ds.print 直接输出到控制台
ds.printToErr() 直接输出到控制台,用红色
ds.writeAsText(“本地/HDFS的path”,WriteMode.OVERWRITE).setParallelism(1)
在输出到path的时候,可以在前面设置并行度,如果
并行度>1,则path为目录
并行度=1,则path为文件名

2、自定义Sink

自己实现RichSinkFunction等sink接口或抽象类即可。

6、DataStream Connectors

DataStream Flink支持很多种连接，接下来会在该系列文章中给出使用示例，如下图

1）、JDBC

This connector provides a sink that writes data to a JDBC database.
To use it, add the following dependency to your project (along with your JDBC-driver):

<dependency>
  <groupId>org.apache.flinkgroupId>
  <artifactId>flink-connector-jdbc_2.11artifactId>
  <version>1.12.7version>
dependency>

Note that the streaming connectors are currently NOT part of the binary distribution. See how to link with them for cluster execution here.
Created JDBC sink provides at-least-once guarantee. Effectively exactly-once can be achieved using upsert statements or idempotent updates.
Example usage:

StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment();
env
        .fromElements(...)
        .addSink(JdbcSink.sink(
                "insert into books (id, title, author, price, qty) values (?,?,?,?,?)",
                (ps, t) -> {
                    ps.setInt(1, t.id);
                    ps.setString(2, t.title);
                    ps.setString(3, t.author);
                    ps.setDouble(4, t.price);
                    ps.setInt(5, t.qty);
                },
                new JdbcConnectionOptions.JdbcConnectionOptionsBuilder()
                        .withUrl(getDbMetadata().getUrl())
                        .withDriverName(getDbMetadata().getDriverClass())
                        .build()));
env.execute();

2）、kafka

官网介绍：Kafka | Apache Flink

Flink 里已经提供了一些绑定的 Connector，例如 kafka source 和 sink，Es sink 等。读写 kafka、es、rabbitMQ 时可以直接使用相应 connector 的 api 即可，虽然该部分是 Flink 项目源代码里的一部分，但是真正意义上不算作 Flink 引擎相关逻辑，并且该部分没有打包在二进制的发布包里面。所以在提交 Job 时候需要注意， job 代码 jar 包中一定要将相应的 connetor 相关类打包进去，否则在提交作业时就会失败，提示找不到相应的类，或初始化某些类异常。

以下参数都建议设置
订阅的主题
反序列化规则
消费者属性-集群地址
消费者属性-消费者组id(如果不设置,会有默认的,但是默认的不方便管理)
消费者属性-offset重置规则,如earliest/latest…
动态分区检测(当kafka的分区数变化/增加时,Flink能够检测到!)
如果没有设置Checkpoint,那么可以设置自动提交offset,后续了解了Checkpoint会把offset随着做Checkpoint的时候提交到Checkpoint和默认主题中

Apache Flink ships with a universal Kafka connector which attempts to track the latest version of the Kafka client. The version of the client it uses may change between Flink releases. Modern Kafka clients are backwards compatible with broker versions 0.10.0 or later. For details on Kafka compatibility, please refer to the official Kafka documentation.

<dependency>
    <groupId>org.apache.flinkgroupId>
    <artifactId>flink-connector-kafkaartifactId>
    <version>1.17.1version>
dependency>

Flink’s streaming connectors are not part of the binary distribution.

二、Flink与kafka

该部分是关于Flink中使用kafka更详细的介绍，内容摘抄至官网。下文中没有介绍关于kafka的性能指标、安全等信息，更多的内容参考官网。
特别注意：
FlinkKafkaConsumer is deprecated and will be removed with Flink 1.17, please use KafkaSource instead.
FlinkKafkaProducer is deprecated and will be removed with Flink 1.15, please use KafkaSink instead.

1、Kafka Source

1）、用法Usage

Kafka source provides a builder class for constructing instance of KafkaSource. The code snippet below shows how to build a KafkaSource to consume messages from the earliest offset of topic “input-topic”, with consumer group “my-group” and deserialize only the value of message as string.

KafkaSource<String> source = KafkaSource.<String>builder()
    .setBootstrapServers(brokers)
    .setTopics("input-topic")
    .setGroupId("my-group")
    .setStartingOffsets(OffsetsInitializer.earliest())
    .setValueOnlyDeserializer(new SimpleStringSchema())
    .build();
env.fromSource(source, WatermarkStrategy.noWatermarks(), "Kafka Source");

The following properties are required for building a KafkaSource:

Bootstrap servers, configured by setBootstrapServers(String)
Topics / partitions to subscribe, see the following Topic-partition subscription for more details.
Deserializer to parse Kafka messages, see the following Deserializer for more details.

2）、主题订阅Topic-partition Subscription

Kafka source provide 3 ways of topic-partition subscription:

Topic list, subscribing messages from all partitions in a list of topics. For example:

KafkaSource.builder().setTopics("topic-a", "topic-b");

Topic pattern, subscribing messages from all topics whose name matches the provided regular expression. For example:

KafkaSource.builder().setTopicPattern("topic.*");

Partition set, subscribing partitions in the provided partition set. For example:

final HashSet<TopicPartition> partitionSet = new HashSet<>(Arrays.asList(
        new TopicPartition("topic-a", 0),    // Partition 0 of topic "topic-a"
        new TopicPartition("topic-b", 5)));  // Partition 5 of topic "topic-b"
KafkaSource.builder().setPartitions(partitionSet);

3）、反序列化Deserializer

A deserializer is required for parsing Kafka messages. Deserializer (Deserialization schema) can be configured by setDeserializer(KafkaRecordDeserializationSchema), where KafkaRecordDeserializationSchema defines how to deserialize a Kafka ConsumerRecord.

If only the value of Kafka ConsumerRecord is needed, you can use setValueOnlyDeserializer(DeserializationSchema) in the builder, where DeserializationSchema defines how to deserialize binaries of Kafka message value.
You can also use a Kafka Deserializer for deserializing Kafka message value. For example using StringDeserializer for deserializing Kafka message value as string:

import org.apache.kafka.common.serialization.StringDeserializer;
KafkaSource.<String>builder()
        .setDeserializer(KafkaRecordDeserializationSchema.valueOnly(StringDeserializer.class));

4）、偏移量Starting Offset

Kafka source is able to consume messages starting from different offsets by specifying OffsetsInitializer. Built-in initializers include:

KafkaSource.builder()
    // Start from committed offset of the consuming group, without reset strategy
    .setStartingOffsets(OffsetsInitializer.committedOffsets())
    // Start from committed offset, also use EARLIEST as reset strategy if committed offset doesn't exist
    .setStartingOffsets(OffsetsInitializer.committedOffsets(OffsetResetStrategy.EARLIEST))
    // Start from the first record whose timestamp is greater than or equals a timestamp (milliseconds)
    .setStartingOffsets(OffsetsInitializer.timestamp(1657256176000L))
    // Start from earliest offset
    .setStartingOffsets(OffsetsInitializer.earliest())
    // Start from latest offset
    .setStartingOffsets(OffsetsInitializer.latest());

You can also implement a custom offsets initializer if built-in initializers above cannot fulfill your requirement. (Not supported in PyFlink)
If offsets initializer is not specified, OffsetsInitializer.earliest() will be used by default.

5）、有界性Boundedness

Kafka source is designed to support both streaming and batch running mode. By default, the KafkaSource is set to run in streaming manner, thus never stops until Flink job fails or is cancelled. You can use setBounded(OffsetsInitializer) to specify stopping offsets and set the source running in batch mode. When all partitions have reached their stopping offsets, the source will exit.
You can also set KafkaSource running in streaming mode, but still stop at the stopping offset by using setUnbounded(OffsetsInitializer). The source will exit when all partitions reach their specified stopping offset.

6）、配置Additional Properties

In addition to properties described above, you can set arbitrary properties for KafkaSource and KafkaConsumer by using setProperties(Properties) and setProperty(String, String). KafkaSource has following options for configuration:

client.id.prefix defines the prefix to use for Kafka consumer’s client ID
partition.discovery.interval.ms defines the interval im milliseconds for Kafka source to discover new partitions. See Dynamic Partition Discovery below for more details.
register.consumer.metrics specifies whether to register metrics of KafkaConsumer in Flink metric group
commit.offsets.on.checkpoint specifies whether to commit consuming offsets to Kafka brokers on checkpoint

For configurations of KafkaConsumer, you can refer to Apache Kafka documentation for more details.
Please note that the following keys will be overridden by the builder even if it is configured:

key.deserializer is always set to ByteArrayDeserializer
value.deserializer is always set to ByteArrayDeserializer
auto.offset.reset.strategy is overridden by OffsetsInitializer#getAutoOffsetResetStrategy() for the starting offsets
partition.discovery.interval.ms is overridden to -1 when setBounded(OffsetsInitializer) has been invoked

7）、动态分区Dynamic Partition Discovery

In order to handle scenarios like topic scaling-out or topic creation without restarting the Flink job, Kafka source can be configured to periodically discover new partitions under provided topic-partition subscribing pattern. To enable partition discovery, set a non-negative value for property partition.discovery.interval.ms:

//Partition discovery is disabled by default. You need to explicitly set the partition discovery interval to enable this feature.
KafkaSource.builder()
    .setProperty("partition.discovery.interval.ms", "10000"); // discover new partitions per 10 seconds

8）、事件时间与水印Event Time and Watermarks

By default, the record will use the timestamp embedded in Kafka ConsumerRecord as the event time. You can define your own WatermarkStrategy for extract event time from the record itself, and emit watermark downstream:

env.fromSource(kafkaSource, new CustomWatermarkStrategy(), "Kafka Source With Custom Watermark Strategy");

This documentation describes details about how to define a WatermarkStrategy. (Not supported in PyFlink)

9）、惰性Idleness

The Kafka Source does not go automatically in an idle state if the parallelism is higher than the number of partitions. You will either need to lower the parallelism or add an idle timeout to the watermark strategy. If no records flow in a partition of a stream for that amount of time, then that partition is considered “idle” and will not hold back the progress of watermarks in downstream operators.
This documentation describes details about how to define a WatermarkStrategy#withIdleness.

10）、消费者偏移量提交Consumer Offset Committing

Kafka source commits the current consuming offset when checkpoints are completed, for ensuring the consistency between Flink’s checkpoint state and committed offsets on Kafka brokers.
If checkpointing is not enabled, Kafka source relies on Kafka consumer’s internal automatic periodic offset committing logic, configured by enable.auto.commit and auto.commit.interval.ms in the properties of Kafka consumer.
Note that Kafka source does NOT rely on committed offsets for fault tolerance. Committing offset is only for exposing the progress of consumer and consuming group for monitoring.

2、Kafka Sink

KafkaSink allows writing a stream of records to one or more Kafka topics.

1)、用法Usage

Kafka sink provides a builder class to construct an instance of a KafkaSink. The code snippet below shows how to write String records to a Kafka topic with a delivery guarantee of at least once.

DataStream<String> stream = ...;
        
KafkaSink<String> sink = KafkaSink.<String>builder()
        .setBootstrapServers(brokers)
        .setRecordSerializer(KafkaRecordSerializationSchema.builder()
            .setTopic("topic-name")
            .setValueSerializationSchema(new SimpleStringSchema())
            .build()
        )
        .setDeliveryGuarantee(DeliveryGuarantee.AT_LEAST_ONCE)
        .build();
        
stream.sinkTo(sink);

The following properties are required to build a KafkaSink:

Bootstrap servers, setBootstrapServers(String)
Record serializer, setRecordSerializer(KafkaRecordSerializationSchema)
If you configure the delivery guarantee with DeliveryGuarantee.EXACTLY_ONCE you also have use setTransactionalIdPrefix(String)

2）、序列化Serializer

You always need to supply a KafkaRecordSerializationSchema to transform incoming elements from the data stream to Kafka producer records. Flink offers a schema builder to provide some common building blocks i.e. key/value serialization, topic selection, partitioning. You can also implement the interface on your own to exert more control.

KafkaRecordSerializationSchema.builder()
    .setTopicSelector((element) -> {<your-topic-selection-logic>})
    .setValueSerializationSchema(new SimpleStringSchema())
    .setKeySerializationSchema(new SimpleStringSchema())
    .setPartitioner(new FlinkFixedPartitioner())
    .build();

It is required to always set a value serialization method and a topic (selection method). Moreover, it is also possible to use Kafka serializers instead of Flink serializer by using setKafkaKeySerializer(Serializer) or setKafkaValueSerializer(Serializer).

3）、容错Fault Tolerance

Overall the KafkaSink supports three different DeliveryGuarantees. For DeliveryGuarantee.AT_LEAST_ONCE and DeliveryGuarantee.EXACTLY_ONCE Flink’s checkpointing must be enabled. By default the KafkaSink uses DeliveryGuarantee.NONE. Below you can find an explanation of the different guarantees.

DeliveryGuarantee.NONE does not provide any guarantees: messages may be lost in case of issues on the Kafka broker and messages may be duplicated in case of a Flink failure.
DeliveryGuarantee.AT_LEAST_ONCE: The sink will wait for all outstanding records in the Kafka buffers to be acknowledged by the Kafka producer on a checkpoint. No messages will be lost in case of any issue with the Kafka brokers but messages may be duplicated when Flink restarts because Flink reprocesses old input records.
DeliveryGuarantee.EXACTLY_ONCE: In this mode, the KafkaSink will write all messages in a Kafka transaction that will be committed to Kafka on a checkpoint. Thus, if the consumer reads only committed data (see Kafka consumer config isolation.level), no duplicates will be seen in case of a Flink restart. However, this delays record visibility effectively until a checkpoint is written, so adjust the checkpoint duration accordingly. Please ensure that you use unique transactionalIdPrefix across your applications running on the same Kafka cluster such that multiple running jobs do not interfere in their transactions! Additionally, it is highly recommended to tweak Kafka transaction timeout (see Kafka producer transaction.timeout.ms)» maximum checkpoint duration + maximum restart duration or data loss may happen when Kafka expires an uncommitted transaction.

3、Transformations

这个与上面的介绍的内容一致，不再赘述。

以上，本文详细介绍了流批一体的开发过程、source、transformations、sink详细的api以及flink与kafka的详细功能介绍。

你可能感兴趣的:(#,Flink专栏,flink,kafka,流批一体,flink,flink,operators,流式计算,批量计算,flink,kafka)

算法竞赛备考冲刺必刷题（C++） | 洛谷 P8814 解密热爱编程的通信人算法 c++开发语言
本文分享的必刷题目是从蓝桥云课、洛谷、AcWing等知名刷题平台精心挑选而来，并结合各平台提供的算法标签和难度等级进行了系统分类。题目涵盖了从基础到进阶的多种算法和数据结构，旨在为不同阶段的编程学习者提供一条清晰、平稳的学习提升路径。欢迎大家订阅我的专栏：算法题解：C++与Python实现！附上汇总贴：算法竞赛备考冲刺必刷题（C++）|汇总【题目来源】洛谷：P8814[CSP-J2022]解密-洛
GitHub Actions与AWS OIDC实现安全的ECR/ECS自动化部署 ivwdcwso 运维与云原生 github aws 安全 ecr ecs oldc CI/CD
引言在现代云原生应用开发中，实现安全、高效的CI/CD流程至关重要。本文将详细介绍如何利用GitHubActions和AWSOIDC（OpenIDConnect）构建一个无需长期凭证的安全部署管道，将容器化应用自动部署到AmazonECR和ECS服务。架构概述整个解决方案的架构包含三个主要部分：GitHub端：代码仓库和GitHubActions工作流AWS端：OIDC身份验证、ECR容器仓库和E
AWS Lambda与RDS连接优化之旅 t0_54manong 编程问题解决手册 aws 云计算个人开发
在云计算的时代，AWSLambda与RDS的结合为开发者提供了高效且灵活的解决方案。然而，在实际应用中，我们常常会遇到一些性能瓶颈。本文将通过一个真实案例，探讨如何优化AWSLambda与RDS之间的连接，以提高API的响应速度。背景介绍最近，我们在AWS上部署了一个使用Dotnet6开发的API，它通过APIGateway暴露给外部，并连接到同VPC内的MySQLAuroraRDS数据库。部署前
大数据面试必备：Kafka性能优化 Producer与Consumer配置指南
Kafka面试题-在Kafka中，如何通过配置优化Producer和Consumer的性能?回答重点在Kafka中，通过优化Producer和Consumer的配置，可以显著提高性能。以下是一些关键配置项和策略：1、Producer端优化:batch.size：批处理大小。增大batch.size可以使Producer每次发送更多的消息，但要注意不能无限制增大，否则会导致内存占用过多。linger
Beam2.61.0版本消费kafka重复问题排查隔壁寝室老吴 kafka linq 分布式
1.问题出现过程在测试环境测试flink的job的任务消费kafka的情况，通过往job任务发送一条消息，然后flinkwebui上消费出现了两条。然后通过重启JobManager和TaskManager后，任务从checkpoint恢复后就会出现重复消费。当任务不从checkpoint恢复的时候，任务不会出现重复消费的情况。由此可见是beam从checkpoint恢复的时候出现了重复消费的问题。
008 【入门】算法和数据结构简介要天天开心啊算法专栏算法数据结构
算法与数据结构系统概览|[算法]-[基础]-[通用]一、算法分类与应用1.硬计算类算法|[算法]-[中级]-[通用]特点应用场景复杂度特征-精确求解问题-可能带来较高计算复杂度-大厂笔试/面试-ACM竞赛-所有程序员岗位必考⏱️通常为O(n)~O(n²)//[示例]快速排序算法-分治思想核心实现publicvoidquickSort(int[]arr,intleft,intright){if(le
初中学习机推荐：从功能、内容到用户体验的深度解析资讯分享周 ux 人工智能
在教育信息化持续深化的背景下,初中阶段的学习辅助设备正逐步成为家长和学生关注的重点。尤其在“双减”政策推动下,传统补习班的作用被削弱,越来越多家庭开始依赖智能学习工具来提升学习效率和自主性。其中,初中学习机因其集视频课程、AI辅导、错题整理、学习反馈等多功能于一体,成为当前市场热度最高的教育硬件之一。本文将围绕市场上主流的几款初中学习机进行客观分析,重点介绍简单一百、学而思、科大讯飞、作业帮四款产
GitHub Actions 的深度解析与概念介绍青草地溪水旁 linux 环境配置开发管理 github linux ubuntu docker
GitHubActions核心定义GitActions是GitHub原生提供的自动化工作流引擎，允许开发者在代码仓库中直接创建、测试、部署代码。其本质是通过事件驱动（Event-Driven）的自动化管道，将软件开发中的重复任务抽象为可编排的流程。架构核心四要素工作流（Workflow）仓库中的自动化流程蓝图（.yml文件）存储在.github/workflows目录单仓库可包含多个独立工作流事件
【数据结构】顺序表 nanguochenchuan 数据结构数据结构
一，顺序表1.顺序表的定义顺序表是一种线性表的数据结构，它的数据元素按照一定次序依次存储在计算机存储器中，使用连续的存储空间来存储。顺序表中每个数据元素的位置都有一个序号，这个序号也称为元素在顺序表中的下标。顺序表的特点是：元素的逻辑顺序与物理顺序相同，支持随机访问，插入和删除元素的时间复杂度为O(n)，查找元素的时间复杂度为O(1)。2.优点与不足优点是访问速度快，因为它的元素在内存中是连续存储
Python编程：使用Opencv进行图像处理
【参考】https://github.com/opencv/opencv/tree/4.x/samples/pythonPython使用OpenCV进行图像处理OpenCV(OpenSourceComputerVisionLibrary)是一个开源的计算机视觉和机器学习软件库。下面将从基础到高阶介绍如何使用Python中的OpenCV进行图像处理。一、安装首先需要安装OpenCV库：pipinst
如何批量将word文档转换为PDF 渍渍渍197 word pdf c#
新建一个txt文件将以下代码复制进去OnErrorResumeNextSetwordTest=CreateObject("Word.Application")IfErr.Number<>0ThenMsgBox"MicrosoftWordnotfound!PleaseinstallWordfirst.",vbCritical,"Error"WScript.QuitEndIfwordTest.Quit
TDengine 3.3.5.0 新功能 —— 查看库文件占用空间、压缩率 TDengine （老段） TDengine 产品设计数据库时序数据库物联网 tdengine 涛思数据 iot
1.背景TDengine之前版本一直没有通过SQL命令查看数据库占用的磁盘空间大小，从3.3.5.0开始，增加了这个方便且实用的小功能，这里详细介绍下。2.SQL基本语法selectexprfrominformation_schema.ins_disk_usage[wherecondtion]行为说明：查看各个vgroup的各个组件磁盘占用情况，并且可以通过查询语句计算压缩率等。示例：taos>s
JavaScript性能优化 lyh1344 javascript 性能优化开发语言
JavaScript性能优化方法减少重绘和回流频繁操作DOM会导致浏览器反复计算布局，引发性能问题。使用documentFragment进行批量DOM操作，或通过classList一次性修改多个样式属性。缓存DOM查询结果，避免重复访问。事件委托利用事件冒泡机制，将事件监听器绑定到父元素而非多个子元素。减少内存占用，提升动态内容的事件处理效率。节流与防抖高频事件（如滚动、输入）通过节流（Throt
uiautomation控制计算器，不动鼠标（界面控制） alooffox python 用户界面
importosimportuiautomationasautoimportsubprocessimporttimeclassuiautoCalc():"""uiautomation控制计算器（完全后台操作方案）"""def__init__(self):auto.uiautomation.DEBUG_SEARCH_TIME=Trueauto.uiautomation.SetGlobalSearch
Node.js特训专栏-实战进阶：8. Express RESTful API设计规范与实现爱分享的程序员 Node.js javascript node.js 前端
欢迎来到Node.js实战专栏！在这里，每一行代码都是解锁高性能应用的钥匙，让我们一起开启Node.js的奇妙开发之旅！Node.js特训专栏主页专栏内容规划详情ExpressRESTfulAPI设计规范与实现：构建标准化、可维护的接口服务在前后端分离架构盛行的今天，RESTfulAPI已成为Web服务交互的事实标准。基于Express框架构建RESTfulAPI，既能利用Node.js的高效性能
【机器学习与数据挖掘实战 | 医疗】案例18：基于Apriori算法的中医证型关联规则分析 Francek Chen 机器学习与数据挖掘实战机器学习数据挖掘 Apriori python 关联规则人工智能
【作者主页】FrancekChen【专栏介绍】⌈⌈⌈机器学习与数据挖掘实战⌋⌋⌋机器学习是人工智能的一个分支，专注于让计算机系统通过数据学习和改进。它利用统计和计算方法，使模型能够从数据中自动提取特征并做出预测或决策。数据挖掘则是从大型数据集中发现模式、关联和异常的过程，旨在提取有价值的信息和知识。机器学习为数据挖掘提供了强大的分析工具，而数据挖掘则是机器学习应用的重要领域，两者相辅相成，共同推动
构建四则运算解析器：字符串处理与计算逻辑实战大熊小清新
本文还有配套的精品资源，点击获取简介：四则运算解析器是将包含四则运算符号的字符串表达式转化为可执行计算的程序。它对编程初学者而言是理解编程逻辑和语法分析的基础。通过理解四则运算的优先级规则，实现输入处理、词法分析、语法分析和计算步骤，可以采用递归下降解析或堆栈解析等方法。本解析器的实现涉及字符串处理、数据结构的运用，有助于学习者掌握编程语言的底层工作方式，提升编程技能和问题解决能力。1.四则运算解
计算机考研408真题解析（2024-34 二进制数字调制方法深度解析与FSK双频载波实现）
【良师408】计算机考研408真题解析（2024-34二进制数字调制方法深度解析与FSK双频载波实现）传播知识，做懂学生的好老师1.【哔哩哔哩】（良师408）2.【抖音】（良师408）goodteacher4083.【小红书】（良师408）4.【CSDN】（良师408）goodteacher4085.【微信】（良师408）goodteacher408特别提醒：【良师408】所收录真题根据考生回忆整
力扣网C语言编程题：接雨水（双指针法）魏劭逻辑编程题 C语言 c语言 leetcode 算法
一.简介前面文章是以动态规划方法实现的，文章如下：力扣网C语言编程题：接雨水（动态规划实现）-CSDN博客本文继续针对力扣网的接雨水问题，以另一种解题思路（双指针）以C语言实现和Python实现。二.力扣网C语言编程题：接雨水（双指针法）题目：接雨水给定n个非负整数表示每个宽度为1的柱子的高度图，计算按此排列的柱子，下雨之后能接多少雨水。示例2：输入：height=[4,2,0,3,2,5]输出：
基于SIP的视频会议系统研究 weixin_33921089 数据库
摘要根据IETFSIPPING工作组提出的集中式会议模型，设计并实现了基于SIP的视频会议系统。该系统各部分可分别设计，具有良好的可扩展性。详细介绍了此系统的结构和工作原理。关键词SIP视频会议会议控制服务器会场控制媒体服务器0前言近几年来，随着计算机技术、通信技术和互联网技术的飞速发展，视频会议的应用范围正逐渐从传统的专业领域、大型企业等高端用户向中小企业等普通用户和个人用户拓展。据有关机构的分
【Html实现“心形日出”（附效果+源代码）】| JavaScript面试题：解释一下异步编程中的回调函数、Promise和Async/Await的概念。它们有什么区别？追光者♂ html5 css3 心形日出前端特效 JS面试题 Promise Async/Await
风会带走你曾经存在过的证明。——虞姬作者主页：追光者♂个人简介：[1]计算机专业硕士研究生[2]2023年城市之星领跑者TOP1(哈尔滨)[3]2022年度博客之星人工智能领域TOP4[4]阿里云社区特邀专家博主[5]CSDN-人工智能领域优质创作者无限进步，一起追光！！！
css优化之提高代码拓展性小小不吃香菜 css 前端 css3 代码规范
css优化系列文章css优化系列：通过“使用CSS变量”和“整合重复样式”来优化代码的可维护性。文章目录css优化系列文章使用css变量整合重复样式总结使用css变量将重复使用的颜色、间距值等等定义为变量，提高代码的可维护性。对于使用函数获取值的情况，也可以降低重复计算的次数。例如：/**跟节点里设置变量**/.chat-window{--cw-z-index:1000;--cw-bg-gradi
mybatis批量插入数据时，如果是sql server库只返回一条自增主键小小不吃香菜 mybatis sqlserver java
有个功能需要做个批量插入，表是自增主键，本来是很简单的事情，结果一测试发现一个神奇的事情，由于数据库是sqlserver的，插入一条时，id可以正常返回，多条时，就出现了标题的问题，只返回一个id，使用的是mybatis自带的jar包，甚至如何使用人家还加了备注在里面，很清晰，是这样的:然后我就按照上面描述的，自己加了一个自定义的Mapper，把主键名称改成我自己的，然后发现依然只能获取到一条，后
RK3399 驱动开发 _ 07 - ADC 开发 chenkanuo 驱动开发
文章目录前言一、ADC简介二、ADC配置1.dts节点配置2.重新编译并烧录三、ADC值获取1.计算采集到的电压2.获取ADC值在这里插入图片描述总结前言在RK3399平台开发过程中，经常需要用到ADC功能。例如：检测主板温度、复位/音量按键、DRAMID检测等。一、ADC简介RK3399板卡上常见的AD接口有2种：温度传感器(TemperatureSensor)、逐次逼近ADC(Successi
InfiniBand架构规范第一卷：深入解析高性能计算的未来明祯跃
InfiniBand架构规范第一卷：深入解析高性能计算的未来【下载地址】InfiniBand架构规范第一卷探索InfiniBand架构的奥秘，开启高性能计算的新篇章！本资源提供InfiniBand架构规范第一卷1.4版本，深入解析RDMA和RoCE核心协议，助您掌握高速网络通信的精髓。无论是高性能计算还是数据中心领域，这份文档都是您不可或缺的指南。下载、解压、阅读，轻松获取前沿技术知识，提升专业能
中国计算机学会推荐国际学术会议-体系结构相关(含投稿截至时间) HiAallen 事务编辑器
Ref:CCF推荐国际学术刊物目录-中国计算机学会中国计算机学会推荐国际学术会议(●计算机体系结构/并行与分布计算/存储系统)A类序号刊物名称刊物全称出版社投稿截止时间地址1PPoPPACMSIGPLANSymposiumonPrinciples&PracticeofParallelProgrammingACM2022-8-17dblp:PPOPP2FASTConferenceonFileandS
中国计算机学会（CCF）推荐学术会议-C（软件工程/系统软件/程序设计语言）：FPT 2025 爱思德学术 AI编程极限编程重构
FPT2025FPTisthepremierconferenceintheAsia-Pacificregiononfield-programmabletechnologies,reconfigurablecomputingdevicesandsystems.Field-programmabledevicesoffertheflexibilityofsoftwarewiththeperformanc
2019 CCF 推荐国际学术期刊&会议（计算机体系结构/并行与分布计算/存储系统）漓艾初 CCF
中国计算机学会推荐国际学术期刊&会议直接去这里找，全部都有https://www.ccf.org.cn/Academic_Evaluation/By_category/计算机体系结构/并行与分布计算/存储系统期刊A类序号刊物简称刊物全称出版社网址1TOCSACMTransactionsonComputerSystemsACMhttp://dblp.uni-trier.de/db/journals/
数据库系统工程师简要概括笔记 Mint_Datazzh 数据库系统工程师数据库笔记数据库系统工程师
文章内容仅为粗略总结知识，便于个人复习思考原文链接:数据库系统工程师简要概括笔记–笔墨云烟数据库系统工程师—1.1计算机硬件基础知识数据库系统工程师—1.2计算机体系结构与存储系统数据库系统工程师—1.3安全性、可靠性与系统性能评测基础知识数据库系统工程师—2.程序语言基础知识数据库系统工程师—3.1~3.4线性结构、数组和矩阵、树和二叉树、图数据库系统工程师—3.5排序算法数据库系统工程师—3.
CCF推荐会议计算机体系结构/并行与分布计算/存储系统领域3月份截稿资讯汇总! 会议之眼人工智能深度学习阿里云云计算计算机网络
会议之眼快讯会议之眼精心汇总了以下CCF推荐会议之计算机十大领域之一：计算机体系结构/并行与分布计算/存储系统领域，2024年度3月份会议截稿资讯！为你第一时间进行播报！让广大科研学者及时了解最新的学术进展，助力学者们在专业领域保持竞争优势！会议简称：ISLPED会议全称：InternationalSymposiumonLowPowerElectronicsandDesignFullPaperDe
多线程编程之join()方法周凡杨 java JOIN 多线程编程线程
现实生活中，有些工作是需要团队中成员依次完成的，这就涉及到了一个顺序问题。现在有T1、T2、T3三个工人，如何保证T2在T1执行完后执行，T3在T2执行完后执行？问题分析：首先问题中有三个实体，T1、T2、T3，因为是多线程编程，所以都要设计成线程类。关键是怎么保证线程能依次执行完呢？ Java实现过程如下： public class T1 implements Runnabl
java中switch的使用 bingyingao java enum break continue
java中的switch仅支持case条件仅支持int、enum两种类型。用enum的时候，不能直接写下列形式。 switch (timeType) { case ProdtransTimeTypeEnum.DAILY: break; default: br
hive having count 不能去重 daizj hive 去重 having count 计数
hive在使用having count()是，不支持去重计数 hive (default)> select imei from t_test_phonenum where ds=20150701 group by imei having count(distinct phone_num)>1 limit 10; FAILED: SemanticExcep
WebSphere对JSP的缓存周凡杨 WAS JSP 缓存
对于线网上的工程，更新JSP到WebSphere后，有时会出现修改的jsp没有起作用，特别是改变了某jsp的样式后，在页面中没看到效果，这主要就是由于websphere中缓存的缘故，这就要清除WebSphere中jsp缓存。要清除WebSphere中JSP的缓存，就要找到WAS安装后的根目录。现服务
设计模式总结朱辉辉33 java 设计模式
1.工厂模式 1.1 工厂方法模式 (由一个工厂类管理构造方法) 1.1.1普通工厂模式(一个工厂类中只有一个方法) 1.1.2多工厂模式(一个工厂类中有多个方法) 1.1.3静态工厂模式(将工厂类中的方法变成静态方法) &n
实例：供应商管理报表需求调研报告老A不折腾 finereport 报表系统报表软件信息化选型
引言随着企业集团的生产规模扩张，为支撑全球供应链管理，对于供应商的管理和采购过程的监控已经不局限于简单的交付以及价格的管理，目前采购及供应商管理各个环节的操作分别在不同的系统下进行，而各个数据源都独立存在，无法提供统一的数据支持；因此，为了实现对于数据分析以提供采购决策，建立报表体系成为必须。业务目标 1、通过报表为采购决策提供数据分析与支撑 2、对供应商进行综合评估以及管理，合理管理和
mysql 林鹤霄
转载源：http://blog.sina.com.cn/s/blog_4f925fc30100rx5l.html mysql -uroot -p ERROR 1045 (28000): Access denied for user 'root'@'localhost' (using password: YES) [root@centos var]# service mysql
Linux下多线程堆栈查看工具(pstree、ps、pstack) aigo linux
原文：http://blog.csdn.net/yfkiss/article/details/6729364 1. pstree pstree以树结构显示进程$ pstree -p work | grep adsshd(22669)---bash(22670)---ad_preprocess(4551)-+-{ad_preprocess}(4552) &n
html input与textarea 值改变事件 alxw4616 JavaScript
// 文本输入框(input) 文本域(textarea)值改变事件 // onpropertychange(IE) oninput(w3c) $('input,textarea').on('propertychange input', function(event) { console.log($(this).val()) });
String类的基本用法百合不是茶 String
字符串的用法; // 根据字节数组创建字符串 byte[] by = { 'a', 'b', 'c', 'd' }; String newByteString = new String(by); 1,length() 获取字符串的长度 &nbs
JDK1.5 Semaphore实例 bijian1013 java thread java多线程 Semaphore
Semaphore类一个计数信号量。从概念上讲，信号量维护了一个许可集合。如有必要，在许可可用前会阻塞每一个 acquire()，然后再获取该许可。每个 release() 添加一个许可，从而可能释放一个正在阻塞的获取者。但是，不使用实际的许可对象，Semaphore 只对可用许可的号码进行计数，并采取相应的行动。 S
使用GZip来压缩传输量 bijian1013 java GZip
启动GZip压缩要用到一个开源的Filter：PJL Compressing Filter。这个Filter自1.5.0开始该工程开始构建于JDK5.0，因此在JDK1.4环境下只能使用1.4.6。 PJL Compressi
【Java范型三】Java范型详解之范型类型通配符 bit1129 java
定义如下一个简单的范型类， package com.tom.lang.generics; public class Generics<T> { private T value; public Generics(T value) { this.value = value; } }
【Hadoop十二】HDFS常用命令 bit1129 hadoop
1. 修改日志文件查看器 hdfs oev -i edits_0000000000000000081-0000000000000000089 -o edits.xml cat edits.xml 修改日志文件转储为xml格式的edits.xml文件，其中每条RECORD就是一个操作事务日志 2. fsimage查看HDFS中的块信息等 &nb
怎样区别nginx中rewrite时break和last ronin47
在使用nginx配置rewrite中经常会遇到有的地方用last并不能工作，换成break就可以，其中的原理是对于根目录的理解有所区别，按我的测试结果大致是这样的。 location / { proxy_pass http://test;
java-21.中兴面试题输入两个整数 n 和 m ，从数列 1 ， 2 ， 3.......n 中随意取几个数 , 使其和等于 m bylijinnan java
import java.util.ArrayList; import java.util.List; import java.util.Stack; public class CombinationToSum { /* 第21 题 2010 年中兴面试题编程求解：输入两个整数 n 和 m ，从数列 1 ， 2 ， 3.......n 中随意取几个数 , 使其和等
eclipse svn 帐号密码修改问题开窍的石头 eclipse SVN svn帐号密码修改
问题描述： Eclipse的SVN插件Subclipse做得很好，在svn操作方面提供了很强大丰富的功能。但到目前为止，该插件对svn用户的概念极为淡薄，不但不能方便地切换用户，而且一旦用户的帐号、密码保存之后，就无法再变更了。解决思路：删除subclipse记录的帐号、密码信息，重新输入
[电子商务]传统商务活动与互联网的结合 comsci 电子商务
某一个传统名牌产品，过去销售的地点就在某些特定的地区和阶层，现在进入互联网之后，用户的数量群突然扩大了无数倍，但是，这种产品潜在的劣势也被放大了无数倍，这种销售利润与经营风险同步放大的效应，在最近几年将会频繁出现。。。。如何避免销售量和利润率增加的
java 解析 properties-使用 Properties-可以指定配置文件路径 cuityang java properties
#mq xdr.mq.url=tcp://192.168.100.15:61618; import java.io.IOException; import java.util.Properties; public class Test { String conf = "log4j.properties"; private static final
Java核心问题集锦 darrenzhu java 基础核心难点
注意，这里的参考文章基本来自Effective Java和jdk源码 1)ConcurrentModificationException 当你用for each遍历一个list时，如果你在循环主体代码中修改list中的元素，将会得到这个Exception，解决的办法是： 1)用listIterator, 它支持在遍历的过程中修改元素， 2)不用listIterator, new一个
1分钟学会Markdown语法 dcj3sjt126com markdown
markdown 简明语法基本符号 *,-,+ 3个符号效果都一样，这3个符号被称为 Markdown符号空白行表示另起一个段落 `是表示inline代码，tab是用来标记代码段，分别对应html的code，pre标签换行单一段落( <p>) 用一个空白行连续两个空格会变成一个 <br> 连续3个符号，然后是空行
Gson使用二（GsonBuilder） eksliang json gson GsonBuilder
转载请出自出处：http://eksliang.iteye.com/blog/2175473 一.概述 GsonBuilder用来定制java跟json之间的转换格式二.基本使用实体测试类：温馨提示：默认情况下@Expose注解是不起作用的,除非你用GsonBuilder创建Gson的时候调用了GsonBuilder.excludeField
报ClassNotFoundException: Didn't find class "...Activity" on path: DexPathList gundumw100 android
有一个工程，本来运行是正常的，我想把它移植到另一台PC上，结果报： java.lang.RuntimeException: Unable to instantiate activity ComponentInfo{com.mobovip.bgr/com.mobovip.bgr.MainActivity}: java.lang.ClassNotFoundException: Didn't f
JavaWeb之JSP指令 ihuning javaweb
要点 JSP指令简介 page指令 include指令 JSP指令简介 JSP指令（directive）是为JSP引擎而设计的，它们并不直接产生任何可见输出，而只是告诉引擎如何处理JSP页面中的其余部分。 JSP指令的基本语法格式： <%@ 指令属性名="
mac上编译FFmpeg跑ios 啸笑天 ffmpeg
1、下载文件：https://github.com/libav/gas-preprocessor，复制gas-preprocessor.pl到/usr/local/bin/下，修改文件权限：chmod 777 /usr/local/bin/gas-preprocessor.pl 2、安装yasm-1.2.0 curl http://www.tortall.net/projects/yasm
sql mysql oracle中字符串连接 macroli oracle sql mysql SQL Server
有的时候，我们有需要将由不同栏位获得的资料串连在一起。每一种资料库都有提供方法来达到这个目的： MySQL: CONCAT() Oracle: CONCAT(), || SQL Server: + CONCAT() 的语法如下： Mysql 中 CONCAT(字串1, 字串2, 字串3, ...): 将字串1、字串2、字串3，等字串连在一起。请注意，Oracle的CON
Git fatal: unab SSL certificate problem: unable to get local issuer ce rtificate qiaolevip 学习永无止境每天进步一点点 git 纵观千象
// 报错如下： $ git pull origin master fatal: unable to access 'https://git.xxx.com/': SSL certificate problem: unable to get local issuer ce rtificate // 原因：由于git最新版默认使用ssl安全验证，但是我们是使用的git未设
windows命令行设置wifi surfingll windows wifi 笔记本wifi
还没有讨厌无线wifi的无尽广告么，还在耐心等待它慢慢启动么教你命令行设置笔记本电脑wifi： 1、开启wifi命令 netsh wlan set hostednetwork mode=allow ssid=surf8 key=bb123456 netsh wlan start hostednetwork pause 其中pause是等待输入，可以去掉 2、
Linux（Ubuntu）下安装sysv-rc-conf wmlJava linux ubuntu sysv-rc-conf
安装：sudo apt-get install sysv-rc-conf 使用：sudo sysv-rc-conf 操作界面十分简洁，你可以用鼠标点击，也可以用键盘方向键定位，用空格键选择，用Ctrl+N翻下一页，用Ctrl+P翻上一页，用Q退出。背景知识 sysv-rc-conf是一个强大的服务管理程序，群众的意见是sysv-rc-conf比chkconf
svn切换环境，重发布应用多了javaee标签前缀 zengshaotao javaee
更换了开发环境，从杭州，改变到了上海。svn的地址肯定要切换的，切换之前需要将原svn自带的.svn文件信息删除，可手动删除，也可通过废弃原来的svn位置提示删除.svn时删除。然后就是按照最新的svn地址和规范建立相关的目录信息，再将原来的纯代码信息上传到新的环境。然后再重新检出，这样每次修改后就可以看到哪些文件被修改过，这对于增量发布的规范特别有用。检出