860MHz

Apache Flink Task类源码分析

1. 简介

Apache Flink由两类运行时JVM进程管理分布式集群的计算资源。

JobManager进程负责分布式任务管理，如任务调度、检查点、故障恢复等。在高可用性（HA）分布式部署时，系统存在多个JobManager，一个leader和多个standby。JobManager是Flink主从架构中的master。
TaskManager进程负责执行任务线程（即子任务subtask）、缓存和传输stream。TaskManager是Flink主从架构中的slave。

Task.java类表示在TaskManager上执行的operator subtask（子任务），这些operator subtask在不同的线程、不同的物理机或不同的容器中彼此互不依赖得执行。

每个operator subtask由一个专用的线程运行。

2. 代码分析

org.apache.flink.runtime.taskmanager.Task类在flink 1.8中有1645行，是个非常冗长的类，它实现了Runnable、TaskActions、CheckpointListener接口。

public interface TaskActions {

	/** * Check the execution state of the execution producing a result partition. * * @param jobId ID of the job the partition belongs to. * @param intermediateDataSetId ID of the parent intermediate data set. * @param resultPartitionId ID of the result partition to check. This * identifies the producing execution and partition. */
	void triggerPartitionProducerStateCheck(
		JobID jobId,
		IntermediateDataSetID intermediateDataSetId,
		ResultPartitionID resultPartitionId);

	/** * Fail the owning task with the given throwable. * * @param cause of the failure */
	void failExternally(Throwable cause);
}

TaskActions接口定义了Task可以被执行的操作，目前包含两个方法：

triggerPartitionProducerStateCheck：检查执行状态
failExternally：根据输入的Throwable令当前Task失败

public interface CheckpointListener {

	/** * This method is called as a notification once a distributed checkpoint has been completed. * * Note that any exception during this method will not cause the checkpoint to * fail any more. * * @param checkpointId The ID of the checkpoint that has been completed. * @throws Exception */
	void notifyCheckpointComplete(long checkpointId) throws Exception;
}

CheckpointListener接口定义了checkpoint完成后的通知逻辑。

2.1 构造函数

    public Task(
        JobInformation jobInformation,
        TaskInformation taskInformation,
        ExecutionAttemptID executionAttemptID,
        AllocationID slotAllocationId,
        int subtaskIndex,
        int attemptNumber,
        Collection<ResultPartitionDeploymentDescriptor> resultPartitionDeploymentDescriptors,
        Collection<InputGateDeploymentDescriptor> inputGateDeploymentDescriptors,
        int targetSlotNumber,
        MemoryManager memManager,
        IOManager ioManager,
        NetworkEnvironment networkEnvironment,
        BroadcastVariableManager bcVarManager,
        TaskStateManager taskStateManager,
        TaskManagerActions taskManagerActions,
        InputSplitProvider inputSplitProvider,
        CheckpointResponder checkpointResponder,
        GlobalAggregateManager aggregateManager,
        BlobCacheService blobService,
        LibraryCacheManager libraryCache,
        FileCache fileCache,
        TaskManagerRuntimeInfo taskManagerConfig,
        @Nonnull TaskMetricGroup metricGroup,
        ResultPartitionConsumableNotifier resultPartitionConsumableNotifier,
        PartitionProducerStateChecker partitionProducerStateChecker,
        Executor executor) {

        Preconditions.checkNotNull(jobInformation);
        Preconditions.checkNotNull(taskInformation);

        Preconditions.checkArgument(0 <= subtaskIndex, "The subtask index must be positive.");
        Preconditions.checkArgument(0 <= attemptNumber, "The attempt number must be positive.");
        Preconditions.checkArgument(0 <= targetSlotNumber, "The target slot number must be positive.");

        this.taskInfo = new TaskInfo(
                taskInformation.getTaskName(),
                taskInformation.getMaxNumberOfSubtaks(),
                subtaskIndex,
                taskInformation.getNumberOfSubtasks(),
                attemptNumber,
                String.valueOf(slotAllocationId));

        this.jobId = jobInformation.getJobId();
        this.vertexId = taskInformation.getJobVertexId();
        this.executionId  = Preconditions.checkNotNull(executionAttemptID);
        this.allocationId = Preconditions.checkNotNull(slotAllocationId);
        this.taskNameWithSubtask = taskInfo.getTaskNameWithSubtasks();
        this.jobConfiguration = jobInformation.getJobConfiguration();
        this.taskConfiguration = taskInformation.getTaskConfiguration();
        this.requiredJarFiles = jobInformation.getRequiredJarFileBlobKeys();
        this.requiredClasspaths = jobInformation.getRequiredClasspathURLs();
        this.nameOfInvokableClass = taskInformation.getInvokableClassName();
        this.serializedExecutionConfig = jobInformation.getSerializedExecutionConfig();

        Configuration tmConfig = taskManagerConfig.getConfiguration();
        this.taskCancellationInterval = tmConfig.getLong(TaskManagerOptions.TASK_CANCELLATION_INTERVAL);
        this.taskCancellationTimeout = tmConfig.getLong(TaskManagerOptions.TASK_CANCELLATION_TIMEOUT);

        this.memoryManager = Preconditions.checkNotNull(memManager);
        this.ioManager = Preconditions.checkNotNull(ioManager);
        this.broadcastVariableManager = Preconditions.checkNotNull(bcVarManager);
        this.taskStateManager = Preconditions.checkNotNull(taskStateManager);
        this.accumulatorRegistry = new AccumulatorRegistry(jobId, executionId);

        this.inputSplitProvider = Preconditions.checkNotNull(inputSplitProvider);
        this.checkpointResponder = Preconditions.checkNotNull(checkpointResponder);
        this.aggregateManager = Preconditions.checkNotNull(aggregateManager);
        this.taskManagerActions = checkNotNull(taskManagerActions);

        this.blobService = Preconditions.checkNotNull(blobService);
        this.libraryCache = Preconditions.checkNotNull(libraryCache);
        this.fileCache = Preconditions.checkNotNull(fileCache);
        this.network = Preconditions.checkNotNull(networkEnvironment);
        this.taskManagerConfig = Preconditions.checkNotNull(taskManagerConfig);

        this.metrics = metricGroup;

        this.partitionProducerStateChecker = Preconditions.checkNotNull(partitionProducerStateChecker);
        this.executor = Preconditions.checkNotNull(executor);

        // create the reader and writer structures

        final String taskNameWithSubtaskAndId = taskNameWithSubtask + " (" + executionId + ')';

        // Produced intermediate result partitions
        this.producedPartitions = new ResultPartition[resultPartitionDeploymentDescriptors.size()];

        int counter = 0;

        for (ResultPartitionDeploymentDescriptor desc: resultPartitionDeploymentDescriptors) {
            ResultPartitionID partitionId = new ResultPartitionID(desc.getPartitionId(), executionId);

            this.producedPartitions[counter] = new ResultPartition(
                taskNameWithSubtaskAndId,
                this,
                jobId,
                partitionId,
                desc.getPartitionType(),
                desc.getNumberOfSubpartitions(),
                desc.getMaxParallelism(),
                networkEnvironment.getResultPartitionManager(),
                resultPartitionConsumableNotifier,
                ioManager,
                desc.sendScheduleOrUpdateConsumersMessage());

            ++counter;
        }

        // Consumed intermediate result partitions
        this.inputGates = new SingleInputGate[inputGateDeploymentDescriptors.size()];
        this.inputGatesById = new HashMap<>();

        counter = 0;

        for (InputGateDeploymentDescriptor inputGateDeploymentDescriptor: inputGateDeploymentDescriptors) {
            SingleInputGate gate = SingleInputGate.create(
                taskNameWithSubtaskAndId,
                jobId,
                executionId,
                inputGateDeploymentDescriptor,
                networkEnvironment,
                this,
                metricGroup.getIOMetricGroup());

            inputGates[counter] = gate;
            inputGatesById.put(gate.getConsumedResultId(), gate);

            ++counter;
        }

        invokableHasBeenCanceled = new AtomicBoolean(false);

        // finally, create the executing thread, but do not start it
        executingThread = new Thread(TASK_THREADS_GROUP, this, taskNameWithSubtask);
    }

构造函数代码较长，但是逻辑比较简单

首先，根据构造函数入参初始化类实例变量
根据分区数量，初始化结果分区数组
创建执行线程，但不启动它。执行线程启动的入口是在TaskManager的submitTask函数中

2.2 run

    public void run() {

        // ----------------------------
        // Initial State transition
        // ----------------------------
        while (true) {
            ExecutionState current = this.executionState;
            if (current == ExecutionState.CREATED) {
                if (transitionState(ExecutionState.CREATED, ExecutionState.DEPLOYING)) {
                    // success, we can start our work
                    break;
                }
            }
            else if (current == ExecutionState.FAILED) {
                // we were immediately failed. tell the TaskManager that we reached our final state
                notifyFinalState();
                if (metrics != null) {
                    metrics.close();
                }
                return;
            }
            else if (current == ExecutionState.CANCELING) {
                if (transitionState(ExecutionState.CANCELING, ExecutionState.CANCELED)) {
                    // we were immediately canceled. tell the TaskManager that we reached our final state
                    notifyFinalState();
                    if (metrics != null) {
                        metrics.close();
                    }
                    return;
                }
            }
            else {
                if (metrics != null) {
                    metrics.close();
                }
                throw new IllegalStateException("Invalid state for beginning of operation of task " + this + '.');
            }
        }

        // all resource acquisitions and registrations from here on
        // need to be undone in the end
        Map<String, Future<Path>> distributedCacheEntries = new HashMap<>();
        AbstractInvokable invokable = null;

        try {
            // ----------------------------
            // Task Bootstrap - We periodically
            // check for canceling as a shortcut
            // ----------------------------

            // activate safety net for task thread
            LOG.info("Creating FileSystem stream leak safety net for task {}", this);
            FileSystemSafetyNet.initializeSafetyNetForThread();

            blobService.getPermanentBlobService().registerJob(jobId);

            // first of all, get a user-code classloader
            // this may involve downloading the job's JAR files and/or classes
            LOG.info("Loading JAR files for task {}.", this);

            userCodeClassLoader = createUserCodeClassloader();
            final ExecutionConfig executionConfig = serializedExecutionConfig.deserializeValue(userCodeClassLoader);

            if (executionConfig.getTaskCancellationInterval() >= 0) {
                // override task cancellation interval from Flink config if set in ExecutionConfig
                taskCancellationInterval = executionConfig.getTaskCancellationInterval();
            }

            if (executionConfig.getTaskCancellationTimeout() >= 0) {
                // override task cancellation timeout from Flink config if set in ExecutionConfig
                taskCancellationTimeout = executionConfig.getTaskCancellationTimeout();
            }

            if (isCanceledOrFailed()) {
                throw new CancelTaskException();
            }

            // ----------------------------------------------------------------
            // register the task with the network stack
            // this operation may fail if the system does not have enough
            // memory to run the necessary data exchanges
            // the registration must also strictly be undone
            // ----------------------------------------------------------------

            LOG.info("Registering task at network: {}.", this);

            network.registerTask(this);

            // add metrics for buffers
            this.metrics.getIOMetricGroup().initializeBufferMetrics(this);

            // register detailed network metrics, if configured
            if (taskManagerConfig.getConfiguration().getBoolean(TaskManagerOptions.NETWORK_DETAILED_METRICS)) {
                // similar to MetricUtils.instantiateNetworkMetrics() but inside this IOMetricGroup
                MetricGroup networkGroup = this.metrics.getIOMetricGroup().addGroup("Network");
                MetricGroup outputGroup = networkGroup.addGroup("Output");
                MetricGroup inputGroup = networkGroup.addGroup("Input");

                // output metrics
                for (int i = 0; i < producedPartitions.length; i++) {
                    ResultPartitionMetrics.registerQueueLengthMetrics(
                        outputGroup.addGroup(i), producedPartitions[i]);
                }

                for (int i = 0; i < inputGates.length; i++) {
                    InputGateMetrics.registerQueueLengthMetrics(
                        inputGroup.addGroup(i), inputGates[i]);
                }
            }

            // next, kick off the background copying of files for the distributed cache
            try {
                for (Map.Entry<String, DistributedCache.DistributedCacheEntry> entry :
                        DistributedCache.readFileInfoFromConfig(jobConfiguration)) {
                    LOG.info("Obtaining local cache file for '{}'.", entry.getKey());
                    Future<Path> cp = fileCache.createTmpFile(entry.getKey(), entry.getValue(), jobId, executionId);
                    distributedCacheEntries.put(entry.getKey(), cp);
                }
            }
            catch (Exception e) {
                throw new Exception(
                    String.format("Exception while adding files to distributed cache of task %s (%s).", taskNameWithSubtask, executionId), e);
            }

            if (isCanceledOrFailed()) {
                throw new CancelTaskException();
            }

            // ----------------------------------------------------------------
            // call the user code initialization methods
            // ----------------------------------------------------------------

            TaskKvStateRegistry kvStateRegistry = network.createKvStateTaskRegistry(jobId, getJobVertexId());

            Environment env = new RuntimeEnvironment(
                jobId,
                vertexId,
                executionId,
                executionConfig,
                taskInfo,
                jobConfiguration,
                taskConfiguration,
                userCodeClassLoader,
                memoryManager,
                ioManager,
                broadcastVariableManager,
                taskStateManager,
                aggregateManager,
                accumulatorRegistry,
                kvStateRegistry,
                inputSplitProvider,
                distributedCacheEntries,
                producedPartitions,
                inputGates,
                network.getTaskEventDispatcher(),
                checkpointResponder,
                taskManagerConfig,
                metrics,
                this);

            // now load and instantiate the task's invokable code
            invokable = loadAndInstantiateInvokable(userCodeClassLoader, nameOfInvokableClass, env);

            // ----------------------------------------------------------------
            // actual task core work
            // ----------------------------------------------------------------

            // we must make strictly sure that the invokable is accessible to the cancel() call
            // by the time we switched to running.
            this.invokable = invokable;

            // switch to the RUNNING state, if that fails, we have been canceled/failed in the meantime
            if (!transitionState(ExecutionState.DEPLOYING, ExecutionState.RUNNING)) {
                throw new CancelTaskException();
            }

            // notify everyone that we switched to running
            taskManagerActions.updateTaskExecutionState(new TaskExecutionState(jobId, executionId, ExecutionState.RUNNING));

            // make sure the user code classloader is accessible thread-locally
            executingThread.setContextClassLoader(userCodeClassLoader);

            // run the invokable
            invokable.invoke();

            // make sure, we enter the catch block if the task leaves the invoke() method due
            // to the fact that it has been canceled
            if (isCanceledOrFailed()) {
                throw new CancelTaskException();
            }

            // ----------------------------------------------------------------
            // finalization of a successful execution
            // ----------------------------------------------------------------

            // finish the produced partitions. if this fails, we consider the execution failed.
            for (ResultPartition partition : producedPartitions) {
                if (partition != null) {
                    partition.finish();
                }
            }

            // try to mark the task as finished
            // if that fails, the task was canceled/failed in the meantime
            if (!transitionState(ExecutionState.RUNNING, ExecutionState.FINISHED)) {
                throw new CancelTaskException();
            }
        }
        catch (Throwable t) {

            // unwrap wrapped exceptions to make stack traces more compact
            if (t instanceof WrappingRuntimeException) {
                t = ((WrappingRuntimeException) t).unwrap();
            }

            // ----------------------------------------------------------------
            // the execution failed. either the invokable code properly failed, or
            // an exception was thrown as a side effect of cancelling
            // ----------------------------------------------------------------

            try {
                // check if the exception is unrecoverable
                if (ExceptionUtils.isJvmFatalError(t) ||
                        (t instanceof OutOfMemoryError && taskManagerConfig.shouldExitJvmOnOutOfMemoryError())) {

                    // terminate the JVM immediately
                    // don't attempt a clean shutdown, because we cannot expect the clean shutdown to complete
                    try {
                        LOG.error("Encountered fatal error {} - terminating the JVM", t.getClass().getName(), t);
                    } finally {
                        Runtime.getRuntime().halt(-1);
                    }
                }

                // transition into our final state. we should be either in DEPLOYING, RUNNING, CANCELING, or FAILED
                // loop for multiple retries during concurrent state changes via calls to cancel() or
                // to failExternally()
                while (true) {
                    ExecutionState current = this.executionState;

                    if (current == ExecutionState.RUNNING || current == ExecutionState.DEPLOYING) {
                        if (t instanceof CancelTaskException) {
                            if (transitionState(current, ExecutionState.CANCELED)) {
                                cancelInvokable(invokable);
                                break;
                            }
                        }
                        else {
                            if (transitionState(current, ExecutionState.FAILED, t)) {
                                // proper failure of the task. record the exception as the root cause
                                failureCause = t;
                                cancelInvokable(invokable);

                                break;
                            }
                        }
                    }
                    else if (current == ExecutionState.CANCELING) {
                        if (transitionState(current, ExecutionState.CANCELED)) {
                            break;
                        }
                    }
                    else if (current == ExecutionState.FAILED) {
                        // in state failed already, no transition necessary any more
                        break;
                    }
                    // unexpected state, go to failed
                    else if (transitionState(current, ExecutionState.FAILED, t)) {
                        LOG.error("Unexpected state in task {} ({}) during an exception: {}.", taskNameWithSubtask, executionId, current);
                        break;
                    }
                    // else fall through the loop and
                }
            }
            catch (Throwable tt) {
                String message = String.format("FATAL - exception in exception handler of task %s (%s).", taskNameWithSubtask, executionId);
                LOG.error(message, tt);
                notifyFatalError(message, tt);
            }
        }
        finally {
            try {
                LOG.info("Freeing task resources for {} ({}).", taskNameWithSubtask, executionId);

                // clear the reference to the invokable. this helps guard against holding references
                // to the invokable and its structures in cases where this Task object is still referenced
                this.invokable = null;

                // stop the async dispatcher.
                // copy dispatcher reference to stack, against concurrent release
                ExecutorService dispatcher = this.asyncCallDispatcher;
                if (dispatcher != null && !dispatcher.isShutdown()) {
                    dispatcher.shutdownNow();
                }

                // free the network resources
                network.unregisterTask(this);

                // free memory resources
                if (invokable != null) {
                    memoryManager.releaseAll(invokable);
                }

                // remove all of the tasks library resources
                libraryCache.unregisterTask(jobId, executionId);
                fileCache.releaseJob(jobId, executionId);
                blobService.getPermanentBlobService().releaseJob(jobId);

                // close and de-activate safety net for task thread
                LOG.info("Ensuring all FileSystem streams are closed for task {}", this);
                FileSystemSafetyNet.closeSafetyNetAndGuardedResourcesForThread();

                notifyFinalState();
            }
            catch (Throwable t) {
                // an error in the resource cleanup is fatal
                String message = String.format("FATAL - exception in resource cleanup of task %s (%s).", taskNameWithSubtask, executionId);
                LOG.error(message, t);
                notifyFatalError(message, t);
            }

            // un-register the metrics at the end so that the task may already be
            // counted as finished when this happens
            // errors here will only be logged
            try {
                metrics.close();
            }
            catch (Throwable t) {
                LOG.error("Error during metrics de-registration of task {} ({}).", taskNameWithSubtask, executionId, t);
            }
        }
    }

run方法是task的启动和执行核心方法。

2.2.1 状态初始化

自旋等待状态从CREATED修改为DEPLOYING成功，修改成功后退出自旋
如果task当前状态不是CREATED则退出run方法

2.2.2 启动

创建一个jobId粒度的class loader并下载缺失的jar files；基于不同的class loader的类加载隔离机制可以在JVM进程内隔离不同的task运行环境
将当前task实例注册到network stack，如果可用内存不足，注册可能会失败
后台拷贝分布式缓存文件
加载和初始化任务的invokable代码
subtask状态从DEPLOYING切换到RUNNING
通知订阅方subtask状态已修改

2.2.3 运行

将执行线程的线程上下文类加载器修改为刚刚启动时创建的class loader
调用invokable的invoke方法，执行用户代码

2.2.4 结束

逐一调用每个结果分区的finish方法
subtask状态从RUNNING切换到FINISHED

2.2.5 异常捕获

如果是OOM error，并且配置了jvm-exit-on-oom参数为true，则调用Runtime.getRuntime().halt(-1)来停止JVM。
自旋修改subtask状态为CANCELED或FAILED

2.3 createUserCodeClassloader

    private ClassLoader createUserCodeClassloader() throws Exception {
        long startDownloadTime = System.currentTimeMillis();

        // triggers the download of all missing jar files from the job manager
        libraryCache.registerTask(jobId, executionId, requiredJarFiles, requiredClasspaths);

        LOG.debug("Getting user code class loader for task {} at library cache manager took {} milliseconds",
                executionId, System.currentTimeMillis() - startDownloadTime);

        ClassLoader userCodeClassLoader = libraryCache.getClassLoader(jobId);
        if (userCodeClassLoader == null) {
            throw new Exception("No user code classloader available.");
        }
        return userCodeClassLoader;
    }

createUserCodeClassloader方法调用BlobLibraryCacheManager类的getClassLoader方法获取用户代码class loader。

  /** Registered entries per job */
  private final Map<JobID, LibraryCacheEntry> cacheEntries = new HashMap<>();

BlobLibraryCacheManager类内部维护了一个HashMap缓存每个job的class loader。

3. 总结

Flink Task.java类实现了Runnable接口，执行线程将执行Task类的run方法；
Flink task的线程模型没有使用线程池，仅仅只用了内置的thread group；
针对每个job创建一个class loader，并设置为执行线程的上下文类加载器。

女性职业新趋势：揭秘未来高薪热门行业氧惠爱高省
女生在职业选择上拥有广阔的空间，尤其是在当前快速发展的社会背景下，一些行业不仅成为了高薪热门，还提供了多样化的职业路径。以下是一些可能成为女生高薪热门选择的行业：➤推荐网购返利app“氧惠”，一个领隐藏优惠券+现金返利的平台。氧惠只提供领券返利链接，下单全程都在淘宝、京东、拼多多等原平台，更支持抖音、快手电商、外卖红包返利等。科技与互联网行业人工智能与大数据：随着人工智能和大数据技术的广泛应用，相
深入解析Hadoop中的Region分裂与合并机制码字的字节 hadoop布道师 hadoop 大数据分布式 Region 分裂合并
Hadoop与Region的基本概念Hadoop的分布式架构基础作为大数据处理的核心框架，Hadoop通过分布式存储和计算解决了海量数据的处理难题。其架构核心由HDFS（HadoopDistributedFileSystem）和MapReduce组成，前者负责数据的分布式存储，后者实现分布式计算。在HDFS中，数据被分割成固定大小的块（默认128MB）分散存储在集群节点上，而MapReduce则通
深入解析Hadoop RPC：技术细节与推广应用码字的字节 hadoop布道师 Hadoop RPC
HadoopRPC框架概述在分布式系统的核心架构中，远程过程调用（RPC）机制如同神经网络般连接着各个计算节点。Hadoop作为大数据处理的基石，其自主研发的RPC框架不仅支撑着内部组件的协同运作，更以独特的工程哲学诠释了分布式通信的本质。透明性：隐形的通信桥梁HadoopRPC最显著的特征是其对通信细节的完美封装。当NameNode接收DataNode的心跳检测，或ResourceManager
深入解析Hadoop：大数据处理的基石学习的锅 hadoop 大数据分布式
随着信息技术的快速发展和互联网的普及，数据的产生速度极具增加。面对如此海量的数据，传统的数据处理工具显得力不从心。在这种背景下，诞生了一系列用于处理大数据的框架与工具，而ApacheHadoop便是其中最为知名和应用最广泛的一个。本文将深入解析Hadoop的基本原理、架构及其在大数据处理中的重要性。1.Hadoop的起源与发展Hadoop起源于Google公司的三篇奠基性论文：GoogleFile
大数据技术关键技术组件
大数据技术是一组用于处理、分析和管理大规模数据集的复杂方法和技术。这些数据集的特点是容量大、增长速度快，且结构多样化，包括结构化、半结构化和非结构化数据。传统数据库管理和分析工具在处理此类数据时效率低下或无法胜任，因此需要专门的大数据技术栈来支持高效的数据处理和智能决策。大数据技术的关键组件通常包括：分布式存储系统：HadoopDistributedFileSystem(HDFS)：一个高度可扩展
大数据领域HDFS的集群资源管理优化大数据洞察大数据与AI人工智能大数据AI应用大数据 hdfs hadoop ai
大数据领域HDFS的集群资源管理优化关键词：HDFS；集群资源管理；存储优化；性能调优；副本策略；负载均衡；NameNode优化摘要：HDFS（Hadoop分布式文件系统）作为大数据领域的基石，承载着海量数据的存储与管理重任。随着数据规模爆炸式增长和业务复杂度提升，HDFS集群的资源管理面临着"存不下、跑不快、管不好"的三重挑战：存储资源浪费与不足并存、计算与存储资源匹配失衡、集群运维效率低下。本
深入探索Hadoop技术：全面学习指南
引言在大数据时代，高效地存储、处理和分析海量数据已成为企业决策与创新的关键驱动力。Hadoop，作为开源的大数据处理框架，以其强大的分布式存储和并行计算能力，以及丰富的生态系统，为企业提供了应对大规模数据挑战的有效解决方案。本文旨在为初学者和进阶者提供一份详尽的Hadoop技术学习指南，涵盖HDFS、MapReduce、YARN等核心组件，以及Hive、Pig、HBase等生态系统工具，助您踏上H
防不胜防!第六届研究所老姜（姜新宁）算力3.0亏损被骗曝光,巨额损失真相令人胆寒心惊！大盛律道
数字经济十选五投资诈骗套路频出，投资者股民的“钱袋子”多有损失，以投资理财获取大数据数字经济投资算法为由，将投资者的积蓄收入囊中，成为不法分子常用的诈骗手段之一。为守护好投资者的“钱袋子”，小编持续开展曝光数字经济诈骗行动，维护“投资者”合法权益。近年来，股市波动不断，投资者们无不渴望找到稳健的投资途径。而一些不法分子趁机利用第六届研究所荐股群的手段，设下重重陷阱，致使投资者损失惨重。骗子冒充姜新
大数据领域 Kafka 入门指南：从安装到基础使用大数据洞察大数据与AI人工智能大数据 kafka linq ai
大数据领域Kafka入门指南：从安装到基础使用关键词：Kafka、消息队列、分布式系统、大数据处理、实时数据流、生产者消费者模型、ZooKeeper摘要：本文是一篇全面介绍ApacheKafka的入门指南，从基本概念到实际应用。我们将详细讲解Kafka的核心架构、工作原理，并提供从安装配置到基础使用的完整实践指导。文章包含Kafka的生产者-消费者模型实现、集群部署策略、性能优化技巧，以及在大数据
python如何抓取网页里面的文字_如何利用python抓取网页文字、图片内容？ weixin_39917437
想必新老python学习者，对爬虫这一概念并不陌生，在如今大数据时代，很多场景都需要利用爬虫去爬取数据，而这刚好时python领域，如何实现？怎么做？一起来看下吧~获取图片：1、当我们浏览这个网站时，会发现，每一个页面的URL都是以网站的域名+page+页数组成，这样我们就可以逐一的访问该网站的网页了。2、当我们看图片列表时中，把鼠标放到图片，右击检查，我们发现，图片的内容由ul包裹的li组成，箭
Flink-Hadoop实战项目 Dylan_muc hadoop hdfs flink
项目说明文档1.项目概述1.1项目简介本项目是一个基于ApacheFlink的大数据流处理平台，专门用于处理铁路系统的票务和车次信息数据。系统包含两个核心流处理作业：文件处理作业和数据合并作业，采用定时调度机制，支持Kerberos安全认证，实现从文件读取到数据仓库存储的完整数据处理链路。1.2技术栈流处理引擎:ApacheFlink1.18.1存储系统:HDFS(Hadoop分布式文件系统)数据
飞算科技：以原创技术为翼，赋能产业数字化转型
在数字经济浪潮席卷全球的当下，一批专注于技术创新的中国企业正加速崛起，飞算数智科技（深圳）有限公司（简称“飞算科技”）便是其中的佼佼者。作为一家国家级高新技术企业，飞算科技以自主创新为核心驱动力，凭借互联网科技、大数据、人工智能等前沿技术，为各行业客户插上数字化转型的翅膀。飞算科技的定位清晰而坚定——自主创新型数字科技公司。这一定位不仅体现在其技术研发的方向上，更融入到为客户服务的每一个环节。无论
2018-03-19新零售是未来的商业模式吗？马云对新零售到底什么看法? 拼自己想要的梦想
马云对新零售到底什么不雅观不雅观点?其实，在此之前，新零售一词就已经在业界出现过，而马云此次的提出，使其作为一个正式的名词传布开来。马云认为互联网时代，传统零售行业受到了电商互联网的打击。将来，线下与线上零售将深度连系，再加当代物流，办事商把持大数据、云计较等立异手艺，构成将来新零售的概念。纯电商的时代很快将竣事，纯零售的情势也将被冲破，新零售将引领将来全新的商业形式。新零售是从哪里来的?新零售是
大数据集群运维常见的一些问题以及处理方式
态）；若为YARN节点，重启NodeManager后手动将其加入集群。若为节点整体宕机：排查电源和网络，重启节点后，依次启动HDFS、YARN等服务进程，确认数据块完整性（避免因节点宕机导致副本不足）。2.网络问题现象：节点间通信超时（如HDFS心跳超时、YARN任务调度延迟）、数据传输卡顿。可能原因：交换机故障、网线松动、网络带宽过载、防火墙规则拦截。处理方式：用ping、traceroute检
学习人工智能开发的详细指南 Ws＿学习人工智能 python
一、引言人工智能（AI）开发是一个充满挑战与机遇的领域，它融合了数学、计算机科学、统计学、认知科学等多个学科的知识。随着大数据、云计算和深度学习技术的快速发展，AI已经成为推动社会进步和产业升级的关键力量。本文将为初学者提供一份详细的学习指南，帮助大家逐步掌握AI开发的核心技能。二、基础知识准备数学基础：线性代数：理解向量、矩阵、线性变换等基本概念，掌握矩阵运算和特征值分解等技巧。概率论与统计学：
大数据技术是解决什么问题的？ @佳瑞大数据
基础知识1TB（太字节）=1024GB1PB（拍字节）=1024TB大数据核心框架HadoopHadoop作为大数据技术生态的核心框架，主要解决了海量数据（TB/PB级）的存储、处理和分析难题，尤其是在传统数据库（如MySQL）和单机计算无法应对的场景下，提供了低成本、高可靠、可扩展的解决方案。其核心解决的问题可归纳为以下几点：海量数据的存储问题传统痛点：单机存储容量有限（如单服务器硬盘通常在TB
Python爬虫【四十五章】爬虫攻防战：异步并发+AI反爬识别的技术解密程序员_CLUB Python入门到进阶 python 爬虫人工智能
目录引言：当爬虫工程师遇上AI反爬官一、异步并发基础设施层1.1混合调度框架设计1.2智能连接池管理二、机器学习反爬识别层2.1特征工程体系2.2轻量级在线推理三、智能决策系统3.1动态策略引擎3.2实时对抗案例四、性能优化实战4.1全链路压测数据4.2典型故障处理案例五、总结：构建智能化的爬虫生态系统Python爬虫相关文章（推荐）引言：当爬虫工程师遇上AI反爬官在大数据采集领域，我们正经历着技
Python处理MySQL大数据量：分页查询与性能优化 AI天才研究院 AI人工智能与大数据 python mysql 性能优化 ai
Python处理MySQL大数据量：分页查询与性能优化关键词：Python分页查询、MySQL性能优化、大数据量处理、LIMITOFFSET、索引优化摘要：当数据库表数据量达到百万级时，传统的LIMITOFFSET分页查询会出现明显性能瓶颈。本文从实际场景出发，用“图书馆找书”的通俗比喻拆解分页原理，结合Python代码示例和MySQL执行计划分析，详细讲解传统分页的痛点、优化思路（索引分页/覆盖
大学专业科普 | 计算智能、信息学与大数据鸭鸭鸭进京赶烤大数据
一、专业背景随着信息技术的飞速发展，数据的产生速度呈爆炸式增长，传统数据处理技术已经无法满足如此庞大的数据量和复杂的数据类型，大数据专业应运而生，旨在培养能够应对大数据挑战的专业人才。二、主要课程内容数学基础课程高等数学、概率论与数理统计、线性代数是大数据分析的核心数学基础，为数据处理、算法优化和模型构建提供必要的理论支持。计算机基础课程数据结构与算法、计算机网络、操作系统是大数据技术的重要支撑，
转行网络安全需要学什么？（非常详细）零基础入门到精通，收藏这一篇就够了网络安全苏柒 web安全计算机网络网络安全运维转业程序员编程
什么是网络安全？网络安全是指保护网络系统的硬件、软件及其系统中的数据，破坏、更改、泄露，使系统连续可靠正常地运行，网络服务不会中断。未来，我国将着重发展数字经济，发展云计算、大数据、物联网、工业互联网、区块链和人工智能等产业，这些产业全部都基于网络互联。网络的安全就是以上这些产业能够良性发展的基础，也是建设制造强国和网络强国的基础保障。什么是网络安全工程师？网络安全工程师是负责保护计算机网络系统，
转行网络安全需要学什么？（非常详细）从零基础到精通，收藏这篇就够了！～小羊没烦恼～黑客技术黑客网络安全 web安全安全学习运维网络
什么是网络安全？网络安全是指保护网络系统的硬件、软件及其系统中的数据，破坏、更改、泄露，使系统连续可靠正常地运行，网络服务不会中断。未来，我国将着重发展数字经济，发展云计算、大数据、物联网、工业互联网、区块链和人工智能等产业，这些产业全部都基于网络互联。网络的安全就是以上这些产业能够良性发展的基础，也是建设制造强国和网络强国的基础保障。什么是网络安全工程师？网络安全工程师是负责保护计算机网络系统，
转行网络安全需要学什么？（非常详细）零基础入门到精通，收藏这一篇就够了网络安全k叔 web安全计算机网络网络安全编程计算机转业信息安全
什么是网络安全？网络安全是指保护网络系统的硬件、软件及其系统中的数据，破坏、更改、泄露，使系统连续可靠正常地运行，网络服务不会中断。未来，我国将着重发展数字经济，发展云计算、大数据、物联网、工业互联网、区块链和人工智能等产业，这些产业全部都基于网络互联。网络的安全就是以上这些产业能够良性发展的基础，也是建设制造强国和网络强国的基础保障。什么是网络安全工程师？网络安全工程师是负责保护计算机网络系统，
新一轮黑产打击：上亿简历大数据公司被警方一锅端大数据的时代
近日，中国的简历大数据公司、曾获李开复旗下创新工场投资的“巧达科技”被警方一锅端，所有员工都被带走。随后，有部分员工被陆续放出。据悉，该公司被查可能缘起在没有获得授权下抓取用户简历。该公司此前曾获得天使轮、A轮和B轮融资，资方包括李开复的创新工场、中信产业基金等。有迹象显示，监管部门正在掀起对大数据灰产和黑产的新一轮打击。传公司被警方一锅端，网站已无法打开。3月23日，有网友在工商信息查询网站“天
贵州微商行业协会，今日成立我是磊少
图片发自App文/磊少2018年6.19是全国所有微商引以为傲的一天，因为这一天，微商立法了。且被纳入电子商务经营者范围。而我想说的是，今天（2018.8月28）是所有贵州微商最扬眉吐气的一天。因为今天，贵州省微商行业协会成立了。伴随着移动互联网的蓬勃发展，大数据的日新月异，尤其是贵州贵阳作为全球大数据研究中心，吸引了众多国际顶尖的互联网技术与核心人才，更是为贵州互联网的发展插上了理想的翅膀，飞翔
Hadoop与图像识别与处理 AI天才研究院 AI大模型企业级应用开发实战 Agentic AI 实战 AI人工智能与大数据计算科学神经计算深度学习神经网络大数据人工智能大型语言模型 AI AGI LLM Java Python 架构设计 Agent RPA
Hadoop与图像识别与处理作者：禅与计算机程序设计艺术/ZenandtheArtofComputerProgramming1.背景介绍1.1问题的由来在大数据时代，数据的爆炸性增长对数据处理技术提出了新的挑战。图像数据作为一种重要的数据形式，其处理和分析在许多领域中具有重要意义，如医疗影像分析、自动驾驶、安防监控等。然而，传统的图像处理方法在面对海量图像数据时显得力不从心。Hadoop作为一种分
大数据领域数据架构的实时数据可视化架构 AGI大模型与大数据研究院 AI大模型应用开发实战信息可视化大数据架构 ai
大数据领域数据架构的实时数据可视化架构关键词：大数据架构、实时数据处理、数据可视化、流式计算、数据管道、可视化工具、性能优化摘要：本文深入探讨了大数据领域中实时数据可视化架构的设计与实现。我们将从基础概念出发，逐步分析实时数据处理流程，介绍关键技术和工具，并通过实际案例展示如何构建高性能的实时可视化系统。文章将涵盖数据采集、处理、存储和可视化展示的全链路架构，同时讨论性能优化策略和未来发展趋势。1
践行乡村支教，助力乡村振兴 bc1bd9748b57
在大数据时代，大量农村青年进城寻求机遇，在工资待遇环境各个方面追求改善，导致大批留守儿童与孤寡老人，教育环境差，师资力量薄弱，这些孩子的教育问题受到大众关注。同时，大学毕业生在求职时也更加倾向于留在大城市，发展较快的地方寻求更大的发展机遇。当然也不乏大学生回乡为新一代的成长奉献自己，通过支教或者直接就业的形式，为乡村孩子的成长奉献自己的力量。有一些有才华的人放弃自己在大城市继续深造的机会，专心于这
时序数据库：数据库领域的未来之星数据库管理艺术数据库专家之路大数据AI人工智能 MCP&Agent SQL实战数据库时序数据库 ai
时序数据库：数据库领域的未来之星关键词：时序数据库、时间序列数据、物联网、大数据分析、数据库优化、TSDB、实时数据处理摘要：本文深入探讨了时序数据库(TimeSeriesDatabase,TSDB)这一新兴数据库技术。我们将从基本概念入手，分析时序数据库的核心原理和架构设计，详细讲解其特有的数据模型和存储机制。通过实际代码示例展示如何使用主流时序数据库处理时间序列数据，并探讨其在物联网、金融科技
MySQL 大数据量分页查询优化实战：从 90秒到 965毫秒的性能飞跃要阿尔卑斯吗. mysql 数据库分布式架构 java
在日常开发中，我们经常需要对数据库中的数据进行分页展示。特别是当表数据量达到几十万甚至上百万级时，传统的LIMIT分页方式会面临严重的性能瓶颈。今天，我将分享一个真实的性能优化案例，通过模拟大页码查询的现场，从90秒缩短到965毫秒，显著提升了查询效率。本篇文章将从问题出现的原因、索引原理、优化思路和最终实战效果等方面，为你全面讲解如何高效处理MySQL大数据分页查询问题。一、问题背景：大页码分页
老码农和你一起学AI：Python系列-Pandas大数据处理 chilavert318 熬之滴水穿石 pandas python
今天开始梳理一下pandas的大数据处理，在数据处理领域，Pandas凭借简洁的API和强大的功能成为Python开发者的首选工具。但当面对GB级甚至更大的数据集时，直接读取数据往往会触发“内存不足”的错误——这是因为Pandas默认将数据全部加载到内存中进行处理。此时，分块处理（Out-of-Core）技术就成为解决问题的关键。它通过将大文件拆分为小块，逐块加载并处理，最终整合结果，实现“用有限
分享100个最新免费的高匿HTTP代理IP mcj8089 代理IP 代理服务器匿名代理免费代理IP 最新代理IP
推荐两个代理IP网站： 1. 全网代理IP：http://proxy.goubanjia.com/ 2. 敲代码免费IP：http://ip.qiaodm.com/ 120.198.243.130:80,中国/广东省 58.251.78.71:8088,中国/广东省 183.207.228.22:83,中国/
mysql高级特性之数据分区 annan211 java 数据结构 mongodb 分区 mysql
mysql高级特性 1 以存储引擎的角度分析，分区表和物理表没有区别。是按照一定的规则将数据分别存储的逻辑设计。器底层是由多个物理字表组成。 2 分区的原理分区表由多个相关的底层表实现，这些底层表也是由句柄对象表示，所以我们可以直接访问各个分区。存储引擎管理分区的各个底层表和管理普通表一样(所有底层表都必须使用相同的存储引擎)，分区表的索引只是
JS采用正则表达式简单获取URL地址栏参数 chiangfai js 地址栏参数获取
GetUrlParam:function GetUrlParam(param){ var reg = new RegExp("(^|&)"+ param +"=([^&]*)(&|$)"); var r = window.location.search.substr(1).match(reg); if(r!=null
怎样将数据表拷贝到powerdesigner (本地数据库表) Array_06 powerDesigner
================================================== 1、打开PowerDesigner12，在菜单中按照如下方式进行操作 file->Reverse Engineer->DataBase 点击后，弹出 New Physical Data Model 的对话框 2、在General选项卡中 Model name:模板名字，自
logbackのhelloworld 飞翔的马甲日志 logback
一、概述 1.日志是啥？当我是个逗比的时候我是这么理解的：log.debug()代替了system.out.print(); 当我项目工作时，以为是一堆得.log文件。这两天项目发布新版本，比较轻松，决定好好地研究下日志以及logback。传送门1：日志的作用与方法： http://www.infoq.com/cn/articles/why-and-how-log 上面的作
新浪微博爬虫模拟登陆随意而生新浪微博
转载自：http://hi.baidu.com/erliang20088/item/251db4b040b8ce58ba0e1235 近来由于毕设需要，重新修改了新浪微博爬虫废了不少劲，希望下边的总结能够帮助后来的同学们。现行版的模拟登陆与以前相比，最大的改动在于cookie获取时候的模拟url的请求
synchronized 香水浓 java thread
Java语言的关键字，可用来给对象和方法或者代码块加锁，当它锁定一个方法或者一个代码块的时候，同一时刻最多只有一个线程执行这段代码。当两个并发线程访问同一个对象object中的这个加锁同步代码块时，一个时间内只能有一个线程得到执行。另一个线程必须等待当前线程执行完这个代码块以后才能执行该代码块。然而，当一个线程访问object的一个加锁代码块时，另一个线程仍然
maven 简单实用教程 AdyZhang maven
1. Maven介绍 1.1. 简介 java编写的用于构建系统的自动化工具。目前版本是2.0.9，注意maven2和maven1有很大区别，阅读第三方文档时需要区分版本。 1.2. Maven资源见官方网站；The 5 minute test，官方简易入门文档；Getting Started Tutorial，官方入门文档；Build Coo
Android 通过 intent传值获得null aijuans android
我在通过intent 获得传递兑现过的时候报错，空指针,我是getMap方法进行传值，代码如下 1 2 3 4 5 6 7 8 9 public void getMap(View view){ Intent i =
apache 做代理报如下错误：The proxy server received an invalid response from an upstream baalwolf response
网站配置是apache＋tomcat,tomcat没有报错，apache报错是： The proxy server received an invalid response from an upstream server. The proxy server could not handle the request GET /. Reason: Error reading fr
Tomcat6 内存和线程配置 BigBird2012 tomcat6
1、修改启动时内存参数、并指定JVM时区（在windows server 2008 下时间少了8个小时）在Tomcat上运行j2ee项目代码时，经常会出现内存溢出的情况，解决办法是在系统参数中增加系统参数： window下，在catalina.bat最前面 set JAVA_OPTS=-XX:PermSize=64M -XX:MaxPermSize=128m -Xms5
Karam与TDD bijian1013 Karam TDD
一.TDD 测试驱动开发（Test-Driven Development,TDD）是一种敏捷（AGILE）开发方法论，它把开发流程倒转了过来，在进行代码实现之前，首先保证编写测试用例，从而用测试来驱动开发（而不是把测试作为一项验证工具来使用）。 TDD的原则很简单： a.只有当某个
[Zookeeper学习笔记之七]Zookeeper源代码分析之Zookeeper.States bit1129 zookeeper
public enum States { CONNECTING, //Zookeeper服务器不可用，客户端处于尝试链接状态 ASSOCIATING, //？？？ CONNECTED, //链接建立，可以与Zookeeper服务器正常通信 CONNECTEDREADONLY, //处于只读状态的链接状态，只读模式可以在
【Scala十四】Scala核心八：闭包 bit1129 scala
Free variable A free variable of an expression is a variable that’s used inside the expression but not defined inside the expression. For instance, in the function literal expression (x: Int) => (x
android发送json并解析返回json ronin47 android
package com.http.test; import org.apache.http.HttpResponse; import org.apache.http.HttpStatus; import org.apache.http.client.HttpClient; import org.apache.http.client.methods.HttpGet; import
一份IT实习生的总结 brotherlamp PHP php资料 php教程 php培训 php视频
今天突然发现在不知不觉中自己已经实习了 3 个月了，现在可能不算是真正意义上的实习吧，因为现在自己才大三，在这边撸代码的同时还要考虑到学校的功课跟期末考试。让我震惊的是，我完全想不到在这 3 个月里我到底学到了什么，这是一件多么悲催的事情啊。同时我对我应该 get 到什么新技能也很迷茫。所以今晚还是总结下把，让自己在接下来的实习生活有更加明确的方向。最后感谢工作室给我们几个人这个机会让我们提前出来
据说是2012年10月人人网校招的一道笔试题-给出一个重物重量为X,另外提供的小砝码重量分别为1，3，9。。。3^N。将重物放到天平左侧，问在两边如何添加砝码 bylijinnan java
public class ScalesBalance { /** * 题目： * 给出一个重物重量为X,另外提供的小砝码重量分别为1，3，9。。。3^N。（假设N无限大，但一种重量的砝码只有一个） * 将重物放到天平左侧，问在两边如何添加砝码使两边平衡 * * 分析： * 三进制 * 我们约定括号表示里面的数是三进制，例如 47=(1202
dom4j最常用最简单的方法 chiangfai dom4j
要使用dom4j读写XML文档,需要先下载dom4j包,dom4j官方网站在 http://www.dom4j.org/目前最新dom4j包下载地址:http://nchc.dl.sourceforge.net/sourceforge/dom4j/dom4j-1.6.1.zip 解开后有两个包,仅操作XML文档的话把dom4j-1.6.1.jar加入工程就可以了,如果需要使用XPath的话还需要
简单HBase笔记 chenchao051 hbase
一、Client-side write buffer 客户端缓存请求描述：可以缓存客户端的请求，以此来减少RPC的次数，但是缓存只是被存在一个ArrayList中，所以多线程访问时不安全的。可以使用getWriteBuffer()方法来取得客户端缓存中的数据。默认关闭。二、Scan的Caching 描述： next( )方法请求一行就要使用一次RPC,即使
mysqldump导出时出现when doing LOCK TABLES daizj mysql mysqdump 导数据
　　执行　mysqldump -uxxx -pxxx -hxxx -Pxxxx database tablename > tablename.sql　导出表时，会报 mysqldump: Got error: 1044: Access denied for user 'xxx'@'xxx' to database 'xxx' when doing LOCK TABLES 解决
CSS渲染原理 dcj3sjt126com Web
从事Web前端开发的人都与CSS打交道很多，有的人也许不知道css是怎么去工作的，写出来的css浏览器是怎么样去解析的呢？当这个成为我们提高css水平的一个瓶颈时，是否应该多了解一下呢？一、浏览器的发展与CSS
《阿甘正传》台词 dcj3sjt126com
Part Ⅰ: 《阿甘正传》Forrest Gump经典中英文对白 Forrest: Hello! My names Forrest. Forrest Gump. You wanna Chocolate? I could eat about a million and a half othese. My momma always said life was like a box ochocol
Java处理JSON dyy_gusi json
Json在数据传输中很好用，原因是JSON 比 XML 更小、更快，更易解析。在Java程序中，如何使用处理JSON，现在有很多工具可以处理，比较流行常用的是google的gson和alibaba的fastjson，具体使用如下： 1、读取json然后处理 class ReadJSON { public static void main(String[] args)
win7下nginx和php的配置 geeksun nginx
1. 安装包准备 nginx : 从nginx.org下载nginx-1.8.0.zip php：从php.net下载php-5.6.10-Win32-VC11-x64.zip， php是免安装文件。 RunHiddenConsole: 用于隐藏命令行窗口 2. 配置 # java用8080端口做应用服务器，nginx反向代理到这个端口即可 p
基于2.8版本redis配置文件中文解释 hongtoushizi redis
转载自： http://wangwei007.blog.51cto.com/68019/1548167 在Redis中直接启动redis-server服务时, 采用的是默认的配置文件。采用redis-server xxx.conf 这样的方式可以按照指定的配置文件来运行Redis服务。下面是Redis2.8.9的配置文
第五章常用Lua开发库3-模板渲染 jinnianshilongnian nginx lua
动态web网页开发是Web开发中一个常见的场景，比如像京东商品详情页，其页面逻辑是非常复杂的，需要使用模板技术来实现。而Lua中也有许多模板引擎，如目前我在使用的lua-resty-template，可以渲染很复杂的页面，借助LuaJIT其性能也是可以接受的。如果学习过JavaEE中的servlet和JSP的话，应该知道JSP模板最终会被翻译成Servlet来执行；而lua-r
JZSearch大数据搜索引擎颠覆者 JavaScript
系统简介：大数据的特点有四个层面：第一，数据体量巨大。从TB级别，跃升到PB级别；第二，数据类型繁多。网络日志、视频、图片、地理位置信息等等。第三，价值密度低。以视频为例，连续不间断监控过程中，可能有用的数据仅仅有一两秒。第四，处理速度快。最后这一点也是和传统的数据挖掘技术有着本质的不同。业界将其归纳为4个“V”——Volume，Variety，Value，Velocity。大数据搜索引
10招让你成为杰出的Java程序员 pda158 java 编程框架
如果你是一个热衷于技术的 Java 程序员，那么下面的 10 个要点可以让你在众多 Java 开发人员中脱颖而出。　　 1. 拥有扎实的基础和深刻理解 OO 原则　　对于 Java 程序员，深刻理解 Object Oriented Programming（面向对象编程）这一概念是必须的。没有 OOPS 的坚实基础，就领会不了像 Java 这些面向对象编程语言
tomcat之oracle连接池配置小网客 oracle
tomcat版本7.0 配置oracle连接池方式：修改tomcat的server.xml配置文件： <GlobalNamingResources> <Resource name="utermdatasource" auth="Container" type="javax.sql.DataSou
Oracle 分页算法汇总 vipbooks oracle sql 算法 .net
这是我找到的一些关于Oracle分页的算法，大家那里还有没有其他好的算法没？我们大家一起分享一下！ -- Oracle 分页算法一 select * from ( select page.*,rownum rn from (select * from help) page -- 20 = (currentPag