Hadoop 2.2.0 Job源代码阅读笔记

　　本文所有涉及的内容均为2.2.0版本中呈现。

　　概述：

　　Job在创建Job并且提交的人的眼中，可以在创建的时候通过配置Job的内容，控制Job的执行，以及查询Job的运行状态。一旦Job提交以后，将不能对其进行配置，否则将会出现IllegalStateException异常。

　　正常情况下用户通过Job类来创建、描述、提交Job，以及监控Job的处理过程。下面是一个简单的例子：　　

// Create a new Job

Job job = new Job(new Configuration());

job.setJarByClass(MyJob.class);



// Specify various job-specific parameters     

job.setJobName("myjob");



job.setInputPath(new Path("in"));

job.setOutputPath(new Path("out"));



job.setMapperClass(MyJob.MyMapper.class);

job.setReducerClass(MyJob.MyReducer.class);

// Submit the job, then poll for progress until the job is complete

job.waitForCompletion(true);

　　基本结构：　　

　　Job类在org.apache.hadoop.mapreduce包中，继承了JobContextImpl类以及实现了JobContext接口。

　　Job定义的静态常量：　　

private static final Log LOG = LogFactory.getLog(Job.class);



  @InterfaceStability.Evolving

  public static enum JobState {DEFINE, RUNNING};

  private static final long MAX_JOBSTATUS_AGE = 1000 * 2;

  public static final String OUTPUT_FILTER = "mapreduce.client.output.filter";

  /** Key in mapred-*.xml that sets completionPollInvervalMillis */

  public static final String COMPLETION_POLL_INTERVAL_KEY = 

    "mapreduce.client.completion.pollinterval";

  

  /** Default completionPollIntervalMillis is 5000 ms. */

  static final int DEFAULT_COMPLETION_POLL_INTERVAL = 5000;

  /** Key in mapred-*.xml that sets progMonitorPollIntervalMillis */

  public static final String PROGRESS_MONITOR_POLL_INTERVAL_KEY =

    "mapreduce.client.progressmonitor.pollinterval";

  /** Default progMonitorPollIntervalMillis is 1000 ms. */

  static final int DEFAULT_MONITOR_POLL_INTERVAL = 1000;



  public static final String USED_GENERIC_PARSER = 

    "mapreduce.client.genericoptionsparser.used";

  public static final String SUBMIT_REPLICATION = 

    "mapreduce.client.submit.file.replication";

  private static final String TASKLOG_PULL_TIMEOUT_KEY =

           "mapreduce.client.tasklog.timeout";

  private static final int DEFAULT_TASKLOG_TIMEOUT = 60000;

　　Job定义的私有变量：　　

 private JobState state = JobState.DEFINE;

 private JobStatus status;

 private long statustime;

 private Cluster cluster;

　　Job类加载的时候就要执行的加载配置文件的方法：　　

static {

    ConfigUtil.loadResources();

 }

　　加载的配置文件包括mapred-default.xml、mapred-site.xml、yarn-default.xml、yarn-site.xml。

　　Job的构造函数：　　

  @Deprecated

  public Job() throws IOException {

    this(new Configuration());

  }



  @Deprecated

  public Job(Configuration conf) throws IOException {

    this(new JobConf(conf));

  }



  @Deprecated

  public Job(Configuration conf, String jobName) throws IOException {

    this(conf);

    setJobName(jobName);

  }



  Job(JobConf conf) throws IOException {

    super(conf, null);

    // propagate existing user credentials to job

    this.credentials.mergeAll(this.ugi.getCredentials());

    this.cluster = null;

  }



  Job(JobStatus status, JobConf conf) throws IOException {

    this(conf);

    setJobID(status.getJobID());

    this.status = status;

    state = JobState.RUNNING;

  }

　　可以注意到Hadoop不鼓励通过缺省的构造函数和通过Configuration类来构造Job对象。通过JobConf对象来构建Job是一个不错的选择。

　　获取Job对象的实例化方法：

　　　　除了通过构造函数，Job类中还提供了通过一些静态方法来获取Job的事例对象，看一下具体定义：　　　　

 /**

   * Creates a new {@link Job} with no particular {@link Cluster} .

   * A Cluster will be created with a generic {@link Configuration}.

   * 

   * @return the {@link Job} , with no connection to a cluster yet.

   * @throws IOException

   */

  public static Job getInstance() throws IOException {

    // create with a null Cluster

    return getInstance(new Configuration());

  }

      

  /**

   * Creates a new {@link Job} with no particular {@link Cluster} and a 

   * given {@link Configuration}.

   * 

   * The <code>Job</code> makes a copy of the <code>Configuration</code> so 

   * that any necessary internal modifications do not reflect on the incoming 

   * parameter.

   * 

   * A Cluster will be created from the conf parameter only when it's needed.

   * 

   * @param conf the configuration

   * @return the {@link Job} , with no connection to a cluster yet.

   * @throws IOException

   */

  public static Job getInstance(Configuration conf) throws IOException {

    // create with a null Cluster

    JobConf jobConf = new JobConf(conf);

    return new Job(jobConf);

  }



      

  /**

   * Creates a new {@link Job} with no particular {@link Cluster} and a given jobName.

   * A Cluster will be created from the conf parameter only when it's needed.

   *

   * The <code>Job</code> makes a copy of the <code>Configuration</code> so 

   * that any necessary internal modifications do not reflect on the incoming 

   * parameter.

   * 

   * @param conf the configuration

   * @return the {@link Job} , with no connection to a cluster yet.

   * @throws IOException

   */

  public static Job getInstance(Configuration conf, String jobName)

           throws IOException {

    // create with a null Cluster

    Job result = getInstance(conf);

    result.setJobName(jobName);

    return result;

  }

  

  /**

   * Creates a new {@link Job} with no particular {@link Cluster} and given

   * {@link Configuration} and {@link JobStatus}.

   * A Cluster will be created from the conf parameter only when it's needed.

   * 

   * The <code>Job</code> makes a copy of the <code>Configuration</code> so 

   * that any necessary internal modifications do not reflect on the incoming 

   * parameter.

   * 

   * @param status job status

   * @param conf job configuration

   * @return the {@link Job} , with no connection to a cluster yet.

   * @throws IOException

   */

  public static Job getInstance(JobStatus status, Configuration conf) 

  throws IOException {

    return new Job(status, new JobConf(conf));

  }



  /**

   * Creates a new {@link Job} with no particular {@link Cluster}.

   * A Cluster will be created from the conf parameter only when it's needed.

   *

   * The <code>Job</code> makes a copy of the <code>Configuration</code> so 

   * that any necessary internal modifications do not reflect on the incoming 

   * parameter.

   * 

   * @param ignored

   * @return the {@link Job} , with no connection to a cluster yet.

   * @throws IOException

   * @deprecated Use {@link #getInstance()}

   */

  @Deprecated

  public static Job getInstance(Cluster ignored) throws IOException {

    return getInstance();

  }

  

  /**

   * Creates a new {@link Job} with no particular {@link Cluster} and given

   * {@link Configuration}.

   * A Cluster will be created from the conf parameter only when it's needed.

   * 

   * The <code>Job</code> makes a copy of the <code>Configuration</code> so 

   * that any necessary internal modifications do not reflect on the incoming 

   * parameter.

   * 

   * @param ignored

   * @param conf job configuration

   * @return the {@link Job} , with no connection to a cluster yet.

   * @throws IOException

   * @deprecated Use {@link #getInstance(Configuration)}

   */

  @Deprecated

  public static Job getInstance(Cluster ignored, Configuration conf) 

      throws IOException {

    return getInstance(conf);

  }

  

  /**

   * Creates a new {@link Job} with no particular {@link Cluster} and given

   * {@link Configuration} and {@link JobStatus}.

   * A Cluster will be created from the conf parameter only when it's needed.

   * 

   * The <code>Job</code> makes a copy of the <code>Configuration</code> so 

   * that any necessary internal modifications do not reflect on the incoming 

   * parameter.

   * 

   * @param cluster cluster

   * @param status job status

   * @param conf job configuration

   * @return the {@link Job} , with no connection to a cluster yet.

   * @throws IOException

   */

  @Private

  public static Job getInstance(Cluster cluster, JobStatus status, 

      Configuration conf) throws IOException {

    Job job = getInstance(status, conf);

    job.setCluster(cluster);

    return job;

  }

　　　　可见通过这种方式获取Job实例的时候会有可能涉及到Cluster。

　　　　轮询周期的方法：　　　　

 /** The interval at which monitorAndPrintJob() prints status */

  public static int getProgressPollInterval(Configuration conf) {

    // Read progress monitor poll interval from config. Default is 1 second.

    int progMonitorPollIntervalMillis = conf.getInt(

      PROGRESS_MONITOR_POLL_INTERVAL_KEY, DEFAULT_MONITOR_POLL_INTERVAL);

    if (progMonitorPollIntervalMillis < 1) {

      LOG.warn(PROGRESS_MONITOR_POLL_INTERVAL_KEY + 

        " has been set to an invalid value; "

        + " replacing with " + DEFAULT_MONITOR_POLL_INTERVAL);

      progMonitorPollIntervalMillis = DEFAULT_MONITOR_POLL_INTERVAL;

    }

    return progMonitorPollIntervalMillis;

  }



  /** The interval at which waitForCompletion() should check. */

  public static int getCompletionPollInterval(Configuration conf) {

    int completionPollIntervalMillis = conf.getInt(

      COMPLETION_POLL_INTERVAL_KEY, DEFAULT_COMPLETION_POLL_INTERVAL);

    if (completionPollIntervalMillis < 1) { 

      LOG.warn(COMPLETION_POLL_INTERVAL_KEY + 

       " has been set to an invalid value; "

       + "replacing with " + DEFAULT_COMPLETION_POLL_INTERVAL);

      completionPollIntervalMillis = DEFAULT_COMPLETION_POLL_INTERVAL;

    }

    return completionPollIntervalMillis;

  }

　　　　上面两个方法分别为获取并且打印Job的运行状态的周期，以及查看Job是否完成的周期。

　　　　需要做异步处理的方法：　　　　

synchronized void ensureFreshStatus() 

      throws IOException {

    if (System.currentTimeMillis() - statustime > MAX_JOBSTATUS_AGE) {

      updateStatus();

    }

  }





 /** Some methods need to update status immediately. So, refresh

   * immediately

   * @throws IOException

   */

  synchronized void updateStatus() throws IOException {

    try {

      this.status = ugi.doAs(new PrivilegedExceptionAction<JobStatus>() {

        @Override

        public JobStatus run() throws IOException, InterruptedException {

          return cluster.getClient().getJobStatus(status.getJobID());

        }

      });

    }

    catch (InterruptedException ie) {

      throw new IOException(ie);

    }

    if (this.status == null) {

      throw new IOException("Job status not available ");

    }

    this.statustime = System.currentTimeMillis();

  }





 private synchronized void connect()

          throws IOException, InterruptedException, ClassNotFoundException {

    if (cluster == null) {

      cluster = 

        ugi.doAs(new PrivilegedExceptionAction<Cluster>() {

                   public Cluster run()

                          throws IOException, InterruptedException, 

                                 ClassNotFoundException {

                     return new Cluster(getConfiguration());

                   }

                 });

    }

  }

　　　　设置配置参数的方法：

/**

   * Set the number of reduce tasks for the job.

   * @param tasks the number of reduce tasks

   * @throws IllegalStateException if the job is submitted

   */

  public void setNumReduceTasks(int tasks) throws IllegalStateException {

    ensureState(JobState.DEFINE);

    conf.setNumReduceTasks(tasks);

  }



  /**

   * Set the current working directory for the default file system.

   * 

   * @param dir the new current working directory.

   * @throws IllegalStateException if the job is submitted

   */

  public void setWorkingDirectory(Path dir) throws IOException {

    ensureState(JobState.DEFINE);

    conf.setWorkingDirectory(dir);

  }



  /**

   * Set the {@link InputFormat} for the job.

   * @param cls the <code>InputFormat</code> to use

   * @throws IllegalStateException if the job is submitted

   */

  public void setInputFormatClass(Class<? extends InputFormat> cls

                                  ) throws IllegalStateException {

    ensureState(JobState.DEFINE);

    conf.setClass(INPUT_FORMAT_CLASS_ATTR, cls, 

                  InputFormat.class);

  }



  /**

   * Set the {@link OutputFormat} for the job.

   * @param cls the <code>OutputFormat</code> to use

   * @throws IllegalStateException if the job is submitted

   */

  public void setOutputFormatClass(Class<? extends OutputFormat> cls

                                   ) throws IllegalStateException {

    ensureState(JobState.DEFINE);

    conf.setClass(OUTPUT_FORMAT_CLASS_ATTR, cls, 

                  OutputFormat.class);

  }



  /**

   * Set the {@link Mapper} for the job.

   * @param cls the <code>Mapper</code> to use

   * @throws IllegalStateException if the job is submitted

   */

  public void setMapperClass(Class<? extends Mapper> cls

                             ) throws IllegalStateException {

    ensureState(JobState.DEFINE);

    conf.setClass(MAP_CLASS_ATTR, cls, Mapper.class);

  }



  /**

   * Set the Jar by finding where a given class came from.

   * @param cls the example class

   */

  public void setJarByClass(Class<?> cls) {

    ensureState(JobState.DEFINE);

    conf.setJarByClass(cls);

  }



  /**

   * Set the job jar 

   */

  public void setJar(String jar) {

    ensureState(JobState.DEFINE);

    conf.setJar(jar);

  }



  /**

   * Set the reported username for this job.

   * 

   * @param user the username for this job.

   */

  public void setUser(String user) {

    ensureState(JobState.DEFINE);

    conf.setUser(user);

  }



  /**

   * Set the combiner class for the job.

   * @param cls the combiner to use

   * @throws IllegalStateException if the job is submitted

   */

  public void setCombinerClass(Class<? extends Reducer> cls

                               ) throws IllegalStateException {

    ensureState(JobState.DEFINE);

    conf.setClass(COMBINE_CLASS_ATTR, cls, Reducer.class);

  }



  /**

   * Set the {@link Reducer} for the job.

   * @param cls the <code>Reducer</code> to use

   * @throws IllegalStateException if the job is submitted

   */

  public void setReducerClass(Class<? extends Reducer> cls

                              ) throws IllegalStateException {

    ensureState(JobState.DEFINE);

    conf.setClass(REDUCE_CLASS_ATTR, cls, Reducer.class);

  }



  /**

   * Set the {@link Partitioner} for the job.

   * @param cls the <code>Partitioner</code> to use

   * @throws IllegalStateException if the job is submitted

   */

  public void setPartitionerClass(Class<? extends Partitioner> cls

                                  ) throws IllegalStateException {

    ensureState(JobState.DEFINE);

    conf.setClass(PARTITIONER_CLASS_ATTR, cls, 

                  Partitioner.class);

  }



  /**

   * Set the key class for the map output data. This allows the user to

   * specify the map output key class to be different than the final output

   * value class.

   * 

   * @param theClass the map output key class.

   * @throws IllegalStateException if the job is submitted

   */

  public void setMapOutputKeyClass(Class<?> theClass

                                   ) throws IllegalStateException {

    ensureState(JobState.DEFINE);

    conf.setMapOutputKeyClass(theClass);

  }



  /**

   * Set the value class for the map output data. This allows the user to

   * specify the map output value class to be different than the final output

   * value class.

   * 

   * @param theClass the map output value class.

   * @throws IllegalStateException if the job is submitted

   */

  public void setMapOutputValueClass(Class<?> theClass

                                     ) throws IllegalStateException {

    ensureState(JobState.DEFINE);

    conf.setMapOutputValueClass(theClass);

  }



  /**

   * Set the key class for the job output data.

   * 

   * @param theClass the key class for the job output data.

   * @throws IllegalStateException if the job is submitted

   */

  public void setOutputKeyClass(Class<?> theClass

                                ) throws IllegalStateException {

    ensureState(JobState.DEFINE);

    conf.setOutputKeyClass(theClass);

  }



  /**

   * Set the value class for job outputs.

   * 

   * @param theClass the value class for job outputs.

   * @throws IllegalStateException if the job is submitted

   */

  public void setOutputValueClass(Class<?> theClass

                                  ) throws IllegalStateException {

    ensureState(JobState.DEFINE);

    conf.setOutputValueClass(theClass);

  }



  /**

   * Define the comparator that controls how the keys are sorted before they

   * are passed to the {@link Reducer}.

   * @param cls the raw comparator

   * @throws IllegalStateException if the job is submitted

   */

  public void setSortComparatorClass(Class<? extends RawComparator> cls

                                     ) throws IllegalStateException {

    ensureState(JobState.DEFINE);

    conf.setOutputKeyComparatorClass(cls);

  }



  /**

   * Define the comparator that controls which keys are grouped together

   * for a single call to 

   * {@link Reducer#reduce(Object, Iterable, 

   *                       org.apache.hadoop.mapreduce.Reducer.Context)}

   * @param cls the raw comparator to use

   * @throws IllegalStateException if the job is submitted

   */

  public void setGroupingComparatorClass(Class<? extends RawComparator> cls

                                         ) throws IllegalStateException {

    ensureState(JobState.DEFINE);

    conf.setOutputValueGroupingComparator(cls);

  }



  /**

   * Set the user-specified job name.

   * 

   * @param name the job's new name.

   * @throws IllegalStateException if the job is submitted

   */

  public void setJobName(String name) throws IllegalStateException {

    ensureState(JobState.DEFINE);

    conf.setJobName(name);

  }



  /**

   * Turn speculative execution on or off for this job. 

   * 

   * @param speculativeExecution <code>true</code> if speculative execution 

   *                             should be turned on, else <code>false</code>.

   */

  public void setSpeculativeExecution(boolean speculativeExecution) {

    ensureState(JobState.DEFINE);

    conf.setSpeculativeExecution(speculativeExecution);

  }



  /**

   * Turn speculative execution on or off for this job for map tasks. 

   * 

   * @param speculativeExecution <code>true</code> if speculative execution 

   *                             should be turned on for map tasks,

   *                             else <code>false</code>.

   */

  public void setMapSpeculativeExecution(boolean speculativeExecution) {

    ensureState(JobState.DEFINE);

    conf.setMapSpeculativeExecution(speculativeExecution);

  }



  /**

   * Turn speculative execution on or off for this job for reduce tasks. 

   * 

   * @param speculativeExecution <code>true</code> if speculative execution 

   *                             should be turned on for reduce tasks,

   *                             else <code>false</code>.

   */

  public void setReduceSpeculativeExecution(boolean speculativeExecution) {

    ensureState(JobState.DEFINE);

    conf.setReduceSpeculativeExecution(speculativeExecution);

  }



  /**

   * Specify whether job-setup and job-cleanup is needed for the job 

   * 

   * @param needed If <code>true</code>, job-setup and job-cleanup will be

   *               considered from {@link OutputCommitter} 

   *               else ignored.

   */

  public void setJobSetupCleanupNeeded(boolean needed) {

    ensureState(JobState.DEFINE);

    conf.setBoolean(SETUP_CLEANUP_NEEDED, needed);

  }



  /**

   * Set the given set of archives

   * @param archives The list of archives that need to be localized

   */

  public void setCacheArchives(URI[] archives) {

    ensureState(JobState.DEFINE);

    DistributedCache.setCacheArchives(archives, conf);

  }



  /**

   * Set the given set of files

   * @param files The list of files that need to be localized

   */

  public void setCacheFiles(URI[] files) {

    ensureState(JobState.DEFINE);

    DistributedCache.setCacheFiles(files, conf);

  }



  /**

   * Add a archives to be localized

   * @param uri The uri of the cache to be localized

   */

  public void addCacheArchive(URI uri) {

    ensureState(JobState.DEFINE);

    DistributedCache.addCacheArchive(uri, conf);

  }

  

  /**

   * Add a file to be localized

   * @param uri The uri of the cache to be localized

   */

  public void addCacheFile(URI uri) {

    ensureState(JobState.DEFINE);

    DistributedCache.addCacheFile(uri, conf);

  }



  /**

   * Add an file path to the current set of classpath entries It adds the file

   * to cache as well.

   * 

   * Files added with this method will not be unpacked while being added to the

   * classpath.

   * To add archives to classpath, use the {@link #addArchiveToClassPath(Path)}

   * method instead.

   *

   * @param file Path of the file to be added

   */

  public void addFileToClassPath(Path file)

    throws IOException {

    ensureState(JobState.DEFINE);

    DistributedCache.addFileToClassPath(file, conf, file.getFileSystem(conf));

  }



  /**

   * Add an archive path to the current set of classpath entries. It adds the

   * archive to cache as well.

   * 

   * Archive files will be unpacked and added to the classpath

   * when being distributed.

   *

   * @param archive Path of the archive to be added

   */

  public void addArchiveToClassPath(Path archive)

    throws IOException {

    ensureState(JobState.DEFINE);

    DistributedCache.addArchiveToClassPath(archive, conf, archive.getFileSystem(conf));

  }



  /**

   * Originally intended to enable symlinks, but currently symlinks cannot be

   * disabled.

   */

  @Deprecated

  public void createSymlink() {

    ensureState(JobState.DEFINE);

    DistributedCache.createSymlink(conf);

  }

  

  /** 

   * Expert: Set the number of maximum attempts that will be made to run a

   * map task.

   * 

   * @param n the number of attempts per map task.

   */

  public void setMaxMapAttempts(int n) {

    ensureState(JobState.DEFINE);

    conf.setMaxMapAttempts(n);

  }



  /** 

   * Expert: Set the number of maximum attempts that will be made to run a

   * reduce task.

   * 

   * @param n the number of attempts per reduce task.

   */

  public void setMaxReduceAttempts(int n) {

    ensureState(JobState.DEFINE);

    conf.setMaxReduceAttempts(n);

  }



  /**

   * Set whether the system should collect profiler information for some of 

   * the tasks in this job? The information is stored in the user log 

   * directory.

   * @param newValue true means it should be gathered

   */

  public void setProfileEnabled(boolean newValue) {

    ensureState(JobState.DEFINE);

    conf.setProfileEnabled(newValue);

  }



  /**

   * Set the profiler configuration arguments. If the string contains a '%s' it

   * will be replaced with the name of the profiling output file when the task

   * runs.

   *

   * This value is passed to the task child JVM on the command line.

   *

   * @param value the configuration string

   */

  public void setProfileParams(String value) {

    ensureState(JobState.DEFINE);

    conf.setProfileParams(value);

  }



  /**

   * Set the ranges of maps or reduces to profile. setProfileEnabled(true) 

   * must also be called.

   * @param newValue a set of integer ranges of the map ids

   */

  public void setProfileTaskRange(boolean isMap, String newValue) {

    ensureState(JobState.DEFINE);

    conf.setProfileTaskRange(isMap, newValue);

  } 

  

  /**

   * Sets the flag that will allow the JobTracker to cancel the HDFS delegation

   * tokens upon job completion. Defaults to true.

   */

  public void setCancelDelegationTokenUponJobCompletion(boolean value) {

    ensureState(JobState.DEFINE);

    conf.setBoolean(JOB_CANCEL_DELEGATION_TOKEN, value);

  }

　　　　要非常注意的地方就是在每项配置的时候都需要检查状态，Job只有处于DEFINE状态下的时候才可以对其进行配置。

　　　　屏幕输出的方法：　　　　

/**

   * Dump stats to screen.

   */

  @Override

  public String toString() {

    ensureState(JobState.RUNNING);

    String reasonforFailure = " ";

    int numMaps = 0;

    int numReduces = 0;

    try {

      updateStatus();

      if (status.getState().equals(JobStatus.State.FAILED))

        reasonforFailure = getTaskFailureEventString();

      numMaps = getTaskReports(TaskType.MAP).length;

      numReduces = getTaskReports(TaskType.REDUCE).length;

    } catch (IOException e) {

    } catch (InterruptedException ie) {

    }

    StringBuffer sb = new StringBuffer();

    sb.append("Job: ").append(status.getJobID()).append("\n");

    sb.append("Job File: ").append(status.getJobFile()).append("\n");

    sb.append("Job Tracking URL : ").append(status.getTrackingUrl());

    sb.append("\n");

    sb.append("Uber job : ").append(status.isUber()).append("\n");

    sb.append("Number of maps: ").append(numMaps).append("\n");

    sb.append("Number of reduces: ").append(numReduces).append("\n");

    sb.append("map() completion: ");

    sb.append(status.getMapProgress()).append("\n");

    sb.append("reduce() completion: ");

    sb.append(status.getReduceProgress()).append("\n");

    sb.append("Job state: ");

    sb.append(status.getState()).append("\n");

    sb.append("retired: ").append(status.isRetired()).append("\n");

    sb.append("reason for failure: ").append(reasonforFailure);

    return sb.toString();

  }

　　获取任务进程的方法：　　

 /**

   * Get the <i>progress</i> of the job's map-tasks, as a float between 0.0 

   * and 1.0.  When all map tasks have completed, the function returns 1.0.

   * 

   * @return the progress of the job's map-tasks.

   * @throws IOException

   */

  public float mapProgress() throws IOException {

    ensureState(JobState.RUNNING);

    ensureFreshStatus();

    return status.getMapProgress();

  }



  /**

   * Get the <i>progress</i> of the job's reduce-tasks, as a float between 0.0 

   * and 1.0.  When all reduce tasks have completed, the function returns 1.0.

   * 

   * @return the progress of the job's reduce-tasks.

   * @throws IOException

   */

  public float reduceProgress() throws IOException {

    ensureState(JobState.RUNNING);

    ensureFreshStatus();

    return status.getReduceProgress();

  }



  /**

   * Get the <i>progress</i> of the job's cleanup-tasks, as a float between 0.0 

   * and 1.0.  When all cleanup tasks have completed, the function returns 1.0.

   * 

   * @return the progress of the job's cleanup-tasks.

   * @throws IOException

   */

  public float cleanupProgress() throws IOException, InterruptedException {

    ensureState(JobState.RUNNING);

    ensureFreshStatus();

    return status.getCleanupProgress();

  }



  /**

   * Get the <i>progress</i> of the job's setup-tasks, as a float between 0.0 

   * and 1.0.  When all setup tasks have completed, the function returns 1.0.

   * 

   * @return the progress of the job's setup-tasks.

   * @throws IOException

   */

  public float setupProgress() throws IOException {

    ensureState(JobState.RUNNING);

    ensureFreshStatus();

    return status.getSetupProgress();

  }

浅谈MapReduce Android路上的人 Hadoop 分布式计算 mapreduce 分布式框架 hadoop
从今天开始，本人将会开始对另一项技术的学习，就是当下炙手可热的Hadoop分布式就算技术。目前国内外的诸多公司因为业务发展的需要，都纷纷用了此平台。国内的比如BAT啦，国外的在这方面走的更加的前面，就不一一列举了。但是Hadoop作为Apache的一个开源项目，在下面有非常多的子项目，比如HDFS，HBase,Hive，Pig,等等，要先彻底学习整个Hadoop，仅仅凭借一个的力量，是远远不够的。
Hadoop 傲雪凌霜，松柏长青后端大数据 hadoop 大数据分布式
ApacheHadoop是一个开源的分布式计算框架，主要用于处理海量数据集。它具有高度的可扩展性、容错性和高效的分布式存储与计算能力。Hadoop核心由四个主要模块组成，分别是HDFS（分布式文件系统）、MapReduce（分布式计算框架）、YARN（资源管理）和HadoopCommon（公共工具和库）。1.HDFS（HadoopDistributedFileSystem）HDFS是Hadoop生
Hadoop架构 henan程序媛 hadoop 大数据分布式
一、案列分析1.1案例概述现在已经进入了大数据(BigData)时代，数以万计用户的互联网服务时时刻刻都在产生大量的交互，要处理的数据量实在是太大了，以传统的数据库技术等其他手段根本无法应对数据处理的实时性、有效性的需求。HDFS顺应时代出现，在解决大数据存储和计算方面有很多的优势。1.2案列前置知识点1.什么是大数据大数据是指无法在一定时间范围内用常规软件工具进行捕捉、管理和处理的大量数据集合，
分享一个基于python的电子书数据采集与可视化分析 hadoop电子书数据分析与推荐系统 spark大数据毕设项目（源码、调试、LW、开题、PPT) 计算机源码社 Python项目大数据大数据 python hadoop 计算机毕业设计选题计算机毕业设计源码数据分析 spark毕设
作者：计算机源码社个人简介：本人八年开发经验，擅长Java、Python、PHP、.NET、Node.js、Android、微信小程序、爬虫、大数据、机器学习等，大家有这一块的问题可以一起交流！学习资料、程序开发、技术解答、文档报告如需要源码，可以扫取文章下方二维码联系咨询Java项目微信小程序项目Android项目Python项目PHP项目ASP.NET项目Node.js项目选题推荐项目实战|p
hbase介绍 CrazyL- 云计算+大数据 hbase
hbase是一个分布式的、多版本的、面向列的开源数据库hbase利用hadoophdfs作为其文件存储系统，提供高可靠性、高性能、列存储、可伸缩、实时读写、适用于非结构化数据存储的数据库系统hbase利用hadoopmapreduce来处理hbase、中的海量数据hbase利用zookeeper作为分布式系统服务特点：数据量大：一个表可以有上亿行，上百万列（列多时，插入变慢）面向列：面向列（族）的
大数据毕业设计hadoop+spark+hive知识图谱租房数据分析可视化大屏租房推荐系统 58同城租房爬虫房源推荐系统房价预测系统计算机毕业设计机器学习深度学习人工智能 2401_84572577 程序员大数据 hadoop 人工智能
做了那么多年开发，自学了很多门编程语言，我很明白学习资源对于学一门新语言的重要性，这些年也收藏了不少的Python干货，对我来说这些东西确实已经用不到了，但对于准备自学Python的人来说，或许它就是一个宝藏，可以给你省去很多的时间和精力。别在网上瞎学了，我最近也做了一些资源的更新，只要你是我的粉丝，这期福利你都可拿走。我先来介绍一下这些东西怎么用，文末抱走。（1）Python所有方向的学习路线（
Spark集群的三种模式 MelodyYN #Spark spark hadoop big data
文章目录1、Spark的由来1.1Hadoop的发展1.2MapReduce与Spark对比2、Spark内置模块3、Spark运行模式3.1Standalone模式部署配置历史服务器配置高可用运行模式3.2Yarn模式安装部署配置历史服务器运行模式4、WordCount案例1、Spark的由来定义：Hadoop主要解决，海量数据的存储和海量数据的分析计算。Spark是一种基于内存的快速、通用、可
月度总结 | 2022年03月 | 考研与就业的抉择 | 确定未来走大数据开发路线「已注销」个人总结 hadoop
一、时间线梳理3月3日，寻找到同专业的就业伙伴3月5日，着手准备Java八股文，决定先走Java后端路线3月8月，申请到了校图书馆的考研专座，决定暂时放弃就业，先准备考研，买了数学和408的资料书3月9日-3月13日，因疫情原因，宿舍区暂封，这段时间在准备考研，发现内容特别多3月13日-3月19日，大部分时间在刷Hadoop、Zookeeper、Kafka的视频，同时在准备实习的项目3月20日，退
HBase介绍 mingyu1016 数据库
概述HBase是一个分布式的、面向列的开源数据库,源于google的一篇论文《bigtable：一个结构化数据的分布式存储系统》。HBase是GoogleBigtable的开源实现，它利用HadoopHDFS作为其文件存储系统，利用HadoopMapReduce来处理HBase中的海量数据，利用Zookeeper作为协同服务。HBase的表结构HBase以表的形式存储数据。表有行和列组成。列划分为
Java中的大数据处理框架对比分析省赚客app开发者 java 开发语言
Java中的大数据处理框架对比分析大家好，我是微赚淘客系统3.0的小编，是个冬天不穿秋裤，天冷也要风度的程序猿！今天，我们将深入探讨Java中常用的大数据处理框架，并对它们进行对比分析。大数据处理框架是现代数据驱动应用的核心，它们帮助企业处理和分析海量数据，以提取有价值的信息。本文将重点介绍ApacheHadoop、ApacheSpark、ApacheFlink和ApacheStorm这四种流行的
Hadoop windows intelij 跑 MR WordCount piziyang12138
一、软件环境我使用的软件版本如下:IntellijIdea2017.1Maven3.3.9Hadoop分布式环境二、创建maven工程打开Idea,file->new->Project,左侧面板选择maven工程。(如果只跑MapReduce创建java工程即可，不用勾选Creatfromarchetype，如果想创建web工程或者使用骨架可以勾选)image.png设置GroupId和Artif
Hadoop学习第三课（HDFS架构--读、写流程）小小程序员呀~ 数据库 hadoop 架构 big data
1.块概念举例1：一桶水1000ml，瓶子的规格100ml=>需要10个瓶子装完一桶水1010ml，瓶子的规格100ml=>需要11个瓶子装完一桶水1010ml，瓶子的规格200ml=>需要6个瓶子装完块的大小规格，只要是需要存储，哪怕一点点，也是要占用一个块的块大小的参数：dfs.blocksize官方默认的大小为128M官网：https://hadoop.apache.org/docs/r3.
hadoop启动HDFS命令 m0_67401228 java 搜索引擎 linux 后端
启动命令：/hadoop/sbin/start-dfs.sh停止命令：/hadoop/sbin/stop-dfs.sh
【计算机毕设-大数据方向】基于Hadoop的电商交易数据分析可视化系统的设计与实现程序员-石头山大数据实战案例大数据 hadoop 毕业设计毕设
博主介绍：✌全平台粉丝5W+,高级大厂开发程序员，博客之星、掘金/知乎/华为云/阿里云等平台优质作者。【源码获取】关注并且私信我【联系方式】最下边感兴趣的可以先收藏起来，同学门有不懂的毕设选题，项目以及论文编写等相关问题都可以和学长沟通，希望帮助更多同学解决问题前言随着电子商务行业的迅猛发展，电商平台积累了海量的数据资源，这些数据不仅包括用户的基本信息、购物记录，还包括用户的浏览行为、评价反馈等多
分布式离线计算—Spark—基础介绍测试开发abbey 人工智能—大数据
原文作者：饥渴的小苹果原文地址：【Spark】Spark基础教程目录Spark特点Spark相对于Hadoop的优势Spark生态系统Spark基本概念Spark结构设计Spark各种概念之间的关系Executor的优点Spark运行基本流程Spark运行架构的特点Spark的部署模式Spark三种部署方式Hadoop和Spark的统一部署摘要：Spark是基于内存计算的大数据并行计算框架Spar
spark常用命令我是浣熊的微笑 spark
查看报错日志：yarnlogsapplicationIDspark2-submit--masteryarn--classcom.hik.ReadHdfstest-1.0-SNAPSHOT.jar进入$SPARK_HOME目录，输入bin/spark-submit--help可以得到该命令的使用帮助。hadoop@wyy:/app/hadoop/spark100$bin/spark-submit--
spark启动命令学不会又听不懂 spark 大数据分布式
hadoop启动：cd/root/toolssstart-dfs.sh，只需在hadoop01上启动stop-dfs.sh日志查看：cat/root/toolss/hadoop/logs/hadoop-root-datanode-hadoop03.outzookeeper启动：cd/root/toolss/zookeeperbin/zkServer.shstart，三台都要启动bin/zkServ
编程常用命令总结 Yellow0523 Linux BigData 大数据
编程命令大全1.软件环境变量的配置JavaScalaSparkHadoopHive2.大数据软件常用命令Spark基本命令Spark-SQL命令Hive命令HDFS命令YARN命令Zookeeper命令kafka命令Hibench命令MySQL命令3.Linux常用命令Git命令conda命令pip命令查看Linux系统的详细信息查看Linux系统架构(X86还是ARM，两种方法都可)端口号命令L
Hadoop常见面试题整理及解答叶青舟 Linux hdfs 大数据 hadoop linux
Hadoop常见面试题整理及解答一、基础知识篇：1.把数据仓库从传统关系型数据库转到hadoop有什么优势？答：（1）关系型数据库成本高，且存储空间有限。而Hadoop使用较为廉价的机器存储数据，且Hadoop可以将大量机器构建成一个集群，并在集群中使用HDFS文件系统统一管理数据，极大的提高了数据的存储及处理能力。（2）关系型数据库仅支持标准结构化数据格式，Hadoop不仅支持标准结构化数据格式
2025毕业设计指南：如何用Hadoop构建超市进货推荐系统？大数据分析助力精准采购计算机编程指导师 Java实战集 Python实战集大数据实战集课程设计 hadoop 数据分析 spring boot java 进货 python
✍✍计算机编程指导师⭐⭐个人介绍：自己非常喜欢研究技术问题！专业做Java、Python、小程序、安卓、大数据、爬虫、Golang、大屏等实战项目。⛽⛽实战项目：有源码或者技术上的问题欢迎在评论区一起讨论交流！⚡⚡Java实战|SpringBoot/SSMPython实战项目|Django微信小程序/安卓实战项目大数据实战项目⚡⚡文末获取源码文章目录⚡⚡文末获取源码基于hadoop的超市进货推荐系
Hadoop Common 之序列化机制小解猫君之上 #Apache Hadoop
1.JavaSerializable序列化该序列化通过ObjectInputStream的readObject实现序列化，ObjectOutputStream的writeObject实现反序列化。这不过此种序列化虽然跨病态兼容性强，但是因为存储过多的信息，但是传输效率比较低，所以hadoop弃用它。（序列化信息包括这个对象的类，类签名，类的所有静态，费静态成员的值，以及他们父类都要被写入）publ
深入理解hadoop(一)----Common的实现----Configuration maoxiao_jsd 深入理解----hadoop
属本人个人原创，转载请注明,希望对大家有帮助！！一,hadoop的配置管理a,hadoop通过独有的Configuration处理配置信息Configurationconf=newConfiguration();conf.addResource("core-default.xml");conf.addResource("core-site.xml");后者会覆盖前者中未final标记的相同配置项b
hadoop 0.22.0 部署笔记 weixin_33701564 大数据 java 运维
为什么80%的码农都做不了架构师？>>>因为需要使用hbase，所以开始对hbase进行学习。hbase是部署在hadoop平台上的NOSql数据库，因此在部署hbase之前需要先部署hadoop。环境：redhat5、hadoop-0.22.0.tar.gz、jdk-6u13-linux-i586.zipip192.168.1.128hostname：localhost.localdomain（
解决Windows环境下hadoop集群的运行_window运行hadoop,unknown hadoop01(4) 2401_84160087 大数据面试学习
网上学习资料一大堆，但如果学到的知识不成体系，遇到问题时只是浅尝辄止，不再深入研究，那么很难做到真正的技术提升。需要这份系统化资料的朋友，可以戳这里获取一个人可以走的很快，但一群人才能走的更远！不论你是正从事IT行业的老鸟或是对IT行业感兴趣的新人，都欢迎加入我们的的圈子（技术交流、学习资源、职场吐槽、大厂内推、面试辅导），让我们一起学习成长！org.apache.hadoophadoop-com
解决Windows环境下hadoop集群的运行_window运行hadoop,unknown hadoop01(3) 2401_84160087 大数据面试学习
网上学习资料一大堆，但如果学到的知识不成体系，遇到问题时只是浅尝辄止，不再深入研究，那么很难做到真正的技术提升。需要这份系统化资料的朋友，可以戳这里获取一个人可以走的很快，但一群人才能走的更远！不论你是正从事IT行业的老鸟或是对IT行业感兴趣的新人，都欢迎加入我们的的圈子（技术交流、学习资源、职场吐槽、大厂内推、面试辅导），让我们一起学习成长！xmlns:xsi="http://www.w3.or
深入解析HDFS：定义、架构、原理、应用场景及常用命令 CloudJourney hdfs 架构 hadoop
引言Hadoop分布式文件系统（HDFS，HadoopDistributedFileSystem）是Hadoop框架的核心组件之一，它提供了高可靠性、高可用性和高吞吐量的大规模数据存储和管理能力。本文将从HDFS的定义、架构、工作原理、应用场景以及常用命令等多个方面进行详细探讨，帮助读者全面深入地了解HDFS。1.HDFS的定义1.1什么是HDFSHDFS是Hadoop生态系统中的一个分布式文件系
Hadoop的搭建流程 lzhlizihang hadoop 大数据分布式
文章目录一、配置IP二、配置主机名三、配置主机映射四、关闭防火墙五、配置免密六、安装jdk1、第一步：2、第二步：3、第三步：4、第四步：5、第五步：七、安装hadoop1、上传2、解压3、重命名4、开始配置环境变量5、刷新配置文件6、验证hadoop命令是否可以识别八、全分布搭建7、修改配置文件core-site.xml8、修改配置文件hdfs-site.xml9、修改配置文件hadoop-en
hive搭建 -----内嵌模式和本地模式 lzhlizihang hive hadoop
文章目录一、内嵌模式（使用较少）1、上传、解压、重命名2、配置环境变量3、配置conf下的hive-env.sh4、修改conf下的hive-site.xml5、启动hadoop集群6、给hdfs创建文件夹7、修改hive-site.xml中的非法字符8、初始化元数据9、测试是否成功10、内嵌模式的缺点二、本地模式（最常用）1、检查mysql是否正常2、上传、解压、重命名3、配置环境变量4、修改c
Hadoop之mapreduce -- WrodCount案例以及各种概念 lzhlizihang hadoop mapreduce 大数据
文章目录一、MapReduce的优缺点二、MapReduce案例--WordCount1、导包2、Mapper方法3、Partitioner方法（自定义分区器）4、reducer方法5、driver（main方法）6、Writable（手机流量统计案例的实体类）三、关于片和块1、什么是片，什么是块？2、mapreduce启动多少个MapTask任务？四、MapReduce的原理五、Shuffle过
IAAS: IT公司去IOE-Alibaba系统构架解读 wishchin 心理学/职业 BigDataMini Spark PaaS
从Hadoop到自主研发，技术解读阿里去IOE后的系统架构原地址：......................云计算阿里飞天摘要：从IOE时代，到Hadoop与飞天并行，再到飞天单集群5000节点的实现，阿里一直摸索在技术衍变的前沿。这里，我们将从架构、性能、运维等多个方面深入了解阿里基础设施。【导读】互联网的普及，智能终端的增加，大数据时代悄然而至。在这个数据为王的时代，数十倍、数百倍的数据给各
设计模式介绍 tntxia 设计模式
设计模式来源于土木工程师克里斯托弗亚历山大（http://en.wikipedia.org/wiki/Christopher_Alexander）的早期作品。他经常发表一些作品，内容是总结他在解决设计问题方面的经验，以及这些知识与城市和建筑模式之间有何关联。有一天，亚历山大突然发现，重复使用这些模式可以让某些设计构造取得我们期望的最佳效果。亚历山大与萨拉-石川佳纯和穆雷西乐弗斯坦合作
android高级组件使用(一) 百合不是茶 android RatingBar Spinner
1、自动完成文本框（AutoCompleteTextView） AutoCompleteTextView从EditText派生出来，实际上也是一个文本编辑框，但它比普通编辑框多一个功能：当用户输入一个字符后，自动完成文本框会显示一个下拉菜单，供用户从中选择，当用户选择某个菜单项之后，AutoCompleteTextView按用户选择自动填写该文本框。使用AutoCompleteTex
[网络与通讯]路由器市场大有潜力可挖掘 comsci 网络
如果国内的电子厂商和计算机设备厂商觉得手机市场已经有点饱和了,那么可以考虑一下交换机和路由器市场的进入问题..... 这方面的技术和知识,目前处在一个开放型的状态,有利于各类小型电子企业进入 &nbs
自写简单Redis内存统计shell 商人shang Linux shell 统计Redis内存
#!/bin/bash address="192.168.150.128:6666,192.168.150.128:6666" hosts=(${address//,/ }) sfile="staticts.log" for hostitem in ${hosts[@]} do ipport=(${hostitem
单例模式(饿汉 vs懒汉) oloz 单例模式
package 单例模式; /* * 应用场景:保证在整个应用之中某个对象的实例只有一个 * 单例模式种的《懒汉模式》 * */ public class Singleton { //01 将构造方法私有化，外界就无法用new Singleton()的方式获得实例 private Singleton(){}; //02 申明类得唯一实例 priva
springMvc json支持杨白白 json springmvc
1.Spring mvc处理json需要使用jackson的类库，因此需要先引入jackson包 2在spring mvc中解析输入为json格式的数据:使用@RequestBody来设置输入 @RequestMapping("helloJson") public @ResponseBody JsonTest helloJson() {
android播放，掃描添加本地音頻文件小桔子
最近幾乎沒有什麽事情，繼續鼓搗我的小東西。想在項目中加入一個簡易的音樂播放器功能，就像華為p6桌面上那麼大小的音樂播放器。用過天天動聽或者QQ音樂播放器的人都知道，可已通過本地掃描添加歌曲。不知道他們是怎麼實現的，我覺得應該掃描設備上的所有文件，過濾出音頻文件，每個文件實例化為一個實體，記錄文件名、路徑、歌手、類型、大小等信息。具體算法思想，
oracle常用命令 aichenglong oracle dba 常用命令
1 创建临时表空间 create temporary tablespace user_temp tempfile 'D:\oracle\oradata\Oracle9i\user_temp.dbf' size 50m autoextend on next 50m maxsize 20480m extent management local
25个Eclipse插件 AILIKES eclipse插件
提高代码质量的插件1. FindBugsFindBugs可以帮你找到Java代码中的bug，它使用Lesser GNU Public License的自由软件许可。2. CheckstyleCheckstyle插件可以集成到Eclipse IDE中去，能确保Java代码遵循标准代码样式。3. ECLemmaECLemma是一款拥有Eclipse Public License许可的免费工具，它提供了
Spring MVC拦截器+注解方式实现防止表单重复提交 baalwolf spring mvc
原理：在新建页面中Session保存token随机码，当保存时验证，通过后删除，当再次点击保存时由于服务器端的Session中已经不存在了，所有无法验证通过。 1.新建注解： ? 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18
《Javascript高级程序设计(第3版)》闭包理解 bijian1013 JavaScript
“闭包是指有权访问另一个函数作用域中的变量的函数。”--《Javascript高级程序设计(第3版)》看以下代码： <script type="text/javascript"> function outer() { var i = 10; return f
AngularJS Module类的方法 bijian1013 JavaScript AngularJS Module
AngularJS中的Module类负责定义应用如何启动，它还可以通过声明的方式定义应用中的各个片段。我们来看看它是如何实现这些功能的。一.Main方法在哪里如果你是从Java或者Python编程语言转过来的，那么你可能很想知道AngularJS里面的main方法在哪里？这个把所
[Maven学习笔记七]Maven插件和目标 bit1129 maven插件
插件(plugin)和目标(goal) Maven，就其本质而言，是一个插件执行框架，Maven的每个目标的执行逻辑都是由插件来完成的，一个插件可以有1个或者几个目标，比如maven-compiler-plugin插件包含compile和testCompile，即maven-compiler-plugin提供了源代码编译和测试源代码编译的两个目标使用插件和目标使得我们可以干预
【Hadoop八】Yarn的资源调度策略 bit1129 hadoop
1. Hadoop的三种调度策略 Hadoop提供了3中作业调用的策略， FIFO Scheduler Fair Scheduler Capacity Scheduler 以上三种调度算法，在Hadoop MR1中就引入了，在Yarn中对它们进行了改进和完善.Fair和Capacity Scheduler用于多用户共享的资源调度 2. 多用户资源共享的调度
Nginx使用Linux内存加速静态文件访问 ronin47
Nginx是一个非常出色的静态资源web服务器。如果你嫌它还不够快，可以把放在磁盘中的文件，映射到内存中，减少高并发下的磁盘IO。先做几个假设。nginx.conf中所配置站点的路径是/home/wwwroot/res，站点所对应文件原始存储路径：/opt/web/res shell脚本非常简单，思路就是拷贝资源文件到内存中，然后在把网站的静态文件链接指向到内存中即可。具体如下：
关于Unity3D中的Shader的知识 brotherlamp unity unity资料 unity教程 unity视频 unity自学
首先先解释下Unity3D的Shader，Unity里面的Shaders是使用一种叫ShaderLab的语言编写的，它同微软的FX文件或者NVIDIA的CgFX有些类似。传统意义上的vertex shader和pixel shader还是使用标准的Cg/HLSL 编程语言编写的。因此Unity文档里面的Shader，都是指用ShaderLab编写的代码，然后我们来看下Unity3D自带的60多个S
CopyOnWriteArrayList vs ArrayList bylijinnan java
package com.ljn.base; import java.util.ArrayList; import java.util.Iterator; import java.util.List; import java.util.concurrent.CopyOnWriteArrayList; /** * 总述： * 1.ArrayListi不是线程安全的，CopyO
内存中栈和堆的区别 chicony 内存
1、内存分配方面：堆：一般由程序员分配释放，若程序员不释放，程序结束时可能由OS回收。注意它与数据结构中的堆是两回事，分配方式是类似于链表。可能用到的关键字如下：new、malloc、delete、free等等。栈：由编译器(Compiler)自动分配释放，存放函数的参数值，局部变量的值等。其操作方式类似于数据结构中
回答一位网友对Scala的提问 chenchao051 scala map
本来准备在私信里直接回复了，但是发现不太方便，就简要回答在这里。问题写道对于scala的简洁十分佩服，但又觉得比较晦涩，例如一例，Map("a" -> List(11,111)).flatMap(_._2)，可否说下最后那个函数做了什么，真正在开发的时候也会如此简洁？谢谢先回答一点，在实际使用中，Scala毫无疑问就是这么简单。
mysql 取每组前几条记录 daizj mysql 分组最大值最小值每组三条记录
一、对分组的记录取前N条记录：例如：取每组的前3条最大的记录 1.用子查询： SELECT * FROM tableName a WHERE 3> (SELECT COUNT(*) FROM tableName b WHERE b.id=a.id AND b.cnt>a. cnt) ORDER BY a.id,a.account DE
HTTP深入浅出 http请求 dcj3sjt126com http
HTTP(HyperText Transfer Protocol)是一套计算机通过网络进行通信的规则。计算机专家设计出HTTP，使HTTP客户（如Web浏览器）能够从HTTP服务器(Web服务器)请求信息和服务，HTTP目前协议的版本是1.1.HTTP是一种无状态的协议，无状态是指Web浏览器和Web服务器之间不需要建立持久的连接，这意味着当一个客户端向服务器端发出请求，然后We
判断MySQL记录是否存在方法比较 dcj3sjt126com mysql
把数据写入到数据库的时，常常会碰到先要检测要插入的记录是否存在，然后决定是否要写入。　　我这里总结了判断记录是否存在的常用方法：　　sql语句： select count ( * ) from tablename; 　　然后读取count(*)的值判断记录是否存在。对于这种方法性能上有些浪费，我们只是想判断记录记录是否存在，没有必要全部都查出来。
对HTML XML的一点认识 e200702084 html xml
感谢http://www.w3school.com.cn提供的资料 HTML 文档中的每个成分都是一个节点。节点根据 DOM，HTML 文档中的每个成分都是一个节点。 DOM 是这样规定的：整个文档是一个文档节点每个 HTML 标签是一个元素节点包含在 HTML 元素中的文本是文本节点每一个 HTML 属性是一个属性节点注释属于注释节点 Node 层次
jquery分页插件 genaiwei jquery Web 前端分页插件
//jquery页码控件// 创建一个闭包 (function($) { // 插件的定义 $.fn.pageTool = function(options) { var totalPa
Mybatis与Ibatis对照入门于学习 Josh_Persistence mybatis ibatis 区别联系
一、为什么使用IBatis/Mybatis 对于从事 Java EE 的开发人员来说，iBatis 是一个再熟悉不过的持久层框架了，在 Hibernate、JPA 这样的一站式对象 / 关系映射（O/R Mapping）解决方案盛行之前，iBaits 基本是持久层框架的不二选择。即使在持久层框架层出不穷的今天，iBatis 凭借着易学易用、
C中怎样合理决定使用那种整数类型？秋风扫落叶 c 数据类型
如果需要大数值(大于32767或小于32767), 使用long 型。否则, 如果空间很重要 (如有大数组或很多结构), 使用 short 型。除此之外, 就使用 int 型。如果严格定义的溢出特征很重要而负值无关紧要, 或者你希望在操作二进制位和字节时避免符号扩展的问题, 请使用对应的无符号类型。但是, 要注意在表达式中混用有符号和无符号值的情况。 &nbs
maven问题 zhb8015 maven问题
问题1： Eclipse 中新建maven项目无法添加src/main/java 问题 eclipse创建maevn web项目，在选择maven_archetype_web原型后，默认只有src/main/resources这个Source Floder。按照maven目录结构，添加src/main/ja
(二)androidpn-server tomcat版源码解析之--push消息处理 spjich java androdipn 推送
在 (一)androidpn-server tomcat版源码解析之--项目启动这篇中，已经描述了整个推送服务器的启动过程，并且把握到了消息的入口即XmppIoHandler这个类，今天我将继续往下分析下面的核心代码，主要分为3大块，链接创建，消息的发送，链接关闭。先贴一段XmppIoHandler的部分代码 /** * Invoked from an I/O proc
用js中的formData类型解决ajax提交表单时文件不能被serialize方法序列化的问题中华好儿孙 JavaScript Ajax Web 上传文件 FormData
var formData = new FormData($("#inputFileForm")[0]); $.ajax({ type:'post', url:webRoot+"/electronicContractUrl/webapp/uploadfile", data:formData, async: false, ca
mybatis常用jdbcType数据类型 ysj5125094 mybatis mapper jdbcType
MyBatis 通过包含的jdbcType 类型 BIT FLOAT CHAR

Hadoop 2.2.0 Job源代码阅读笔记

你可能感兴趣的:(hadoop)