k_lb

Hadoop definitive guide

1.Introduction to HDFS

1.1.HDFS Concepts

1.1.1.Blocks

lHDFS too has the concept of a block, but it is a much larger unit 64 MB by default.

lLike in a filesystem for a single disk, files in HDFS are broken into block-sized chunks, which are stored as independent units.

lUnlike a filesystem for a single disk, a file in HDFS that is smaller than a single block does not occupy a full block’s worth of underlying storage.

1.1.2.Namenodes and Datanodes

lThe namenode manages the filesystem namespace.

nIt maintains the filesystem tree and the metadata for all the files and directories in the tree.

nThis information is stored persistently on the local disk in the form of two files: the namespace image and the edit log.

nThe namenode also knows the datanodes on which all the blocks for a given file are located, however, it does not store block locations persistently, since this information is reconstructed from datanodes when the system starts.

lDatanodes are the work horses of the filesystem.

nThey store and retrieve blocks when they are told to (by clients or the namenode)

nThey report back to the namenode periodically with lists of blocks that they are storing.

lsecondary namenode

nIt does not act as a namenode.

nIts main role is to periodically merge the namespace image with the edit log to prevent the edit log from becoming too large.

nIt keeps a copy of the merged name space image, which can be used in the event of the namenode failing.

Namenode directory structure

lThe VERSION file is a Java properties file that contains information about the version of HDFS that is running

nThe layoutVersion is a negative integer that defines the version of HDFS’s persistent data structures.

nThe namespaceID is a unique identifier for the filesystem, which is created whenthe filesystem is first formatted.

nThe cTime property marks the creation time of the namenode’s storage.

nThe storageType indicates that this storage directory contains data structures for a namenode.

The filesystem image and edit log

lWhen a filesystem client performs a write operation, it is first recorded in the edit log.

lThe namenode also has an in-memory representation of the filesystem metadata, which it updates after the edit log has been modified.

lThe edit log is flushed and synced after every write before a success code is returned to the client.

lThe fsimage file is a persistent checkpoint of the filesystem metadata. it is not updated for every filesystem write operation.

lIf the namenode fails, then the latest state of its metadata can be reconstructed by loading the fsimage from disk into memory, then applying each of the operations in the edit log.

lThis is precisely what the namenode does when it starts up.

lThe fsimage file contains a serialized form of all the directory and file inodes in the filesystem.

lThe secondary namenode is to produce checkpoints of the primary’s in-memory filesystem metadata.

lThe checkpointing process proceeds as follows :

nThe secondary asks the primary to roll its edits file, so new edits go to a new file.

nThe secondary retrieves fsimage and edits from the primary (using HTTP GET).

nThe secondary loads fsimage into memory, applies each operation from edits, then creates a new consolidated fsimage file.

nThe secondary sends the new fsimage back to the primary (using HTTP POST).

nThe primary replaces the old fsimage with the new one from the secondary, and the old edits file with the new one it started in step 1. It also updates the fstime file to record the time that the checkpoint was taken.

nAt the end of the process, the primary has an up-to-date fsimage file, and a shorter edits file.

Secondary namenode directory structure

Datanode directory structure

lA datanode’s VERSION file

lThe other files in the datanode’s current storage directory are the files with the blk_ prefix.

nThere are two types: the HDFS blocks themselves (which just consist of the file’s raw bytes) and the metadata for a block (with a .meta suffix).

nA block file just consists of the raw bytes of a portion of the file being stored;

nthe metadata file is made up of a header with version and type information, followed by a series of checksums for sections of the block.

lWhen the number of blocks in a directory grows to a certain size, the datanode creates a new subdirectory in which to place new blocks and their accompanying metadata.

1.2.Data Flow

1.2.1.Anatomy of a File Read

lThe client opens the file it wishes to read by calling open() on the FileSystem object (step 1).

lDistributedFileSystem calls the namenode, using RPC, to determine the locations of the blocks for the first few blocks in the file (step 2).

lFor each block, the namenode returns the addresses of the datanodes that have a copy of that block.

lThe datanodes are sorted according to their proximity to the client.

lThe DistributedFileSystem returns a FSDataInputStream to the client for it to read data from.

lThe client then calls read() on the stream (step 3).

lDFSInputStream connects to the first (closest) datanode for the first block in the file.

lData is streamed from the datanode back to the client (step 4).

lWhen the end of the block is reached, DFSInputStream will close the connection to the datanode, then find the best datanode for the next block (step 5).

lWhen the client has finished reading, it calls close() on the FSDataInputStream (step 6).

lDuring reading, if the client encounters an error while communicating with a datanode, then it will try the next closest one for that block.

lIt will also remember datanodes that have failed so that it doesn’t needlessly retry them forlater blocks.

lThe client also verifies checksums for the data transferred to it from the datanode. If a corrupted block is found, it is reported to the namenode.

1.2.2.Anatomy of a File Write

lThe client creates the file by calling create() (step 1).

lDistributedFileSystem makes an RPC call to the namenode to create a new file in the filesystem’s namespace, with no blocks associated with it (step 2).

lThe namenode performs various checks to make sure the file doesn’t already exist, and that the client has the right permissions to create the file. If these checks pass, the namenode makes a record of the new file; otherwise, file creation fails and the client is thrown an IOException.

lThe DistributedFileSystem returns a FSDataOutputStream for the client to start writing data to.

lAs the client writes data (step 3), DFSOutputStream splits it into packets, which it writes to an internal queue, called the data queue.

lThe data queue is consumed by the Data Streamer, whose responsibility it is to ask the namenode to allocate new blocks by picking a list of suitable datanodes to store the replicas. The list of datanodes forms apipeline.

lThe DataStreamer streams the packets to the first datanode in the pipeline, which storesthe packet and forwards it to the second datanode in the pipeline. Similarly, the seconddatanode stores the packet and forwards it to the third (and last) datanode in the pipe line (step 4).

lDFSOutputStream also maintains an internal queue of packets that are waiting to be acknowledged by datanodes, called the ack queue. A packet is removed from the ack queue only when it has been acknowledged by all the datanodes in the pipeline (step 5).

lIf a datanode fails while data is being written to it,

nFirst the pipeline is closed, and any packets in the ack queue are added to the front of the data queue.

nThe current block on the good datanodes is given a new identity by the namenode, so that the partial block on the failed datanode will be deleted if the failed data node recovers later on.

nThe failed datanode is removed from the pipeline and the remainder of the block’s data is written to the two good datanodes in the pipeline.

nThe namenode notices that the block is under-replicated, and it arranges for a further replica to be created on another node.

lWhen the client has finished writing data it calls close() on the stream (step 6). This action flushes all the remaining packets to the datanode pipeline and waits for acknowledgments before contacting the namenode to signal that the file is complete (step7).

2.Meet Map/Reduce

lMapReduce has two phases: the map phase and the reduce phase.

lEach phase has key-value pairs as input and output (the types can be specified).

nThe input key-value types of the map phase is determined by the input format

nThe output key-value types of the map phase should match the input key value types of the reduce phase

nThe output key-value types of the reduce phase can be set in the JobConf interface.

lThe programmer specifies two functions: the map function and the reduce function.

2.1.MapReduce logical data flow

2.2.MapReduce Code

2.2.1.The map function is represented by an implementation of the Mapper interface, which declares a map() method.

2.2.2.The reduce function is defined using a Reducer

lThe input types of the reduce function must match the output type of the map function.

2.2.3.The code runs the MapReduce job

lAn input path is specified by calling the static addInputPath() method on FileInputFormat

nIt can be a single file, a directory, or a file pattern.

naddInputPath() can be called more than once to use input from multiple paths.

lThe output path is specified by the static setOutputPath() method on FileOutputFormat.

nIt specifies a directory where the output files from the reducer functions are written.

nThe directory shouldn’t exist before running the job

lThe map and reduce types can be specified via the setMapperClass() and setReducerClass() methods.

lThe setOutputKeyClass() and setOutputValueClass() methods control the output types for the map and the reduce functions, which are often the same.

nIf they are different, then the map output types can be set using the methods setMapOutputKeyClass() and setMapOutputValueClass().

lThe input types are controlled via the input format, which we have not explicitly set since we are using the default TextInputFormat.

2.3.Scaling Out

2.3.1.MapReduce data flow with a single reduce task

lA MapReduce job is a unit of work that the client wants to be performed: it consists of the input data, the MapReduce program, and configuration information.

lHadoop runs the job by dividing it into tasks, of which there are two types: map tasks and reduce tasks.

lThere are two types of nodes that control the job execution process: a jobtracker and a number of tasktrackers.

nThe jobtracker coordinates all the jobs run on the system by scheduling tasks to run on tasktrackers.

nTasktrackers run tasks and send progress reports to the jobtracker, which keeps a record of the overall progress of each job.

nIf a tasks fails, the jobtracker can reschedule it on a different tasktracker.

lHadoop divides the input to a MapReduce job into fixed-size input splits.

lHadoop creates one map task for each split, which runs the user defined map function for each record in the split.

lHadoop does its best to run the map task on a node where the input data resides in HDFS.

nThis is called the data locality optimization.

nThis is why the optimal split size is the same as the block size: it is the largest size of input that can be guaranteed to be stored on a single node.

lReduce tasks don’t have the advantage of data locality

nThe input to a single reduce task is normally the output from all mappers.

nThe output of the reduce is normally stored in HDFS for reliability.

2.3.2.MapReduce data flow with multiple reduce tasks

The number of reduce tasks is not governed by the size of the input, but is specified independently.

lWhen there are multiple reducers, the map tasks partition their output, each creating one partition for each reduce task.

lThere can be many keys (and their associated values) in each partition, but the records for every key are all in a single partition.

lThe partitioning can be controlled by a user-defined partitioning function

nNormally the default partitioner which buckets keys using a hash function.

nconf.setPartitionerClass(HashPartitioner.class);

nconf.setNumReduceTasks(1);

lThe data flow between map and reduce tasks is “the shuffle,” as each reduce task is fed by many map tasks.

lIt’s also possible to have zero reduce tasks. This can be appropriate when you don’t need the shuffle since the processing can be carried out entirely in parallel

3.MapReduce Types and Formats

3.1.MapReduce Types

lThe map and reduce functions in Hadoop MapReduce have the following general form:

lThe partition function operates on the intermediate key and value types (K2and V2), and returns the partition index.

3.1.1.Configuration of MapReduce types

lInput types are set by the input format.

nFor instance, a TextInputFormat generates keys of type LongWritable and values of type Text.

lA minimal MapReduce driver, with the defaults explicitly set

lThe default input format is TextInputFormat, which produces keys of type LongWritable (the offset of the beginning of the line in the file) and values of type Text (the line of text).

lThe setNumMapTasks() call does not necessarily set the number of map tasks to one

nThe actual number of map tasks depends on the size of the input

lThe default mapper is IdentityMapper

lMap tasks are run by MapRunner, the default implementation of MapRunnable that calls the Mapper’s map() method sequentially with each record.

lThe default partitioner is HashPartitioner, which hashes a record’s key to determine which partition the record belongs in.

nEach partition is processed by a reduce task, so the number of partitions is equal to the number of reduce tasks for the job

lThe default reducer is IdentityReducer

lRecords are sorted by the MapReduce system before being presented to the reducer.

lThe default output format is TextOutputFormat, which writes out records, one per line, by converting keys and values to strings and separating them with a tab character.

3.2.Input Formats

3.2.1.Input Splits and Records

lAn input split is a chunk of the input that is processed by a single map.

lEach split is divided into records, and the map processes each record—a key-value pair—in turn.

lAn InputSplit has a length in bytes, and a set of storage locations, which are just hostname strings.

lA split doesn’t contain the input data; it is just a reference to the data.

lThe storage locations are used by the MapReduce system to place map tasks as close to the split’s data as possible

lThe size is used to order the splits so that the largest get processed first

lAn InputFormat is responsible for creating the input splits, and dividing them into records.

lThe JobClient calls the getSplits() method, passing the desired number of map tasks as the numSplits argument.

lHaving calculated the splits, the client sends them to the jobtracker, which uses their storage locations to schedule map tasks to process them on the tasktrackers.

lOn a tasktracker, the map task passes the split to the getRecordReader() method on InputFormat to obtain a RecordReader for that split.

lA RecordReader is little more than an iterator over records, and the map task uses one to generate record key-value pairs, which it passes to the map function.

lThe same key and value objects are used on each invocation of the map() method—only their contents are changed.If you need to change the value out of map, make a copy of the object you want to hold on to.

3.2.2.FileInputFormat

lFileInputFormat is the base class for all implementations of InputFormat that use files as their data source.

lIt provides two things: a place to define which files are included as the input to a job, and an implementation for generating splits for the input files.

lFileInputFormat input paths may represent a file, a directory, or, by using a glob, a collection of files and directories.

lTo exclude certain files from the input, you can set a filter using the setInputPathFilter() method on FileInputFormat

lFileInputFormat splits only large files. Here “large” means larger than an HDFS block.

lProperties for controlling split size

nThe minimum split size is usually 1 byte, by setting this to a value larger than the block size, they can force splits to be larger than a block.

nThe maximum split size defaults to the maximum value that can be represented by a Java long type. It has an effect only when it is less than the block size, forcing splits to be smaller than a block.

Small files and CombineFileInputFormat

lHadoop works better with a small number of large files than a large number of small files.

lWhere FileInputFormat creates a split per file, CombineFileInputFormat packs many files into each split so that each mapper has more to process.

lOne technique for avoiding the many small files case is to merge small files into larger files by using a SequenceFile: the keys can act as filenames and the values as file contents.

3.2.3.Text Input

lTextInputFormat is the default InputFormat.

nEach record is a line of input.

nThe key, a LongWritable, is the byte offset within the file of the beginning of the line.

nThe value is the contents of the line, excluding any line terminators, and is packaged as a Text object.

lThe logical records that FileInputFormats define do not usually fit neatly into HDFS blocks.

lA single file is broken into lines, and the line boundaries do not correspond with the HDFS block boundaries.

lSplits honor logical record boundaries

nThe first split contains line 5, even though it spans the first and second block.

nThe second split starts at line 6.

lData-local maps will perform some remote reads.

KeyValueTextInputFormat

lIt is common for each line in a file to be a key-value pair, separated by a delimiter such as a tab character.

lYou can specify the separator via the key.value.separator.in.input.line property.

NLineInputFormat

lIf you want your mappers to receive a fixed number of lines of input, then NLineInputFormat is the InputFormat to use.

lLike TextInputFormat, the keys are the byte offsets within the file and the values are the lines themselves.

lN refers to the number of lines of input that each mapper receives.

3.2.4.Binary Input

SequenceFileInputFormat

lHadoop’s sequence file format stores sequences of binary key-value pairs.

lTo use data from sequence files as the input to MapReduce, you use SequenceFileInputFormat.

lThe keys and values are determined by the sequence file, and you need to make sure that your map input types correspond.

lFor example, if your sequence file has IntWritable keys and Text values, then the map signature would be Mapper<IntWritable, Text, K, V>.

SequenceFileAsTextInputFormat

lSequenceFileAsTextInputFormat is a variant of SequenceFileInputFormat that converts the sequence file’s keys and values to Text objects.

SequenceFileAsBinaryInputFormat

lSequenceFileAsBinaryInputFormat is a variant of SequenceFileInputFormat that retrieves the sequence file’s keys and values as opaque binary objects.

lThey are encapsulated as BytesWritable objects

SequenceFile

lWriting a SequenceFile

nTo create a SequenceFile, use one of its createWriter() static methods, which returns a SequenceFile.Writer instance.

nspecify a stream to write to (either a FSDataOutputStream or a FileSystem and Path pairing), a Configuration object, and the key and value types.

nOnce you have a SequenceFile.Writer, you then write key-value pairs, using the append() method.

nThen when you’ve finished you call the close() method

lReading a SequenceFile

nReading sequence files from beginning to end is a matter of creating an instance of SequenceFile.Reader, and iterating over records by repeatedly invoking one of the next() methods.

lThe SequenceFile Format

nA sequence file consists of a header followed by one or more records.

nThe first three bytes of a sequence file are the bytes SEQ, which acts a magic number, followed by a single byte representing the version number.

nThe header contains other fields including the names of the key and value classes,compression details, user-defined metadata, and the sync marker.

nThe sync marker is used to allow a reader to synchronize to a record boundary from any position in the file.

3.2.5.Multiple Inputs

lThe MultipleInputs class allows you to specify the InputFormat and Mapper to use on a per-path basis.

3.3.Output Formats

3.3.1.Text Output

lThe default output format, TextOutputFormat, writes records as lines of text.

lIts keys and values may be of any type, since TextOutputFormat turns them to strings by calling toString() on them.

lEach key-value pair is separated by a tab character, although that may be changed using the mapred.textoutputformat.separator property.

3.3.2.Binary Output

lSequenceFileOutputFormat

lSequenceFileAsBinaryOutputFormat

lMapFileOutputFormat

Writing a MapFile

lYou create an instance of MapFile.Writer, then call the append() method to add entries in order.

lKeys must be instances of WritableComparable, and values must be Writable

lIf we look at the MapFile, we see it’s actually a directory containing two files called data and index:

lBoth files are SequenceFiles. The data file contains all of the entries, in order:

lThe index file contains a fraction of the keys, and contains a mapping from the key to that key’s offset in the data file:

Reading a MapFile

lyou create a MapFile.Reader, then call the next() method until it returns false

3.3.3.Multiple Outputs

MultipleOutputFormat

lMultipleOutputFormat allows you to write data to multiple files whose names are derived from the output keys and values.

nconf.setOutputFormat(StationNameMultipleTextOutputFormat.class);

MultipleOutputs

lMultipleOutputs can emit different types for each output.

4.Developing a MapReduce Application

4.1.The Configuration API

lAn instance of the Configuration class (found in the org.apache.hadoop.conf package) represents a collection of configuration properties and their values.

lConfigurations read their properties from resources—XML files

lwe can access its properties using a piece of code like this:

4.2.Configuring the Development Environment

4.2.1.Managing Configuration

lWhen developing Hadoop applications, it is common to switch between running the application locally and running it on a cluster.

lhadoop-local.xml

lhadoop-localhost.xml

lhadoop-cluster.xml

lWith this setup, it is easy to use any configuration with the -conf command-line switch.

lFor example, the following command shows a directory listing on the HDFS server running in pseudo-distributed mode on localhost:

4.2.2.GenericOptionsParser, Tool, and ToolRunner

5.How MapReduce Works

5.1.Anatomy of a MapReduce Job Run

lThere are four independent entities:

nThe client, which submits the MapReduce job.

nThe jobtracker, which coordinates the job run. The jobtracker is a Java application whose main class is JobTracker.

nThe tasktrackers, which run the tasks that the job has been split into. Tasktrackers are Java applications whose main class is TaskTracker.

nThe distributed filesystem, which is used for sharing job files between the other entities.

5.1.1.Job Submission

lThe runJob() method on JobClient creates a new JobClient instance and calls submitJob() on it.

lHaving submitted the job, runJob() polls the job’s progress once a second, and reports the progress to the console if it has changed since the last report.

lWhen the job is complete, if it was successful, the job counters are displayed. Otherwise, the error that caused the job to fail is logged to the console.

The job submission process

lAsks the jobtracker for a new job ID (by calling getNewJobId() on JobTracker)

lChecks the output specification of the job.

lComputes the input splits for the job.

lCopies the resources needed to run the job, including the job JAR file, the configuration file and the computed input splits, to the jobtracker’s filesystem in a directory named after the job ID.

lTells the jobtracker that the job is ready for execution (by calling submitJob() on JobTracker)

5.1.2.Job Initialization

lWhen the JobTracker receives a call to its submitJob() method, it puts it into an internal queue from where the job scheduler will pick it up and initialize it.

lInitialization involves creating an object to represent the job being run, which encapsulates its tasks, and bookkeeping information to keep track of the tasks’ status and progress.

lTo create the list of tasks to run, the job scheduler first retrieves the input splits computed by the JobClient from the shared filesystem.

lIt then creates one map task for each split.

lTasks are given IDs at this point.

5.1.3.Task Assignment

lTasktrackers run a simple loop that periodically sends heartbeat method calls to the jobtracker.

lAs a part of the heartbeat, a tasktracker will indicate whether it is ready to run a new task, and if it is, the jobtracker will allocate it a task, which it communicates to the tasktracker using the heartbeat return value

lBefore it can choose a task for the tasktracker, the jobtracker must choose a job to select the task from according to priority.(setJobPriority() and FIFO)

lTasktrackers have a fixed number of slots for map tasks and for reduce tasks.

lThe default scheduler fills empty map task slots before reduce task slots

lTo choose a reduce task the jobtracker simply takes the next in its list of yet-to-be-run reduce tasks, since there are no data locality considerations.

5.1.4.Task Execution

lNow the tasktracker has been assigned a task, the next step is for it to run the task.

lFirst, it localizes the job JAR by copying it from the shared filesystem to the tasktracker’s filesystem.

lIt also copies any files needed from the distributed cache by the application to the local disk

lSecond, it creates a local working directory for the task, and un-jars the contents of the JAR into this directory.

lThird, it creates an instance of TaskRunner to run the task.

lTaskRunner launches a new Java Virtual Machine to run each task in

lIt is however possible to reuse the JVM between tasks;

lThe child process communicates with its parent through the umbilical interface.

5.1.5.Job Completion

lWhen the jobtracker receives a notification that the last task for a job is complete, it changes the status for the job to “successful.” T

lhen, when the JobClient polls for status, it learns that the job has completed successfully, so it prints a message to tell the user, and then returns from the runJob() method.

5.2.Failures

5.2.1.Task Failure

lThe most common way is when user code in the map or reduce task throws a runtime exception.

nthe child JVM reports the error back to its parent tasktracker, before it exits.

nThe error ultimately makes it into the user logs.

nThe tasktracker marks the task attempt as failed, freeing up a slot to run another task.

lAnother failure mode is the sudden exit of the child JVM

nthe tasktracker notices that the process has exited, and marks the attempt as failed.

lHanging tasks are dealt with differently.

nThe tasktracker notices that it hasn’t received a progress update for a while, and proceeds to mark the task as failed.

nThe child JVM process will be automatically killed after this period

lWhen the jobtracker is notified of a task attempt that has failed (by the tasktracker’s heartbeat call) it will reschedule execution of the task.

nThe jobtracker will try to avoid rescheduling the task on a tasktracker where it has previously failed.

nIf a task fails more than four times, it will not be retried further.

5.2.2.Tasktracker Failure

lIf a tasktracker fails by crashing, or running very slowly, it will stop sending heartbeats to the jobtracker (or send them very infrequently).

lThe jobtracker will notice a tasktracker that has stopped sending heartbeats and remove it from its pool of tasktrackers to schedule tasks on.

lThe jobtracker arranges for map tasks that were run and completed successfully on that tasktracker to be rerun if they belong to incomplete jobs, since their intermediate output residing on the failed tasktracker’s local filesystem may not be accessible to the reduce task. Any tasks in progress are also rescheduled.

5.2.3.Jobtracker Failure

5.3.Shuffle and Sort

5.3.1.The Map Side

lWhen the map function starts producing output, it is not simply written to disk.

lEach map task has a circular memory buffer that it writes the output to.

lWhen the contents of the buffer reach a certain threshold size, a background thread will start to spill the contents to disk.

lSpills are written in round-robin fashion to the directories specified by the mapred.local.dir property

lBefore it writes to disk, the thread first divides the data into partitions corresponding to the reducers that they will ultimately be sent to.

lWithin each partition, the background thread performs an in-memory sort by key.

lEach time the memory buffer reaches the spill threshold, a new spill file is created, so after the map task has written its last output record there could be several spill files.

lBefore the task is finished, the spill files are merged into a single partitioned and sorted output file.

lThe output file’s partitions are made available to the reducers over HTTP.

lThe number of worker threads used to serve the file partitions is controlled by the task tracker.http.threads property

5.3.2.The Reduce Side

lAs map tasks complete successfully, they notify their parent tasktracker of the status update, which in turn notifies the jobtracker.

lfor a given job, the jobtracker knows the mapping between map outputs and tasktrackers.

lA thread in the reducer periodically asks the jobtracker for map output locations until it has retrieved them all.

lThe reduce task needs the map output for its particular partition from several map tasks across the cluster.

lThe map tasks may finish at different times, so the reduce task starts copying their outputs as soon as each completes. This is known as the copy phase of the reduce task.

lThe reduce task has a small number of copier threads so that it can fetch map outputs in parallel.

lAs the copies accumulate on disk, a background thread merges them into larger, sorted files.

lWhen all the map outputs have been copied, the reduce task moves into the sort phase (which should properly be called the merge phase, as the sorting was carried out on the map side), which merges the map outputs, maintaining their sort ordering.

lDuring the reduce phase the reduce function is invoked for each key in the sorted output. The output of this phase is written directly to the output filesystem, typically HDFS.

原文地址：http://www.cnblogs.com/forfuture1978/archive/2010/02/27/1674955.html 感谢原作者的分享！

你可能感兴趣的:(hadoop)

浅谈MapReduce Android路上的人 Hadoop 分布式计算 mapreduce 分布式框架 hadoop
从今天开始，本人将会开始对另一项技术的学习，就是当下炙手可热的Hadoop分布式就算技术。目前国内外的诸多公司因为业务发展的需要，都纷纷用了此平台。国内的比如BAT啦，国外的在这方面走的更加的前面，就不一一列举了。但是Hadoop作为Apache的一个开源项目，在下面有非常多的子项目，比如HDFS，HBase,Hive，Pig,等等，要先彻底学习整个Hadoop，仅仅凭借一个的力量，是远远不够的。
Hadoop 傲雪凌霜，松柏长青后端大数据 hadoop 大数据分布式
ApacheHadoop是一个开源的分布式计算框架，主要用于处理海量数据集。它具有高度的可扩展性、容错性和高效的分布式存储与计算能力。Hadoop核心由四个主要模块组成，分别是HDFS（分布式文件系统）、MapReduce（分布式计算框架）、YARN（资源管理）和HadoopCommon（公共工具和库）。1.HDFS（HadoopDistributedFileSystem）HDFS是Hadoop生
Hadoop架构 henan程序媛 hadoop 大数据分布式
一、案列分析1.1案例概述现在已经进入了大数据(BigData)时代，数以万计用户的互联网服务时时刻刻都在产生大量的交互，要处理的数据量实在是太大了，以传统的数据库技术等其他手段根本无法应对数据处理的实时性、有效性的需求。HDFS顺应时代出现，在解决大数据存储和计算方面有很多的优势。1.2案列前置知识点1.什么是大数据大数据是指无法在一定时间范围内用常规软件工具进行捕捉、管理和处理的大量数据集合，
分享一个基于python的电子书数据采集与可视化分析 hadoop电子书数据分析与推荐系统 spark大数据毕设项目（源码、调试、LW、开题、PPT) 计算机源码社 Python项目大数据大数据 python hadoop 计算机毕业设计选题计算机毕业设计源码数据分析 spark毕设
作者：计算机源码社个人简介：本人八年开发经验，擅长Java、Python、PHP、.NET、Node.js、Android、微信小程序、爬虫、大数据、机器学习等，大家有这一块的问题可以一起交流！学习资料、程序开发、技术解答、文档报告如需要源码，可以扫取文章下方二维码联系咨询Java项目微信小程序项目Android项目Python项目PHP项目ASP.NET项目Node.js项目选题推荐项目实战|p
hbase介绍 CrazyL- 云计算+大数据 hbase
hbase是一个分布式的、多版本的、面向列的开源数据库hbase利用hadoophdfs作为其文件存储系统，提供高可靠性、高性能、列存储、可伸缩、实时读写、适用于非结构化数据存储的数据库系统hbase利用hadoopmapreduce来处理hbase、中的海量数据hbase利用zookeeper作为分布式系统服务特点：数据量大：一个表可以有上亿行，上百万列（列多时，插入变慢）面向列：面向列（族）的
大数据毕业设计hadoop+spark+hive知识图谱租房数据分析可视化大屏租房推荐系统 58同城租房爬虫房源推荐系统房价预测系统计算机毕业设计机器学习深度学习人工智能 2401_84572577 程序员大数据 hadoop 人工智能
做了那么多年开发，自学了很多门编程语言，我很明白学习资源对于学一门新语言的重要性，这些年也收藏了不少的Python干货，对我来说这些东西确实已经用不到了，但对于准备自学Python的人来说，或许它就是一个宝藏，可以给你省去很多的时间和精力。别在网上瞎学了，我最近也做了一些资源的更新，只要你是我的粉丝，这期福利你都可拿走。我先来介绍一下这些东西怎么用，文末抱走。（1）Python所有方向的学习路线（
Spark集群的三种模式 MelodyYN #Spark spark hadoop big data
文章目录1、Spark的由来1.1Hadoop的发展1.2MapReduce与Spark对比2、Spark内置模块3、Spark运行模式3.1Standalone模式部署配置历史服务器配置高可用运行模式3.2Yarn模式安装部署配置历史服务器运行模式4、WordCount案例1、Spark的由来定义：Hadoop主要解决，海量数据的存储和海量数据的分析计算。Spark是一种基于内存的快速、通用、可
月度总结 | 2022年03月 | 考研与就业的抉择 | 确定未来走大数据开发路线「已注销」个人总结 hadoop
一、时间线梳理3月3日，寻找到同专业的就业伙伴3月5日，着手准备Java八股文，决定先走Java后端路线3月8月，申请到了校图书馆的考研专座，决定暂时放弃就业，先准备考研，买了数学和408的资料书3月9日-3月13日，因疫情原因，宿舍区暂封，这段时间在准备考研，发现内容特别多3月13日-3月19日，大部分时间在刷Hadoop、Zookeeper、Kafka的视频，同时在准备实习的项目3月20日，退
HBase介绍 mingyu1016 数据库
概述HBase是一个分布式的、面向列的开源数据库,源于google的一篇论文《bigtable：一个结构化数据的分布式存储系统》。HBase是GoogleBigtable的开源实现，它利用HadoopHDFS作为其文件存储系统，利用HadoopMapReduce来处理HBase中的海量数据，利用Zookeeper作为协同服务。HBase的表结构HBase以表的形式存储数据。表有行和列组成。列划分为
Java中的大数据处理框架对比分析省赚客app开发者 java 开发语言
Java中的大数据处理框架对比分析大家好，我是微赚淘客系统3.0的小编，是个冬天不穿秋裤，天冷也要风度的程序猿！今天，我们将深入探讨Java中常用的大数据处理框架，并对它们进行对比分析。大数据处理框架是现代数据驱动应用的核心，它们帮助企业处理和分析海量数据，以提取有价值的信息。本文将重点介绍ApacheHadoop、ApacheSpark、ApacheFlink和ApacheStorm这四种流行的
Hadoop windows intelij 跑 MR WordCount piziyang12138
一、软件环境我使用的软件版本如下:IntellijIdea2017.1Maven3.3.9Hadoop分布式环境二、创建maven工程打开Idea,file->new->Project,左侧面板选择maven工程。(如果只跑MapReduce创建java工程即可，不用勾选Creatfromarchetype，如果想创建web工程或者使用骨架可以勾选)image.png设置GroupId和Artif
Hadoop学习第三课（HDFS架构--读、写流程）小小程序员呀~ 数据库 hadoop 架构 big data
1.块概念举例1：一桶水1000ml，瓶子的规格100ml=>需要10个瓶子装完一桶水1010ml，瓶子的规格100ml=>需要11个瓶子装完一桶水1010ml，瓶子的规格200ml=>需要6个瓶子装完块的大小规格，只要是需要存储，哪怕一点点，也是要占用一个块的块大小的参数：dfs.blocksize官方默认的大小为128M官网：https://hadoop.apache.org/docs/r3.
hadoop启动HDFS命令 m0_67401228 java 搜索引擎 linux 后端
启动命令：/hadoop/sbin/start-dfs.sh停止命令：/hadoop/sbin/stop-dfs.sh
【计算机毕设-大数据方向】基于Hadoop的电商交易数据分析可视化系统的设计与实现程序员-石头山大数据实战案例大数据 hadoop 毕业设计毕设
博主介绍：✌全平台粉丝5W+,高级大厂开发程序员，博客之星、掘金/知乎/华为云/阿里云等平台优质作者。【源码获取】关注并且私信我【联系方式】最下边感兴趣的可以先收藏起来，同学门有不懂的毕设选题，项目以及论文编写等相关问题都可以和学长沟通，希望帮助更多同学解决问题前言随着电子商务行业的迅猛发展，电商平台积累了海量的数据资源，这些数据不仅包括用户的基本信息、购物记录，还包括用户的浏览行为、评价反馈等多
分布式离线计算—Spark—基础介绍测试开发abbey 人工智能—大数据
原文作者：饥渴的小苹果原文地址：【Spark】Spark基础教程目录Spark特点Spark相对于Hadoop的优势Spark生态系统Spark基本概念Spark结构设计Spark各种概念之间的关系Executor的优点Spark运行基本流程Spark运行架构的特点Spark的部署模式Spark三种部署方式Hadoop和Spark的统一部署摘要：Spark是基于内存计算的大数据并行计算框架Spar
spark常用命令我是浣熊的微笑 spark
查看报错日志：yarnlogsapplicationIDspark2-submit--masteryarn--classcom.hik.ReadHdfstest-1.0-SNAPSHOT.jar进入$SPARK_HOME目录，输入bin/spark-submit--help可以得到该命令的使用帮助。hadoop@wyy:/app/hadoop/spark100$bin/spark-submit--
spark启动命令学不会又听不懂 spark 大数据分布式
hadoop启动：cd/root/toolssstart-dfs.sh，只需在hadoop01上启动stop-dfs.sh日志查看：cat/root/toolss/hadoop/logs/hadoop-root-datanode-hadoop03.outzookeeper启动：cd/root/toolss/zookeeperbin/zkServer.shstart，三台都要启动bin/zkServ
编程常用命令总结 Yellow0523 Linux BigData 大数据
编程命令大全1.软件环境变量的配置JavaScalaSparkHadoopHive2.大数据软件常用命令Spark基本命令Spark-SQL命令Hive命令HDFS命令YARN命令Zookeeper命令kafka命令Hibench命令MySQL命令3.Linux常用命令Git命令conda命令pip命令查看Linux系统的详细信息查看Linux系统架构(X86还是ARM，两种方法都可)端口号命令L
Hadoop常见面试题整理及解答叶青舟 Linux hdfs 大数据 hadoop linux
Hadoop常见面试题整理及解答一、基础知识篇：1.把数据仓库从传统关系型数据库转到hadoop有什么优势？答：（1）关系型数据库成本高，且存储空间有限。而Hadoop使用较为廉价的机器存储数据，且Hadoop可以将大量机器构建成一个集群，并在集群中使用HDFS文件系统统一管理数据，极大的提高了数据的存储及处理能力。（2）关系型数据库仅支持标准结构化数据格式，Hadoop不仅支持标准结构化数据格式
2025毕业设计指南：如何用Hadoop构建超市进货推荐系统？大数据分析助力精准采购计算机编程指导师 Java实战集 Python实战集大数据实战集课程设计 hadoop 数据分析 spring boot java 进货 python
✍✍计算机编程指导师⭐⭐个人介绍：自己非常喜欢研究技术问题！专业做Java、Python、小程序、安卓、大数据、爬虫、Golang、大屏等实战项目。⛽⛽实战项目：有源码或者技术上的问题欢迎在评论区一起讨论交流！⚡⚡Java实战|SpringBoot/SSMPython实战项目|Django微信小程序/安卓实战项目大数据实战项目⚡⚡文末获取源码文章目录⚡⚡文末获取源码基于hadoop的超市进货推荐系
Hadoop Common 之序列化机制小解猫君之上 #Apache Hadoop
1.JavaSerializable序列化该序列化通过ObjectInputStream的readObject实现序列化，ObjectOutputStream的writeObject实现反序列化。这不过此种序列化虽然跨病态兼容性强，但是因为存储过多的信息，但是传输效率比较低，所以hadoop弃用它。（序列化信息包括这个对象的类，类签名，类的所有静态，费静态成员的值，以及他们父类都要被写入）publ
深入理解hadoop(一)----Common的实现----Configuration maoxiao_jsd 深入理解----hadoop
属本人个人原创，转载请注明,希望对大家有帮助！！一,hadoop的配置管理a,hadoop通过独有的Configuration处理配置信息Configurationconf=newConfiguration();conf.addResource("core-default.xml");conf.addResource("core-site.xml");后者会覆盖前者中未final标记的相同配置项b
hadoop 0.22.0 部署笔记 weixin_33701564 大数据 java 运维
为什么80%的码农都做不了架构师？>>>因为需要使用hbase，所以开始对hbase进行学习。hbase是部署在hadoop平台上的NOSql数据库，因此在部署hbase之前需要先部署hadoop。环境：redhat5、hadoop-0.22.0.tar.gz、jdk-6u13-linux-i586.zipip192.168.1.128hostname：localhost.localdomain（
解决Windows环境下hadoop集群的运行_window运行hadoop,unknown hadoop01(4) 2401_84160087 大数据面试学习
网上学习资料一大堆，但如果学到的知识不成体系，遇到问题时只是浅尝辄止，不再深入研究，那么很难做到真正的技术提升。需要这份系统化资料的朋友，可以戳这里获取一个人可以走的很快，但一群人才能走的更远！不论你是正从事IT行业的老鸟或是对IT行业感兴趣的新人，都欢迎加入我们的的圈子（技术交流、学习资源、职场吐槽、大厂内推、面试辅导），让我们一起学习成长！org.apache.hadoophadoop-com
解决Windows环境下hadoop集群的运行_window运行hadoop,unknown hadoop01(3) 2401_84160087 大数据面试学习
网上学习资料一大堆，但如果学到的知识不成体系，遇到问题时只是浅尝辄止，不再深入研究，那么很难做到真正的技术提升。需要这份系统化资料的朋友，可以戳这里获取一个人可以走的很快，但一群人才能走的更远！不论你是正从事IT行业的老鸟或是对IT行业感兴趣的新人，都欢迎加入我们的的圈子（技术交流、学习资源、职场吐槽、大厂内推、面试辅导），让我们一起学习成长！xmlns:xsi="http://www.w3.or
深入解析HDFS：定义、架构、原理、应用场景及常用命令 CloudJourney hdfs 架构 hadoop
引言Hadoop分布式文件系统（HDFS，HadoopDistributedFileSystem）是Hadoop框架的核心组件之一，它提供了高可靠性、高可用性和高吞吐量的大规模数据存储和管理能力。本文将从HDFS的定义、架构、工作原理、应用场景以及常用命令等多个方面进行详细探讨，帮助读者全面深入地了解HDFS。1.HDFS的定义1.1什么是HDFSHDFS是Hadoop生态系统中的一个分布式文件系
Hadoop的搭建流程 lzhlizihang hadoop 大数据分布式
文章目录一、配置IP二、配置主机名三、配置主机映射四、关闭防火墙五、配置免密六、安装jdk1、第一步：2、第二步：3、第三步：4、第四步：5、第五步：七、安装hadoop1、上传2、解压3、重命名4、开始配置环境变量5、刷新配置文件6、验证hadoop命令是否可以识别八、全分布搭建7、修改配置文件core-site.xml8、修改配置文件hdfs-site.xml9、修改配置文件hadoop-en
hive搭建 -----内嵌模式和本地模式 lzhlizihang hive hadoop
文章目录一、内嵌模式（使用较少）1、上传、解压、重命名2、配置环境变量3、配置conf下的hive-env.sh4、修改conf下的hive-site.xml5、启动hadoop集群6、给hdfs创建文件夹7、修改hive-site.xml中的非法字符8、初始化元数据9、测试是否成功10、内嵌模式的缺点二、本地模式（最常用）1、检查mysql是否正常2、上传、解压、重命名3、配置环境变量4、修改c
Hadoop之mapreduce -- WrodCount案例以及各种概念 lzhlizihang hadoop mapreduce 大数据
文章目录一、MapReduce的优缺点二、MapReduce案例--WordCount1、导包2、Mapper方法3、Partitioner方法（自定义分区器）4、reducer方法5、driver（main方法）6、Writable（手机流量统计案例的实体类）三、关于片和块1、什么是片，什么是块？2、mapreduce启动多少个MapTask任务？四、MapReduce的原理五、Shuffle过
IAAS: IT公司去IOE-Alibaba系统构架解读 wishchin 心理学/职业 BigDataMini Spark PaaS
从Hadoop到自主研发，技术解读阿里去IOE后的系统架构原地址：......................云计算阿里飞天摘要：从IOE时代，到Hadoop与飞天并行，再到飞天单集群5000节点的实现，阿里一直摸索在技术衍变的前沿。这里，我们将从架构、性能、运维等多个方面深入了解阿里基础设施。【导读】互联网的普及，智能终端的增加，大数据时代悄然而至。在这个数据为王的时代，数十倍、数百倍的数据给各
Dom 周华华 JavaScript html
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> <html xmlns="http://www.w3.org/1999/xhtml&q
【Spark九十六】RDD API之combineByKey bit1129 spark
1. combineByKey函数的运行机制 RDD提供了很多针对元素类型为(K,V)的API，这些API封装在PairRDDFunctions类中，通过Scala隐式转换使用。这些API实现上是借助于combineByKey实现的。combineByKey函数本身也是RDD开放给Spark开发人员使用的API之一首先看一下combineByKey的方法说明：
msyql设置密码报错：ERROR 1372 (HY000): 解决方法详解 daizj mysql 设置密码
MySql给用户设置权限同时指定访问密码时，会提示如下错误： ERROR 1372 (HY000): Password hash should be a 41-digit hexadecimal number；问题原因：你输入的密码是明文。不允许这么输入。解决办法：用select password('你想输入的密码');查询出你的密码对应的字符串，然后
路漫漫其修远兮吾将上下而求索周凡杨学习思索
王国维在他的《人间词话》中曾经概括了为学的三种境界古今之成大事业、大学问者，罔不经过三种之境界。“昨夜西风凋碧树。独上高楼，望尽天涯路。”此第一境界也。“衣带渐宽终不悔，为伊消得人憔悴。”此第二境界也。“众里寻他千百度，蓦然回首，那人却在灯火阑珊处。”此第三境界也。学习技术，这也是你必须经历的三种境界。第一层境界是说，学习的路是漫漫的，你必须做好充分的思想准备，如果半途而废还不如不要开始。这里，注
Hadoop(二)对话单的操作朱辉辉33 hadoop
Debug： 1、 A = LOAD '/user/hue/task.txt' USING PigStorage(' ') AS (col1,col2,col3); DUMP A; //输出结果前几行示例： (>ggsnPDPRecord(21),,) (-->recordType(0),,) (-->networkInitiation(1),,)
web报表工具FineReport常用函数的用法总结（日期和时间函数）老A不折腾 finereport 报表工具 web开发
web报表工具FineReport常用函数的用法总结（日期和时间函数）说明：凡函数中以日期作为参数因子的，其中日期的形式都必须是yy/mm/dd。而且必须用英文环境下双引号(" ")引用。 DATE DATE(year,month,day):返回一个表示某一特定日期的系列数。 Year:代表年，可为一到四位数。 Month:代表月份。
c++ 宏定义中的##操作符墙头上一根草 C++
#与##在宏定义中的--宏展开 #include <stdio.h> #define f(a,b) a##b #define g(a) #a #define h(a) g(a) int main() { &nbs
分析Spring源代码之，DI的实现 aijuans spring DI 现源代码
(转) 分析Spring源代码之，DI的实现 2012/1/3 by tony 接着上次的讲，以下这个sample [java] view plain copy print
for循环的进化 alxw4616 JavaScript
// for循环的进化 // 菜鸟 for (var i = 0; i < Things.length ; i++) { // Things[i] } // 老鸟 for (var i = 0, len = Things.length; i < len; i++) { // Things[i] } // 大师 for (var i = Things.le
网络编程Socket和ServerSocket简单的使用百合不是茶网络编程基础 IP地址端口
网络编程;TCP/IP协议网络:实现计算机之间的信息共享,数据资源的交换协议:数据交换需要遵守的一种协议,按照约定的数据格式等写出去端口:用于计算机之间的通信每运行一个程序，系统会分配一个编号给该程序，作为和外界交换数据的唯一标识 0~65535 查看被使用的
JDK1.5 生产消费者 bijian1013 java thread 生产消费者 java多线程
ArrayBlockingQueue：一个由数组支持的有界阻塞队列。此队列按 FIFO（先进先出）原则对元素进行排序。队列的头部是在队列中存在时间最长的元素。队列的尾部是在队列中存在时间最短的元素。新元素插入到队列的尾部，队列检索操作则是从队列头部开始获得元素。 ArrayBlockingQueue的常用方法：
JAVA版身份证获取性别、出生日期及年龄 bijian1013 java 性别出生日期年龄
工作中需要根据身份证获取性别、出生日期及年龄，且要还要支持15位长度的身份证号码，网上搜索了一下，经过测试好像多少存在点问题，干脆自已写一个。 CertificateNo.java package com.bijian.study; import java.util.Calendar; import
【Java范型六】范型与枚举 bit1129 java
首先，枚举类型的定义不能带有类型参数，所以，不能把枚举类型定义为范型枚举类，例如下面的枚举类定义是有编译错的 public enum EnumGenerics<T> { //编译错，提示枚举不能带有范型参数 OK, ERROR; public <T> T get(T type) { return null;
【Nginx五】Nginx常用日志格式含义 bit1129 nginx
1. log_format 1.1 log_format指令用于指定日志的格式，格式： log_format name(格式名称) type(格式样式) 1.2 如下是一个常用的Nginx日志格式： log_format main '[$time_local]|$request_time|$status|$body_bytes
Lua 语言 15 分钟快速入门 ronin47 lua 基础
- - 单行注释 - - [[ [多行注释] - - ]] - - - - - - - - - - - 1. 变量 & 控制流 - - - - - - - - - - num = 23 - - 数字都是双精度 str = 'aspythonstring'
java-35.求一个矩阵中最大的二维矩阵 ( 元素和最大 ) bylijinnan java
the idea is from: http://blog.csdn.net/zhanxinhang/article/details/6731134 public class MaxSubMatrix { /**see http://blog.csdn.net/zhanxinhang/article/details/6731134 * Q35 求一个矩阵中最大的二维
mongoDB文档型数据库特点开窍的石头 mongoDB文档型数据库特点
MongoDD: 文档型数据库存储的是Bson文档-->json的二进制特点：内部是执行引擎是js解释器，把文档转成Bson结构，在查询时转换成js对象。 mongoDB传统型数据库对比传统类型数据库：结构化数据，定好了表结构后每一个内容符合表结构的。也就是说每一行每一列的数据都是一样的文档型数据库：不用定好数据结构，
[毕业季节]欢迎广大毕业生加入JAVA程序员的行列 comsci java
一年一度的毕业季来临了。。。。。。。。正在投简历的学弟学妹们。。。如果觉得学校推荐的单位和公司不适合自己的兴趣和专业，可以考虑来我们软件行业，做一名职业程序员。。。软件行业的开发工具中，对初学者最友好的就是JAVA语言了，网络上不仅仅有大量的
PHP操作Excel – PHPExcel 基本用法详解 cuiyadll PHP Excel
导出excel属性设置//Include classrequire_once('Classes/PHPExcel.php');require_once('Classes/PHPExcel/Writer/Excel2007.php');$objPHPExcel = new PHPExcel();//Set properties 设置文件属性$objPHPExcel->getProperties
IBM Webshpere MQ Client User Issue (MCAUSER) darrenzhu IBM jms user MQ MCAUSER
IBM MQ JMS Client去连接远端MQ Server的时候，需要提供User和Password吗？答案是根据情况而定，取决于所定义的Channel里面的属性Message channel agent user identifier (MCAUSER)的设置。 http://stackoverflow.com/questions/20209429/how-mca-user-i
网线的接法 dcj3sjt126com
一、PC连HUB (直连线)A端：（标准568B）：白橙，橙，白绿，蓝，白蓝，绿，白棕，棕。 B端：（标准568B）：白橙，橙，白绿，蓝，白蓝，绿，白棕，棕。二、PC连PC （交叉线）A端：(568A)：白绿，绿，白橙，蓝，白蓝，橙，白棕，棕； B端：（标准568B）：白橙，橙，白绿，蓝，白蓝，绿，白棕，棕。三、HUB连HUB&nb
Vimium插件让键盘党像操作Vim一样操作Chrome dcj3sjt126com chrome vim
什么是键盘党？键盘党是指尽可能将所有电脑操作用键盘来完成，而不去动鼠标的人。鼠标应该说是新手们的最爱，很直观，指哪点哪，很听话！不过常常使用电脑的人，如果一直使用鼠标的话，手会发酸，因为操作鼠标的时候，手臂不是在一个自然的状态，臂肌会处于绷紧状态。而使用键盘则双手是放松状态，只有手指在动。而且尽量少的从鼠标移动到键盘来回操作，也省不少事。在chrome里安装 vimium 插件
MongoDB查询（2）——数组查询[六] eksliang mongodb MongoDB查询数组
MongoDB查询数组转载请出自出处：http://eksliang.iteye.com/blog/2177292 一、概述 MongoDB查询数组与查询标量值是一样的，例如，有一个水果列表，如下所示： > db.food.find() { "_id" : "001", "fruits" : [ "苹
cordova读写文件（1） gundumw100 JavaScript Cordova
使用cordova可以很方便的在手机sdcard中读写文件。首先需要安装cordova插件：file 命令为： cordova plugin add org.apache.cordova.file 然后就可以读写文件了，这里我先是写入一个文件，具体的JS代码为： var datas=null;//datas need write var directory=&
HTML5 FormData 进行文件jquery ajax 上传到又拍云 ileson jquery Ajax html5 FormData
html5 新东西：FormData 可以提交二进制数据。页面test.html <!DOCTYPE> <html> <head> <title> formdata file jquery ajax upload</title> </head> <body> <
swift appearanceWhenContainedIn:(version1.2 xcode6.4) 啸笑天 version
swift1.2中没有oc中对应的方法： + (instancetype)appearanceWhenContainedIn:(Class <UIAppearanceContainer>)ContainerClass, ... NS_REQUIRES_NIL_TERMINATION; 解决方法：在swift项目中新建oc类如下： #import &
java实现SMTP邮件服务器 macroli java 编程
电子邮件传递可以由多种协议来实现。目前，在Internet 网上最流行的三种电子邮件协议是SMTP、POP3 和 IMAP，下面分别简单介绍。　　◆ SMTP 协议　　简单邮件传输协议(Simple Mail Transfer Protocol,SMTP)是一个运行在TCP/IP之上的协议，用它发送和接收电子邮件。SMTP 服务器在默认端口25上监听。SMTP客户使用一组简单的、基于文本的
mongodb group by having where 查询sql qiaolevip 每天进步一点点学习永无止境 mongo 纵观千象
SELECT cust_id, SUM(price) as total FROM orders WHERE status = 'A' GROUP BY cust_id HAVING total > 250 db.orders.aggregate( [ { $match: { status: 'A' } }, { $group: {
Struts2 Pojo（六） Luob. POJO strust2
注意：附件中有完整案例 1.采用POJO对象的方法进行赋值和传值 2.web配置 <?xml version="1.0" encoding="UTF-8"?> <web-app version="2.5" xmlns="http://java.sun.com/xml/ns/javaee&q
struts2步骤 wuai struts
1、添加jar包 2、在web.xml中配置过滤器 <filter> <filter-name>struts2</filter-name> <filter-class>org.apache.st