杨尚川

2000个软件开发领域的高频特殊词及精选例句（二）

superword是一个Java实现的英文单词分析软件，主要研究英语单词音近形似转化规律、前缀后缀规律、词之间的相似性规律等等。

276、单词 tokenizer 的匹配文本：

Analysis is done by the analyzer, which is built of tokenizer and zero or more filters, and can also have zero or more character mappers. Mastering ElasticSearch

Up until we start using the StringTokenizer towards the bottom of the code, we’re just cleaning up the string. MapReduce Design Patterns

The mapper takes a column and breaks it into tokens (individual words) using StringTokenizer, a class that splits strings on common tokens such as spaces and columns. Cassandra High Performance Cookbook

Ironically, the next step may be to fall back on the StringTokenizer class, use of which is discouraged in the class Javadocs comments in favor of the newer regular expression-based tokenizers. Hadoop in Practice

TokenCounterMapper A mapper that tokenizes the input value into words (using Java’s StringTokenizer) and emits each word along with a count of one. Hadoop：The Definitive Guide 3rd Edition

Tokenizer Next, the string is tokenized into individual terms by a tokenizer. Elasticsearch The Definitive Guide

We see q that WordCount uses Java’s StringTokenizer in its default setting, which tokenizes based only on whitespaces. Hadoop in Action

It has a more complicated API and has more powerful features than StringTokenizer. Java in a Nutshell 5th Edition

Note that better tokenization could be provided by use of regular expressions or the StringTokenizer class, but for our purposes this simple approach will suffice. Hadoop Beginners Guide

For example, if a low-level IO exception occurs in the midst of a StreamTokenizer or ObjectInputStream operation, there is no sensible. Concurrent Programming in Java Design Principles and Pattern

277、单词 accumulo 的匹配文本：

By the end of this chapter, you will be able to integrate Apache Nutch with Apache Accumulo as well as with MySQL. Web Crawling and Data Mining with Apache Nutch

Adam Shook is a software engineer at ClearEdge IT Solutions, LLC, working with a number of big data technologies such as Hadoop, Accumulo, Pig, and ZooKeeper. MapReduce Design Patterns

Providing column and table-level authorization works well for situations that require column-based security, but it does not does address fine-grained access controls at the cell level, such as what is provided by Apache Accumulo (discussed in Chapter 12) Professional Hadoop Solutions

This book covers unloading/loading to and from HDFS, graph analytics with Giraph, batch data analysis using Hive, Pig, and MapReduce, machine-learning approaches with Mahout, debugging and troubleshooting MapReduce jobs, and columnar storage and retrieval of structured data using Apache Accumulo. Hadoop RealWorld Solutions Cookbook

278、单词 jvms 的匹配文本：

With sFlow instrumentation of the JVM, data is pushed from the JVMs to gmond, with no polling necessary. Monitoring with Ganglia

There will be multiple instances of your map function running, potentially in multiple Java Virtual Machines (JVMs), and potentially on multiple machines. Pro Hadoop

The number of map and reduce slots determines the number of concurrent map/reduce tasks for a TaskTracker, which forks multiple JVMs to run these tasks. Hadoop Operations and Cluster Management Cookbook

In the case of very small, fast-running map tasks (where the order of execution time is in seconds), Hadoop allows you to enable several tasks to reuse JVMs by specifying the job configuration mapreduce.job .jvm.numtasks. Professional Hadoop Solutions

However, the licensing and maturity of these other JVMs vary. Cassandra High Performance Cookbook

If multiple reduce tasks are specified, cityA and cityB could be processed independently on separate reduce JVMs. Hadoop RealWorld Solutions Cookbook

Bear in mind, too, that tasktrackers in a mini-cluster launch separate JVMs to run tasks in, which can make debugging more difficult. Hadoop：The Definitive Guide 3rd Edition

Although there is a single TaskTracker per slave node, each TaskTracker can spawn multiple JVMs to handle many map or reduce tasks in parallel. Hadoop in Action

The Hadoop community keeps a list of tested JVMs at http://wiki.apache.org/hadoop/HadoopJavaVersions where users can post their experiences with various Java VMs and versions. Hadoop Operations

Gradle provides an intelligent compromise in that it offers a maxParallelForks that governs the maximum simultaneous JVMs that are spawned. Building and Testing with Gradle

279、单词 programmatically 的匹配文本：

Being able to control the versioning scheme programmatically will become a necessity the more you want to automate your project lifecycle. Gradle in Action

Configuration changes, which we would normally make in Eclipse, can now be done programmatically in a Gradle build file. Gradle Effective Implementation Guide

The ElasticSearch API has some methods to programmatically create JSON documents which we will show. Mastering ElasticSearch

To rename files programmatically, we can pass a closure to the rename method (Example 1-7) Gradle Beyond The Basics

This will generate object allocation stacks sizes that are too small to be useful, so instead you’ll programmatically set custom HPROF parameters: Hadoop in Practice

Section 3.1.2 covers Hadoop Java libraries for handling HDFS files programmatically. Hadoop in Action

Programmatically Defining Keyspaces and Column Families You can create keyspaces and column families through the API as well. Cassandra The Definitive Guide

It is possible to use the JobConf class to programmatically set configuration properties on the executing job. Hadoop Beginners Guide

If you want to embed an ActiveMQ broker inside an application, the message store can also be configured programmatically. ActiveMQ in Action

Gradle also implements a strong domain model that is easily programmatically accessible. Building and Testing with Gradle

280、单词 ping 的匹配文本：

If a worker does not receive a ping message after a specified interval, the worker process terminates. Pregel - A System for Large-Scale Graph Processing

We can use this plugin to send a ping to a Hadoop/HBase daemon's RPC port, to check whether the daemon is alive. HBase Administration Cookbook

Imagine a check_ping plug-in that, when executed by the Nagios scheduler, pings a host and then return the following output: Monitoring with Ganglia

The counter values from individual worker machines are periodically propagated to the master (piggybacked on the ping response) MapReduce - Simplied Data Processing on Large Clusters

The master node will send the ping requests to other nodes and wait for the response. Mastering ElasticSearch

If an OSD has not heard from a peer recently, an explicit ping is sent. Ceph A Scalable, High-Performance Distributed File System

Open a shell or Terminal window on each machine, and use the ping command to see whether it can find the other. Beginning CouchDB

Creating a Rule Let’s create a rule to ping an arbitrary server and store the results of the ping. Gradle Beyond The Basics

Sessions are kept alive by the client sending ping requests (also known as heartbeats) whenever the session is idle for longer than a certain period. Hadoop：The Definitive Guide 3rd Edition

When a RPC client detects a tcp-socket timeout, instead of declaring a RPC timeout, it sends a ping to the RPC server. Apache Hadoop Goes Realtime at Facebook

281、单词 ascii 的匹配文本：

For example, the following command uses sflowtool to convert the binary sFlow HTTP operation data into ASCII CLF so that the operations can be visually inspected or exported to a web log analyzer such as Webalizer: Monitoring with Ganglia

If they are the same, the _rev values are compared in ASCII sort order, and the highest wins. CouchDB The Definitive Guide

Lowercase filter: It makes all the tokens lowercased • ASCII folding filter: It removes non ASCII parts from tokens. Mastering ElasticSearch

Use assume so CLI formats the columns as ASCII text, and then fetch all the columns of the 'mynewcar' row: Cassandra High Performance Cookbook

If both documents have the same number of revisions, then CouchDB simply does an ASCII comparison of the revision number and the document with the highest sort order wins. Scaling CouchDB

The data is stored using a line-oriented ASCII format, in which each line is a record. Hadoop：The Definitive Guide 3rd Edition

Without this clause, Hive defaults to the newline character as the row delimiter and an ASCII value. Hadoop in Action

This sorts by directly comparing the bytes, validating that the input can be parsed as US-ASCII. Cassandra The Definitive Guide

Our file is tab separated, but Hive, by default, expects its input files to have fields separated by the ASCII code 00 (control-A) Hadoop Beginners Guide

Finally the frame is ended by the ASCII null character (^@) ActiveMQ in Action

282、单词 wikipedia 的匹配文本：

You may be bound by other regulatory limitations on access to data such as Payment Card Industry (PCI—http://en.wikipedia. Apache Flume Distributed Log Collection for Hadoop

The Wikipedia article on B-trees is a good starting point for further investigations. CouchDB The Definitive Guide

For more information about SPNEGO, please refer to its wiki page at http://en.wikipedia.org/wiki/SPNEGO. Hadoop Operations and Cluster Management Cookbook

From the client point of view, document is a JSON object (see more about JSON format at http://en.wikipedia.org/wiki/JSON) Mastering ElasticSearch

Suppose we want to add StackOverflow links to each Wikipedia page that is referenced in a StackOverflow comment. MapReduce Design Patterns

A more in-depth explanation can be found in Wikipedia at http://en.wikipedia.org/wiki/Z-order_curve. Hadoop RealWorld Solutions Cookbook

You can find more information about Bayesian classifier from http://en.wikipedia.org/wiki/Naive_Bayes_classifier and Bayes theorem from http://betterexplained.com/articles/an-intuitiveand-short-explanation-of-bayes-theorem/ Hadoop MapReduce Cookbook

Wikipedia provides some good background articles on XML and JSON, if needed. Hadoop in Practice

The reader is configured by setting job configuration properties to tell it the patterns for the start and end tags (see the class documentation for details).5 To take an example, Wikipedia provides dumps of its content in XML form, which are appropriate for processing in parallel with MapReduce using this approach. Hadoop：The Definitive Guide 3rd Edition

The original Google paper and the Wikipedia entry use the CamelCase version MapReduce. Hadoop in Action

283、单词 enumeration 的匹配文本：

When you use the switch statement with enumerated constants, you need not supply the name of the enumeration in each label—it is deduced from the switch value. Core Java Volume I Fundamentals 9th Edition

When the argument is create (or an equivalent enumeration value), then the index operation will fail if a document with a given identifier already exists. Mastering ElasticSearch

The interface provides a public enumeration type, named ReturnCode, that is used by the filterKeyValue() method to indicate what the execution framework should do next. HBase：The Definitive Guide

Obviously, this is not what we want to happen when dealing with identification codes, or any kind of precise enumeration. Elasticsearch The Definitive Guide

Enumeration entries() returns an Enumeration object that enumerates the ZipEntry objects that describe the entries of the ZipFile. Core Java Volume II Advanced Features 9th Edition

The basic idea behind Java’s enum types is simple: they are classes that export one instance for each enumeration constant via a public static final field. Effective Java 2nd Edition

The getStatus operation is not an enumeration with some set list of possible statuses. Cassandra The Definitive Guide

What just happened? The first thing we need to do in order to add new counters is to create a standard Java enumeration that will hold them. Hadoop Beginners Guide

The getPropertyNames() method returns an Enumeration of all the properties on a given message to easily iterate through all of them. ActiveMQ in Action

Another possibility is to use the sleep() method of an element of the TimeUnit enumeration. Java 7 Concurrency Cookbook

284、单词 percona 的匹配文本：

Two of the three (Percona Server and MariaDB) are drop-in replacements, so the effort involved in trying them out is not large. High Performance MySQL 3rd Edition

In fact, for some systems, such as MariaDB or Percona, this may be your only option.) For example, to check the status of the cluster, open the MySQL client and check the status of the various parameters: high-availability-guide

285、单词 deletes 的匹配文本：

Each calendar receiving the message will process the invitation in its own transaction which updates the invitee's state and deletes the message. Megastore - Providing Scalable, Highly Available Storage for Interactive Services

The merge cost is calculated in a way that merges reclaiming more deletes and having a smaller size is favored. Mastering ElasticSearch

FlumeJava automatically deletes any temporary intermediate files it creates when they are no longer needed by later operations in the execution plan. FlumeJava - Easy, Efficient Data-Parallel Pipelines

Deletes in Cassandra are implemented as special writes known as tombstones. Cassandra High Performance Cookbook

Description: Deletes documents marked as deleted and old revisions of documents. Beginning CouchDB

Whenever CouchDB deletes a document, it creates a new revision with _deleted field set to true. Scaling CouchDB

After some interval (a minute, say), the master deletes ephemeral files that have no open file handles. Chubby - The lock service for loosely-coupled distributed systems

Each segment is almost a self sufficient index by itself, immutable up to deletes. elasticsearch. The Road to a Distributed, (Near) Real Time, Search Engine

It is still a common design requirement to store IDs related to other entities in your tables, but operations such as cascading deletes are not available. Cassandra The Definitive Guide

Irrelevant details were elided to keep the example short.) The call to Apply performs an atomic mutation to the Webtable: it adds one anchor to www.cnn.com and deletes a different anchor. Bigtable - A Distributed Storage System for Structured Data

286、单词 callable 的匹配文本：

A Callable is similar to a Runnable, but it returns a value. Core Java Volume I Fundamentals 9th Edition

In practice, a command named runMyAwesomeTask would be callable with the abbreviation rMAT. Gradle in Action

We can also use a closure or an instance of the Callable interface to define a list of files: Gradle Effective Implementation Guide

Functions in Pig are callable units of work that can be referenced in the context of specific operators. Hadoop in Practice

Endpoints are dynamic extensions to the RPC protocol, adding callable remote procedures. HBase：The Definitive Guide

In our example here, we are submitting a Callable object to the ExecutorService instance. Getting Started with Google Guava

There are two kinds of tasks: Runnable and its close cousin, Callable (which is like Runnable, except that it returns a value) Effective Java 2nd Edition

Another important advantage of the Executor framework is the Callable interface. Java 7 Concurrency Cookbook

In the same way that interface Runnable describes any pure action, a Callable interface can be used to describe any service method that accepts an Object argument, returns an Object result, and may throw an Exception: Concurrent Programming in Java Design Principles and Pattern

The CallableStatement in the following code executes this stored procedure that is contained within the database, passing the necessary parameters. Java 7 Recipes A Problem-Solution Approach

287、单词 solrconfig 的匹配文本：

File name Description solrconfig.xml This is the main configuration file of your Solr install. Scaling Big Data with Hadoop and Solr Chapter-02 Understanding Solr

Solr's solrconfig.xml file contains lots of parameters that can be tweaked. Solr 3 Enterprise Search Server

I'm talking about files like schema.xml, solrconfig.xml, and so on. Solr 3.1 Cookbook

We'll learn more about the main Solr configuration file for a core, named solrconfig.xml, in chapter 4 Solr in action

I'm talking about files such as schema.xml, solrconfig.xml, solr.xml, and so on. Apache Solr 4 Cookbook

288、单词 resolver 的匹配文本：

In SBT, a resolver is mapped to a URL that hosts dependencies (like Maven repositories) Scala in Action

This resolver handles certain standard Spring MVC exceptions by setting a specific response status code. Pro Spring 3

NameResolver’s act method can be rewritten to use loop as shown in Listing 32.4 Programming in Scala 2nd Edition

If you must parse from an input stream, supply an entity resolver (see the following note) Core Java Volume II Advanced Features 9th Edition

The DefaultHandler class provides an empty implementation of all the methods of the ContentHandler, ErrorHandler, DTDHandler, and EntityResolver interfaces. Java in a Nutshell 5th Edition

Hive uses Java Reflection to select the correct evaluation method, and if you require a finer-grained selection, you can develop your own utility class that implements the UDFMethodResolver interface. Hadoop Beginners Guide

Hostname canonicalization is the process of finding the complete, official, hostname according to the resolution system, in this case, the host’s resolver library. Hadoop Operations

As an example, we will look at a new UDF, MetroResolver. Programming Pig

One way to improve the performance is to load the entities into memory using an entity resolver. Java Performance

You can catch these exceptions and handle the conflicts manually using a SyncResolver object. Java 7 Recipes A Problem-Solution Approach

289、单词 perl 的匹配文本：

He has worked with Perl/ PHP/Python, primarily for coding and MySQL/Oracle as the backend, for several years prior to moving into Hadoop fulltime. Apache Flume Distributed Log Collection for Hadoop

The prospect of firing up rsync or kludging together some Perl is altogether different when 20,000 hosts are involved. Monitoring with Ganglia

The multiple language code generation allows programs written in C++ or Perl to call the same methods as a Java client. Cassandra High Performance Cookbook

Most of these packages (Perl, gcc, GNU make, and so on) are preinstalled on most Linux distributions. Beginning CouchDB

The results, shown in Table 1, demonstrate Sawzall is significantly faster than Python, Ruby, or Perl, at least for these microbenchmarks. Interpreting the Data Parallel Analysis with Sawzall

Hadoop has a contrib project (contributed in JIRA ticket HADOOP-3754) that contains a Thrift server and bindings for various client languages including Python, Ruby, and Perl. Hadoop in Practice

We wrote a naive single-node Perl hash-based program as an example of a typical quick solution a sysadmin may create. Hadoop in Action

It’s increasingly popular and is used by projects such as Android, Fedora, Ruby on Rails, Perl, and many Cassandra clients (as we’ll see in Chapter 8) Cassandra The Definitive Guide

If you know Perl, Python, Ruby, JavaScript, or even shell script, you can learn Pig’s syntax in the morning and be running MapReduce jobs by lunchtime. Hadoop Operations

Although ActiveMQ is written in Java, APIs for many languages other than Java are provided, including C/C++, .NET, Perl, PHP, Python, Ruby, and many more. ActiveMQ in Action

290、单词 urlconnection 的匹配文本：

The type of object returned will vary, depending on the MIME content type of the remote resource and the available content handlers (classes responsible for processing and retrieving objects from a URLConnection) Java Network Programming and Distributed Computing

ContentHandler objects are typically d indirectly using the getContent method of URL and URLConnection. Java EE 7 Specification

The HttpURLConnection class that uses this method is abstract, so the actual behavior of this class may be overridden in a particular implementation. Java Security 2nd Edition

The URL and URLConnection classes encapsulate much of the complexity of retrieving information from a remote site. Core Java Volume II Advanced Features 9th Edition

At a slightly more complex level, a URLConnection object can be obtained from a given URL object. Java in a Nutshell 5th Edition

Chapter 7, URLConnections, shows you how to use the URLConnection and HttpURLConnection classes not just to download data from web servers, but to upload documents and configure connections. Java Network Programming 4th Edition

291、单词 validator 的匹配文本：

Use the CLI to see that the data is stored in a single column: [default@parking] assume parking validator as ascii; Cassandra High Performance Cookbook

ValidatorFactory for ValidatorFactory, or by requesting the injection of an object of the appropriate type via the Resource annotation. Java EE 7 Specification

This command runs the SortValidator program, which performs a series of checks on the unsorted and sorted data to check whether the sort is accurate. Hadoop：The Definitive Guide 3rd Edition

This is achieved by way of a nested f:convert er and f:validator, just like with any h:inputText, and can be done as shown: Java EE 7 Essentials

Several HTML validity checkers are available for download and you can validate HTML online [W3C-validator] Effective Java 2nd Edition

Obtain a Validator from the Schema, and then use the Validator to validate your XML document. Java in a Nutshell 5th Edition

Once the rows have been added to the validator, if the column family needs to be validated, it will create the Merkle tree and broadcast it to the neighboring nodes. Cassandra The Definitive Guide

The example has two UserValidator objects that return a random boolean value. Java 7 Concurrency Cookbook

The following code snippet shows how to use a validator to validate a document using SAX. Java Performance

You can use the Schema instance to retrieve a Validator instance that understands this grammar. Java 7 Recipes A Problem-Solution Approach

292、单词 hypervisor 的匹配文本：

The hypervisor acts as a virtual switch, connecting virtual machines to each other and to the physical network. Monitoring with Ganglia

This refers to the time during which something was runnable on the virtual machine, but the hypervisor chose to run something else instead. High Performance MySQL 3rd Edition

The nova-compute process is primarily a worker daemon that creates and terminates virtual machine instances via hypervisor's APIs (XenAPI for XenServer/XCP, libvirt for KVM or QEMU, VMwareAPI for VMware, etc.) bk-compute-adminguide-grizzly

A worker daemon that creates and terminates virtual machine instances through hypervisor APIs. bk-admin-guide-cloud-latest

The Vagrant file simply defines the hypervisor-level configuration and provisioning; the remaining provisioning is done through Puppet and is defined at two levels. Storm Real-Time Processing Cookbook

The piece of software that enables multiple guests, for example operating systems, to run on a single system is called a hypervisor. Oracle JRockit The Definitive Guide

Virtualization works by running a hypervisor either in a host OS or directly on bare metal, replacing the host OS entirely. Hadoop Operations

To view the list of valid compute hosts, use the nova hypervisor-list command, as follows: admin-user-guide-trunk

The actual device name might differ due to hypervisor settings. user-guide

Probably the most important factor in your choice of hypervisor is your current usage or experience. OpenStack Operations Guide

293、单词 pojo 的匹配文本：

Imagine you want to have a codec that decodes bytes to some kind of message (a POJO) and encodes the message back to bytes. Netty in Action MEAP V08

You’ll create a new Java class called ToDoItem, a plain old Java object (POJO) acting as a model. Gradle in Action

This is a simple POJO (plain old Java object) with three properties—nothing much to it. Scala in Action

There is little else to configure: just add a dependency to the test framework (or frameworks) your tests are using, and Maven will automatically look for and execute the JUnit, TestNG, or even Plain Old Java Objects (POJO) tests contained in this directory structure. Jenkins The Definitive Guide

Enables a WebSocket client and server endpoint to be defined declaratively via annotations on a POJO, or programmatically via interface implementation. Java EE 7 Essentials

A configurable POJO is associated with the ActiveMQ connection that you can tune to set different policies. ActiveMQ in Action

Arcane interfaces for using persistent entities have been removed, creating a simplified Plain Old Java Object (POJO) based persistence model. Java Performance

I created a constraint object called MyCellConstraint which is a plain old Java object (POJO) used when calling the add(JComponent comp, Object constraint) method of a JPanel component. Java 7 Recipes A Problem-Solution Approach

Because the objects were regular Java objects, this persistence model came to be known as Plain Old Java Object (POJO) persistence. Pro JPA2

Formally, each of these kind of fields is known as a property in Gradle since we are saving state in POJO objects under the hood. Building and Testing with Gradle

294、单词 traversal 的匹配文本：

When you add many components into a window, you need to give some thought to the traversal order. Core Java Volume I Fundamentals 9th Edition

There is also a version of parallelDo() that allows multiple output PCollections to be produced simultaneously from a single traversal of the input PCollection. FlumeJava - Easy, Efficient Data-Parallel Pipelines

For example, Cascalog adds a logic-based query capability inspired by Datalog, while Scalding adds math libraries that are useful for graph-traversal problems and many machine-learning algorithms. Professional Hadoop Solutions

For example, owner and mode are required for the security check during path traversal but rarely change, requiring very few states, while the file lock reflects a wider range of client access modes as it controls an MDS’s ability to issue client capabilities. Ceph A Scalable, High-Performance Distributed File System

Because of the limited set of methods available in the Iterator interface, it is hard to use the Iterator for anything more than a simple traversal through the Collection. Data Structures and Algorithm Analysis in Java 3rd Edition

Computer scientists also call this postorder traversal because the search process visits the children before visiting the parents. Core Java Volume II Advanced Features 9th Edition

The serialization logic has no knowledge of the topology of the object graph, so it must go through an expensive graph traversal. Effective Java 2nd Edition

It performs a depth first traversal, starting at a specific root directory. Java 7 New Features Cookbook

The most fundamental graph traversal algorithms are depth-first search (DFS) and breadth-first search (BFS).The difference between the algorithms is the ordering in which a node is processed in relationship to its neighbors. Hadoop Beginners Guide

However, this strategy doesn't work for another common usage of collections, traversal. Concurrent Programming in Java Design Principles and Pattern

295、单词 pdf 的匹配文本：

Did you know that Packt offers eBook versions of every book published, with PDF and ePub files available? You can upgrade to the eBook version at www.PacktPub. Apache Flume Distributed Log Collection for Hadoop

Such a problem often occurs when Hadoop is used to store and retrieve large items, such as PDF files, large data samples, images, movies, or other multimedia data. Professional Hadoop Solutions

Aside from his work at Core, Joe has worked on several successful web development projects, including Learn French by Podcast, a language-learning web site that offers its members free highquality MP3 podcasts and allows them to purchase accompanying professional PDF guides for a small fee. Beginning CouchDB

This example uses PNG, but GNU plot supports many other terminals like SCREEN, PDF, EPS, and so on. Hadoop MapReduce Cookbook

Nutch provides parsers for documents in many popular formats, such as HTML, PDF, Open Office and Microsoft Office, RSS, and others. Hadoop：The Definitive Guide 3rd Edition

See Eric Brewer’s original paper on this topic and the follow-up post by Coda Hale, as well as this PDF by Gilbert and Lynch. HBase：The Definitive Guide

The MIME standard defines common data formats such as "image/jpeg" and "application/pdf" Core Java Volume II Advanced Features 9th Edition

Because the Times had stored its older articles as scanned TIFF images , they needed image processing to combine different pieces of each article together into a single file in the desired PDF format. Hadoop in Action

You can read the original Phi Accrual Failure Detection paper on which Cassandra’s design is based at http://ddg.jaist.ac.jp/pub/HDY+04.pdf. Cassandra The Definitive Guide

This book was written and typeset by the author using Adobe FrameMaker, Acrobat, and Distiller on Macintosh and PC platforms, and supplied to the publisher and printer as an Adobe Portable Document Format (PDF) file. Fundamental Networking in Java

296、单词 operands 的匹配文本：

When two values are combined with a binary operator (such as n + f where n is an integer and f is a floating-point value), both operands are converted to a common type before the operation is carried out. Core Java Volume I Fundamentals 9th Edition

Italic is used for Java Virtual Machine "assembly language", its opcodes and operands, as well as items in the Java Virtual Machine's run-time data areas. The Java Virtual Machine Specification Java SE 7 Edition

The most common use for operators that modify their operands is to generate the side effect, but you should keep in mind that the value produced is available for your use, just as in operators without side effects. Thinking in Java 4th Edition

The leaves of an expression tree are operands, such as constants or variable names, and the other nodes contain operators. Data Structures and Algorithm Analysis in Java 3rd Edition

Numeric promotions are used to convert the operands of a numeric operator to a common type where an operation can be performed. The Java Language Specification Java SE 7 Edition

All bytecode operations, with few exceptions, are computed on an evaluation stack by popping operands from the stack, executing the operation and pushing the result back to the stack. Oracle JRockit The Definitive Guide

In a bitwise AND operation, each bit of the operands are Anded with each other. Java 7 New Features Cookbook

An operator is a key part of Java syntax: it performs a computation on one or two operands to produce a new value. Java in a Nutshell 5th Edition

Other constraints that check the type and number of arguments for other instructions requires dynamic analysis of the code to determine which operands will be present on the expression stack during execution. Java Performance

If the operands are integer types, the quotient of the division is returned. Programming Hive

297、单词 gmetad 的匹配文本：

These metrics are then subsequently polled to Ganglia Meta Daemon (Gmetad) servers, where the metrics are computed and saved in round-robin, time-series databases using round-robin database tool (RRDtool) HBase Administration Cookbook

I should mention here that Federico Sacerdoti was heavily involved in the implementation of gmetad and wrote a nice academic paper2 highlighting the strength of its design. Monitoring with Ganglia

Ganglia is composed of two servers: the gmetad server, which provides historical data and collects current data, and the gmond server, which collects and serves current statistics. Pro Hadoop

Add all the hostnames in the cluster to the gmetad configuration /etc/ganglia/ gmetad.conf file, for example, this file should contain the following: Hadoop Operations and Cluster Management Cookbook

You will also need the Ganglia meta daemon (gmetad) running on at least one node, and another node running the Ganglia web frontend. Hadoop RealWorld Solutions Cookbook

Ganglia meta daemon (gmetad) The meta daemon is installed on a central node and acts as the federation node to the entire cluster. HBase：The Definitive Guide

Each gmond process relays data to a central gmetad process that records data in a series of RRD, or round-robin database files, which are fixed-size files that efficiently store time series data. Hadoop Operations

298、单词 hashes 的匹配文本：

In this way, the hash table is just a mapping from hashes to backend database URIs. CouchDB The Definitive Guide

The Random Partitioner hashes the user supplied key and uses the result to place this data on nodes in the cluster. Cassandra High Performance Cookbook

When the high write levels are detected, Ceph hashes the shared directory and relaxes the directory’s mtime coherence to distribute the workload across all MDS nodes. Ceph A Scalable, High-Performance Distributed File System

Redis supports many other data structures, including hashes, lists, and sorted sets. Hadoop RealWorld Solutions Cookbook

Gradle relies heavily on hashes to optimize download performance; when resolving a dependency, it will first attempt to download the much-smaller SHA from the repo, skipping the download if it determines that it already has that content in the artifact cache. Gradle Beyond The Basics

The hashes are the same as th you generated with the write, Hadoop in Practice

The default partitioner is HashPartitioner, which hashes a record’s key to determine which partition the record belongs in. Hadoop：The Definitive Guide 3rd Edition

The last native JSON datatype that we need to discuss is the object—known in other languages as hashes, hashmaps, dictionaries or associative arrays. Elasticsearch The Definitive Guide

By default, Hadoop will use a strategy that hashes the output key to perform the partitioning. Hadoop Beginners Guide

Note that the field has many distinct values, leading to large internal hash-tables; computing the hashes themselves on possibly large strings is already computationally quite expensive. PowerDrill - Processing a Trillion Cells per Mouse Click

299、单词 virtualization 的匹配文本：

The libvirt project has created a common set of tools for managing virtualization resources on different virtualization platforms, currently including: Xen, QEMU, KVM, LXC, OpenVZ, User Mode Linux, VirtualBox, and VMware ESX and GSX. Monitoring with Ganglia

In contrast, a SAN can be great when you don’t demand a lot of high-performance random I/O, but you are interested in features such as snapshots, storage consolidation, data deduplication, and virtualization. High Performance MySQL 3rd Edition

Server virtualization is the partitioning of a physical server into smaller virtual servers to help maximize the server resources. Gradle in Action

Virtualization Virtualization is a great way to get cheap hardware and be able to expand fast. MongoDB The Definitive Guide 2nd Edition

It may seem strange, but think about virtualization and the modern world, where the operating system cannot determine on which physical machine it works. Mastering ElasticSearch

Some virtualization types are better than others, but none come without an impact on performance. HBase in Action

Billing for resources consumed, including the cost of hardware, virtualization, network provision, and even the costs of software development and operations (since these are a simply logical extension of the system we see here) Graph Databases

Since 2008, he works for Oracle on a fast virtualization platform. Oracle JRockit The Definitive Guide

Blades, SANs, and Virtualization The large-scale data storage and processing industry moves in cycles. Hadoop Operations

It is also a good practice in the setup and configuration of a benchmark to consider leveraging operating system capabilities such as processor sets, or partitioning off portions of a system so they are viewed as distinct systems, such as virtualization or Oracle Solaris zones. Java Performance

300、单词 hfile 的匹配文本：

This includes putting data directly into an HBase table and writing to HFile format files on Hadoop Distributed File System (HDFS) HBase Administration Cookbook

A storage file format, HFile, is under development and due for Hbase version 0.20.0, and is expected to solve many of the performance and reliability issues. Pro Hadoop

The implementation of HFile in a region server is responsible for reading and writing HFiles to and from HDFS. Professional Hadoop Solutions

Its contents are flushed to disk to form an HFile when the MemStore fills up. HBase in Action

All columns in a column family are stored together in the same lowlevel storage file, called an HFile. HBase：The Definitive Guide

When the MemStore reaches a certain threshold it is written out as an HFile. Apache Hadoop Goes Realtime at Facebook

The HFile writer has a straightforward write path, with four elements: Hadoop in Action

301、单词 jconsole 的匹配文本：

While JMX is fine for causal browsing of metric values, the number of eyeballs looking at JConsole doesn't scale when you have hundreds or even thousands of servers sending data all over the place. Apache Flume Distributed Log Collection for Hadoop

It is also possible to change logging levels in a running program by using the jconsole program. Core Java Volume I Fundamentals 9th Edition

Use the following command to start the monitor user interface: jconsole. Hadoop Operations and Cluster Management Cookbook

Also described are the Java profiling tools such as JConsole. Cassandra High Performance Cookbook

The JDK comes with a tool called JConsole for viewing MBeans in a running JVM. Hadoop：The Definitive Guide 3rd Edition

JConsole Java ships with a helper application called JConsole, which can be used to connect to local and remote Java processes. HBase：The Definitive Guide

The jconsole application that is available as part of the JDK, provides an alternative technique for determining which MXBeans are available. Java 7 New Features Cookbook

You may start jconsole with no local or remote process specified and use its Connection menu to establish connections. Java in a Nutshell 5th Edition

To do this, open a console, start the jconsole application, and connect to Cassandra via JMX. Cassandra The Definitive Guide

It shows how to configure ActiveMQ for JMX monitoring and demonstrates this using JConsole. ActiveMQ in Action

302、单词 tasktracker 的匹配文本：

You can also check the detailed statistics of your tasks per job by opening the Tasktracker component of Apache Hadoop. Web Crawling and Data Mining with Apache Nutch

Jobtracker and tasktracker tunable parameters The JobTracker is the server that handles the management of the queued and executing jobs. Pro Hadoop

Another two EC2 small instances are used with roles hadoop-datanode and hadoop-tasktracker in each instance. Hadoop MapReduce Cookbook

There are two types of client: external clients (run by users) and internal clients (run on the cluster as a part of the system, e.g., datanode and tasktracker daemons) Hadoop：The Definitive Guide 3rd Edition

Tasktracker The process responsible for running map or reduce tasks from the jobtracker. Cassandra The Definitive Guide

Daemons There are two major daemons in Hadoop MapReduce: the jobtracker and the tasktracker. Hadoop Operations

Core Instance Group The nodes in the core instance group have the same function as Hadoop slave nodes that run both the datanode and tasktracker daemons. Programming Hive

303、单词 subdirectory 的匹配文本：

This option, in turn, defaults to a subdirectory named Ganglia in the directory where libganglia will be installed. Monitoring with Ganglia

Cluster Administration Tools The Hadoop Core installation provides a number of scripts in the bin subdirectory of the installation that are used to stop and start the entire cluster or various pieces of the cluster. Pro Hadoop

In addition, you can specify a directory and CouchApp will load all the files in this directory and any subdirectory. CouchDB The Definitive Guide

Because I don’t want to make this example too complex to grasp, you’ll just copy it to the subdirectory build/backup. Gradle in Action

I will now show how to change to the source subdirectory inside this new directory and change the permissions of a few files to make sure they are ready for the build process: Beginning CouchDB

Whirr generates a script that we can use to start this proxy, under a subdirectory named after your Hadoop cluster inside the ~/.whirr directory. Hadoop MapReduce Cookbook

The extra trailing /foo subdirectory is needed on the destination, because now the contents of the source directory are copied to the contents of the destination directory. Hadoop：The Definitive Guide 3rd Edition

A task can have more than one attempt if the first one fails.) These user log files are under the /logs/userlogs subdirectory. Hadoop in Action

Now we have a subdirectory named after the Git project so that we can build the project and start using it. Cassandra The Definitive Guide

If you include in a JAR file class the files from a subdirectory, the class may not be stored with the path you expect. Hadoop Beginners Guide

304、单词 parses 的匹配文本：

A user request comes in, and the web server takes care of the networking and parses the HTTP request. CouchDB The Definitive Guide

The compiler parses the Groovy source files and generates stubs for them. Gradle in Action

It parses key/value pairs out of the input data and passes each pair to the user-defined Map function. MapReduce - Simplied Data Processing on Large Clusters

ElasticSearch parses the input as lines, where every pair of lines contains information about the target index along with additional parameters and a query itself. Mastering ElasticSearch

The program parses the input record and then does a relatively sophisticated emit statement. Interpreting the Data Parallel Analysis with Sawzall

It parses out all the links in every Web page and stores important information about them in an anchors tile. The anatomy of a large-scale hypertextual Web search engine

The query server parses the query, applies optimizations and sends the plan to the master for execution. Tenzing - A SQL Implementation On The MapReduce Framework

As we know there is only one file in the cache we feel safe in using the first index in this array, and pass that to a utility method that parses the file and uses the contents to populate the state abbreviation lookup map. Hadoop Beginners Guide

We can do this using the receive() method, which reads frames from the TCP socket and parses them. ActiveMQ in Action

In the background, the “engine” provides an abstraction layer for the UI based on SQL: the user constructs charts via drag’n’drop operations, they get translated to group-by SQL queries, which the engine parses and processes. PowerDrill - Processing a Trillion Cells per Mouse Click

305、单词 recursively 的匹配文本：

To recursively list files in the home directory, we can use the command hadoop fs -lsr ... Hadoop Operations and Cluster Management Cookbook

The procedure is invoked recursively on nested records (Line 18) Dremel - Interactive Analysis of WebScale Datasets

There are convenience methods for converting paths into project-relative java.io.File objects, making collections of files, and recursively turning directory trees into file collections. Gradle Beyond The Basics

The –R option recursively sets the replication factor for files and directories within a directory. Hadoop MapReduce Cookbook

PageRank handles both these cases and everything in between by recursively propagating weights through the link structure of the Web. The anatomy of a large-scale hypertextual Web search engine

Both files and directories (which are scanned recursively for LZOP files) are supported: Hadoop in Practice

In practice we don’t need to check on all files recursively, and we may restrict ourselves to what’s in our own working directory. Hadoop in Action

This can be done by using the -setrep command, which takes a replication factor and an optional flag (-R) to indicate it should operate recursively (see Example 2-5) Hadoop Operations

For example, STOCKS.> means the entry applies to all destinations in the STOCKS path recursively. ActiveMQ in Action

This rewrite can be applied recursively, to support deeper trees. PowerDrill - Processing a Trillion Cells per Mouse Click

306、单词 subflow 的匹配文本：

However, in the subflow, when selecting the customer, the information about the order should be available because it contains some filter criteria (for example, an international order can be placed only by customers living in certain countries) Pro Spring 3

The other states are decision state, action state, subflow state, and end state. Pro Spring MVC

The start state is a marker state that designates another state (view, action, decision, end, subflow) defined in the flow as being the flow’s starting point. Spring in Practice

The boxes represent various states (the View state, Action state, Decision state, Subflow state, and others), and the arrows represent transitions. Pro Spring Security

Subflow A subflow state starts a new flow in the context of a flow that is already underway. Spring in Action 4th Edition

307、单词 configures 的匹配文本：

The globals section configures the general characteristics of the daemon itself. Monitoring with Ganglia

Configures each subproject by project path; actual configuration happens in the closure. Gradle in Action

In addition, the master node is a critical failure point in a default cluster configuration, which configures only one master node. Hadoop Operations and Cluster Management Cookbook

The build script configures the Project object, and finally the set of tasks to be executed is determined and executed. Gradle Effective Implementation Guide

After validating the authenticity of the lease [R7], the NM configures the environment for the container, including initializing its monitoring subsystem with the resource constraints specified in the lease. Apache Hadoop YARN Yet Another Resource Negotiator

To recover from a failed namenode in this situation, an administrator starts a new primary namenode with one of the filesystem metadata replicas and configures datanodes and clients to use this new namenode. Hadoop：The Definitive Guide 3rd Edition

When Elasticsearch detects a new string field in your documents, it automatically configures it as a full text string field and analyzes it with the standard analyzer. Elasticsearch The Definitive Guide

The driver instantiates, configures, and passes a JobConf object named job to JobClient. Hadoop in Action

Since the tasktracker runs as an unprivileged user (user mapred, by default, in the case of CDH and whatever user the administrator configures in Apache Hadoop), it isn’t possible for it to launch task JVMs as a different user. Hadoop Operations

As you can see, listing 7.2 instantiates the BrokerService and configures the brokerName and dataDirectory properties. ActiveMQ in Action

308、单词 lang 的匹配文本：

Whenever you use a class that is not defined in the basic java.lang package, you need to use an import directive. Core Java Volume I Fundamentals 9th Edition

The assumptions are that the hadoopprobook and commons-lang JARs are in the current working directory. Pro Hadoop

Your Java source code has a dependency on a class from the external library Apache Commons Lang. Gradle in Action

In the preceding example, ElasticSearch will take the value of the lang field and will use it as the analyzer for that document. Mastering ElasticSearch

The following snippet shows how we could define and use a trim UDF that uses the Apache Commons Lang StringUtils class. Hadoop：The Definitive Guide 3rd Edition

Minimally, a Java client needs the following JAR files specified in its CLASSPATH, when connecting to HBase: hbase, hadoop-core, zookeeper, log4j, commons-logging, and commons-lang. HBase：The Definitive Guide

For more sophisticated caches, you may need to use java.lang.ref directly. Effective Java 2nd Edition

Using a string reference variable that is assigned a null value will result in a java.lang. Java 7 New Features Cookbook

Primitive wrapper classes These classes are conceptually similar to the primitive wrapper classes, such as Integer and Long found in java.lang. Hadoop Beginners Guide

The lang and country arguments are optional; if omitted, the default locale is used. Programming Hive

309、单词 subquery 的匹配文本：

There were rumors or promises of many new features, such as online backups and server-level foreign keys for all storage engines, subquery improvements, and thread pooling. High Performance MySQL 3rd Edition

The CLUSTER BY clause in the sub-query specifies the output columns that are hashed on to distributed the data to the reducers and finally the REDUCE clause specifies the user program to invoke (python wc_reduce.py in this case) on the output columns of the subquery. Hive A Petabyte Scale Data Warehouse Using Hadoop

As mentioned earlier, if the job is to return all the twits for all the users you follow, you can’t rely on a join or subquery to do this, as you can in SQL: HBase in Action

The second subquery calculates the second leg of the route, which comprises the path from the end location to another delivery center elsewhere in the network. Graph Databases

Consider this IN subquery, which finds all the items in the things table that are in the sales table: Hadoop：The Definitive Guide 3rd Edition

A correlated subquery contains a reference to the outer query and so. SQL Antipatterns

A clause is a subquery that can be optional, required, or prohibited. Lucene in Action 2nd Edition

Each subquery of the union query must produce the same number of columns, and for each column, its type must match all the column types in the same position. Programming Hive

A subquery is a complete select query inside a pair of parentheses that is embedded within a conditional expression. Pro JPA2

The optimizer needs quick access to various parts of the WHERE clause, table, field, and key lists, ORDER BY and GROUP BY expressions, subquery structuring, and other data. Understanding MySQL Internals

310、单词 tasklet 的匹配文本：

In the job definition, only one step is defined, which contains one tasklet and chunk process. Pro Spring 3

The chunk element C configures the chunk-processing step, in a step element, which is itself in a tasklet element. Spring Batch in Action

The most common way to implement a step is delegating to a tasklet, which defines the activity that must be performed, leaving the proper step implementation to deal with the boilerplate aspects of the execution (maintaining state, sending events, and so on) Spring Integration in Action

Processing and Loading Data from a Database To process and load data from a relational database to HDFS, we need to configure a Spring Batch tasklet with a JdbcItemReader and a HdfsTextItemWriter. Spring Data Modern Data Access for Enterprise Java

311、单词 aggregated 的匹配文本：

A graph mining algorithm might simply output aggregated statistics mined from the graph. Pregel - A System for Large-Scale Graph Processing

The gweb Aggregated Graphs Tab Aggregate graphs (Figure 4-11) allow you to create composite graphs combining different metrics. Monitoring with Ganglia

That is, an integer; a string; or a small, fixed-size list or object that includes an aggregated value (or values) from the values argument. CouchDB The Definitive Guide

After running gradle build, you’ll find the aggregated HTML test report under the directory build/reports/test. Gradle in Action

Enterprise-level switches often support Link Aggregation Groups (LAG), which bundle multiple interfaces together in an active/active fashion to make a single logical interface that is as fast as the sum of the links aggregated. Cassandra High Performance Cookbook

In all the examples so far, you have aggregated by department. Beginning CouchDB

Solid arcs represent data flowing from the analysis machines to the aggregators; dashed arcs represent the aggregated data being merged, first into one file per aggregation machine and then to a single final, collated output file. Interpreting the Data Parallel Analysis with Sawzall

Good keyword search is often not enough: you need some aggregated data, so you can narrow the results down to what you're interested in. Elasticsearch in Action MEAP V07 cHAPTER 1

However, shared storage services like those described in Section 6.5 require aggregated information as soon as possible during a sudden spike in user activity. Dapper a Large-Scale Distributed Systems Tracing Infrastructure

Read requests are issued on these multiple HFiles in parallel & aggregated for a final result. Apache Hadoop Goes Realtime at Facebook

312、单词 ivy 的匹配文本：

Make sure that the HBasegora-hbase dependency is available in ivy.xml. Web Crawling and Data Mining with Apache Nutch

It will be helpful to have some knowledge of tools like Ant, Ivy, and Maven; practices like continuous integration and delivery; and concepts like dependency management. Gradle in Action

Maven and Ivy repositories are supported to publish or fetch dependencies. Gradle Effective Implementation Guide

Apache Ivy is a dependency manager with flexibility and configurability. Scala in Action

They also play a key role for non-Maven projects using tools like Ivy and Gradle, both of which rely on standard Maven repositories. Jenkins The Definitive Guide

There are three kinds of repositories currently supported by Gradle: Maven repositories, Ivy repositories, and static directories. Gradle Beyond The Basics

It is straightforward to use another build tool, such as Gradle or Ant with Ivy, as long as you use the same set of dependencies defined here.) For building MapReduce jobs you only need to have the hadoop-core dependency, which contains all the Hadoop classes. Hadoop：The Definitive Guide 3rd Edition

The better approach is to let Ivy load the appropriate version from the remote repositories, and then compile Hive normally. HBase：The Definitive Guide

For example, Apache Ivy is used to build Cassandra, and a popular client (Hector) is available via Git. Cassandra The Definitive Guide

Apache Ivy later added badly needed dependency management to Java builds, but still didn’t address the lack of conventions inherent in Ant. Building and Testing with Gradle

313、单词 matcher 的匹配文本：

Without such an element, significant time could be spent nailing down the right pattern matcher, as it would involve rerunning your test plan several times with various inaccurate expressions, hoping it eventually matches. Performance Testing With JMeter 2.9

HBase solves the problem by using a QueryMatcher in combination with a ColumnTracker, which comes in a few variations: one for explicit matching, for when you specify a list of columns to retrieve, and another that includes all columns. HBase：The Definitive Guide

Then, get a Matcher object from the pattern and call its matches method: Click here to view code imageClick here to view code image. Core Java Volume II Advanced Features 9th Edition

Working with strings; specifically, how to perform common operations such as removing parts of a string, matching strings, and more using the CharMatcher and Strings classes. Getting Started with Google Guava

These are defined in the documentation for the Files class' getPathMatcher method. Java 7 New Features Cookbook

Regular expression support is provided by the Pattern and Matcher. Java in a Nutshell 5th Edition

By running the output through a pattern matcher to extract the GC times and amount freed and totalling those values, you can identify the cost of GC to your application. Java Performance Tuning 2nd Edition

Use the regular expression Pattern and Matcher classes for a better performing and more versatile matching solution than the String matches() method. Java 7 Recipes A Problem-Solution Approach

In Java 8, you can use the names in the start, end, and group methods of Matcher: Java SE 8 for the Really Impatient

The power of Groovy’s operators can also be leveraged inside AntBuilder calls, such as using the regular expression matcher inside an evaluation of all System properties in Example 3-19 Building and Testing with Gradle

314、单词 ioexception 的匹配文本：

The code that tries to process the information in a file therefore will need to notify the compiler that it can throw some sort of IOException. Core Java Volume I Fundamentals 9th Edition

If no path can be constructed, an IOException will be thrown. Pro Hadoop

You need to create one that matches the signature BufferedReader -> String and that may throw an IOException. Java 8 in Action Lambdas, Streams and Functional-style Programming

Calling seek() with a position that is greater than the length of the file will result in an IOException. Hadoop：The Definitive Guide 3rd Edition

First, the method that is used to retrieve specific values from an HBase table: Result get(Get get) throws IOException. HBase：The Definitive Guide

FileVisitResult visitFileFailed(T path, IOException exc) is called if an exception was thrown in an attempt to get information about the given file. Core Java Volume II Advanced Features 9th Edition

Thus, catching the IOException may be sufficient as illustrated in the following code: Java 7 New Features Cookbook

An IOException that signals that a specified file cannot be found. Java in a Nutshell 5th Edition

If you try to reset too far back, an IOException is thrown. Java Network Programming 4th Edition

If one thread performs s.close() on an IO object (for example, an InputStream) s, then any other thread attempting to use s (for example, s.read()) will receive an IOException. Concurrent Programming in Java Design Principles and Pattern

315、单词 neo4j 的匹配文本：

The core framework is large, and dozens of portfolio projects extend that core, covering things like security, web flow, SOAP web services (REST web services are part of the core), enterprise integration, batch processing, mobile, various flavors of social (Facebook, LinkedIn, Twitter, GitHub, and so on), various flavors of NoSQL (MongoDB, Neo4j, Riak, and so on), BlazeDS/Flex, AMQP/Rabbit, and many more. Spring in Practice

It’s generally applied to a number of recent nonrelational databases such as Cassandra, Mongo, Neo4J, and Riak. NoSQL Distilled

He has given talks at PyCon India on Storm and Neo4J and has published. Storm Blueprints Patterns for Distributed Real-time Computation

One of the biggest logistics companies in the world uses a graph database in real time to route physical parcels; a major airline is leveraging graphs for its media content metadata; and a toptier financial services firm has rewritten its entire entitlements infrastructure on Neo4j. Graph Databases

Unlike single node databases like Neo4j or PostgreSQL, Riak does not support ACID transactions. A Little Riak Book

Graph databases such as Neo4J and the set of databases derived at least in part from the design of Google’s Bigtable database (such as MongoDB, HBase, Hypertable, and Redis) all are focused slightly less on Availability and more on ensuring Consistency and Partition Tolerance. Cassandra The Definitive Guide

They were trying to integrate the Neo4j graph database with the Spring Framework and evaluated different approaches. Spring Data Modern Data Access for Enterprise Java

Whether you’re using a document database like MongoDB, a graph database such as Neo4j, or even a traditional relational database, Spring Data offers a simplified programming model for persistence. Spring in Action 4th Edition

316、单词 cron 的匹配文本：

Set up cron jobs to prepare static HTML reports at desired intervals. Monitoring with Ganglia

You could use this feature to synchronize database servers within a cluster or between data centers using a job scheduler such as cron, or you could use it to synchronize data with your laptop for offline work as you travel. CouchDB The Definitive Guide

Think of scheduled automation as a time-based job scheduler (in the context of a Unix-based operation system, also known as a cron job) Gradle in Action

It is commonly used to run commands on remote computers typically from non-interactive jobs such as rsync-based backups through cron. Cassandra High Performance Cookbook

One way of doing this would be to set up a cron job that runs at a frequent interval, calling a script that uses curl to replicate your primary database to a database on your backup server. Beginning CouchDB

This can be done with a cron job or through another automated process. Scaling CouchDB

Summary I showed you one example of the use of the Oozie coordinator, which offers cron-like capabilities to launch periodic Oozie workflows. Hadoop in Practice

For example, you may want all cron jobs to go to a single pool rather than have them run under each individual user’s pool. Hadoop in Action

Obvious tools such as cron may be good enough, but as system complexity increases, you may need to investigate more sophisticated scheduling systems. Hadoop Beginners Guide

AMQ_SCHEDULED_CRON String Use a cron entry to set the schedule. ActiveMQ in Action

317、单词 chubby 的匹配文本：

A quorum of Chubby locks detects most network partitions and node unavailability. Megastore - Providing Scalable, Highly Available Storage for Interactive Services

However the GFS master is now made fault tolerant using the Chubby[3] abstraction. Cassandra A Decentralized Structured Storage System

Bigtable in turn uses both Chubby[7] (a distributed lock system) and GFS. Dapper a Large-Scale Distributed Systems Tracing Infrastructure

ZooKeeper is the comparable system to Google’s use of Chubby for Bigtable. HBase：The Definitive Guide

Abstract We describe our experiences with the Chubby lock service, which is intended to provide coarse-grained locking as well as reliable (though low-volume) storage for a loosely-coupled distributed system. Chubby - The lock service for loosely-coupled distributed systems

In Bigtable there is a master server that controls operations using the Chubby persistent distributed locking mechanism; in Cassandra, all the nodes are on even par with no centralized control, and they communicate using a gossip model. Cassandra The Definitive Guide

Bigtable relies on a highly-available and persistent distributed lock service called Chubby [8] Bigtable - A Distributed Storage System for Structured Data

Running workers write a token into the Chubby lockservice [8] to indicate they belong to the system; other workers can use the existence of this token as a sign that the worker is alive (the token is automatically deleted when the process exits) Percolator - Large-scale Incremental Processing Using Distributed Transactions and Notifications

318、单词 english 的匹配文本：

In plain English, if you flip the parameters of compareTo, the sign (but not necessarily the actual value) of the result must also flip. Core Java Volume I Fundamentals 9th Edition

Now if our user wants to narrow down its results to the English-language films category we need to send the following query: Mastering ElasticSearch

The description provides a useful English description of what the metric actually measures. Beginning CouchDB

You don’t want words that commonly appear in the English language (referred to in text mining circles as stopwords) to be used as features because they’ll appear in every single email. Hadoop in Practice

Then create a new index, specifying that the tweet field should use the english analyzer: Elasticsearch The Definitive Guide

We will close this chapter with a complete example—a retirement calculator with a user interface in English, German, and Chinese. Core Java Volume II Advanced Features 9th Edition

The English language has about one million words, a size that fits comfortably into an iPod, but our word counting program will deal with many unique words not found in any standard English dictionary. Hadoop in Action

It has been said, though it is hard to verify, that the 17th-century English poet John Milton had actually read every published book on the face of the earth. Cassandra The Definitive Guide

Apache Camel uses simple English prepositions, such as from and to, to denote a route. ActiveMQ in Action

The easyB specification file reads like plain English in its execution of web browser driving events and querying of the page responses. Building and Testing with Gradle

319、单词 formatter 的匹配文本：

You need to extend the Formatter class and override the method. Core Java Volume I Fundamentals 9th Edition

A StringBuilder and Formatter are ThreadLocal instance variables, making this class thread-safe. Pro Hadoop

You can now represent multiple formatting behaviors by implementing the Apple-Formatter interface: Java 8 in Action Lambdas, Streams and Functional-style Programming

Therefore, we will write a Hadoop input formatter to process the e-mail archive. Hadoop MapReduce Cookbook

There are several formatter classes that take these differences into account. Core Java Volume II Advanced Features 9th Edition

As we’ll see below, a Formatter object can send its output to any Appendable object. Java in a Nutshell 5th Edition

The final component in log4j is the layout which acts as a formatter for log events. Hadoop Operations

First of all, you have implemented the MyFormatter class to give a format to the log messages. Java 7 Concurrency Cookbook

Some properties are good candidates for being transient, like cached calculations, or a date formatter that you always instantiate to the same value. Java 7 Recipes A Problem-Solution Approach

When this log level is selected, Gradle automatically provides a richer log formatter, including the timestamp, log level, and logger name of each message. Building and Testing with Gradle

320、单词 keyspace 的匹配文本：

Because Lounge allocates a portion of the hash (known as a keyspace) to each node, you can add as many nodes as you like. CouchDB The Definitive Guide

Many cryptographic algorithms suffer from the problem that a small number of keys in the keyspace are considered weak. Pro Spring 3

This randomly distributes the keyspace evenly over the reducers, but still ensures that keys with the same value in different mappers end up at the same reducer. MapReduce Design Patterns

A keyspace is similar to a database in RDBMS where all column families related to the application are stored. NoSQL Distilled

These need to be created so data can be stored in them: [default@unknown] create keyspace testkeyspace. Cassandra High Performance Cookbook

There are more keywords that may be used in the USING clause, including "skewed" (for large datasets with a skewed keyspace) and "merge" (to effect a merge join for inputs that are already sorted on the join key) Hadoop：The Definitive Guide 3rd Edition

Schema-Free Cassandra requires you to define an outer container, called a keyspace, that contains column families. Cassandra The Definitive Guide

The following is an example of creating an external Hive table that maps to an existing Cassandra keyspace and column family: Programming Hive

The channel names are not related to the Redis keyspace so you don’t have to worry about conflicts with existing keys. Redis Cookbook

321、单词 dfs 的匹配文本：

The Hadoop DFS metrics provide information about data block operations (read, removed, replicated, verify, written), verification failures, bytes read and written, volume failures, and local/remote client reads and writes. Monitoring with Ganglia

The dfs.data.dir property specifies the location of the data directory on the host Linux filesystem. Hadoop Operations and Cluster Management Cookbook

The number of mappers launched is roughly equal to the input size divided by dfs.block.size (the default block size is 64 MB) Hadoop RealWorld Solutions Cookbook

Finally, shut down the HDFS cluster using the following command: >bin/stop-dfs.sh. Hadoop MapReduce Cookbook

Check the value of dfs.block.size to see what it’s set to in your cluster. Hadoop in Practice

We set the second property, dfs.replication, to 1 so that HDFS doesn’t replicate filesystem blocks by the default factor of three. Hadoop：The Definitive Guide 3rd Edition

The dfs stands for Distributed Frequency Search and it tells Elasticsearch to first retrieve the local IDF from each shard in order to calculate the global IDF. Elasticsearch The Definitive Guide

Some older documentation shows file utilities in the form of hadoop dfs -cmd Hadoop in Action

The dfs.replication variable specifies how many times each HDFS block should be replicated. Hadoop Beginners Guide

For this reason, the underlying disks need not be part of a RAID group, although some administrators choose to do so and forego specifying multiple directories in dfs.name.dir (although an NFS mount should still be used, no matter what) Hadoop Operations

322、单词 namespaces 的匹配文本：

In addition, federated NameNodes can isolate namespaces for different applications or situations. Hadoop Operations and Cluster Management Cookbook

With multiple NameNodes, different categories of applications and users can be isolated to different namespaces. Professional Hadoop Solutions

Compared to P2P storage systems that only support flat namespaces, distributed file systems typically support hierarchical namespaces. Cassandra A Decentralized Structured Storage System

Hadoop version 2.x supports HDFS Federation, where HDFS is partitioned into multiple distinct namespaces, with each independently managed by a separate NameNode. Hadoop in Practice

Namespace volumes are independent of each other, which means namenodes do not communicate with one another, and furthermore the failure of one namenode does not affect the availability of the namespaces managed by other namenodes. Hadoop：The Definitive Guide 3rd Edition

Our near-term solution to scalability is to allow multiple namespaces (and NameNodes) to share the physical storage within a cluster. The Hadoop Distributed File System

Data is organized into buckets, which are little more than flat namespaces for logically grouping key-value pairs. Cassandra The Definitive Guide

In Example 5-16, we generate a one gigabyte file and copy it into the two separate filesystem namespaces. Hadoop Operations

You’ll see how to configure ActiveMQ using both pure Spring XML and custom namespaces, as well as the standard ActiveMQ XML via Apache XBean (http://mng.bz/EAfX) ActiveMQ in Action

When you use this feature, Hive puts the key-value pair in the hivevar “namespace” to distinguish these definitions from three other built-in namespaces, hiveconf, system, and env. Programming Hive

323、单词 oplog 的匹配文本：

The way MongoDB accomplishes this is by keeping a log of operations, or oplog, containing every write that a primary performs. MongoDB The Definitive Guide 2nd Edition

We’ll also look at the oplog and how it fits in with the communication between members of a replication topology. The Definitive Guide to MongoDB

Each member of a replica set logs all its writes to a special capped collection called oplog.rs. MongoDB in Action

324、单词 checksum 的匹配文本：

You'll see two compressed tar archives, available along with checksum and gpg signature files used to verify the archives. Apache Flume Distributed Log Collection for Hadoop

When it finds a footer pair, it makes some checks: if the first 2k are corrupt (a footer includes a checksum), CouchDB replaces it with the second footer and all is well. CouchDB The Definitive Guide

Gradle detects if an artifact was changed in the repository by comparing its local and remote checksum. Gradle in Action

Current doc status, a pointer into the repository, a doc checksum, and various statistics. The anatomy of a large-scale hypertextual Web search engine short

The information stored in each entry includes the current document status, a pointer into the repository, a document checksum, and various statistics. The anatomy of a large-scale hypertextual Web search engine

Chubby also exposes a 64-bit file-content checksum so clients may tell whether files differ. Chubby - The lock service for loosely-coupled distributed systems

To avoid this problem we had to change the format of the edits log to have a transaction length, transaction id and the checksum per each transaction written to the file. Apache Hadoop Goes Realtime at Facebook

This performs a checksum against the database and compares checksums of peers; if the checksums differ, then the data is exchanged. Cassandra The Definitive Guide

By default, there is a checksum for each 512-byte chunk within the block. Hadoop Beginners Guide

Figure 2: Example usage of the Percolator API to perform basic checksum clustering and eliminate documents with the same content. Percolator - Large-scale Incremental Processing Using Distributed Transactions and Notifications

325、单词 compilers 的匹配文本：

After all, what can go wrong? Unfortunately, with modern processors and compilers, there is plenty of room for error. Core Java Volume I Fundamentals 9th Edition

In these times of nearly continuous performance improvements of compilers and JVMs, performance measurements are only of transient value. A Java ForkJoin Framework

It’s typically useful when building compilers or engines for processing business rules. Java 8 in Action Lambdas, Streams and Functional-style Programming

Similar to compilers in traditional databases, the Hive compiler processes HiveQL statements in the following steps: Hive A Petabyte Scale Data Warehouse Using Hadoop

Balancing Symbols Compilers check your programs for syntax errors, but frequently a lack of one symbol (such as a missing brace or comment starter) will cause the compiler to spill out a hundred lines of diagnostics without identifying the real error. Data Structures and Algorithm Analysis in Java 3rd Edition

Avro compilers come in two flavors: they can either compile a protocol file or a schema file. Hadoop in Practice

As with many modern compilers, the Pig compiler can reorder the execution sequence to optimize performance, as long as the execution plan remains logically equivalent to the original program. Hadoop in Action

Reflection is also appropriate for use in remote procedure call (RPC) systems to eliminate the need for stub compilers. Effective Java 2nd Edition

To achieve this goal, it is essential to use an IDE that integrates under one common interface several tools (compilers and debuggers) that facilitate the development of applications. Java 7 Concurrency Cookbook

This eliminates the possibility of some common errors, and also enables compilers to automatically and safely perform some optimizations that would need to be performed manually in C. Concurrent Programming in Java Design Principles and Pattern

326、单词 refactoring 的匹配文本：

What started as a refactoring branch eventually became the main line of development as Flume 1.X. Apache Flume Distributed Log Collection for Hadoop

At Acme, developers do all their coding within the IDE, from navigating through the source code, implementing new features, and compiling and refactoring code, to running unit and integration tests. Gradle in Action

The HFile refactoring into a reader and writer hierarchy allows for significant improvements in code maintainability. Professional Hadoop Solutions

The API is constantly being enhanced with new features, bug fixes, and occasional refactoring. Hadoop RealWorld Solutions Cookbook

The Example Plug-In In this chapter, we will create a Gradle plug-in to automate the use of the open-source database refactoring tool, Liquibase. Gradle Beyond The Basics

Therefore, the WebMap applications would not require extensive refactoring to fit into MapReduce. Hadoop：The Definitive Guide 3rd Edition

This refactoring better aligns the configuration settings to the subsystem of Hadoop that they control. Hadoop in Action

If a JMS client uses such additional features, this client may not be portable to another JMS provider without a refactoring effort. ActiveMQ in Action

When the representations and behavior of one class can be partitioned into independent, noninteracting, or just non-conflicting subsets, it is almost always worth refactoring the class to use distinct finer-granularity helper objects whose actions are delegated by the host. Concurrent Programming in Java Design Principles and Pattern

Only the modest addition of the useTestNG() call to the test closure is needed in addition to the refactoring of the unit test class. Building and Testing with Gradle

327、单词 mbeans 的匹配文本：

The default configuration provides for only local access to the managed beans (MBeans) Pro Hadoop

From the window, we can check the memory usage, threads, classes, summary of JVM, and details of MBeans. Hadoop Operations and Cluster Management Cookbook

Cassandra provides access to this information through standard JMX MBeans (Java Management eXtension Managed Bean) Cassandra High Performance Cookbook

Hadoop includes several managed beans (MBeans), which expose Hadoop metrics to JMX-aware applications. Hadoop：The Definitive Guide 3rd Edition

There is a loose overlap between the metric context, as provided by the metrics framework, and the MBeans exposed over JMX. HBase：The Definitive Guide

MBeans and the platform MBean server will be discussed in the next chapter. Oracle JRockit The Definitive Guide

Overview of JMX and MBeans In this section, we explore how Cassandra makes use of Java Management Extensions (JMX) to enable remote management of your servers. Cassandra The Definitive Guide

Internal MBeans in Hadoop rely on a metric plug-in that has an update thread running to collect data from the system. Hadoop Operations

These interfaces consist of management beans, usually called MBeans, which expose resources of your application to external management applications. ActiveMQ in Action

Some applications also build-in monitoring and management capabilities using MBeans via Java SE’s monitoring and management APIs. Java Performance

328、单词 datanodes 的匹配文本：

The responsibility of Namenode is to transfer data to Datanodes. Web Crawling and Data Mining with Apache Nutch

Since MapReduce servers (task-trackers) share nodes with HDFS datanodes, map output inevitably competes with HDFS reads. HDFS scalability the limits to growth

Namenodes and Datanodes An HDFS cluster has two types of nodes operating in a master-worker pattern: a namenode (the master) and a number of datanodes (workers) Hadoop：The Definitive Guide 3rd Edition

It has locations of data blocks stored in several datanodes and often runs on the same server as the jobtracker in smaller clusters. Cassandra The Definitive Guide

Storage can be added to a cluster by adding more datanodes with additional disk capacity, or even adding disks to existing datanodes. Hadoop Operations

329、单词 codebase 的匹配文本：

It is also one of the biggest contributors of the Hadoop codebase. Hadoop Operations and Cluster Management Cookbook

To tell the truth, it was much harder not only because of more advanced topics covered in this book, but also because of the constantly introduced changes in the ElasticSearch codebase. Mastering ElasticSearch

Because of high performance, a small codebase, and compact resulting data, there is a wide adoption of Avro not only in the Hadoop community, but also by many other NoSQL implementations (including Cassandra) Professional Hadoop Solutions

Familiarity with the Pregel design will help to explain many components found in the Giraph codebase. Hadoop RealWorld Solutions Cookbook

Due to the number of moving parts and the size and scope of the codebase and deployment, debugging services like universal search (described earlier in Section 1) is very challenging. Dapper a Large-Scale Distributed Systems Tracing Infrastructure

That begs the question: what’s the best approach to integrate sampling support into an existing codebase? One approach would be to add a configurable option that would toggle the use of the sampling InputFormat, similar to the following code: Hadoop in Practice

The shuffle is an area of the codebase where refinements and improvements are continually being made, so the following description necessarily conceals many details (and may change over time; this is for version 0.20) Hadoop：The Definitive Guide 3rd Edition

These benchmarks are available as part of the Hadoop codebase. The Hadoop Distributed File System

Powerset was acquired in July 2008, and after a couple of months during which Powerset employees were disallowed from contributing while Microsoft’s legal department vetted the HBase codebase to see if it impinged on SQLServer patents, we were allowed to resume contributing (I was a Microsoft employee working near full time on an Apache open source project) HBase：The Definitive Guide

By unifying the code paths, the codebase was simplified and files were kept as small as possible. Apache Hadoop Goes Realtime at Facebook

330、单词 benchmarking 的匹配文本：

This is not exactly how Craigslist works (because we don’t know how Craigslist works), but it is a close enough approximation to illustrate problems with benchmarking. CouchDB The Definitive Guide

Benchmarking of a Hadoop cluster is the first step to tune the performance of a Hadoop cluster. Hadoop Operations and Cluster Management Cookbook

We demonstrate through benchmarking that our system is effective at transforming logical computations into efficient programs. FlumeJava - Easy, Efficient Data-Parallel Pipelines

Stress is an easy-to-use command-line tool for stress testing and benchmarking Cassandra. Cassandra High Performance Cookbook

Generic benchmarking can be of some use, but a test specifically designed for your system will be more useful. Scaling CouchDB

Benchmarking HDFS Running benchmarks is a good way to verify whether your HDFS cluster is set up properly and performs as expected. Hadoop MapReduce Cookbook

Benchmarking a Hadoop Cluster Is the cluster set up correctly? The best way to answer this question is empirically: run some jobs and confirm that you get the expected results. Hadoop：The Definitive Guide 3rd Edition

Only after real-world usage in your environment and some performance benchmarking with the aid of heap analysis tools and observation of your specific application’s behavior should you dive into tuning the more advanced JVM. Cassandra The Definitive Guide

The results of benchmarking LLVM vs vector on some typical aggregation queries is shown in table 3 Tenzing - A SQL Implementation On The MapReduce Framework

After benchmarking a particular cluster size, we deleted the data directories on each node, reformatted and reloaded HDFS to ensure uniform data distribution across all nodes. HadoopDB An Architectural Hybrid of MapReduce and DBMS Technologies for Analytical Workloads

331、单词 dhcp 的匹配文本：

If the system obtains an IP address dynamically, it is possible that DHCP is not completed before the attempt to start gmond, and so gmond fails to run. Monitoring with Ganglia

In an intranet, a network administrator may assign a specific address to your machine, or you may have a dynamically assigned address allocated by a Dynamic Host Control Protocol (DHCP) server. Java Network Programming and Distributed Computing

Only first nic of vm will get default gateway from dhcp server. bk-compute-adminguide-grizzly

We will learn steps to configure this server, including the configuration of the following two services: DHCP and FTP. Hadoop Operations and Cluster Management Cookbook

FlatDHCP builds upon the Flat manager by providing DHCP services to handle instance addressing and creation of bridges. Deploying OpenStack

Tracks IP address leases and records them in the database by using the dnsmasq dhcp-script facility. bk-admin-guide-cloud-latest

When booting, the kernel will try to use the DHCP protocol, if available, to configure networking. Oracle JRockit The Definitive Guide

High Availability for the DHCP agent is achieved by adopting Pacemaker. high-availability-guide

Others, especially clients on local area networks and wireless connections, receive a different address every time they boot up, often provided by a DHCP server. Java Network Programming 4th Edition

You should be familiar with concepts such as DHCP, Linux bridges, VLANs, and iptables. OpenStack Operations Guide

332、单词 metastore 的匹配文本：

We recommend you to limit this user's privileges to prevent it from creating or altering tables in the metastore database schema. HBase Administration Cookbook

Solution You can take advantage of the built-in Sqoop metastore that allows you to save all parameters for later reuse. Apache Sqoop Cookbook

To support this abstraction, Hive provides a metastore in the form of a relational database that enables you to capture dependencies between actual physical files (and HBase tables) and tables (virtual) used to access this data. Professional Hadoop Solutions

Following this, the script issues a CREATE command with the EXTERNAL keyword, which tells the Hive Metastore that the data is not managed by the Hive warehouse in HDFS. Hadoop RealWorld Solutions Cookbook

Hive also includes a system catalog - Metastore – that contains schemas and statistics, which are useful in data exploration, query optimization and query compilation. Hive A Petabyte Scale Data Warehouse Using Hadoop

This merely saves the notion of this command as a job in something called the Sqoop metastore. Hadoop in Practice

The Apache HCatalog project (http://incubator.apache.org/hcatalog/) solves this problem by providing a table metadata service, based on Hive’s metastore, so that Pig queries can reference schemas by name, rather than specifying them in full each time. Hadoop：The Definitive Guide 3rd Edition

To support these additional features, a new and important component of Hive is a metastore for storing schema information. Hadoop in Action

Another option is to use Sqoop’s saved job feature, which stores both connection credentials and incremental state information in a directory on the local filesystem of the client machine in what is called the metastore. Hadoop Operations

The Metastore is a separate relational database (usually a MySQL instance) where Hive persists table schemas and other system metadata. Programming Hive

333、单词 debian 的匹配文本：

To install gweb on a Debian-based Linux distribution, execute the following command as either root or user with high privilege: Monitoring with Ganglia

The Cloudera folks have Amazon machine images (AMIs), Debian and RPM installer files, and an online configuration tool to generate configuration files. Pro Hadoop

Debian GNU/Linux You can install the CouchDB package by running: CouchDB The Definitive Guide

If you are using a Linux distribution other than CentOS, such as Debian, please refer to corresponding documentation. Hadoop Operations and Cluster Management Cookbook

At the time of writing this book, the current release is Debian package. Gradle Effective Implementation Guide

Fuse development files can be found in fusedevel RPM for Redhat/Fedora and in libfuse-dev package for Debian/Ubuntu. Hadoop MapReduce Cookbook

The Apache Bigtop project (http://incubator.apache.org/bigtop/) runs interoperability tests on stacks of Hadoop components and provides Linux packages (RPMs and Debian packages) for easy installation. Hadoop：The Definitive Guide 3rd Edition

When installing Elasticsearch in production, you can use the method described above, or the Debian or RPM packages provided on the downloads page. Elasticsearch The Definitive Guide

You can simplify your deployment of Hadoop by using their free Hadoop distribution, in either RPM or Ubuntu /Debian Packages. Hadoop in Action

Starting around version 1.0, support for building RPM and Debian packages was added to the build system, and later releases provide these artifacts for download. Hadoop Operations

334、单词 impl 的匹配文本：

The next step is to create an instance of the RMI lightbulb service defined by the RMILightBulbImpl class. Java Network Programming and Distributed Computing

Putting it all together, the complete RelationCountImpl is shown next. HBase in Action

To do this globally in an application, it suffices to remap the implementation for file URIs by setting the property fs.file.impl to the value org.apache. Hadoop：The Definitive Guide 3rd Edition

This code returns PartitionPlanImpl, a convenient basic implementation of the PartitionPlan interface. Java EE 7 Essentials

The WarehouseImpl constructor is declared to throw a RemoteException because the superclass constructor can throw that exception. Core Java Volume II Advanced Features 9th Edition

SocketImplFactory objects may be registered to create SocketImpl objects for the Socket and ServerSocket classes. Java in a Nutshell 5th Edition

This depends heavily on the native UDP software in the OS and the native code that interfaces between this and Java’s DatagramSocketImpl class. Java Network Programming 4th Edition

You can almost do this generically using the SocketImplFactory, but if you install your own SocketImplFactory, there is no protocol to allow you to access the default socket implementation, so another way must be used. Java Performance Tuning 2nd Edition

For example, assuming that we have an implementation class, say TankImpl, we can write the following AdaptedTank class. Concurrent Programming in Java Design Principles and Pattern

First, a new JoinRowSet is created by instantiating a JoinRowSetImpl() object: Java 7 Recipes A Problem-Solution Approach

335、单词 riak 的匹配文本：

Alternatively, you can also use Redis as a cache, storing recently posted messages in Redis, and older (rarely read) messages in a primarily on-disk storage server (like PostgreSQL, MySQL, Riak, MongoDB, and so on) Redis in Action

Riak is a Dynamo-inspired key/value store that offers fault tolerance out of the box. RabbitMQ in Action

The core framework is large, and dozens of portfolio projects extend that core, covering things like security, web flow, SOAP web services (REST web services are part of the core), enterprise integration, batch processing, mobile, various flavors of social (Facebook, LinkedIn, Twitter, GitHub, and so on), various flavors of NoSQL (MongoDB, Neo4j, Riak, and so on), BlazeDS/Flex, AMQP/Rabbit, and many more. Spring in Practice

It’s generally applied to a number of recent nonrelational databases such as Cassandra, Mongo, Neo4J, and Riak. NoSQL Distilled

Links and Walking The Riak key-value store allows each of its stored values to be augmented with link metadata. Graph Databases

Riak is an open-source, distributed key/value database for high availability, fault-tolerance, and nearlinear scalability. A Little Riak Book

This design is observed by Cassandra as well as by straight key/value stores such as Project Voldemort and Riak. Cassandra The Definitive Guide

Although Dynamo is a proprietary system, the ideas used to build it have inspired many systems falling under the NoSQL umbrella, including Cassandra, Project Voldemort, and Riak. MongoDB in Action

What it lacks in complex querying functionality (like that found in Riak or MongoDB), it makes up for in speed and efficiency. Spring Data Modern Data Access for Enterprise Java

336、单词 whitespace 的匹配文本：

Fortunately, JSON doesn’t care about whitespace, and the compact nature of the output keeps the size of the responses down, but isn’t very human-readable.: Getting Started with CouchDB

Now, the actual result is not as nicely formatted and doesn’t include any superfluous whitespace or newlines, but this is better for you (and us!) to read and understand. CouchDB The Definitive Guide

For example, the following code will result in a request that checks how the analysis will be performed for the ElasticSearch Servers phrase in the library index using the whitespace tokenizer and the nGram filter: Mastering ElasticSearch

The call to strip() method removes any leading/trailing whitespace, and then we tokenize it into an array of named variables. Hadoop RealWorld Solutions Cookbook

In the previous function, I am taking the content field of the document, lowering it, and replacing any nonalphabetical characters with a space, before tokenizing the document into an array of words, split by a single whitespace token. Beginning CouchDB

We already covered this topic in section 13.2.4, but I want to reemphasize that a high percentage of problems you’ll encounter in your job are due to unexpected input, and can be as simple an issue as leading or trailing whitespace characters that cause parsing issues. Hadoop in Practice

The script must accept a variable number of arguments that are the hostnames or IP addresses to be mapped, and it must emit the corresponding network locations to standard output, separated by whitespace. Hadoop：The Definitive Guide 3rd Edition

A simple tokenizer might split the text up into terms whenever it encounters whitespace or punctuation. Elasticsearch The Definitive Guide

Tokenization is based purely on whitespace characters and not punctuation marks, making States, States., and States: separate words. Hadoop in Action

Unlike a property file, however, whitespace is not permitted between the equals sign and its key or value. Hadoop Operations

337、单词 paging 的匹配文本：

HBase servers, particularly the version using memcached, are memory intensive and generally require at least a gigabyte of real memory per server; any paging will drastically affect performance. Pro Hadoop

Slow Paging (Do Not Use) Don’t use this method! We just show it because it might seem natural to use, and you need to know why it is a bad idea. CouchDB The Definitive Guide

Other alerting methods such as SMS messages and paging, are also available. Hadoop Operations and Cluster Management Cookbook

Paging When the result list is large, it is necessary to use paging to limit the number of documents fetched in a single request and go through the subsequent ones. Mastering ElasticSearch

Create a scanner over all users whose username is less than I, paging one cell at a time. HBase in Action

Beware of paging too deep or requesting too many results at once. Elasticsearch The Definitive Guide

In extreme cases, such memory leaks can cause disk paging and even program failure with an OutOfMemoryError, but such failures are relatively rare. Effective Java 2nd Edition

Adding RAM will help you if you find the OS doing a lot of paging on reads (in general, it is better to enable the various caches Cassandra has) Cassandra The Definitive Guide

The default queue configuration uses a separate thread for paging messages from the message store into the queue to be dispatched to interested message consumers. ActiveMQ in Action

Improved Cache Heuristics It is a known problem in disk-cache / paging algorithms. PowerDrill - Processing a Trillion Cells per Mouse Click

338、单词 iterating 的匹配文本：

The view definition below shows a new view that emits a row of data based on the ingredient text by iterating over the ingredient list from each recipe document, and emitting each ingredient name. Getting Started with CouchDB

If your module requires a simple string value, referencing the module_params string from the mmodule structure is much more convenient than iterating through an APR array of name/value pairs. Monitoring with Ganglia

Perhaps it is programmed to stop iterating when it sees a particular user’s document or is based on a tally it’s been keeping of some resource allocated in the configuration. CouchDB The Definitive Guide

This is because you are now iterating over all values that have that key, instead of just one at a time. MapReduce Design Patterns

Iterating all the columns of a large key In some designs, a particular row key may have a large number of associated columns. Cassandra High Performance Cookbook

Running your MapReduce application against the dataset may take hours, and constantly iterating with code refinements and rerunning against it isn’t an optimal workflow. Hadoop in Practice

And in this case, the output types of the reduce function are Text and IntWritable, for a year and its maximum temperature, which we find by iterating through the temperatures and comparing each with a record of the highest found so far. Hadoop：The Definitive Guide 3rd Edition

When iterating over super columns in your mapper, each IColumn would need to be cast to a SuperColumn, and it would contain nested column information. Cassandra The Definitive Guide

In other words, when “synchronoulsy” iterating over all columns, the original rows can be reconstructed. PowerDrill - Processing a Trillion Cells per Mouse Click

Iterating through all dependencies can take simultaneous advantage of the dependencies object and the Groovy each method that accepts a closure. Building and Testing with Gradle

339、单词 granularity 的匹配文本：

Another set of properties you may find useful allow for rounding down event times at a hour, minute, or second granularity while still maintaining those elements in file paths. Apache Flume Distributed Log Collection for Hadoop

The granularity of the data blocks can affect the distribution and parallel execution of the tasks. Hadoop Operations and Cluster Management Cookbook

We can also set the maximum and minimum granularity of the log messages with the minGranularity and maxGranularity properties. Gradle Effective Implementation Guide

Ultimately, HoD had too little information to make intelligent decisions about its allocations, its resource granularity was too coarse, and its API forced users to provide misleading constraints to the resource layer. Apache Hadoop YARN Yet Another Resource Negotiator

Although, technically, updates can be implemented as overwrites, the granularity of such an approach (overwrite will work only on the file level) can be cost-prohibitive in most cases. Professional Hadoop Solutions

To achieve efficiency, Incoop relies on memoization, but goes beyond the straightforward task-level application of this technique by performing a stable partitioning of the input and by reducing the granularity of tasks to maximize result reuse. Incoop MapReduce for Incremental Computations

However, they tend to be at the wrong level of granularity for creating sophisticated, highly composable code that can be shared among different developers. Hadoop：The Definitive Guide 3rd Edition

We are currently implementing an object-granularity clone operation to create object copies with copy-on-write behavior for efficient storage utilization, and are extending the RADOS client interface to allow transparent versioning for logical point-in-time copies across sets of objects (i. RADOS A Scalable, Reliable Storage Service for Petabyte-scale Storage Clusters.

Hadoop is unable to efficiently divide this task at a finer granularity. Hadoop in Action

Users and groups may be granted permission to perform any combination of the above operations at the table, column family, or column qualifier granularity. Hadoop Operations

340、单词 jax 的匹配文本：

End-to-end application which makes use of the new clientjar option in JAX-WS wsimport tool. Gradle Effective Implementation Guide

The jaxws-api.jar file is required for JAX-WS support in Spring remoting, and jta.jar is used for JTA transaction support. Pro Spring 3

For the APIs exposed via the public Internet, you can implement an HBase-based real-time application by using REST APIs and leveraging one of the JAX-RS frameworks (for example, RestEasy or Jersey) Professional Hadoop Solutions

The Java API for XML Web Services (JAX-WS) and the Java API for ased RPC (JAX-RPC) both provide support for web service calls using the TTP protocol. Java EE 7 Specification

Using JAX-RS annotations, we decorate each extension class to indicate to the server which HTTP requests it handles. Graph Databases

Without the Client API (introduced in JAX-RS 2), developers are required to use basic HttpUrlConnection APIs and write all the surrounding code. Java EE 7 Essentials

The second part of this chapter focuses on Web services performance, which starts with a brief look at the implementation of the Java API for XML Web Services (JAX-WS) Reference Implementation. Java Performance

In doing so, you’ll explore various remoting technologies, including RMI, Hessian/ Burlap, and SOAP web services with JAX-WS. Spring in Action 4th Edition

341、单词 jsr 的匹配文本：

To prove the point: The Java date library is a bit muddled, and a major redesign is underway; see http://jcp.org/en/jsr/detail?id=310.) Core Java Volume I Fundamentals 9th Edition

If tryItOut returns without raising an exception, control is transferred to the finally block using a jsr instruction. The Java Virtual Machine Specification Java SE 7 Edition

This document is the public review version of the JSR-133 specification, the Java Memory Model (JMM) and Thread Specification. JSR-133 Java Memory Model and Thread

This scenario ely discussed in the Servlet expert group (JSR-53) but we were unable to consensus on the appropriate solution. Java EE 7 Specification

No one deserves more credit for this version of the Java programming language than he - but any blame for its deficiencies should be directed at myself and the members of the many JSR Expert Groups! The Java Language Specification Java SE 7 Edition

Thanks to my colleagues at Oracle and the different JSR specification leads for explaining the intended use cases of different technologies. Java EE 7 Essentials

For more information about using Java bytecode for dynamic languages, please refer to Java Specification Request (JSR) 292 on the Internet. Oracle JRockit The Definitive Guide

It's important to mention that Guava has only one dependency, JSR-305 Getting Started with Google Guava

Customize Liferay portal using JSR-286 portlets, hooks, themes, layout templates, webs plugins, and diverse portlet bridges. Java 7 New Features Cookbook

One of the issues addressed with JSR-133 is the use of the Java keyword volatile. Java Performance

342、单词 timestamps 的匹配文本：

Remember to include standard fields such as timestamps (created, updated), and status so that you can organize and identify the content. Getting Started with CouchDB

Because validation functions are run at replication time as well as during normal client access, we can’t require that timestamps be set close to the server’s system time. CouchDB The Definitive Guide

Data is stored in schematized semi-relational tables; data is versioned, and each version is automatically timestamped with its commit time; old versions of data are subject to configurable garbage-collection policies; and applications can read data at old timestamps. Spanner - Google’s Globally-Distributed Database

When two columns for a row key have the same name, the timestamps of the columns are compared and the value of the column with the highest timestamp is the final value. Cassandra High Performance Cookbook

A write operation in Dynamo also requires a read to be performed for managing the vector timestamps. Cassandra A Decentralized Structured Storage System

In this way, we have a lower and upper bound for the span timestamps on the server side of RPCs. Dapper a Large-Scale Distributed Systems Tracing Infrastructure

This information is stored as metadata in each HFile and queries that ask for a specific timestamp or range of timestamps will check if the request intersects with the ranges of each file, skipping those which do not overlap. Apache Hadoop Goes Realtime at Facebook

Instead of relying on timestamps, causal consistency instead takes a more semantic approach, attempting to determine the cause of events to create some consistency in their order. Cassandra The Definitive Guide

In Webtable, we would use URLs as row keys, various aspects of web pages as column names, and store the contents of the web pages in the contents: column under the timestamps when they were fetched, as illustrated in Figure 1 Bigtable - A Distributed Storage System for Structured Data

The timestamp oracle provides strictly increasing timestamps: a property required for correct operation of the snapshot isolation protocol. Percolator - Large-scale Incremental Processing Using Distributed Transactions and Notifications

343、单词 timeline 的匹配文本：

Alternatively, you can overlay a timeline to indicate the duration of a particular event. Monitoring with Ganglia

For more information on its timeline, please refer to the roadmap’s dashboard (http://www.gradle.org/roadmap) Gradle in Action

You can now view trend reports and see a timeline of success and failure. Jenkins Continuous Integration Cookbook

From a machine point of view, the most natural format to model time is with a single large number representing a point on a continuous timeline. Java 8 in Action Lambdas, Streams and Functional-style Programming

Sorted by time, it can provide a timeline view on the data. MapReduce Design Patterns

It primarily retrieves events by looking up all logs containing a keyword from a range of time.To make this efficient, the event timeline is denormalized to produce one timeline per keyword. Cassandra High Performance Cookbook

Figure 6.13 shows the reduce task timeline with the units of work and potential areas impacting performance. Hadoop in Practice

Start by summarizing the problem and the end result, and then walk through the timeline of events, calling out anything interesting along the way. Hadoop Operations

If you observe a wide distribution of pause times, you should look at the GC Timeline tab to identify peaks in garbage collection activity. Java Performance

Quite similar to a cartoon flip book, each page represents a frame or picture that will be displayed on the timeline for a period of time. Java 7 Recipes A Problem-Solution Approach

344、单词 nonblocking 的匹配文本：

This API and its implementation are critical to supporting external consistency and a variety of powerful features: nonblocking reads in the past, lock-free read-only transactions, and atomic schema changes, across all of Spanner. Spanner - Google’s Globally-Distributed Database

Use Job.is Complete(), a nonblocking job completion check, to constantly poll to see whether all of the jobs are complete. MapReduce Design Patterns

Twisted provides a Deferred object for building chains of nonblocking event handlers. HBase in Action

Thrift also includes an RPC layer with various types of servers, including a nonblocking implementation. Hadoop in Practice

Updates are a nonblocking operation, so a client that loses an update (because another process updated the znode in the meantime) can decide whether to try again or take some other action, and it can do so without blocking the progress of any other process. Hadoop：The Definitive Guide 3rd Edition

From the command-line help you can see that, for example, using the nonblocking server implies the framed transport. HBase：The Definitive Guide

If you do synchronize your class internally, you can use various techniques to achieve high concurrency, such as lock splitting, lock striping, and nonblocking concurrency control. Effective Java 2nd Edition

This flushing is a nonblocking operation; multiple memtables may exist for a single column family, one current and the rest waiting to be flushed. Cassandra The Definitive Guide

The most prominent features of NIO are selectors and nonblocking I/O programming, allowing developers to use the same resources to handle more network clients and generally heavier loads on their servers. ActiveMQ in Action

With event-based programming in mind, you will discover nonblocking or callback behaviors when invoking functions. Java 7 Recipes A Problem-Solution Approach

345、单词 modifiers 的匹配文本：

The keyword public is called an access modifier; these modifiers control the level of access other parts of a program have to this. Core Java Volume I Fundamentals 9th Edition

Other features such as class inheritance and visibility modifiers (the public, private, and protected keywords of Java) make object-oriented languages much easier and safer to work with than older procedural languages. Java Network Programming and Distributed Computing

Term modifiers In addition to the standard field query with a simple term or clause, Lucene allows us to modify the terms we pass in the query with modifiers. Mastering ElasticSearch

Like the access modifiers that started this discussion, bounds checking is generally thought of in terms other than security: the prime benefit to bounds checking is that it leads to fewer bugs and more robust programs. Java Security 2nd Edition

Documented annotations should be treated just like other modifiers such as protected or static for documentation purposes. Core Java Volume II Advanced Features 9th Edition

In order to make a class cloneable, it may be necessary to remove final modifiers from some fields. Effective Java 2nd Edition

In this release of Java, there are no defined modifiers. Java 7 New Features Cookbook

To return to line 5, the public and static keywords are modifiers. Java in a Nutshell 5th Edition

When using the -O option with javac prior to SDK 1.3, the compiler considered inlining methods defined with any of the following modifiers: private, static, or final. Java Performance Tuning 2nd Edition

In any case, constants are typically defined with the static and final modifiers: Java 7 Recipes A Problem-Solution Approach

346、单词 iterative 的匹配文本：

It is similar in concept to MapReduce [14], but with a natural graph API and much more efficient support for iterative computations over the graph. Pregel - A System for Large-Scale Graph Processing

A friend has told him about agile development, a time-boxed iterative approach for implementing and releasing software. Gradle in Action

So it is ideal for iterative and responsive Big Data applications. Hadoop Operations and Cluster Management Cookbook

Examples of (ab)uses include forking web servers and gang-scheduled computation of iterative workloads. Apache Hadoop YARN Yet Another Resource Negotiator

Most of these problems require iterative steps and/or knowledge of complex algorithms, which can be cumbersome to implement in MapReduce. Hadoop RealWorld Solutions Cookbook

Finally, although not fully transparent, Haloop [4] provides task-level memoization techniques for memoization in the context of iterative data processing applications. Incoop MapReduce for Incremental Computations

How it works… Mahout CVB version of LDA implements the Collapse Variable Bayesian inference algorithm using an iterative MapReduce approach: Hadoop MapReduce Cookbook

A basic implementation uses a sequential iterative process to traverse the entire graph from the starting node, as seen in the algorithm presented in figure 7.7 Hadoop in Practice

Iterative application If we apply this algorithm once, we will get node 1 marked as Done, several more (its immediate neighbors) as Current, and a few others as Pending. Hadoop Beginners Guide

In order to reassemble the files into a single sorted file, we open some number of files (defined by io.sort.fac tor), and begin the iterative process of comparing the heads of the streams, picking the lowest key, and writing the result to a new file. Hadoop Operations

347、单词 acls 的匹配文本：

Notice that in case of conflicting ACLs, the first match wins. Monitoring with Ganglia

If no ACLs have been configured, we will get an output similar to the following: Hadoop Operations and Cluster Management Cookbook

What is not covered by this LDIF example is the adding of objectClasses and Access Control Lists (ACLs) Jenkins Continuous Integration Cookbook

Service-level authorization is a capability that provides access control lists (ACLs) of which users have permission to access particular services. Professional Hadoop Solutions

By default, all ACLs are set to *, which means that all users have permission to access each service, but on a real cluster you should lock the ACLs down to only those users and groups that should have access. Hadoop：The Definitive Guide 3rd Edition

Each node has various meta-data, including three names of access control lists (ACLs) used to control reading, writing and changing the ACL names for the node. Chubby - The lock service for loosely-coupled distributed systems

We displayed the list of current ACLs only to see what they were, and to eventually verify that the attributes for the file have been changed. Java 7 New Features Cookbook

The metadata server is also responsible for storing ACLs (Access Control Lists) and other security related information about the tables. Tenzing - A SQL Implementation On The MapReduce Framework

When enabling security, this feature should be enabled as well and meaningful ACLs configured. Hadoop Operations

For example, does user X have the necessary permissions to execute program Y on system Z? Such privileges are often referred to as access control lists (ACLs) and determine who or what can access a given resource to perform a given operation. ActiveMQ in Action

348、单词 nfs 的匹配文本：

In the latest Ganglia versions, some of these metrics now have modules that report their values including NFS. Monitoring with Ganglia

There are other installation methods other than FTP, for example, we can also use NFS and HTTP. Hadoop Operations and Cluster Management Cookbook

Traditional solutions, exemplified by NFS [20], provide a straightforward model in which a server exports a file system hierarchy that clients can map into their local name space. Ceph A Scalable, High-Performance Distributed File System

This time is mostly dominated by disk IO over NFS (since our disks are spread over a number of machines) The anatomy of a large-scale hypertextual Web search engine

Scribe supports multiple data sinks, including HDFS, regular filesystems, and NFS. Hadoop in Practice

Previously, data generated by Scribe was stored in expensive and hard to manage NFS servers. Apache Hadoop Goes Realtime at Facebook

NFS: May not be a better fit for many deployments as NFS is an external device, costly, less control on timeouts etc. Namenode High Availability

As the dfs.name.dir directory on the SNN is unused, we can expose it to the NameNode via the Network File System (NFS) Hadoop in Action

Be it block storage via a Storage Area Network (SAN) or file-based via Network File System (NFS) or similar protocols, these approaches constrain Hadoop by introducing unnecessary bottlenecks and additional shared devices that would have a critical impact on failure. Hadoop Beginners Guide

Currently, an NFS filer is required as the shared storage, although there are plans to remove this dependency.2 As the active namenode writes to the edits log, the standby namenode is constantly replaying transactions to ensure it is up to date and ready to take over in the case of failure. Hadoop Operations

349、单词 rss 的匹配文本：

You will pull in the Jenkins RSS feeds using a Firefox add-on. Jenkins Continuous Integration Cookbook

You can also subscribe to RSS feeds for various levels of log messages. Jenkins The Definitive Guide

RabbitMQ development RSS feed—The RabbitMQ Mercurial repository offers an RSS feed with the latest code changes to the server. RabbitMQ in Action

Tip Other ideas for Puppet report processors include RSS feeds for new reports, IRC, XMPP or instant messaging, or SMS notifications of new reports. Pro Puppet

It’s employed by many standards today such as SOAP and RSS, and used as an open data format for products such as Microsoft Office. Hadoop in Practice

Streamy crawls thousands of RSS feeds and aggregates hundreds of millions of items from them. Hadoop：The Definitive Guide 3rd Edition

To emulate this use case, the book makes use of a freely available data set on the Internet: the Delicious RSS feed. HBase：The Definitive Guide

You can create Atom or RSS feeds in just a few lines that transform database records into the appropriate XML. Beginning Scala

Then, in the early part of this century, RSS feeds, blogs, and wikis started cropping up, adding automation and linking of reference data in new ways that stretched the relational model. Cassandra The Definitive Guide

How It Works If you’ve ever seen a simple RSS reader or the Javadocs, you’ll notice that the screen is divided into sections with dividers that allow the user to adjust. Java 7 Recipes A Problem-Solution Approach

350、单词 datagrams 的匹配文本：

Through the membership key, you can block/unblock datagrams from different addresses, drop membership, get the channel and/or multicast group for which this membership key was created, and more. Pro Java 7 NIO.2

Troubleshooting Most problems with sFlow deployments occur because the sFlow datagrams are dropped somewhere between the sFlow agent and gmond. Monitoring with Ganglia

Information is exchanged between two hosts in the form of IP packets, also known as IP datagrams. Java Network Programming and Distributed Computing

In this book, we cover only the Transmission Control Protocol (TCP).The Java platform also supports the User Datagram Protocol (UDP), which can be used to send packets (also called datagrams) with much less overhead than that of TCP. Core Java Volume II Advanced Features 9th Edition

This class implements a communication channel based on network datagrams. Java in a Nutshell 5th Edition

The internet layer fragments the segments into IP datagrams of the necessary size for the local network and passes them to the host-to-network layer for transmission onto the wire. Java Network Programming 4th Edition

The purpose of these two protocols is identical—to send and receive data packets (datagrams) over the network. ActiveMQ in Action

Its sequencing with respect to other datagrams between the same two endpoints is not guaranteed. Fundamental Networking in Java

Problem You would like to broadcast datagrams to zero or more hosts identified by a single address. Java 7 Recipes A Problem-Solution Approach

351、单词 visualvm 的匹配文本：

In this chapter, we will use VisualVM (http://visualvm.java.net/index.html) as the application monitoring tool. Pro Spring 3

You start it in this manner so you can connect a local JMX console (like JConsole or Java VisualVM) to it. Spring in Practice

You can try to reproduce the problem locally by downloading the file that the task is failing on and running the job locally, possibly using a debugger such as Java’s VisualVM. Hadoop：The Definitive Guide 3rd Edition

This is what we’ll look at in the next section on how to monitor the Course Manager application using tools like VisualVM. Spring Roo in Action

The following graphical tools can be used to monitor the HotSpot VM: JConsole, VisualGC, and VisualVM. Java Performance

Solution 2 Analyze your program’s memory consumption by using VisualVM (an external GNU Open Source Java profiler and visualization tool).To use and install VisualVM, go to http://visualvm.java.net/, download and follow the steps for installation (usually uncompressing in the folder of your choice and then clicking the provided executable) Java 7 Recipes A Problem-Solution Approach

As illustrated in figure 20.1, exporting Spring beans as JMX MBeans makes it possible for a JMX-based management tool such as JConsole or VisualVM to peer inside a running application to view the beans’ properties and invoke their methods. Spring in Action 4th Edition

352、单词 mesos 的匹配文本：

Mesos uses resource hoarding to achieve gang scheduling, and can potentially deadlock as a result. Omega - flexible, scalable schedulers for large compute clusters

The system runs over the Mesos cluster manager [17], allowing it to share resources with Hadoop, MPI and other applications. Resilient Distributed Datasets - A Fault-Tolerant Abstraction for In-Memory Cluster Computing

HoD proved itself as a versatile platform, anticipating some qualities of Mesos[17], which would extend. Apache Hadoop YARN Yet Another Resource Negotiator

Paco is an evangelist for the Mesos and Cascading open source projects. Storm Real-Time Processing Cookbook

Abstract We present Mesos, a platform for sharing commodity clusters between multiple diverse cluster computing frameworks, such as Hadoop and MPI. Mesos A Platform for Fine-Grained Resource Sharing in the Data Center

353、单词 tweets 的匹配文本：

The screenshot in Figure 10-10 shows the word cloud generated from a MapReduce view of CouchDB’s archived tweets. CouchDB The Definitive Guide

You need to be able to provide credentials so that your plugin can send tweets. Jenkins Continuous Integration Cookbook

In order to get the data we needed, we decided to use the Twitter river plugin to index some public tweets from Twitter. Mastering ElasticSearch

Let’s say you have a list of strings representing tweets people are sending to you. Java 8 in Action Lambdas, Streams and Functional-style Programming

By monitoring, collecting, and aggregating data from various Internet sources such as blogs, boards, news feeds, tweets, and social media, companies are using Hadoop to extract and aggregate information about their products, services, and competitors, discovering patterns and revealing upcoming trends important for understanding their business. Professional Hadoop Solutions

HBase stores structured and semistructured data naturally so you can load it with tweets and parsed log files and a catalog of all your products right along with their customer reviews. HBase in Action

Imagine you wanted to join that set of users with all their tweets, which is a much larger dataset than the set of users. Hadoop in Practice

For example, emitting a stream of trend analysis by processing a stream of tweets. Apache Kafka

Something seems to be broken: we have 12 tweets in our indices, and only one. Elasticsearch The Definitive Guide

You can use what’s likely a familiar data model (Twitter’s) and see how users, time lines, and tweets all fit into a simple Cassandra data model. Cassandra The Definitive Guide

354、单词 bson 的匹配文本：

Each document has to be converted from BSON to a JavaScript object and then run through the "$where" expression. MongoDB The Definitive Guide 2nd Edition

The format of the document in which the information is stored in MongoDB is BSON (binary JSON) Scala in Action

Finally, you don’t need to worry about mapping the Java Article object to MongoDB’s native BSON format,8 because Spring Data MongoDB also handles that for you. Spring in Practice

In addition to standard relational data, Drill supports (using ANSI SQL extensions) hierarchical data, including XML, JavaScript Object Notation (JSON), Binary JSON (BSON), Avro, protocol buffers, and so on. Professional Hadoop Solutions

The database stores and retrieves documents, which can be XML, JSON, BSON, and so on. NoSQL Distilled

MongoDB uses these BSON objects to efficiently serialize, transfer, and store data. Hadoop RealWorld Solutions Cookbook

MongoDB doesn’t actually use JSON to store the data; rather, it uses an open data format developed by the MongoDB team called BSON (pronounced Bee-Son), which is short for Binary-JSON. The Definitive Guide to MongoDB

Chapter 3 introduces the MongoDB drivers and MongoDB’s data format, BSON. MongoDB in Action

MongoDB stores these documents in BSON, a binary derivative of JSON. Spring Data Modern Data Access for Enterprise Java

355、单词 batis 的匹配文本：

The support for Hibernate, MyBatis, JDO, and JPA makes already simple APIs even simpler, thus easing the burden on developers. Pro Spring 3

For example, it provides a consistent and transparent way to work with a variety of different data access frameworks (e.g., JDBC, JPA, Hibernate, iBATIS, and JDO) Pro Spring MVC

A quick note about iBATIS Apache iBATIS was retired in 2010 and has been superseded by MyBatis (mybatis.org) Spring in Practice

There are many other tools for such upgrades, including Liquibase [Liquibase], MyBatis. NoSQL Distilled

As an example, consider the ORM module in the Spring Framework: this module includes support for popular ORM tools such as Hibernate, iBATIS, and OpenJPA. Spring Dynamic Modules in Action

Just configure your Spring application with the appropriate JDBC driver and data source, and then start using the Spring JdbcTemplate API, or MyBatis (a common SQL-mapping API), or even a non-SQL database platform. Spring Roo in Action

It significantly simplified the implementation of data access layers, regardless of whether JDBC, Hibernate, TopLink, JDO, or iBatis was used as persistence technology. Spring Data Modern Data Access for Enterprise Java

Spring doesn’t attempt to implement its own ORM solution but does provide hooks into several popular ORM frameworks, including Hibernate, Java Persistence API, Java Data Objects, and iBATIS SQL Maps. Spring in Action 4th Edition

356、单词 iff 的匹配文本：

Iff the predicate classIsTypeSafe is not true, the type checker must throw the exception VerifyError to indicate that the class file is malformed. The Java Virtual Machine Specification Java SE 7 Edition

For every other kind of expression, the expression can throw an exception class E iff one of its immediate subexpressions can throw E. The Java Language Specification Java SE 7 Edition

Apply a predicate p to all elements of this iterable object and // return true, iff there is at least one element for which p yields true. Programming Scala

357、单词 compiles 的匹配文本：

If it does not find that file, it automatically searches for Employee.java and compiles it. Core Java Volume I Fundamentals 9th Edition

This code compiles and executes on Android without any conspicuous error. Monitoring with Ganglia

The build task compiles your code, runs your tests, and assembles the JAR file, all in the correct order. Gradle in Action

The apiClasses task is the task that compiles the Java source files, and processes the resources into the build directory, so Gradle will first invoke the apiClasses task before the apiJar task. Gradle Effective Implementation Guide

Query Compiler – The component that compiles HiveQL into a directed acyclic graph of map/reduce tasks. Hive A Petabyte Scale Data Warehouse Using Hadoop

For example, the build task in a Java project compiles and tests all of the code. Gradle Beyond The Basics

When we run the Pig script, Pig internally compiles Pig commands to MapReduce jobs in an optimized form and runs it in a MapReduce cluster. Hadoop MapReduce Cookbook

Pig compiles all the statements together and can locate the dependency and redundancy. Hadoop in Action

The command compiles the Java examples and starts up a simple JMS consumer. ActiveMQ in Action

All commands and queries go to the Driver, which compiles the input, optimizes the computation required, and executes the required steps, usually with MapReduce jobs. Programming Hive

358、单词 cloudera 的匹配文本：

In the past, you had to come up with a solution for this problem, but no more! Flume started as a project out of Cloudera when their integration engineers had to keep writing tools over and over again for their customers to import data automatically. Apache Flume Distributed Log Collection for Hadoop

Hadoop was created by Doug Cutting, who now works as an architect at Cloudera and serves as chair of the Apache Software Foundation. Monitoring with Ganglia

Prerequisites For those of you who are new to Hadoop, I strongly urge you to try Cloudera’s open source Distribution for Hadoop (http://www.cloudera.com/hadoop) Pro Hadoop

The success of MR led to a wide range of third-party implementations (notably open-source Hadoop [15]), and a number of hybrid systems that combine parallel DBMSs with MR, offered by vendors like Aster, Cloudera, Greenplum, and Vertica. Dremel - Interactive Analysis of WebScale Datasets

Prior to the release of MRUnit by Cloudera, even the simplest tests running in local mode would have to read from the disk and take several seconds each to set up and run. Hadoop RealWorld Solutions Cookbook

After tackling Hadoop’s architecture and its weaknesses you’re probably ready to roll up your sleeves and get hands-on with Hadoop, so let’s take a look at how to get the Cloudera Distribution for Hadoop (CDH)12 up and running on your system, which you can use for all the examples in this book. Hadoop in Practice

There are Hadoop distributions from the large, established enterprise vendors, including EMC, IBM, Microsoft, and Oracle, as well as from specialist Hadoop companies such as Cloudera, Hortonworks, and MapR. Hadoop：The Definitive Guide 3rd Edition

Cloudera is now bringing Hadoop to a broader enterprise audience. Hadoop in Action

Though Doug has now moved on to Cloudera (another prominent startup supporting the Hadoop community) and much of the Yahoo's Hadoop team has been spun off into a startup called Hortonworks, Yahoo remains a major Hadoop contributor. Hadoop Beginners Guide

Special thanks to Christophe Bisciglia for giving me an opportunity at Cloudera and for the advice along the way. Hadoop Operations

359、单词 xen 的匹配文本：

The libvirt project has created a common set of tools for managing virtualization resources on different virtualization platforms, currently including: Xen, QEMU, KVM, LXC, OpenVZ, User Mode Linux, VirtualBox, and VMware ESX and GSX. Monitoring with Ganglia

The nova-compute process is primarily a worker daemon that creates and terminates virtual machine instances via hypervisor's APIs (XenAPI for XenServer/XCP, libvirt for KVM or QEMU, VMwareAPI for VMware, etc.) bk-compute-adminguide-grizzly

If our system is running within Xen virtualization, those numbers can be closer to 200–300ms per gigabyte of memory used by Redis, depending on the Xen configuration. Redis in Action

While libvirt may be the most common, novacompute also uses the Xen API, vSphere API, Windows Management Interface, and others to support other virtualization technologies. Deploying OpenStack

However, the majority of OpenStack development is done on systems using KVM and Xen-based hypervisors. bk-admin-guide-cloud-latest

Hardware device drivers can be provided by the hypervisor, either in a special virtual machine for isolation (for example in Oracle VM- and Xen-based solutions) or as part of the hypervisor itself (for example in VMware ESX) Oracle JRockit The Definitive Guide

To boot a Xen image from a volume, the image you launch in must be the same type, fully virtualized or paravirtualized, as the one on the volume. user-guide

360、单词 initializer 的匹配文本：

However, C++ has a special initializer list syntax, such as. Core Java Volume I Fundamentals 9th Edition

A compiler writer can work around this bug by limiting the maximum size of the generated Java Virtual Machine code for any method, instance initialization method, or static initializer (the size of any code array) to 65534 bytes. The Java Virtual Machine Specification Java SE 7 Edition

The value of the variable Point.origin can never change, so it always refers to the same Point object, the one created by its initializer. The Java Language Specification Java SE 7 Edition

The delegate injection point follows the normal rules for injection and therefore must be an injected field, initializer method parameter, or bean constructor method parameter: Java EE 7 Essentials

This statement causes the driver class to be loaded, thereby executing a static initializer that registers the driver. Core Java Volume II Advanced Features 9th Edition

The version that follows avoids this inefficiency with a static initializer: Effective Java 2nd Edition

A variable declaration can also include an initializer: an expression that specifies an initial value for the variable. Java in a Nutshell 5th Edition

Rather than initialize explicitly in the constructor (or class static initializer), it is left until access time for the variable to be initialized, using a test for null to determine if it has been initialized. Java Performance Tuning 2nd Edition

As of this writing, the JLS does not yet clearly state that the visible initial value read for an initialized final field is the value assigned in its initializer or constructor. Concurrent Programming in Java Design Principles and Pattern

The initGrades2() method combines array creation and initialization in one line using the special array initializer syntax: Java 7 Recipes A Problem-Solution Approach

361、单词 ftp 的匹配文本：

Java has an extensive library of routines for coping with TCP/IP protocols like HTTP and FTP. Core Java Volume I Fundamentals 9th Edition

The backup server could either be accessible over a mounted shared drive or you could transfer the file over FTP. Gradle in Action

We will learn steps to configure this server, including the configuration of the following two services: DHCP and FTP. Hadoop Operations and Cluster Management Cookbook

Once a jar file is created, you can FTP it to the cluster’s edge node, and executed using the command shown in Listing 3-4 Professional Hadoop Solutions

This recipe requires an SSH Server and SSH Client, but any method of transferring binary data such as FTP is sufficient. Cassandra High Performance Cookbook

Using curl curl is a lightweight command-line tool for transferring data to or from a server using various protocols, such as HTTP, HTTPS, and FTP. Beginning CouchDB

These techniques are particularly helpful in situations where the data you’re working with is in a binary or semistructured form, or is only accessible via interfaces such as HTTP or FTP. Hadoop in Practice

It functions much like a GUI-based FTP client in managing remote storage. Hadoop in Action

The act of copying files to or from HDFS is more like SCP or FTP than working with an NFS mounted filesystem, for example. Hadoop Operations

The blob itself is transferred out of bounds, by either FTP or HTTP. ActiveMQ in Action

362、单词 statically 的匹配文本：

Part of the transition from a single executable also included splitting out other components such as the Apache Portable Runtime (APR) library, which was previously being statically linked with gmond as well. Monitoring with Ganglia

You may prefer Java for its statically typed nature and library support to implement your business logic. Gradle in Action

Structural conditions — These are statically defined in a Workflow DAG (transitions and fork-join construct) Professional Hadoop Solutions

The mapper class defines a statically initialized position to read the country from each line, as well as the regex pattern to split each line. Hadoop RealWorld Solutions Cookbook

Many US hotels and conference centers are still woefully unprepared for a few hundred software developers, each with two or three devices on the wireless network and a seemingly insatiable appetite for bandwidth.) While I strongly prefer dependencies to be managed by my build tool, it might make sense for me to prepare lab materials with all of the dependencies statically located in the project in the style of old Ant builds.4 For some Java frameworks and APIs, chasing all of these JARs down by hand can be a burden. Gradle Beyond The Basics

In contexts where the Writable is defined by type, such as in SequenceFile keys or values or as input to MapReduce in general, you need to subclass ArrayWritable (or TwoDAr rayWritable, as appropriate) to set the type statically. Hadoop：The Definitive Guide 3rd Edition

In more basic designs, the ID is hashed statically to one of a fixed number of database machines. Hadoop in Action

Keyspaces Keyspaces used to be defined statically in an XML configuration file, but as of 0.7, you can use the API to create keyspaces and column families. Cassandra The Definitive Guide

When you want to configure a network of brokers, the first obvious question is, do you know the exact network address of each broker in the network? If the answer is yes, then you can proceed configuring your network statically and also connect your clients to predefined broker URIs. ActiveMQ in Action

The concrete nature (and ideally, number) of all objects and threads comprising the subsystem are statically known. Concurrent Programming in Java Design Principles and Pattern

363、单词 zookeeper 的匹配文本：

It consisted of a federation of worker daemons (agents) configured from a centralized master (or masters) via Zookeeper (a federated configuration and coordination system) Apache Flume Distributed Log Collection for Hadoop

The examples at http://hadoop.apache.org/zookeeper/docs/current/recipes.html provide ZooKeeper recipes for two-phase commit, leader election, barriers, queues, and locks. Pro Hadoop

HBase uses Zookeeper for its management to ensure that all of its components are up and running. Professional Hadoop Solutions

In Chapter 10, Third-party Libraries and Applications, the recipes Setting up Zookeeper to support Cages for transactional locking and Using Cages to implement an atomic read and set. Cassandra High Performance Cookbook

The shell script in this recipe assumes that Zookeeper is running on the host localhost on port 2181; you can change this to suit your environment needs. Hadoop RealWorld Solutions Cookbook

Cassandra system elects a leader amongst its nodes using a system called Zookeeper[13] Cassandra A Decentralized Structured Storage System

Create two directories to store the HDFS data and Zookeeper data. Hadoop MapReduce Cookbook

Windows is supported only as a development platform, not as a production platform.) Download a stable release of ZooKeeper from the Apache ZooKeeper releases page at http://zookeeper.apache.org/releases.html, and unpack the tarball in a suitable location: Hadoop：The Definitive Guide 3rd Edition

Our plan is to use Zookeeper, Yahoo’s distributed consensus technology to build an automated failover solution. The Hadoop Distributed File System

Zookeeper, also part of the Hadoop project, is a centralized service for maintaining configuration information and distributed synchronization across nodes in a cluster. Cassandra The Definitive Guide

364、单词 descriptors 的匹配文本：

Use the ulimit command to increase the permitted number of open files or file descriptors for the rrdcached process. Monitoring with Ganglia

File Descriptors Hadoop Core uses large numbers of file descriptors for MapReduce, and the DFSClient uses a large number of file descriptors for communicating with the HDFS NameNode and DataNode server processes. Pro Hadoop

All this takes processing time and enough sockets (or file descriptors) must be available. CouchDB The Definitive Guide

Writing a plugin is surprisingly easy and doesn’t require a lot of additional descriptors. Gradle in Action

But those descriptor files can be misconfigured, and so we may want to override the descriptors ourselves to ensure the dependencies are correct. Gradle Effective Implementation Guide

Because of this, HBase tends to see issues not normally encountered by MapReduce clients: Running out of file descriptors. Hadoop：The Definitive Guide 3rd Edition

Clients open nodes to obtain handles that are analogous to UNIX file descriptors. Chubby - The lock service for loosely-coupled distributed systems

The new intermediate file, under output/test, is missing the quoted field descriptors. Hadoop in Action

Linux supports, via Pluggable Authentication Modules (PAM) system, the ability to control resources such as file descriptors and virtual memory at the user level. Hadoop Operations

This is common when working with a Java EE application server, since each one requires its own custom configuration file and possibly more deployment descriptors. ActiveMQ in Action

365、单词 delimited 的匹配文本：

For each named sink group, you need to specify the sinks it contains using the sinks property consisting of a space-delimited list of sink names: Apache Flume Distributed Log Collection for Hadoop

Because the output is just a text string, that means you can reformat the content as XML, a tab delimited string, even as a UUencoded string if that is what you need. Getting Started with CouchDB

Then add a list of checks that are delimited with a colon. Monitoring with Ganglia

This recipe will use delimited text as an alternative to storing multiple columns under the same key. Cassandra High Performance Cookbook

Note that the expression passed to the expand() method is a Groovy map literal—it is enclosed by square brackets, and a series of key/value pairs are delimited by commas, with the key and the value themselves separated by colons. Gradle Beyond The Basics

It reads records that are delimited by a specific XML begin and end tag. Hadoop in Practice

Popular formats, such as tab-delimited text files, are natively supported. Hadoop in Action

For example, for delimited text format, the user can specify the delimiter, encoding, quoting, escaping, headers, etc. Tenzing - A SQL Implementation On The MapReduce Framework

For simplicity, we keep to our previous validation strategy of looking for the number of fields and discarding lines that don't break into exactly six tab-delimited fields. Hadoop Beginners Guide

The users.properties file defines each user in a line-delimited manner along with its password, as shown: ActiveMQ in Action

366、单词 rowkey 的匹配文本：

Using specific rowkey filtering, scanning will be faster over a subset, rather than scanning over the entire table. Web Crawling and Data Mining with Apache Nutch

Within the enterprise, all applications can write their events to Cassandra with their own columns and the rowkey of the form appname:timestamp. NoSQL Distilled

It’s often described as a sparse, distributed, persistent, multidimensional sorted map, which is indexed by rowkey, column. HBase in Action

Iterate through all values for the rowkey and scan criteria (defined in the main method) Hadoop in Practice

367、单词 memcached 的匹配文本：

Originally targeted at embedded network hardware, sFlow collectors now exist for general-purpose operating systems as well as popular applications such as Tomcat, memcached, and the Apache Web Server. Monitoring with Ganglia

More recent versions of HBase also provide a memcached-based intermediate layer between the user and the MapFiles (http://www.danga.com/memcached/) Pro Hadoop

BigTable, Hadoop, SimpleDB, and memcached restrict object lookups by key for exactly these reasons. CouchDB The Definitive Guide

Cache documents and query results using memcached or another caching system. Scaling CouchDB

Service becomes more popular; too many reads hitting the database Add memcached to cache common queries. Hadoop：The Definitive Guide 3rd Edition

Even projects like memcached are lumped into the NoSQL category, as if anything that is not an RDBMS is automatically NoSQL. HBase：The Definitive Guide

Instead, we have used a combination of large clusters of MySQL databases and caching tiers built using memcached[9] Apache Hadoop Goes Realtime at Facebook

Caching in Guava gives us more power and flexibility than using plain HashMap but is not as robust as EHCache or Memcached. Getting Started with Google Guava

This was true of many tools and techniques for scaling data processing, including caching (for example, memcached), replication, sharding, and, of course, MapReduce/ Hadoop. Hadoop in Action

For larger systems, this might include distributed caches such as memcached, EHCache, Oracle Coherence, or other related products. Cassandra The Definitive Guide

368、单词 stm 的匹配文本：

He’s then taught the modern concurrency tools provided directly within the Java SDK before embarking upon an adventure through the exciting realms of STM and actors. Programming Concurrency on the JVM

Instead of working with tables and rows, STM controls the access to shared memory. Scala in Action

369、单词 charset 的匹配文本：

This method gets the path to the file, an iterable object over the char sequence, a charset to use for encoding, and options specifying how the file is opened. Pro Java 7 NIO.2

Passing this to Charset.forName( ) produces the Charset object that can be used to decode the string. Thinking in Java 4th Edition

You can use the CHARSET(), COLLATION(), and COERCIBILITY() functions to help debug errors related to character sets and collations. High Performance MySQL 3rd Edition

Each supports the same attributes (other than a charset attribute that's only present on the string transformer), so the primary difference is the type of the payload of the outgoing message. Spring Integration in Action

To obtain a Charset, call the static forName method with either the official name or one of its aliases: Core Java Volume II Advanced Features 9th Edition

The third parameter of this method defines the Charset to use. Java 7 New Features Cookbook

Charset and its associated classes, CharsetEncoder and CharsetDecoder, define methods for encoding strings of characters into sequences of bytes and decoding sequences of bytes into strings of characters. Java in a Nutshell 5th Edition

When you create a String instance from a byte array, you must provide a Charset argument to the String constructor so that the platform knows how to perform the mapping from the legacy encoding to Unicode. Java 7 Recipes A Problem-Solution Approach

In the (hopefully unlikely) case that you need a different encoding, you can supply a Charset argument. Java SE 8 for the Really Impatient

THR_LOCK_charset Protects the variables and data structures relevant to character set operations. Understanding MySQL Internals

370、单词 yahoo 的匹配文本：

In addition to extremely large-scale pipelines for Yahoo! Search, scientists optimizing advertising analytics, spam filtering, and content optimization drove many of its early requirements. Apache Hadoop YARN Yet Another Resource Negotiator

The next recipe, Running the Yahoo! Cloud Serving Benchmark for a more sophisticated load testing system. Cassandra High Performance Cookbook

Also, pages that have perhaps only one citation from something like the Yahoo! h homepage are also generally worth looking at. The anatomy of a large-scale hypertextual Web search engine

You’ve most likely come across Hadoop as it’s been adopted by technology giants like Yahoo!, Facebook, and Twitter to address their big data needs, and it’s making inroads across all industrial sectors. Hadoop in Practice

Around that time, Yahoo! got interested, and quickly put together a team that I joined. Hadoop：The Definitive Guide 3rd Edition

BookKeeper is in production use at Yahoo! for guaranteed delivery of log messages to Hedwig Servers. Namenode High Availability

Doug Cutting got it all started, and Yahoo had the foresight to support it early on. Hadoop in Action

Pig Pig (http://hadoop.apache.org/pig) is a platform for data analytics developed at Yahoo! Cassandra The Definitive Guide

Thanks, Yahoo Yahoo hired Doug Cutting in 2006 and quickly became one of the most prominent supporters of the Hadoop project. Hadoop Beginners Guide

The team at Yahoo! ran into a number of scalability limitations that were difficult to overcome given Hadoop’s existing architecture and design. Hadoop Operations

371、单词 programmatic 的匹配文本：

CouchDB’s JSON documents are great for programmatic access in most environments. CouchDB The Definitive Guide

The Project instance gives you programmatic access to all Gradle features in your build, like task creation and dependency management. Gradle in Action

Introduction Programmatic access to a cluster of Cassandra servers is done though the Application Programmer Interface. Cassandra High Performance Cookbook

Dapper is used widely at Google, both directly through the Dapper user interface and indirectly through the programmatic APIs or applications built on top of those APIs. Dapper a Large-Scale Distributed Systems Tracing Infrastructure

You have programmatic access to counters, and the job history command will also include the counters: Hadoop in Practice

The listStatus() and globStatus() methods of FileSystem take an optional PathFilter, which allows programmatic control over matching: Hadoop：The Definitive Guide 3rd Edition

Programmatic Authentication If you have set up authentication on your keyspace, your client application code will need to log in. Cassandra The Definitive Guide

Using web, commandline, or programmatic interfaces to create objects, which can be everything from text files to images to MP3s, you can store and retrieve your data based on a hierarchical model. Hadoop Beginners Guide

Oozie presents a REST service for programmatic management of workflows and status retrieval. Hadoop Operations

This kind of programmatic modification of that otherwise inaccessible build logic can be very powerful. Building and Testing with Gradle

372、单词 cms 的匹配文本：

HBase uses CMS GC as its default garbage collector for the old generation. HBase Administration Cookbook

For instance, if you are using the blog.posts collection in the cms database, the namespace of that collection would be cms.blog.posts. MongoDB The Definitive Guide 2nd Edition

In the XML world he is known as the developer of DB Generator for the Apache Cocoon project, the open source projects DBPrism and DBPrism CMS, the Lucene-Oracle integration by using Oracle JVM Directory implementation and in the Restlet.org project the Oracle XDB Restlet Adapter, an alternative to write native REST web services inside the database-resident JVM. Mastering ElasticSearch

A content management system (CMS) allows for storing and serving content, as well as managing everything from a central location. HBase in Action

This is mitigated by using the Concurrent Mark-Sweep Collector (CMS), enabled with the latter option shown earlier. HBase：The Definitive Guide

Use the name or id attributes to name these variables as follows: public WebElement heightCMS; public WebElement weightKg; public WebElement Calculate; public WebElement bmi; public WebElement bmi_category; Selenium Testing Tools Cookbook

Use Liferay tools (CMS, WCM, collaborative API and social API) to create your own Web sites and WAP sites with hands-on examples. Java 7 New Features Cookbook

The CMS API (described later in this chapter) defines an API similar to JMS, and since the pyactivemq is just a wrapper around a CMS API implementation, we can expect a JMS-like API for Python. ActiveMQ in Action

Figure 3-8a illustrates how a garbage collection cycle works in CMS. Java Performance

In the XML world, he is known as the developer of the DB Generator for the Apache Cocoon project, the open source projects DBPrism and DBPrism CMS, the Lucene-Oracle integration by using Oracle JVM Directory implementation, and the Restlet.org project – the Oracle XDB Restlet Adapter, an alternative to writing native REST web services inside the database resident JVM. Apache Solr 4 Cookbook

373、单词 extensible 的匹配文本：

Flume was created to meet this need and create a standard, simple, robust, flexible, and extensible tool for data ingestion into Hadoop. Apache Flume Distributed Log Collection for Hadoop

Ganglia 1.0 was not extensible, whereas Ganglia today can publish custom metrics via Python and C modules or a simple command-line tool. Monitoring with Ganglia

The nested data model backs a platform-neutral, extensible mechanism for serializing structured data at Google. Dremel - Interactive Analysis of WebScale Datasets

In fact, the AM can run arbitrary user code, and can be written in any programming language since all communication with the RM and NM is encoded using extensible communication protocols6—as an example consider. Apache Hadoop YARN Yet Another Resource Negotiator

Attain a more flexible and extensible framework with the world class Axis2 architecture. Mastering ElasticSearch

It has extensible rules that can permit or deny traffic based on many attributes, including, but not limited to, source IP, destination IP, source port, and destination port. Cassandra High Performance Cookbook

The DDL forms a clear, compact, extensible notation describing the layout of the binary records and naming the fields. Interpreting the Data Parallel Analysis with Sawzall

One of the main design goals driving toward Hadoop’s major 1.0 release is a stable and extensible MapReduce API. Hadoop in Action

First we’ve started the XMPP transport connector on port 61222 to expose the broker to clients via XMPP (the Extensible Messaging and Presence Protocol) ActiveMQ in Action

Ant provided a way to describe a Java build that was extensible and platform independent, which was a large part of its success. Building and Testing with Gradle

374、单词 customizing 的匹配文本：

If your view of the world is different, Gradle gives you the option of customizing the conventions. Gradle in Action

Next, we will go to a web page for customizing services, for example, configuring the location for the NameNode directory. Hadoop Operations and Cluster Management Cookbook

Customizing the Gradle wrapper We can change the names of the script files that are generated with the scriptFile property of the Wrapper task. Gradle Effective Implementation Guide

Our books and publications share the experiences of your fellow IT professionals in adapting and customizing today's systems, applications, and frameworks. Cassandra High Performance Cookbook

CouchDB’s configuration file includes many options for customizing your CouchDB installation. Beginning CouchDB

While this is the fundamental role of configurations, they also provide methods for querying the declared dependencies and customizing the resolution process. Gradle Beyond The Basics

The site XML files (those with site in their filenames) will grow as you start customizing your Hadoop cluster, and it can quickly become challenging to keep track of what changes you’ve made, and how they relate to the default configuration values. Hadoop in Practice

Inverted index Analysis and analyzers Built-in analyzers When analyzers are used Testing analyzers Specifying analyzers Mapping Core simple field types Viewing the mapping Customizing field mappings Updating a mapping Testing the mapping Complex core field types Multi-value fields Empty fields Multi-level objects Mapping for inner objects How inner objects are indexed Arrays of inner objects Empty search Query DSL Structure of a query clause Combining multiple clauses Queries and Filters Performance differences When to use which Most important queries and filters Combining queries with filters Filtering a query Just a filter A query as a filter Validating queries Understanding errors Understanding queries Conclusion Sorting Sorting by field values Multi-level sorting Sorting on multi-value fields String sorting and multi-fields What is relevance? Understanding the score. Elasticsearch The Definitive Guide

In using FileInputFormat you focus on customizing RecordReader, which is responsible for parsing an input split into records and then parsing each record into a key/value pair. Hadoop in Action

To begin customizing the ActiveMQ resource adapter to start up an embedded ActiveMQ instance, you need to make some changes to META-INF/ra.xml as shown in the following steps: ActiveMQ in Action

375、单词 hiveql 的匹配文本：

Hive supports a SQL-like query language called HiveQL (HQL) to access data in its table. HBase Administration Cookbook

Similar to Apache Pig, the Hive runtime engine translates HiveQL statements into a sequence of MapReduce jobs for execution. Hadoop Operations and Cluster Management Cookbook

As will be seen throughout this book in the resemblances sections, Pig and SQL (or HiveQL) can be significantly more terse than the raw Hadoop implementations in Java. MapReduce Design Patterns

HiveQL supports ad-hoc queries, joins, summarization, and so on, by transparently converting them into MapReduce execution. Professional Hadoop Solutions

Hive lets you build analytics and manage data using a declarative, SQL-like dialect known as HiveQL. Hadoop RealWorld Solutions Cookbook

Discussion The following HiveQL creates a table which you’ll partition by date as well as by country. Hadoop in Practice

Its query language, HiveQL, is based on SQL, and anyone who is familiar with SQL would have little trouble writing queries in HiveQL. Hadoop：The Definitive Guide 3rd Edition

Its target users remain data analysts who are comfortable with SQL and who need to do ad hoc queries , summarization , and data analysis on Hadoop-scale data.3 You interact with Hive by issuing queries in a SQL-like language called HiveQL. Hadoop in Action

In particular, it provides a query language called HiveQL that closely resembles the common Structured Query Language (SQL) standard. Hadoop Beginners Guide

Developers, analysts, and existing third-party packages already know and speak SQL (Hive’s dialect of SQL is called HiveQL and implements only a subset of any of the common standards) Hadoop Operations

376、单词 wildcards 的匹配文本：

Intuitively speaking, wildcards with supertype bounds let you write to a generic object, while wildcards with subtype bounds let you read from a generic object. Core Java Volume I Fundamentals 9th Edition

The most common modifiers, which you are surely familiar with, are wildcards. Mastering ElasticSearch

Java supports wildcards as a flexible mechanism to allow subtyping for generics (more generally referred to as use-site variance) Java 8 in Action Lambdas, Streams and Functional-style Programming

We can specify all the part prefixed subdirectories as input to the dependent job using wildcards. Hadoop MapReduce Cookbook

Bash by default performs globbing, meaning that it’ll expand wildcards like “*” Hadoop in Practice

Fieldnames can be specified with wildcards—any field which matches the wildcard pattern will be included in the search. Elasticsearch The Definitive Guide

If an input parameter is both a producer and a consumer, then wildcard types will do you no good: you need an exact type match, which is what you get without any wildcards. Effective Java 2nd Edition

Lower-bounded wildcards are much less commonly used than upperbounded wildcards, and we discuss them later in the chapter. Java in a Nutshell 5th Edition

Chapter 11 presents advanced features provided by ActiveMQ such as wildcards and composite destinations, advisory messages, virtual topics, some info about ActiveMQ plug-ins, and an introduction to message routing with Apache Camel. ActiveMQ in Action

Wildcards are used to declare parameterized types that are either bounded or unbounded. Java 7 Recipes A Problem-Solution Approach

377、单词 sphinx 的匹配文本：

And finally, Appendix F is an introduction to Sphinx, a high-performance, full-text indexing system that can complement MySQL’s own abilities. High Performance MySQL 3rd Edition

Indexing and searching is fast in Sphinx Search, and it supports distributed queries as well. SQL Antipatterns

378、单词 versioning 的匹配文本：

It’s common practice that a version control system (VCS) manages the versioning of the build definition and source code files. Gradle in Action

ElasticSearch tries to reduce these issues and provide additional mechanisms as versioning. Mastering ElasticSearch

Things to think about: Major impact on storage design and performance Are these storage- or application-level features? Versioning of collaborative documents. Storage Architecture and Challenges

If you are developing an application that requires revision history, it is highly recommended that you build an additional versioning layer rather than use CouchDB's internal layer. Beginning CouchDB

This is a very efficient way to create a prototype, but of course a properly managed plug-in will need a build of its own, so it can be decorated with all of the appropriate testing, versioning, and release processes that attend mature software development. Gradle Beyond The Basics

Versioning—The ability for the file format to support backward or forward compatibility. Hadoop in Practice

We are currently implementing an object-granularity clone operation to create object copies with copy-on-write behavior for efficient storage utilization, and are extending the RADOS client interface to allow transparent versioning for logical point-in-time copies across sets of objects (i. RADOS A Scalable, Reliable Storage Service for Petabyte-scale Storage Clusters.

Versioning support The data types are capable of being versioned to support updates to the client API. Cassandra The Definitive Guide

A note on versioning Flume has gone through some major changes in recent times. Hadoop Beginners Guide

Versioning As newer versions of Hadoop are related, the process of upgrading the software on disk can be deferred to the package manager. Hadoop Operations

379、单词 https 的匹配文本：

In order to get your authentication details you need to log in to https://dev. Mastering ElasticSearch

This specification defines the g of servlets and JSP pages to the HTTP and HTTPS protocols, and the g of EJB components to IIOP and SOAP protocols. Java EE 7 Specification

Using curl curl is a lightweight command-line tool for transferring data to or from a server using various protocols, such as HTTP, HTTPS, and FTP. Beginning CouchDB

You can find more details about Hive commands from https://cwiki. Hadoop MapReduce Cookbook

The NameNode has an embedded Jetty HTTP/HTTPS web server, which is used for the SecondaryNameNode to read images and merge them back. Hadoop in Practice

For that reason, the URL class only works with schemes that the Java library knows how to handle, such as http:, https:, ftp:, the local file system (file:), and JAR files (jar:) Core Java Volume II Advanced Features 9th Edition

As of Java 1.4, secure HTTP is also supported with the https:// protocol. Java in a Nutshell 5th Edition

It is valid to specify the wild card IP 0.0.0.0 to indicate the HTTPS server should listen on all interfaces. Hadoop Operations

Standalone Java applications can indicate the proxy server to use by setting the socksProxyHost and socksProxyPort properties (if you’re using a SOCKS proxy server), or http.proxySet, http.proxyHost, http.proxyPort, https.proxySet, https.proxy Host, https.proxyPort, ftpProxySet, ftpProxyHost, ftpProxyPort, gopherProxy Set, gopherProxyHost, and gopherProxyPort system properties (if you’re using protocol-specific proxies) Java Network Programming 4th Edition

Secure HTTP (HTTP over SSL or HTTPS) is also supported by this transport: ActiveMQ in Action

380、单词 incremented 的匹配文本：

Databases in CouchDB have a sequence number that gets incremented every time the database is changed. CouchDB The Definitive Guide

They’ll change less often (for example, to indicate a new feature) and have to be incremented manually. Gradle in Action

For each record, numWords is incremented by the number of terms in input.txtField returned by the CountWords function. Dremel - Interactive Analysis of WebScale Datasets

The cluster map also includes a list of down or inactive devices and an epoch number, which is incremented each time the map changes. Ceph A Scalable, High-Performance Distributed File System

Generate a new vector with all the values incremented by one. Hadoop in Practice

In the response, we can see that Elasticsearch has incremented. Elasticsearch The Definitive Guide

For each word, its corresponding entry in a multiset called wordCount is incremented by one. Hadoop in Action

Each time a process experiences an event, its clock is incremented by 1 Cassandra The Definitive Guide

Each reduce task writes to a single file with the filename part-r-nnnnn, where nnnnn starts at 00000 and is incremented. Hadoop Beginners Guide

When a data log is full, a new one will be created, and the log number incremented. ActiveMQ in Action

381、单词 filesystems 的匹配文本：

The problem with HDFS and streaming data/logs HDFS isn't a real filesystem, at least not in the traditional sense, and many of the things we take for granted with normal filesystems don't apply here, for example being able to mount it. Apache Flume Distributed Log Collection for Hadoop

The scan and DFSIO jobs are canonical benchmarks used to evaluate HDFS and other distributed filesystems run under Hadoop MapReduce; the results in table 1 are a coarse measure of the effect attributable to HDFS in our experiments. Apache Hadoop YARN Yet Another Resource Negotiator

Its implementation addresses a number of problems that are present in a number of distributed filesystems such as Network File System (NFS) Professional Hadoop Solutions

Scribe supports multiple data sinks, including HDFS, regular filesystems, and NFS. Hadoop in Practice

Chapter 3 looks at Hadoop filesystems, and in particular HDFS, in depth. Hadoop：The Definitive Guide 3rd Edition

Other developments include HBase running on filesystems other than Apache HDFS, such as MapR. HBase：The Definitive Guide

You can store a big data set of (say) 100 TB as a single file in HDFS , something that would overwhelm most other filesystems. Hadoop in Action

Like most filesystems, Hadoop has the concept of a home directory for each user. Hadoop Beginners Guide

This is similar to common Linux filesystems such as ext3 Hadoop Operations

Programming Hive introduces Hive, an essential tool in the Hadoop ecosystem that provides an SQL (Structured Query Language) dialect for querying data stored in the Hadoop Distributed Filesystem (HDFS), other filesystems that integrate with Hadoop, such as MapR-FS and Amazon’s S3 and databases like HBase (the Hadoop database) and Cassandra. Programming Hive

382、单词 awt 的匹配文本：

Chapter 8 is a detailed discussion of the event model of the AWT, the abstract window toolkit. Core Java Volume I Fundamentals 9th Edition

The AWT event-handling model requires application developers to implement a listener interface and have that listener register with each component that needs to be monitored. Java Network Programming and Distributed Computing

A number of classes in the java.awt.geom package, such as Point2D.Double, are not serializable. Core Java Volume II Advanced Features 9th Edition

Prominent examples include the Point and Dimension classes in the java.awt package. Effective Java 2nd Edition

However, they were implemented as part of the private com.sun.awt. Java 7 New Features Cookbook

For the second edition, John Zukowski reviewed my Java 1.1 AWT quick reference material, and George Reese reviewed most of the remaining new material. Java in a Nutshell 5th Edition

In comparison to the AWT or I/O, there have been almost no changes and only a few additions. Java Network Programming 4th Edition

This technique is used extensively in the java.awt and javax.swing packages, which define interfaces and abstract classes associated with different kinds of event-handling methods. Concurrent Programming in Java Design Principles and Pattern

For example, you can’t serialize any Swing components (like JButton, JSpinner), and you can’t serialize File objects, but you can serialize the Color class (awt.color, to be more precise) Java 7 Recipes A Problem-Solution Approach

NOTE: This method uses the Color and Image classes from JavaFX, not from java.awt. Java SE 8 for the Really Impatient

383、单词 dev 的匹配文本：

Packaged distributions (particularly on Linux and Debian) typically distribute those headers in a dev package and install them into the directory /usr/include. Monitoring with Ganglia

To push your application to a development database, you could use couchapp push dev. CouchDB The Definitive Guide

Special thanks to Spencer Allain, Jonathan Keam, and Robert Wenner for thoroughly reading every chapter of the book and providing me with line-by-line edits and comments at different stages of development; Michael McGarr and Samuel Brown for bouncing around ideas that involved content on continuous delivery and DevOps; and Baruch Sadogursky from JFrog for the technical review of chapter 14 and for promoting the book even before it was released. Gradle in Action

In the sample build file, we only want to sign the artifacts if the uploadArchives task is part of the Gradle task graph to be executed and if the version of the project doesn't end with the String value DEV: Gradle Effective Implementation Guide

In order to get your authentication details you need to log in to https://dev. Mastering ElasticSearch

For those interested in developing or fixing features in Cassandra, it is common to join the dev mailing list and chat room. Cassandra High Performance Cookbook

Fuse development files can be found in fusedevel RPM for Redhat/Fedora and in libfuse-dev package for Debian/Ubuntu. Hadoop MapReduce Cookbook

Replace /dev/md1 with the name of your disk device (consult /etc/fstab for device names): Hadoop in Practice

Download the JDBC driver for your database; for MySQL, we find it at http://dev. Hadoop Beginners Guide

The continued growth in software engineering maturity today can be seen in the DevOps movement, climaxing with the publication last year of Jez Humble and David Farley’s book Continuous Delivery (AddisonWesley) Building and Testing with Gradle

384、单词 combinators 的匹配文本：

You can easily combine these functions with the orElse and andThen combinators you learned at the beginning of this section: Scala in Action

In the next chapter, we look more closely at functional programming techniques such as higher-order functions, currying, persistent data structures, lazy lists, pattern matching, and combinators. Java 8 in Action Lambdas, Streams and Functional-style Programming

The MapReduce model originates from the map and reduce combinators concept in functional programming languages such as Lisp. Professional Hadoop Solutions

Examples can be found in standard libraries such as parser combinators and actors. Programming in Scala 2nd Edition

We want to start writing combinators that let us avoid explicitly passing along the state. Functional Programming in Scala

External DSLs with Parser Combinators When you write a parser for an external DSL, you can use a parser generator tool like Antlr (see [Antlr]) Programming Scala

Let's explore advanced CSS selectors such as adjacent sibling combinators and pseudoclasses as described in the following sections. Selenium Testing Tools Cookbook

In this chapter, we’re going to explore combinators and Scala’s parser combinatory library. Beginning Scala

385、单词 oop 的匹配文本：

Object-oriented programming (OOP) is now in the mainstream of programming practice, and Java is an object-oriented programming language. Core Java Volume I Fundamentals 9th Edition

As Java programmers, we’re well versed in object-oriented programming (OOP) Programming Concurrency on the JVM

However, I will be introducing the concepts of object-oriented programming (OOP) and Java’s basic control mechanisms. Thinking in Java 4th Edition

This approach borrowed from an OOP world solves the problem of maintaining jobs sharing a number of identical properties. Jenkins The Definitive Guide

Because in the client you have to create a JSON object, you’ll import the JSONStringer object that lets you create JSON strings in an OOP way. RabbitMQ in Action

An OOP is normally the same size as the machine pointer. Cassandra High Performance Cookbook

Other languages that have integrated some elements of functional programming into OOP include Ruby, Smalltalk, and Python. Programming in Scala 2nd Edition

Smalltalk captured the essence of object-oriented programming (OOP) as that model of programming entered the mainstream. Programming Scala

But in fact, most of the Java classes we define have getters and setters, so there is data exposed in OOP. Beginning Scala

The HotSpot VM internally accesses the instanceKlass using an internal data structure called a klassOop. Java Performance

386、单词 filenames 的匹配文本：

It also assumes that filenames never change; otherwise, the source would loose its place on restarts as to which files have been sent and which have not. Apache Flume Distributed Log Collection for Hadoop

As of Ganglia 3.3, hostnames are converted to lowercase, RRD files are created with lowercase filenames, and lowercase hostnames should be used in URLs to access the metrics. Monitoring with Ganglia

This section will explain how to configure your project to use custom build filenames. Gradle in Action

The matching pattern style is described as an ANT-style matching pattern because the ANT build tool uses this style to define a syntax for matching filenames in file trees. Gradle Effective Implementation Guide

Code words in text, database table names, folder names, filenames, file extensions, pathnames, dummy URLs, user input, and Twitter handles are shown as follows: "What we would like to do is, use the BM25 similarity model for the name field and the contents field." Mastering ElasticSearch

A real-world build routinely copies files from place to place, recursing directory trees, pattern-maching filenames, and performing string operations on file content. Gradle Beyond The Basics

By Hadoop convention filenames for these files end in .deflate. Hadoop in Action

Indicates new terms, URLs, email addresses, filenames, and file extensions. Cassandra The Definitive Guide

On a computer filesystem, filenames (keys) allow access to any sort of data, such as text, images, and sound (values) Hadoop Beginners Guide

Italic Indicates new terms, URLs, email addresses, filenames, and file extensions. Building and Testing with Gradle

387、单词 jta 的匹配文本：

However, for those applications that were built on EJB or need to deploy the Spring-based applications in a JEE container and utilize the application server’s enterprise services (e.g., JTA Transaction Manager, data source connection pooling, JMS connection factories, etc.), Spring also provides simplified support for those technologies. Pro Spring 3

There is a transaction manager that handles the transactions, and it even has support for JTA (Java Transaction API), which enables distributed or global transactions. Pro Spring MVC

Additional threads that are created for any purpose uld not attempt to start JTA transactions. Java EE 7 Specification

You can use the Java Transaction API (JTA) to make transactions span both databases: the batch tables and the business tables will always be synchronized, but you’ll add overhead because managing multiple transactional resources is more expensive than managing just one. Spring Batch in Action

Spring deals with common resource management “plumbing” code and allows for declarative transaction management in various situations ranging from managed environments with JTA (Java Transaction API, the transaction standard in Java) to native database transaction with plain JDBC. Spring Dynamic Modules in Action

TransactionScoped is a new CDI scope that defines bean instances whose life cycle is scoped to the currently active JTA transaction. Java EE 7 Essentials

It’s transactional, supporting JTA/JTS, two-phase commit, deadlock detection, and MVCC. Cassandra The Definitive Guide

Note that the JTA transaction manager has been added and a reference to it has been added to the listener-container. ActiveMQ in Action

The typical situation when running inside the Java EE container environment is that the standard Java Transaction API (JTA) is used. Pro JPA2

It is written predominantly in Java and leverages a custom storage format and the facilities of the Java Transaction Architecture (JTA) to provide XA transactions. Spring Data Modern Data Access for Enterprise Java

388、单词 finalizer 的匹配文本：

Gradle’s answer to such a scenario is finalizer tasks, which are regular Gradle tasks scheduled to run even if the finalized task fails. Gradle in Action

If you don’t want to return this key, you can use a finalizer to change the final accumulator document into anything, even a nondocument (e.g., a number or string) MongoDB The Definitive Guide 2nd Edition

The particular definition of finalize that can be invoked for an object is called the finalizer of that object. JSR-133 Java Memory Model and Thread

If an object declares a finalizer, the finalizer is executed before the object. The Java Language Specification Java SE 7 Edition

However, since a finalize method can contain arbitrary code, there are several potentially bad scenarios to consider as well, for example, the case that a finalizer resurrects the object, or clones a new object from the dying one, in effect preventing an object from being garbage collected. Oracle JRockit The Definitive Guide

It can take arbitrarily long between the time that an object becomes unreachable and the time that its finalizer is executed. Effective Java 2nd Edition

A finalizer in Java is the opposite of a constructor. Java in a Nutshell 5th Edition

Finalizers postpone garbage collection until the finalizer is run, adding yet more overhead to the GC algorithm. Java Performance Tuning 2nd Edition

These arise from a combination of internal HotSpot VM threads and HotSpot VM library related threads such as the reference handler and finalizer threads. Java Performance

In this case, you use the finalizer to calculate the average number of votes per review. MongoDB in Action

389、单词 sflow 的匹配文本：

In fact, in cooperation with the sFlow agent, Ganglia may be used to monitor network gear such as routers and switches (see Chapter 8 for more information) Monitoring with Ganglia

390、单词 subprojects 的匹配文本：

The Maven Reactor lays out the build order: the parent pom, then the subprojects. Netty in Action MEAP V08

You’ll start by deconstructing the existing project structure and break out individual, functional subprojects. Gradle in Action

In this case you have only one project, but multiproject builds will declare all the subprojects here. Scala in Action

Example 3-2 has three builds, the parent project and two subprojects. Gradle Beyond The Basics

Over the course of three years, the Hadoop project has blossomed and spun off half a dozen subprojects. Hadoop：The Definitive Guide 3rd Edition

The most prominent and well-supported ones have officially become subprojects under the umbrella of the Apache Hadoop project.1 These subprojects include. Hadoop in Action

Parts of Hadoop The top-level Hadoop project has many component subprojects, several of which we'll discuss in this book, but the two main ones are Hadoop Distributed File System (HDFS) and MapReduce. Hadoop Beginners Guide

Instead, you’ll find three subprojects called Common, HDFS, and MapReduce shown as tabs along the top of the page. Hadoop Operations

We’ve also seen how NMS and CMS subprojects help create more complex, JMS-like APIs for environments that deserve this kind of support. ActiveMQ in Action

However, for a given build, not all tasks must be present in the top-level build file, since Gradle supports dependent subprojects, or nested builds. Building and Testing with Gradle

391、单词 iterates 的匹配文本：

The rest of the code block defines the parameters that will be passed to the metric init function and iterates through each metric definition, calls the specified callback function with the metric name as a parameter, and finally prints out the metric value that was gathered. Monitoring with Ganglia

Iterates through all subprojects accessible through root projectSets custom build filename for a subproject by using subprojects’ names, Gradle in Action

The while-loop of the algorithm (Line 5) iterates over all atomic and record-valued fields contained in a given record. Dremel - Interactive Analysis of WebScale Datasets

The reduce worker iterates over the sorted intermediate data and for each unique intermediate key encountered, it passes the key and the corresponding set of intermediate values to the user’s Reduce function. MapReduce - Simplied Data Processing on Large Clusters

When an active OSD receives an updated cluster map, it iterates over all locally stored placement groups and calculates the CRUSH mapping to determine which ones it is responsible for, either as a primary or replica. Ceph A Scalable, High-Performance Distributed File System

The process iterates until the skip range is within the acceptable size. Hadoop in Action

Our engine iterates on input rows and uses generated procedures that do both. Tenzing - A SQL Implementation On The MapReduce Framework

The example created a Ruby script that opens an Avro datafile, iterates through each datum, and displays it based on explicitly named fields. Hadoop Beginners Guide

Next it iterates through all available queues and prints their current size and number of consumers subscribed to them. ActiveMQ in Action

This Groovy code declares a list of four strings (each of which is the name of a poet), then iterates over that list. Building and Testing with Gradle

392、单词 connectors 的匹配文本：

Two Jetty connectors are defined for HTTP and for the secure TLS traffic. Jenkins Continuous Integration Cookbook

This specification also describes standard ways to extend Java EE with connectors to other non-Java EE application systems, such as me systems and ERP systems. Java EE 7 Specification

Sqoop has the notion of Connectors, which contain the specialized logic to read and write to external systems. Hadoop in Practice

Applications using the Thrift, JDBC, and ODBC connectors need to run a Hive server to communicate with Hive. Hadoop：The Definitive Guide 3rd Edition

To add a user-defined connection, simply click on the Connectors folder and click on the Create Connection button. Oracle JRockit The Definitive Guide

Many of these connectors are open source, while others are free or available from commercial vendors at a cost. Hadoop Operations

It discusses ActiveMQ URIs and all the transport connectors for both client-to-broker and broker-to-broker communications including TCP, NIO, STOMP, failover, SSL, HTTP/S, and much more. ActiveMQ in Action

The server consists of one or more connectors that share a single engine component. Java Performance

Such tools are typically designed to make it easy to load content stored in various systems, and sometimes provide pre-built connectors to common content stores, such as web sites, databases, popular content management systems, filesystems, etc. Lucene in Action 2nd Edition

Many NoSQL databases have implemented Hive connectors using custom adapters. Programming Hive

393、单词 buf 的匹配文本：

If it is successful create a ByteBuf to hold the data. Netty in Action MEAP V08

The default implementation of this method simply calls the engineDigest( ) method and then copies the result into buf. Java Security 2nd Edition

The “buf.type” that appears in the interpreter responses in this and several other examples in this section is a singleton type. Programming in Scala 2nd Edition

Abbreviations, as in buf holding a pointer to a buffer of some kind. The Java Language Specification Java SE 7 Edition

The buf argument is the address of the String buffer that stores the resulting message. Understanding MySQL Internals

394、单词 subproject 的匹配文本：

Gradle figures out for you which of the subproject dependencies need to be rebuilt, without having to store a subproject’s artifact in the local cache. Gradle in Action

We apply the Java plugin to each subproject and add a testCompile dependency on the JUnit libraries. Gradle Effective Implementation Guide

This causes Gradle to look explicitly in the individual subproject’s directory for the grammars directory, rather than only at the root project level. Gradle Beyond The Basics

Chukwa is an Apache subproject of Hadoop that also offers a large-scale mechanism to collect and store data in HDFS. Hadoop in Practice

If you need to know which package a class is in, you can easily look it up in Hadoop’s Java API documentation for the relevant subproject, linked to from the Apache Hadoop home page at http://hadoop.apache.org/ Hadoop：The Definitive Guide 3rd Edition

Hadoop started out as a subproject of Nutch , which in turn was a subproject of Apache Lucene. Hadoop in Action

HBase HBase is a clone of Google’s Bigtable, originally created for use with Hadoop (it’s actually a subproject of the Apache Hadoop project) Cassandra The Definitive Guide

Doug started work on the implementations of these Google systems, and Hadoop was soon born, firstly as a subproject of Lucene and soon was its own top-level project within the Apache open source foundation. Hadoop Beginners Guide

The Common subproject was to contain code shared by both HDFS and MapReduce, while HDFS and MapReduce would be home to their respective components. Hadoop Operations

Ant, though the older of the JVM build tools, has been supplemented by the Ivy subproject to accomplish exactly this. Building and Testing with Gradle

395、单词 schedulers 的匹配文本：

We compare this approach to existing cluster scheduler designs, evaluate how much interference between schedulers occurs and how much it matters in practice, present some techniques to alleviate it, and finally discuss a use case highlighting the advantages of our approach – all driven by real-life Google production workloads. Omega - flexible, scalable schedulers for large compute clusters

Other administrative tasks include the management of Hadoop jobs, for example configuring job scheduling policy with schedulers. Hadoop Operations and Cluster Management Cookbook

This makes it possible to tightly enforce global scheduling properties (different schedulers in YARN focus on different global properties, such as capacity or fairness), but it requires the scheduler to obtain an accurate understanding of applications’ resource requirements. Apache Hadoop YARN Yet Another Resource Negotiator

When talking about merge policy and merge schedulers it would be nice to visualize it. Mastering ElasticSearch

The good news is that Hadoop scheduler is pluggable, and it comes with two other schedulers. Hadoop MapReduce Cookbook

Summary The amount of parallelism available to you is a function of your cluster capacity, the other jobs running at the same time as your job, and any schedulers configured for your environment. Hadoop in Practice

The default in MapReduce 1 is the original FIFO queue-based scheduler, and there are also multiuser schedulers called the Fair Scheduler and the Capacity Scheduler. Hadoop：The Definitive Guide 3rd Edition

Fortunately, starting with version 0.19, Hadoop has a pluggable architecture for the scheduler, and two new schedulers have become available for resolving job contention. Hadoop in Action

Unless we use one of the alternative schedulers that we will discuss in later sections, the FIFO scheduler dedicates the full cluster to the sole currently running job. Hadoop Beginners Guide

In Chapter 7, we’ll cover how each of the schedulers may be configured to suit different environments and use cases. Hadoop Operations

396、单词 atomically 的匹配文本：

The WAL is used to track all input and output from the channel in an atomically safe way. Apache Flume Distributed Log Collection for Hadoop

Storing all metadata in a single Bigtable row allows us to update it atomically through a single Bigtable transaction. Megastore - Providing Scalable, Highly Available Storage for Interactive Services

The entry data is accessed atomically, and changes are ordered. Pro Hadoop

When a reduce task completes, the reduce worker atomically renames its temporary output file to the final output file. MapReduce - Simplied Data Processing on Large Clusters

When it has moved all but a nominal amount of the data, it uses a transaction to atomically move that nominal amount and update the metadata for the two Paxos groups. Spanner - Google’s Globally-Distributed Database

It turns out there’s no direct analogue in the relational world! Column families exist in HBase as a way for a single row to contain disjoint sets of columns in a way that’s physically efficient but can be processed atomically. HBase in Action

Files must be atomically moved into the source directory (file moves in both the Linux3 and HDFS filesystems are atomic) Hadoop in Practice

Cell values can be interpreted as counters and updated atomically. HBase：The Definitive Guide

The contents of a file are read atomically and in their entirety. Chubby - The lock service for loosely-coupled distributed systems

In particular, composite read/write operations such as the "++'' operation on volatile variables are not performed atomically. Concurrent Programming in Java Design Principles and Pattern

397、单词 classloader 的匹配文本：

Launch the console and issue the x command, which clears the classloader cache. Core Java Volume I Fundamentals 9th Edition

If a user-defined classloader prefetches binary representations of classes and interfaces, or loads a group of related classes together, then it must reflect loading errors only at points in the program where they could have arisen without prefetching or group loading. The Java Virtual Machine Specification Java SE 7 Edition

This is generally a good idea, as it can avoid classloader issues related to conflicts with Ant’s own libraries. Jenkins The Definitive Guide

Classes in Java are loaded by a classloader object, which consults the directories and. Java Threads 3rd Edition

It indicates that the dependencies you’re providing will be available to the classloader during the rest of the build script execution, which is exactly what we need. Gradle Beyond The Basics

The JNDI naming context, classloader, and security context is propagated to the proxied object. Java EE 7 Essentials

The option of splitting the classes per classloader is also available in the Type Graph. Oracle JRockit The Definitive Guide

A further extension to this idea is to include your own classloader that opens the ZIP/JAR file itself and reads in all files sequentially, loading them into memory immediately. Java Performance Tuning 2nd Edition

This will cause classloader issues and will prevent ActiveMQ from being deployed correctly. ActiveMQ in Action

Java 7 simply adds a close method to close the classloader. Java SE 8 for the Really Impatient

398、单词 deployer 的匹配文本：

You’ll configure the Maven deployer to upload both files to a local directory. Gradle in Action

We must configure the repository for our uploadArchives task via a closure argument of the mavenDeployer() method. Gradle Effective Implementation Guide

Because there’s no gold copy of the configuration for a given set of server instances, the deployer has to go onto each server and update the configuration manually. Spring in Practice

The Application Assembler is responsible for providing assembly ions describing external dependencies of the application that the Deployer solve in the deployment process. Java EE 7 Specification

At the heart of the Spring DM web support is its web extender, which contains a WAR deployer, which we’ll take a look at to see how it differs from a standard web deployer. Spring Dynamic Modules in Action

You decide what components to include via the Geronimo deployer. ActiveMQ in Action

For GlassFish, it is recommended that the deployer use commit option C for entity beans that are used only in transactions or used only once or twice (hence no benefit in caching the instances) Java Performance

The deployment process is optimized for these kinds of deployment scenarios to minimize the amount of metadata that a deployer has to specify. Pro JPA2

399、单词 traversable 的匹配文本：

The file chooser calls the isTraversable method to decide whether to open a directory when a user clicks on it. Core Java Volume I Fundamentals 9th Edition

Then we instantiate a variable x as an anonymous subclass of A, such that type B is stabilized at Traversable[Int] Scala in Depth

Traversable is the parent trait for all the collection types in Scala, and the ++ method is only defined in this trait. Scala in Action

Nodes know about INCOMING and OUTGOING relationships that are traversable both ways. NoSQL Distilled

If the current vertex value is still equal to Integer.MAX, we know that the graph never received any incoming messages intended for that vertex, which implies it is not traversable by the source vertex. Hadoop RealWorld Solutions Cookbook

This method selects those elements of a collection that pass a test the user sup3A foreach method is defined in trait Traversable, a common supertrait of List, Set, Array, and Map. Programming in Scala 2nd Edition

But traversable data types are too numerous for us to write specialized and methods for each of them. Functional Programming in Scala

The xs parameter can be any collection with the TraversableOnce trait, the most general trait in the Scala collections hierarchy. Scala for the Impatient

As described earlier in Section 2.1, Dapper traces are modelled as trees of trace spans, so the Trace data structure is consequently a simple traversable tree of individual Span structures. Dapper a Large-Scale Distributed Systems Tracing Infrastructure

In addition to receiving confined references along call chains, the method invocations executing within a single thread can access the Thread object representing the thread they are running in, and any further information traversable from there. Concurrent Programming in Java Design Principles and Pattern

400、单词 htable 的匹配文本：

In the Java source, to connect to HBase, we first create a Configuration object and then use it with the table name to create an HTable instance. HBase Administration Cookbook

All programmatic data-manipulation access to HBase is done through either the HTableInterface or the HTable class that implements HTableInterface. Professional Hadoop Solutions

The HTable constructor reads the default configuration information to locate HBase, similar to the way the shell did. HBase in Action

Finally, we call close() on our HTable instance to flush out any write buffers not yet cleared. Hadoop：The Definitive Guide 3rd Edition

Here is a summary of the points we just discussed: • Create HTable instances only once, usually when your application. HBase：The Definitive Guide

The HTable class is the main way in Java to interact with HBase. Spring Data Modern Data Access for Enterprise Java

401、单词 sudo 的匹配文本：

You can now start CouchDB using the new user: sudo -i -u couchdb couchdb -b. Getting Started with CouchDB

Finally, to install gmond, type: user@host:# sudo yum install ganglia-gmond. Monitoring with Ganglia

Build and install the CouchDB ebuild by running: sudo emerge couchdb. CouchDB The Definitive Guide

This user should have sudo privileges to install software packages, configure system services, and so on. Hadoop Operations and Cluster Management Cookbook

Please remember to run the jstat command from the same account ElasticSearch is running or if that is not possible, run the jstat command with administrator privileges (for example, using sudo command on Linux systems) Mastering ElasticSearch

If you get a message informing you that curl is not available, simply use the command sudo apt-get install curl to download and install it automatically. Beginning CouchDB

You’ll need sudo access for your user to run this command (it launches the Hadoop services via init.d scripts): Hadoop in Practice

We also need to change the owner of the Hadoop files to be the hadoop user and group: % sudo chown -R hadoop:hadoop hadoop-x.y.z. Hadoop：The Definitive Guide 3rd Edition

In order to get autoscan, which the Thrift bootstrap needs, you’ll have to run this command: $ sudo apt-get install automake. Cassandra The Definitive Guide

For instance, assuming the HDFS super user is hdfs, it would normally be possible to perform administrative activities using sudo like in Example 6-3 Hadoop Operations

402、单词 bytecodes 的匹配文本：

Of course, interpreting bytecodes is necessarily slower than running machine instructions at full speed, so it isn’t clear that this is even a good idea. Core Java Volume I Fundamentals 9th Edition

A class file contains Java Virtual Machine instructions (or bytecodes) and a symbol table, as well as other ancillary information. The Java Virtual Machine Specification Java SE 7 Edition

When a class must be loaded (typically, the first time you want to create an object of that class), the .class file is located, and the bytecodes for that class are brought into memory. Thinking in Java 4th Edition

But if the program came from an unknown source, how do we know that the bytecodes we've received are actually legal? Java Security 2nd Edition

Thus, the JVM is required to turn the bytecodes into native code for the CPU on which the Java application executes. Oracle JRockit The Definitive Guide

That lets you carry out custom checks before you pass the bytecodes to the virtual machine. Core Java Volume II Advanced Features 9th Edition

A long time ago, when Java was Oak and it was being designed as a way to distribute untrusted code into set-top boxes (and later browsers), the rules defining how a program executed and what were the means of the instruction set (bytecodes) was super-important. Beginning Scala

Generating the bytecodes without the extra constructor is perfectly legal. Java Performance Tuning 2nd Edition

People loved the simplicity of the language, the portability of bytecodes, and the safety of garbage collection (versus traditional malloc/free memory management of other systems) Java Performance

This mapping strategy also applies at the level of the CPU when operations are compiled into native code rather than interpreted as bytecodes. Concurrent Programming in Java Design Principles and Pattern

403、单词 workflows 的匹配文本：

How all of these components can be used to build complicated data workflows will become clear as the book proceeds. Apache Flume Distributed Log Collection for Hadoop

With its plugin system and rapidly growing community, you’ll need to take into consideration existing JavaScript automation workflows. Gradle in Action

Still worse, instead of losing a single workflow, a JobTracker failure caused an outage that would lose all the running jobs in a cluster and require users to manually recover their workflows. Apache Hadoop YARN Yet Another Resource Negotiator

Hive and Pig provide a clean abstraction layer between your data flow and meaningful queries, and the complex MapReduce workflows they compile to. Hadoop RealWorld Solutions Cookbook

We will also be discussing other advanced Hadoop features such as using DistributedCache for distribute data, using Hadoop Streaming for quick prototyping of Hadoop computations, and using Hadoop counters to report custom metrics for your computation as well as adding job dependencies to manage simple DAG-based workflows of Hadoop MapReduce computations. Hadoop MapReduce Cookbook

For our scenario Oozie has a coordinator engine that can start workflows based on data and time triggers. Hadoop in Practice

Oozie A service for running and scheduling workflows of Hadoop jobs (including MapReduce, Pig, Hive, and Sqoop jobs) Hadoop：The Definitive Guide 3rd Edition

Each page in the ES2 collection is pushed through multiple logical workflows, each consisting of a local analysis phase, a global analysis phase, and an appropriate variant generation strategy. Hadoop in Action

Smart software, dumb hardware If we wish to see a cluster of hardware used in as flexible a way as possible, providing hosting to multiple parallel workflows, the answer is to push the smarts into the software and away from the hardware. Hadoop Beginners Guide

Workflows can be triggered by time or events such as data arriving in a directory, and job failure handling logic can be implemented so that policies are adhered to. Hadoop Operations

404、单词 keystone 的匹配文本：

Identity (codenamed "Keystone") provides authentication and authorization for all the OpenStack services. bk-compute-adminguide-grizzly

Keystone Provides an authentication and authorization service for other OpenStack services. bk-admin-guide-cloud-latest

As a result of the influence elephants have over many plants and animals, they are often referred to as a keystone species because they are vital to the long-term survival of the ecosystems in which they live. Hadoop：The Definitive Guide 3rd Edition

OpenStack services that are stateless include nova-api, nova-conductor, glance-api, keystone-api, neutron-api and nova-scheduler. high-availability-guide

To see the version of the keystone client, run the following command: admin-user-guide-trunk

The OpenStack Identity Service (Keystone) is the point that provides the authentication decisions and user attribute information, which is then used by the other OpenStack services to perform authorization. OpenStack Operations Guide

405、单词 env 的匹配文本：

In order to inform HBase where Java is installed, we will set JAVA_HOME in hbase-env. HBase Administration Cookbook

Per-process runtime environment: The file hadoop-env.sh is responsible for tailoring the per-process environment. Pro Hadoop

Based on the provided environment value with the key env, Maven will pick the appropriate profile. Gradle in Action

File Description hadoop-env.sh Configures the environment variable used by Hadoop core-site.xml Configures parameters for the whole Hadoop cluster hdfs-site.xml Configures parameters for HDFS and its clients mapred-site. Hadoop Operations and Cluster Management Cookbook

The #!/usr/bin/env python statement is a hint to tell the shell how to execute the script. Hadoop RealWorld Solutions Cookbook

You can check if it's there by using the env command on UNIX-like systems and the set command on Windows. Elasticsearch in Action MEAP V07 cHAPTER 1

If you’ve set up your cluster with LZO compression, you’ll need to create a flume-env.sh file and set the directory that contains the native compression codecs: Hadoop in Practice

Environment Settings In this section, we consider how to set the variables in hadoop-env.sh. Hadoop：The Definitive Guide 3rd Edition

In hadoop-env.sh define the JAVA_HOME environment variable to point to the Java installation directory. Hadoop in Action

Later, we’ll use configuration variables in hadoop-env.sh to specify alternate locations for Hadoop’s log and pid file directories. Hadoop Operations

406、单词 tmp 的匹配文本：

Finally, while files are being written to the HDFS, a .tmp extension is added. Apache Flume Distributed Log Collection for Hadoop

We also need to create a /tmp directory on HDFS, so that Hive can save its temporary files under it. HBase Administration Cookbook

For instance, here’s a sample command for piping the output of running gmond in debug mode to the file /tmp/gmond.log: Monitoring with Ganglia

In a standard Cygwin installation, the /tmp directory is also the C:\cygwin\tmp directory. Pro Hadoop

Set the sticky bit on HDFS directory to prevent the directories or files from being deleted by unauthorized users with the following command: sudo -u hdfs hadoop fs -chmod 1777 /tmp. Hadoop Operations and Cluster Management Cookbook

You would therefore (incorrectly) assume that the command hadoop fs -ls /tmp/* would work. Hadoop in Practice

Test Drive To start a temporary instance of HBase that uses the /tmp directory on the local filesystem for persistence, type: Hadoop：The Definitive Guide 3rd Edition

The default values for dfs.name.dir and dfs.data.dir point to directories under /tmp, which is intended only for temporary storage in almost all Unix systems. Hadoop in Action

Regardless of the mode, Hadoop by default uses the hadoop.tmp.dir property as the base directory under which all files and data are written. Hadoop Beginners Guide

Many new to Hadoop have quickly setup Hadoop, missed setting dfs.name.dir, and found themselves with an unusable filesystem after rebooting the namenode because / tmp was cleared during boot. Hadoop Operations

407、单词 leveraging 的匹配文本：

You’ve most likely created these connections between nodes by declaring a dependsOn relationship for a task or by leveraging the implicit task dependency interference mechanism. Gradle in Action

It redefines the way that data is managed and processed by leveraging the power of computing resources composed of commodity hardware. Hadoop Operations and Cluster Management Cookbook

We showcase this by leveraging the notion of container reuse and push-based communications provided by REEF. Apache Hadoop YARN Yet Another Resource Negotiator

This chapter will introduce readers to these two frameworks, which are capable of leveraging the distributed power of MapReduce. Hadoop RealWorld Solutions Cookbook

By declaring and modelling dependencies, tooling such as Gradle can automate working with dependencies by leveraging this information. Gradle Beyond The Basics

Finally, by opening Dapper’s trace repositories to internal developers we have enabled the creation of many more analysis tools than the Dapper team alone could have been able to produce in isolation, greatly leveraging the design and implementation effort. Dapper a Large-Scale Distributed Systems Tracing Infrastructure

Usage was important to us because we think some of the most interesting research will involve leveraging the vast amount of usage data that is available from modern Web systems. The anatomy of a large-scale hypertextual Web search engine

Among many notable “Aha!” moments I had while working with Cynthia, the biggest one was when she steered me into leveraging visual aids to help explain some of the complex concepts in this book. Hadoop in Practice

We present the design and implementation of RADOS, a reliable object storage service that can scales to many thousands of devices by leveraging the intelligence present in individual storage nodes. RADOS A Scalable, Reliable Storage Service for Petabyte-scale Storage Clusters.

When considering leveraging an existing investment in Apache Ant, or perhaps using the broad set of tasks that the Ant community has created, Gradle has a great story to tell. Building and Testing with Gradle

408、单词 sawzall 的匹配文本：

If you’re interested in pushing the edge of CouchDB’s incremental reduce functionality, have a look at Google’s paper on Sawzall, which gives examples of some of the more exotic reductions that can be accomplished in a system with similar constraints. CouchDB The Definitive Guide

The handoptimized Sawzall version likewise intermixes logical computation with lower-level implementation details, in an effort to get better performance. FlumeJava - Easy, Efficient Data-Parallel Pipelines

The set of aggregations is limited but the query phase can involve more general computations, which we express in a new interpreted, procedural programming language called Sawzall.1 (An interpreted language is fast enough: most of the programs are small and on large data sets the calculation tends to be I/O bound, as is discussed in the section on performance.) Interpreting the Data Parallel Analysis with Sawzall

HBase has coprocessors that are different from what Sawzall, the scripting language used in Bigtable to filter or aggregate data, or the Bigtable Coprocessor framework,† provides. HBase：The Definitive Guide

Queries are also sometimes written in a scripting language developed at Google called Sawzall. Cassandra The Definitive Guide

An increasing number of analysts were being forced to write custom code for more complex analysis, often directly against the source (such as Sawzall against logs) Tenzing - A SQL Implementation On The MapReduce Framework

The scripts are written in a language developed at Google for processing data called Sawzall [28] Bigtable - A Distributed Storage System for Structured Data

409、单词 delimiter 的匹配文本：

Careful with those delimiters! With so many Nagios plug-ins, written by so many different authors, it’s important to carefully choose your delimiter and avoid using the same one returned by a plug-in. Monitoring with Ganglia

While these values can be crammed into a Text object with some delimiter, it is typically a better practice to create a custom Writable. MapReduce Design Patterns

A standard delimiter “/” is used to split a key name.) Professional Hadoop Solutions

Each line of the input data is broken into a key (text) and value (text) pair using a delimiter character. Hadoop MapReduce Cookbook

The default record delimiter is newline, but it can be overridden with. Hadoop in Practice

Each line is broken into fields using a configurable field delimiter (defaults to a tab character) to be stored in the tuple’s fields. Hadoop：The Definitive Guide 3rd Edition

To ignore standard punctuation marks, we add them to the StringTokenizer’s list of delimiter characters: Hadoop in Action

Instead of representing columns within a super column, the composite key approach means that you use a regular column family with regular columns, and then employ a custom delimiter in your key name and parse it on client retrieval. Cassandra The Definitive Guide

For example, for delimited text format, the user can specify the delimiter, encoding, quoting, escaping, headers, etc. Tenzing - A SQL Implementation On The MapReduce Framework

In order to access the different attributes at run time, the Map and Reduce functions split the record by the delimiter into an array of strings. HadoopDB An Architectural Hybrid of MapReduce and DBMS Technologies for Analytical Workloads

410、单词 untrusted 的匹配文本：

In the real world, Java uses the "sandbox" approach, wherein untrusted code, which includes classes downloaded over a network within a Web browser, is placed within the sandbox and required to meet certain expectations. Java Network Programming and Distributed Computing

If you’re deploying CouchDB on the Web, or any place where untrusted parties can access your sever, it behooves you to research the recommended security measures for your operating system and take any additional steps needed. CouchDB The Definitive Guide

When we discuss the methods below, we speak of them in terms of trusted and untrusted classes. Java Security 2nd Edition

By contrast, SETI@home runs a perpetual computation on untrusted machines on the Internet with highly variable connection speeds and no data locality. Hadoop：The Definitive Guide 3rd Edition

If you write a class whose security depends on the immutability of a BigInteger or BigDecimal argument from an untrusted client, you must check to see that the argument is a “real” BigInteger or BigDecimal, rather than an instance of an untrusted subclass. Effective Java 2nd Edition

Chapter 6: Java Security This chapter explains the Java security architecture that allows untrusted code to run in a secure environment from which it cannot do any malicious damage to the host system. Java in a Nutshell 5th Edition

If, however, you are running Hadoop in an untrusted, multitenant environment or where accidental data exposure would be catastrophic, secure mode is the appropriate option. Hadoop Operations

An untrusted applet under the control of the default security manager will only be allowed to get the IP address of the host it came from (its codebase) and possibly the local host. Java Network Programming 4th Edition

This is a generic error all clients will receive when trying to connect to the untrusted broker (without the proper keystore and truststore) ActiveMQ in Action

These systems address concerns that do not arise for Bigtable, such as highly variable bandwidth, untrusted participants, or frequent reconfiguration; decentralized control and Byzantine fault tolerance are not Bigtable goals. Bigtable - A Distributed Storage System for Structured Data

411、单词 polymorphism 的匹配文本：

In order to be truly object-oriented, a programming language must also support inheritance and polymorphism. Core Java Volume I Fundamentals 9th Edition

This code calculates the address of the method body, using information stored in the object (this process is covered in great detail in the Polymorphism chapter) Thinking in Java 4th Edition

For such data types, you might also use polymorphism and method overriding as an alternative to if-then-else, but there’s still language-design discussion as to which is more appropriate.[9] We’d say that both are useful tools and you should have both in your armory. Java 8 in Action Lambdas, Streams and Functional-style Programming

Serialization or polymorphism can be used if this is truly a problem, but adds a bit of complexity. MapReduce Design Patterns

The producer methods provide runtime polymorphism where the concrete type of the bean to be injected may vary at runtime, the injected object may not even be a bean, and objects may require custom initialization. Java EE 7 Essentials

In object-oriented polymorphism, two subtypes can be referenced similarly because they implicitly share a common supertype. SQL Antipatterns

Creating Buffers The buffer class hierarchy is based on inheritance but not really on polymorphism, at least not at the top level. Java Network Programming 4th Edition

On the other hand, good objectoriented design actually encourages many small methods and significant polymorphism in the method hierarchy. Java Performance Tuning 2nd Edition

Because of polymorphism, dynamic loading, etc., it is generally impossible to identify them all beforehand; instead, their identities become known only as the action unfolds. Concurrent Programming in Java Design Principles and Pattern

Inheritance is the natural way to express these relationships and allows for polymorphism in the application. Pro JPA2

412、单词 aggregator 的匹配文本：

For instance, a sum aggregator applied to the out-degree of each vertex yields the. Pregel - A System for Large-Scale Graph Processing

Starting Up the Processes Starting the processes in a specific order is not necessary; however, if the daemons are started in the order recommended here, there won’t be a delay waiting for metadata to be retransmitted to the UDP aggregator and users won’t get error pages or incomplete data from the web server: Monitoring with Ganglia

The framework provides a set of aggregator services (see Table 8-14 for descriptions of the predefined aggregator services) Pro Hadoop

Of course, a node can be configured to neither be a master or a data node and in such case, the node will be only used as aggregator node that will have user queries. Mastering ElasticSearch

Pipes consist of types Each (applying a function or filter), GroupBy (which groups streams on tuple fields), CoGroup (which joins a common set of values), Every (which applies an aggregator or sliding window operation), and SubAssembly (which combines other pipe assemblies) Professional Hadoop Solutions

The keyword table introduces an aggregator type; aggregators are called tables in Sawzall even though they may be singletons. Interpreting the Data Parallel Analysis with Sawzall

Use Case: HBase at Streamy.com Streamy.com is a real-time news aggregator and social sharing platform. Hadoop：The Definitive Guide 3rd Edition

The output string starts with the name of a value aggregator function (from the set of predefined functions available in the Aggregate package) Hadoop in Action

Flume is a relatively new log aggregator developed by Cloudera. Kafka a Distributed Messaging System for Log Processing

At first the value of aggregator is the initial document. MongoDB in Action

413、单词 syslog 的匹配文本：

Many have already used the filesystem as a handoff point between the application creating the data (for instance, log4j) and the mechanism responsible for moving those files someplace else (for instance, syslog) Apache Flume Distributed Log Collection for Hadoop

Logs gmond and gmetad do not log many things to syslog; however, these logs should always be checked when there is a problem. Monitoring with Ganglia

The script is invoked with the additional arguments of the paths to the stdout, stderr, and syslog output files for the task. Pro Hadoop

As shown in Figure 5-4, log pages contain stdout, stderr, and syslog logs. Professional Hadoop Solutions

This recipe shows how to configure Cassandra’s logging mechanism, Log4J, to send events to its local logfile as well as a remote syslog server. Cassandra High Performance Cookbook

Sending data and running queries by sending JSON objects over HTTP makes it easy for someone to extend anything, from a syslog daemon like rsyslog to a connecting framework like ManifoldCF, to interact with Elasticsearch. Elasticsearch in Action MEAP V07 cHAPTER 1

You’ll want to preserve the original format of your syslog file, so you’ll need to create and edit flume-site.xml and indicate the raw output format. Hadoop in Practice

Flume supports a large variety of sources; some of the more commonly used ones include tail (which pipes data from a local file being written to into Flume, just like Unix tail), syslog, and Apache log4j (allowing Java applications to write events to files in HDFS via Flume) Hadoop：The Definitive Guide 3rd Edition

It supports native connectivity and support for writing directly to HDFS, and simplifies reliable, streaming data delivery from a variety of sources including RPC services, log4j appenders, syslog, and even the output from OS commands. Hadoop Operations

The Unix syslog utility provides a good starting point for designing such a framework. Java Performance Tuning 2nd Edition

414、单词 implicits 的匹配文本：

This book provides in-depth coverage of several of the more intricate areas of Scala, including advanced aspects of its type system, implicits, composition techniques with traits, collections, actors, functional categories. Scala in Depth

There are many features such as pattern matching, for comprehensions and implicits supported in Scala but not in Java, that we won’t discuss. Java 8 in Action Lambdas, Streams and Functional-style Programming

This chapter shows you how implicits work, and it presents some of the most common ways they are used. Programming in Scala 2nd Edition

Trim the stream to at most a single element; see chapter code We can give eval infix syntax using implicits; see chapter code for details. Functional Programming in Scala

Instead, this syntax is defined with normal definitions of types and methods, combined with a few general-purpose parsing conventions, such as support for implicits. Programming Scala

They are very powerful and very dangerous, and I only invite them into my program’s scope when there is a very good reason.5 Using implicits to convert to a class that has a particular method is a good reason. Beginning Scala

Beware that this power comes with a price, as having too many implicits in scope can render the code not only less understandable but also incorrect. Steps in Scala An Introduction to Object-Functional Programming

415、单词 vlan 的匹配文本：

They consist of a separate VLAN, volumes, instances, images, keys, and users. bk-compute-adminguide-grizzly

In this mode, nova-network creates a VLAN, a subnet, and a separate bridge for each project. Deploying OpenStack

They consist of an individual VLAN, and volumes, instances, images, keys, and users. bk-admin-guide-cloud-latest

With VLAN networking, this allows access between instances within the same project. admin-user-guide-trunk

VLANs VLAN configuration can be as simple or as complicated as desired. OpenStack Operations Guide

416、单词 println 的匹配文本：

Here, we are using the System.out object and calling its println method. Core Java Volume I Fundamentals 9th Edition

If the specified boolean flag is set to "true," whenever a byte array, println method, or newline character is sent, the underlying buffer will be automatically flushed. Java Network Programming and Distributed Computing

So, we are defining here that we want to add the closure (with the statement println 'Hello world.') to our task with the name helloWorld. Gradle Effective Implementation Guide

This rule definition can be further expanded with actual network reachability code in place of the println() call, with that functionality comfortably factored into methods as we see fit. Gradle Beyond The Basics

To write to a print writer, use the same print, println, and printf methods that you used with System.out. Core Java Volume II Advanced Features 9th Edition

The toString method is automatically invoked when an object is passed to println, printf, the string concatenation operator, or assert, or printed by a debugger. Effective Java 2nd Edition

The object System.out has an instance method named println( ) Java in a Nutshell 5th Edition

For example, it’s relatively rare to use the write() method of PrintStream instead of one of its print() and println() methods. Java Network Programming 4th Edition

Add a breakpoint in the first call to the println() method of the run() method of the Task1 class. Java 7 Concurrency Cookbook

In a real-world build (still relying on println statements in place of actual testing actions), we would be more likely to structure the task as done in Example 2-16 Building and Testing with Gradle

417、单词 san 的匹配文本：

Consequently, if there is a truly huge number of RRD files and if the disk storage system does not cope well with a random access workload (e.g., if it is not an SSD or SAN), the startup phase may take many hours or may not even complete at all. Monitoring with Ganglia

There are too many to name, but all of you in Dublin, Portland, Lisbon, London, Zurich, San Francisco, Mountain View, Dortmund, Stockholm, Hamburg, Frankfurt, Salt Lake City, Blacksburg, San Diego, and Amsterdam: you know who you are—thanks! To my family, friends, and coworkers: thanks you for your support and your patience with me over the last year. CouchDB The Definitive Guide

In contrast to systems like Lustre [4], which assume one can construct sufficiently reliable OSDs using mechanisms like RAID or fail-over on a SAN, we assume that in a petabyte or exabyte system failure will be the norm rather than the exception, and at any point in time several OSDs are likely to be inoperable. Ceph A Scalable, High-Performance Distributed File System

Broadly, the approach in HPC is to distribute the work across a cluster of machines, which access a shared filesystem, hosted by a Storage Area Network (SAN) Hadoop：The Definitive Guide 3rd Edition

Block pools are analogous to LUNs in a SAN storage system and a namespace with its pool of blocks is analogous as a file system volume. The Hadoop Distributed File System

As in conventional storage systems based on local or network-attached (SAN) disk drives or those embracing the proposed T10 OSD standard, devices passively respond to read and write commands, despite their potential to encapsulate significant intelligence. RADOS A Scalable, Reliable Storage Service for Petabyte-scale Storage Clusters.

Thanks to my editor, Mike Loukides, for being a charming conversationalist at dinner in San Francisco. Cassandra The Definitive Guide

Be it block storage via a Storage Area Network (SAN) or file-based via Network File System (NFS) or similar protocols, these approaches constrain Hadoop by introducing unnecessary bottlenecks and additional shared devices that would have a critical impact on failure. Hadoop Beginners Guide

In the classic HPC model, data is usually stored on a large shared centralized storage system such as a SAN or NAS. Hadoop Operations

CPUs have become so much faster than disks that a few CPUs in a shared-memory machine can drive enough disk heads to service required loads without the complexity of distributed transactions: the top TPC-E benchmark results today are achieved on large shared-memory machines connected to a SAN. Percolator - Large-scale Incremental Processing Using Distributed Transactions and Notifications

418、单词 middleware 的匹配文本：

He has more than 7 years of experience in software and middleware (Apache, Tomcat, PostgreSQL, Hadoop eco system) engineering. HBase Administration Cookbook

The use of technologies like CORBA as middleware is very important in organizations with many systems written in many languages. Java Network Programming and Distributed Computing

Because the test suite is run from the browser, not only does it test that CouchDB is functioning properly, it also verifies that your browser’s connection to the database is properly configured, which can be very handy for diagnosing misbehaving proxies or other HTTP middleware. CouchDB The Definitive Guide

Her interests and experiences lie across multiple tiers, from web frontend GUI to middleware, from middleware to backend SQL RDBMS and NoSQL data storage. Cassandra High Performance Cookbook

Before starting the graduate studies, Thilina worked as a Senior Software Engineer at WSO2 Inc., focusing on open source middleware development. Hadoop MapReduce Cookbook

We find that there are many middleware systems which return a result to their caller before all. Dapper a Large-Scale Distributed Systems Tracing Infrastructure

Message-oriented middleware (MOM) allows sending and receiving messages between distributed systems. Java EE 7 Essentials

Instead, it calls on a middleware layer on the server that in turn makes the database queries. Core Java Volume II Advanced Features 9th Edition

Type 3 drivers provide a connection to a middleware product, which in turn connects to a database. Java Performance Tuning 2nd Edition

Commercial message-oriented middleware (MOM) did not help to encourage adoption of the process; most of the MOMs on the market were closed source, were costly to buy and support, and required trained system administrators for proper installation. ActiveMQ in Action

419、单词 jruby 的匹配文本：

The import command is a basic JRuby feature, and it is not HBase-specific. HBase Administration Cookbook

I’ve written this book for experienced Java programmers who are interested in learning how to manage and make use of concurrency on the JVM, using languages such as Java, Clojure, Groovy, JRuby, and Scala. Programming Concurrency on the JVM

With R, simply handling the training data volumes and getting good integration with the existing Ruby/JRuby infrastructure at Shop It To Me proved very difficult. Mahout in Action

JRuby and JVM languages Those of you unfamiliar with Java may be confused by this JRuby concept. HBase in Action

This will bring up a JRuby IRB interpreter that has had some HBase-specific commands added to it. Hadoop：The Definitive Guide 3rd Edition

Since the shell is JRuby-based, you can mix Ruby with HBase commands, which enables you to do things like this: HBase：The Definitive Guide

He envisages himself using functional languages in projects and the decision to learn Clojure (a little bit of JRuby, Scala, F#, and Dart lately) influences his current self-learning activities. Java 7 New Features Cookbook

By using JRuby, you write ruby code that has access to any Java code, but will generally lose access to Ruby extensions that are implemented in C. Lucene in Action 2nd Edition

All are JVM (Java Virtual Machine) libraries that can be used from programming languages like Java, Clojure, Scala, JRuby, Groovy, and Jython, as opposed to tools with their own languages, like Hive and Pig. Programming Hive

Not only is Java 7 the most modern, statically typed, object-oriented language available for development, it also enhances the ability to integrate different languages that run on the JVM, such as Jython, Groovy, JRuby, and Scala to name a handful. Java 7 Recipes A Problem-Solution Approach

420、单词 hashtable 的匹配文本：

The classic Hashtable class serves the same purpose as the HashMap and has essentially the same interface. Core Java Volume I Fundamentals 9th Edition

Thus, read and write operations have improved performance compared to the synchronized Hashtable alternative. Java 8 in Action Lambdas, Streams and Functional-style Programming

You’ll use the distributed cache4 to copy the small dataset to the nodes running the map tasks, and use the initialization method of each map task to load the small dataset into a hashtable. Hadoop in Practice

The program reads this source file into a Java Hashtable called joinData that’s available throughout the mapper’s lifespan. Hadoop in Action

Failure to do so will result in a violation of the general contract for Object.hashCode, which will prevent your class from functioning properly in conjunction with all hash-based collections, including HashMap, HashSet, and Hashtable. Effective Java 2nd Edition

The Collections Framework was added in Java 1.2, but prior to that release you can use Vector and Hashtable, which are approximately the same as ArrayList and HashMap. Java in a Nutshell 5th Edition

When using regular columns, as we saw earlier, Cassandra looks like a four-dimensional hashtable. Cassandra The Definitive Guide

The java.util.Map interface is the parent of commonly used classes such as HashMap and (through some library backward reengineering) even the original Hashtable. Hadoop Beginners Guide

You should also prefer the URI class if you need to store URLs in a hashtable or other data structure, since its equals() method is not blocking. Java Network Programming 4th Edition

Even though the fully synchronized Hashtable class allows an Inventory class to be defined without worrying about some low-level synchronization details, we still want to place some semantic constraints on the store and retrieve operations. Concurrent Programming in Java Design Principles and Pattern

421、单词 nonzero 的匹配文本：

So this loop need only execute for each nonzero element of the user vector. Mahout in Action

IOException— reads a byte from the input stream, and returns "true" if the byte is nonzero. Java Network Programming and Distributed Computing

Each of the four value sets includes not only the finite nonzero values that are ascribed to it above, but also the five values positive zero, negative zero, positive infinity, negative infinity, and NaN. The Java Virtual Machine Specification Java SE 7 Edition

In C and C++ the result of this assignment will always be true if y is nonzero, and you’ll probably get an infinite loop. Thinking in Java 4th Edition

If any of the executed commands return a nonzero value, the build will fail. Jenkins The Definitive Guide

A compile-time error does not occur if a non-zero floating-point literal has a small value that, on rounded conversion to its internal representation, becomes a nonzero denormalized number. The Java Language Specification Java SE 7 Edition

The test is very fast because it is inmemory, but it has a nonzero probability of false positives, in which case the regular get() method is called. Hadoop：The Definitive Guide 3rd Edition

Most products have nonzero values for only a few of these optional fields. Effective Java 2nd Edition

It sets its exit code to a nonzero value if an extension with the same name is already installed and has a specification version number equal to or greater than the version of the specified file. Java in a Nutshell 5th Edition

However, if the scan rate is nonzero and the trend of reported free memory is decreasing, then swapping is likely occurring. Java Performance

422、单词 suggester 的匹配文本：

With ElasticSearch 0.90.3 we have the possibility of using the prefix-based suggester which is very handy for implementing the autocomplete functionality and which we will discuss in the Completion suggester section. Mastering ElasticSearch

Query term completion via the Suggester New in Solr 3 is a high-speed approach to implement term completion, called the Suggester. Solr 3 Enterprise Search Server

Using the Suggester component Nowadays, it's common for web pages to give a search suggestion (or autocomplete as I tend to call it), just like many "big" search engines do—just like Google, Microsoft, and others. Solr 3.1 Cookbook

The completion suggester in Elasticsearch takes a completely different approach. Elasticsearch The Definitive Guide

423、单词 logfile 的匹配文本：

The source, an exec source, will tail the /var/log/app.log file and follow any rotations outside applications may perform on that logfile. Apache Flume Distributed Log Collection for Hadoop

The check_log plugin greps a particular query word incrementally in a specified logfile. HBase Administration Cookbook

The HTTP operation records contain a superset of the attributes in the widely supported Combined Logfile Format (CLF) commonly used in web server logging. Monitoring with Ganglia

To troubleshoot the HDFS startup problem, we can open a new terminal and monitor the NameNode logfile on the master node with the following command: Hadoop Operations and Cluster Management Cookbook

This recipe shows how to configure Cassandra’s logging mechanism, Log4J, to send events to its local logfile as well as a remote syslog server. Cassandra High Performance Cookbook

Benchmark writes the results to the console and appends the results to a logfile similarly to the write benchmark. Hadoop MapReduce Cookbook

SequenceFile Imagine a logfile where each log record is a new line of text. Hadoop：The Definitive Guide 3rd Edition

Whenever you are not sure what a byte array contains you should use this method to print its content, for example, to the console, or into a logfile. HBase：The Definitive Guide

Watching the logfile is a good way to determine that the node is finished bootstrapping, but to watch for progress while it’s happening, use nodetool streams. Cassandra The Definitive Guide

Also, when starting a session, we’ll omit the line about the logfile. Programming Hive

424、单词 initializing 的匹配文本：

The time for initializing the cluster, generating the test graphs in-memory, and verifying results is not included in the measurements. Pregel - A System for Large-Scale Graph Processing

The framework calls the configure() method upon initializing a task, and it calls the close() method when the task has finished processing its input split: Pro Hadoop

You’ll also add two more targets for initializing and cleaning up the required output directories. Gradle in Action

After validating the authenticity of the lease [R7], the NM configures the environment for the container, including initializing its monitoring subsystem with the resource constraints specified in the lease. Apache Hadoop YARN Yet Another Resource Negotiator

Initializing our river After we are done with installation, we can initialize our river. Mastering ElasticSearch

We start by initializing the output result for each input group. MapReduce Design Patterns

There's more Initializing connections in this manner does not account for server fail-over or retries. Cassandra High Performance Cookbook

Before a task starts, it shows its status as “initializing,” and then once it starts reading records, it shows the split information for the split it is reading as a filename with a byte offset and length. Hadoop：The Definitive Guide 3rd Edition

Backups provide both disaster recovery and a means for initializing the database of a newly replaced replica without placing load on replicas that are in service. Chubby - The lock service for loosely-coupled distributed systems

The JMS API can be tedious, as it requires you to write a lot of code for initializing all the necessary JMS objects such as connections, sessions, producers, consumers, and so forth. ActiveMQ in Action

425、单词 inet 的匹配文本：

More precisely, this method will establish an association between the socket and a local address, which is usually explicitly specified as an InetSocketAddress instance (this class represents a socket address with IP (or hostname) and port, and extends the abstract SocketAddress class) Pro Java 7 NIO.2

The InetAddress class is used to represent IP addresses within a Java networking application. Java Network Programming and Distributed Computing

The list of attributes the record must have is defined by the objectClass inetOrgPerson. Jenkins Continuous Integration Cookbook

This variable has a single use: it must be set by the security manager before the security manager calls most methods of the InetAddress class. Java Security 2nd Edition

This is used to force the built-in SNMP agent to bind to the given inetAddress. Oracle JRockit The Definitive Guide

However, you can use the InetAddress class if you need to convert between host names and Internet addresses. Core Java Volume II Advanced Features 9th Edition

In Java 5.0 the InetAddress class has an isReachable( ) method that attempts to determine whether the host is reachable. Java in a Nutshell 5th Edition

It shows how Java programs interact with the Domain Name System through the InetAddress class, the one class that’s needed by essentially all network programs. Java Network Programming 4th Edition

A session can cache the IP address explicitly after the first lookup, but this needs to be done at the application level by holding on to the InetAddress object. Java Performance Tuning 2nd Edition

The client is then bound to the port using a new InetSocketAddress instance. Java 7 Recipes A Problem-Solution Approach

426、单词 bidirectional 的匹配文本：

If you want bidirectional replication, you need to trigger two replications with source and target swapped. CouchDB The Definitive Guide

The key to making this work is called joint compilation, which allows you to freely mix Java and Groovy source code with bidirectional dependencies on each other. Gradle in Action

Replication CouchDB is designed to allow bidirectional replication in an efficient and reliable manner. Beginning CouchDB

Finally, having assigned the bidirectional mapping, the UID is returned. HBase in Action

For example, you would use a directed graph to model relationships between users in a social network because relationships are not always bidirectional. Hadoop in Practice

We can see that one of the edges is bidirectional. Cassandra The Definitive Guide

Depending on the type of graph, the edges may be bidirectional or unidirectional and may have weights associated with them. Hadoop Beginners Guide

Sqoop is a tool that facilitates bidirectional exchange of data between HDFS and relational databases. Hadoop Operations

In some situations, you may want to create a bidirectional communication channel between brokers—a channel that communicates not only outward to the brokers on the other side of the connection, but also receives messages from other brokers on that same channel. ActiveMQ in Action

A Java channel represents an open data path, possibly bidirectional, to an external data source or sink such as a file or a socket. Fundamental Networking in Java

427、单词 gridfs 的匹配文本：

GridFS, a protocol for storing large files, uses subcollections to store file metadata separately from content chunks (see Chapter 6 for more information about GridFS) MongoDB The Definitive Guide 2nd Edition

As we move through the book, we cover essential topics such as how to store large files using the GridFS feature and how to administer and optimize your MongoDB installation. The Definitive Guide to MongoDB

Appendix C shows how to work with binary data in MongoDB and how to use GridFS, a spec implemented by all the drivers, to store especially large files in the database. MongoDB in Action

428、单词 uris 的匹配文本：

Caution The paths and URIs for DistributedCache items are stored as comma-separated lists of strings in the configuration. Pro Hadoop

Everything in CouchDB has an address, a URI, and you use the different HTTP methods to operate on these URIs. CouchDB The Definitive Guide

This EL function is commonly used to pass the URIs of data set instances that will be consumed by a Workflow job triggered by a Coordinator action. Professional Hadoop Solutions

The setup() method in the Mapper is told to pull the complete list of cache file URIs currently in the distributed cache. Hadoop RealWorld Solutions Cookbook

The subsequent rewrite rule then proxies all requests to URIs starting with /api to the equivalent URI on http://couch-master.example.com:5984 (again, only if the previous rewrite condition has been met): Scaling CouchDB

You’ll notice that all of the directory names are HDFS URIs. Hadoop in Practice

This is managed in configuration using ViewFileSystem and the viewfs:// URIs. Hadoop：The Definitive Guide 3rd Edition

Similar to the URIs used in a web browser, the file: scheme addresses local files. HBase：The Definitive Guide

The JobTracker will take this list of URIs and create a local copy of the files in all the TaskTrackers when it starts the job. Hadoop in Action

It discusses ActiveMQ URIs and all the transport connectors for both client-to-broker and broker-to-broker communications including TCP, NIO, STOMP, failover, SSL, HTTP/S, and much more. ActiveMQ in Action

429、单词 zsets 的匹配文本：

As shown in table 1.1, Redis allows us to store keys that map to any one of five different data structure types; STRINGs, LISTs, SETs, HASHes, and ZSETs. Redis in Action

Gets the intersection of a given number of ZSETS and store the result in a new ZSET. Redis Cookbook

430、单词 cryptographic 的匹配文本：

Cryptographic techniques ensure that such a certificate cannot be forged, and that any effort to tamper with the signed file will be detected. Core Java Volume I Fundamentals 9th Edition

With the addition of new cryptographic codecs, a new token-based authentication mechanism that supports more authentication protocols, a unified authorization system that supports Attribute Based Access Control (ABAC) and supports policy enforcement using open standards and XACML, and changes to HBase to allow cell-level authorization, Hadoop will be able to move from isolated cluster environments with perimeter-level security to very secure deployments that can meet the requirements of highly secure environments. Professional Hadoop Solutions

They’re both faster than cryptographic hashers such as SHA-1 and produce good distributions. Hadoop in Practice

More significantly, objects in PAST are immutable, facilitating cryptographic protection and simplifying consistency and caching, but limiting the systems usefulness as a general storage service. RADOS A Scalable, Reliable Storage Service for Petabyte-scale Storage Clusters.

Code signing, whereby code authors can use standard cryptographic algorithms to authenticate Java code. Core Java Volume II Advanced Features 9th Edition

Cryptographic hash functions While a full description of a cryptographic hash function is beyond the scope of this book, we can say that cryptographic hash functions are used for information security. Getting Started with Google Guava

These include support to protect against cipher block chaining attacks and new cryptographic algorithms. Java 7 New Features Cookbook

This package and its subpackages support cryptographic message digests and digital signatures. Java in a Nutshell 5th Edition

When a block is first stored, there is also a hidden file written to the same HDFS directory containing cryptographic checksums for the block. Hadoop Beginners Guide

The Handshake Protocol allows the server and client to authenticate each other and to negotiate an encryption algorithm and secret cryptographic keys before the application protocol transmits or receives its first byte of data. Fundamental Networking in Java

431、单词 natively 的匹配文本：

As such, this chapter is a great opportunity to showcase an example of natively serving up a dynamic web application using nothing more than CouchDB’s integrated web server, something you may wish to do with your own applications. CouchDB The Definitive Guide

The use of SequenceFile provides a storage mechanism natively supporting key/value structure, thus making implementations using this data layout much simpler. Professional Hadoop Solutions

For browsers that do not support JSON natively, the eval() function can be used to translate a string containing JSON code into JavaScript objects. Beginning CouchDB

There are few dependency management scenarios it can’t handle natively or be adapted to through a few easy customizations. Gradle Beyond The Basics

But when you’re working with big data volumes you’ll need serialization formats with compact serialized forms that natively support partitioning and have schema evolution features. Hadoop in Practice

It is, however, possible to disable checksums, typically when the underlying filesystem supports checksums natively. Hadoop：The Definitive Guide 3rd Edition

Keys and values can take on types beyond the basic ones which Hadoop natively supports. Hadoop in Action

You can do the same thing if you’re using Windows, but Windows doesn’t include a tail program natively. Cassandra The Definitive Guide

With its support for nested complex types, Avro can natively support a representation of a node that is much closer to the runtime object. Hadoop Beginners Guide

Nevertheless, if you know it, or already have a monitoring system that is natively JMX-aware, it’s a perfectly valid option for integration. Hadoop Operations

432、单词 timeouts 的匹配文本：

Timeouts and workers Finally, there are two miscellaneous properties related to timeouts and two for worker pools that you can change: Apache Flume Distributed Log Collection for Hadoop

Prior to Hadoop 0.18, blocking operations and fixed timeouts were used for the RPC calls. Pro Hadoop

It turns out that most of these overwrites came from client retries due to errors or timeouts. GFS - The Google File System

By default, both types of cached artifacts expire after 24 hours, but both timeouts can be set to arbitrary periods (including zero) using the resolution Strategy block. Gradle Beyond The Basics

But setting the value too high would mean that the client’s interaction with HBase would have longer pauses, and this could result in timeouts on HBase’s side. HBase in Action

I felt like we’d need to precisely control how things like connections, timeouts and buffers are handled, and RMI gives you little control over those. Hadoop：The Definitive Guide 3rd Edition

TCP’s back off policies pay no attention to higher-level timeouts such as Chubby leases, so TCP-based KeepAlives led to many lost sessions at times of high network congestion. Chubby - The lock service for loosely-coupled distributed systems

NFS: May not be a better fit for many deployments as NFS is an external device, costly, less control on timeouts etc. Namenode High Availability

Time for action – causing task failure Let's cause a task to fail; before we do, we will need to modify the default timeouts: Hadoop Beginners Guide

In general, timeouts should be set slightly too long rather than slightly too short.19 Fundamental Networking in Java

433、单词 subtree 的匹配文本：

You can export the preferences of a subtree (or, less commonly, a single node) by calling the methods. Core Java Volume I Fundamentals 9th Edition

The interactive facility allows simple subtree and summation views of the grid state XML tree. Monitoring with Ganglia

Ceph [9] has a cluster of namespace servers (MDS) and uses a dynamic subtree partitioning algorithm in order to map the namespace tree to MDSes evenly. HDFS scalability the limits to growth

The bin/hadoop script and all of the Cygwin utilities see a file system that is a subtree of the Windows file system, with the Windows drives mapped to the /cygdrive directory. Pro Hadoop

Ceph utilizes a novel metadata cluster architecture based on Dynamic Subtree Partitioning [30] that adaptively and intelligently distributes responsibility for managing the file system directory hierarchy among tens or even hundreds of MDSs. Ceph A Scalable, High-Performance Distributed File System

The problem only gets more complicated as we allow more complex changes, where for example the order of records may be permuted; such changes can be common, for instance, if a crawler uses a depth-first strategy to crawl the web, and a single link changing can move an entire subtree’s position in the input file. Incoop MapReduce for Incremental Computations

If we changed a file in the /foo subtree on the first cluster from the previous example, we could synchronize the change with the second cluster by running: Hadoop：The Definitive Guide 3rd Edition

Ceph [17] has a cluster of namespace servers (MDS) and uses a dynamic subtree partitioning algorithm in order to map the namespace tree to MDSs evenly. The Hadoop Distributed File System

A special cell, named global, contains a subtree /ls/global/master that is mirrored to the. Chubby - The lock service for loosely-coupled distributed systems

The line sticking out from the handle icon points to the right when the subtree is collapsed and down when the subtree is expanded (see Figure 6.20) Core Java Volume II Advanced Features 9th Edition

434、单词 alice 的匹配文本：

Alice in Wonderland has 5,909 unique words, including the copyright notice at the beginning.) The words appear in random order. Core Java Volume I Fundamentals 9th Edition

To continue with our previous examples, we may wish to assign the "alice" user the "compute-user" role in the "acme" tenant: bk-compute-adminguide-grizzly

Suppose that Alice, an engineer at Google, comes up with a novel idea for extracting new kinds of signals from web pages. Dremel - Interactive Analysis of WebScale Datasets

You might assign the computeuser role to the alice user in the acme tenant: bk-admin-guide-cloud-latest

Bob uses the algorithm specification to generate his own public and private keys; he sends the public key to Alice. Java Security 2nd Edition

Based on our sample data, the answer is Alice and Zach. Graph Databases

Suppose Alice wants to send Bob a message, and Bob wants to know this message came from Alice and not an impostor. Core Java Volume II Advanced Features 9th Edition

The sender of a message, Alice, can compute a digest for a message and then encrypt that digest with her private key. Java in a Nutshell 5th Edition

So any commands executed by a user called alice will create files with an owner named alice and will only be able to read or write files to which this user has the correct access. Hadoop Beginners Guide

Consider the example of an over-supervised girl (Alice) trying to write to her boyfriend (Bob) when her outgoing mail is scrutinized by her parents. Fundamental Networking in Java

435、单词 concatenation 的匹配文本：

Indeed, it isn’t efficient to generate a new string that holds the concatenation of "Hel" and "p!" Core Java Volume I Fundamentals 9th Edition

The combiner can be used to do some concatenation prior to the reduce phase. MapReduce Design Patterns

When obtaining execution context, the class overwrites the output directory with a new value — concatenation of the base output with the output name. Professional Hadoop Solutions

This includes string concatenation, external table mapping, simple table joins, custom functions, and dependency distribution across the cluster. Hadoop RealWorld Solutions Cookbook

It consists of the concatenation of the compressed HTML of every page, preceded by a small header. The anatomy of a large-scale hypertextual Web search engine

One of the oldest rules in Java has always been that string concatenation using the plus operator should be avoided at all cost, which the following code example shows: Hadoop in Practice

However, there is no easy way to combine the files (by concatenation, for example, in the case of plain-text files) to produce a globally sorted file. Hadoop：The Definitive Guide 3rd Edition

The row keys in the catalog tables are the region names, which are a concatenation of the region’s table name, its start row, and an ID (usually the current time in milliseconds) HBase：The Definitive Guide

In addition, the number of fields in the output relation is the sum of the number of fields in the input relations, and the output relation’s schema is a concatenation of the input relations’ schemas. Hadoop in Action

The key is essentially the concatenation of four dimension keys (row, column family, column qualifier, and timestamp) Cassandra The Definitive Guide

436、单词 znodes 的匹配文本：

Client connections dump Lists all the sessions and ephemeral znodes for the ensemble. Hadoop：The Definitive Guide 3rd Edition

It offers filesystem-like access with directories and files (called znodes) that. HBase：The Definitive Guide

ZooKeeper (http://zookeeper.apache.org) allows distributed processes coordinating with each other through a shared hierarchical name space of data registers (znodes), much like a file system. Apache Kafka

Like files in a traditional filesystem, ZNodes can store arbitrary bytes, have access control lists associated with them, and can be organized hierarchically. Hadoop Operations

Recipes include ZooKeeper operations that manipulate small data nodes, called znodes, that are organized hierarchically as a tree, just like in a file system. ZooKeeper Distributed Process Coordination

437、单词 ips 的匹配文本：

Currently, there can only be one network, and this set up would use the max IPs available in a /24 bk-compute-adminguide-grizzly

Nova defines two different types of IP addresses for an instance: Fixed IPs and Floating IPs. Deploying OpenStack

HAProxy uses a single configuration file to define everything from the frontend IPs being advertised to the servers behind them. RabbitMQ in Action

The templates enable you to create most OpenStack resource types, such as instances, floating IPs, volumes, security groups, users, and so on. bk-admin-guide-cloud-latest

To calculate page views, we need to calculate distinct IPs by page. Hadoop RealWorld Solutions Cookbook

You’ll then use sorting to order them by the number of IPs that viewed that movie category, thus giving you the most popular categories first. Hadoop in Practice

This snitch helps Cassandra know for certain if two IPs are in the same data center or on the same rack—because you tell it that they are. Cassandra The Definitive Guide

Global Network Usage tab to view duration and creation requests for networks, subnets, routers, ports, and floating IPs, per tenant (project) admin-user-guide-trunk

The templates allow creation of most OpenStack resource types such as instances, floating ips, volumes, security groups, users, etc. user-guide

Fixed IPs are assigned to instances on boot, whereas Floating IP address55 OpenStack Operations Guide

438、单词 checkstyle 的匹配文本：

In the Java space, you can choose from a wide range of open source and commercial solutions, such as Checkstyle, PMD, Cobertura, FindBugs, and Sonar. Gradle in Action

This recipe shows you how to do this for checkstyle. Jenkins Continuous Integration Cookbook

There are tools already available for Java and Groovy projects to analyze and check source code, such as Checkstyle, JDepend, PMD, FindBugs, CodeNarc, and Sonar. Gradle Effective Implementation Guide

Modern IDEs have great support for many code quality tools—Checkstyle, PMD, and FindBugs all have plugins for Eclipse, NetBeans, and IntelliJ, which provide rapid feedback for developers on code quality issues. Jenkins The Definitive Guide

Earlier we discussed the jscheckstyle npm package, which coincidentally can output its report in Checkstyle format—and wouldn’t you know it, there is a Jenkins plug-in for that format! Even though you may be checking complexity numbers pre-commit, it is still a great idea to also generate those numbers as part of your build, for two reasons. Testable JavaScript

Checkstyle Checkstyle ensures that all source code adheres to coding standards. Sonar Code Quality Testing Essentials

These tools, such as Checkstyle, PMD, or FindBugs have a set of predefined rules of good practices and parse the source code looking for violations of those rules. Java 7 Concurrency Cookbook

For the first example, the Checkstyle Ant task will be loaded from a combination of a properties file and a set of local directory JARs. Building and Testing with Gradle

439、单词 readability 的匹配文本：

Figure 2.2 Comparing build script size and readability between Maven and Gradle. Gradle in Action

Some of the major benefits of using these higher-level of abstractions include readability, maintainability, development time, and automatic optimization. MapReduce Design Patterns

Adding a custom action node with a well-defined and meaningful business name can significantly improve the overall readability of the process. Professional Hadoop Solutions

One basic example might be the min() operation of the Reducer class that specifically formats output in a certain way for readability. Hadoop RealWorld Solutions Cookbook

The following code shows the implementations of the Serialized interface in the RegexSerDe class, with some unimportant code removed to help with readability: Hadoop in Practice

Statements that have to be terminated with a semicolon can be split across multiple lines for readability: Hadoop：The Definitive Guide 3rd Edition

For easier readability, the related details were broken up into groups using blank lines. HBase：The Definitive Guide

In the interest of readability, we’re ignoring the fact that this doesn’t yield identically formatted output; the data is the same. Hadoop Operations

Although the output has been truncated for readability, the command starts up a simple JMS producer and you can see from the output that it. ActiveMQ in Action

This technique increases readability by breaking up a long complex WHERE clause into two or more UNION queries. Programming Hive

440、单词 utf 的匹配文本：

If your input is comprised of tailed logfiles, the array is most likely a UTF-8 encoded String containing a line of text. Apache Flume Distributed Log Collection for Hadoop

The UTF-16 encoding represents all Unicode code points in a variable-length code. Core Java Volume I Fundamentals 9th Edition

In this case, the output is assumed by CouchDB to be HTML (i.e., with a content type of text/html) and UTF-8 formatting. Getting Started with CouchDB

The input bytes are considered to be in the UTF-8 character set. Pro Hadoop

This recipe is only applicable when your data is ASCII or UTF-8 Cassandra High Performance Cookbook

Your implementation stored them as strings, and used the DataOutput’s writeUTF method to write them out: Hadoop in Practice

Because of its emphasis on using standard UTF-8, there are some differences between Text and the Java String class. Hadoop：The Definitive Guide 3rd Edition

The writeUTF method writes string data using a modified version of 8-bit Unicode Transformation Format. Core Java Volume II Advanced Features 9th Edition

BytesType is the default for a reason: it provides the correct sorting for most types of data (UTF-8 and ASCII included) Cassandra The Definitive Guide

Regardless of operating system, you should save the file in the UTF-8 encoding if your editor allows. Java 7 Recipes A Problem-Solution Approach

441、单词 smtp 的匹配文本：

Interactive queries are performed via a text protocol (similar to SMTP or HTTP) Monitoring with Ganglia

Finally, we describe interfacing directly with CouchDB from Erlang, which can be a useful technique if you want to integrate CouchDB storage with a server for non-HTTP protocols, like SMTP (email) or XMPP (chat) CouchDB The Definitive Guide

The way you want to send an email (for example, via SMTP or a third-party email service provider) and the list of recipients are very specific to your build. Gradle in Action

The output tells us that the local machine has the following services running: ftp, ssh, smtp, rpcbind (service for remote procedure calls), and jpp (service for Java packaging) Hadoop Operations and Cluster Management Cookbook

The JavaMail API specification does not require any protocol providers, but the JavaMail reference implementation includes an essage store provider, a POP3 message store provider, and an SMTP. Java EE 7 Specification

However, our core data model is not restricted to our particular RPC framework; we also trace activities such as SMTP sessions in Gmail, HTTP requests from the outside world, and outbound queries to SQL servers. Dapper a Large-Scale Distributed Systems Tracing Infrastructure

The Exchange environment is independent by necessity, but one class of Postfix machines acts as an added layer of protection and uses SMTP to transfer messages between mailboxes hosted in each environment. Hadoop：The Definitive Guide 3rd Edition

In the past, it was simple to write a program that sends e-mail by making a socket connection to port 25, the SMTP port. Core Java Volume II Advanced Features 9th Edition

You’ll also get a behind-the-scenes look at HTTP, SMTP, TCP/IP, and the other protocols that support the Internet and the Web. Java Network Programming 4th Edition

In the example that follows, application-wide parameters are set up to store an organization’s e-mail address and SMTP host for sending mail. Java 7 Recipes A Problem-Solution Approach

442、单词 accessor 的匹配文本：

Methods that change instance fields are called mutator methods, and those that only access instance fields without modifying them are called accessor methods. Core Java Volume I Fundamentals 9th Edition

While you can write accessor functions to obtain the value of a variable and/or modify it, an easier alternative is to define an attribute. Java Network Programming and Distributed Computing

But it isn’t always possible to use a method reference as a classification function, because you may wish to classify using something more complex than a simple property accessor. Java 8 in Action Lambdas, Streams and Functional-style Programming

A large number of accessor methods give you this information. Core Java Volume II Advanced Features 9th Edition

Note that a nonzero-length array is always mutable, so it is wrong for a class to have a public static final array field, or an accessor that returns such a field. Effective Java 2nd Edition

Certain important or frequently used fields can also be public, but it is common practice to make fields non-public and encapsulate them with public accessor methods. Java in a Nutshell 5th Edition

It is easy to use a method in a loop that has side effects such as making copies, or an accessor that returns a copy of some object you need only once. Java Performance Tuning 2nd Edition

The HotSpot VM’s TemplateTable defines all the templates and provides accessor functions to get the template for a given bytecode. Java Performance

We can again apply our default rules, which in this case lead to the creation of little synchronized get and set methods (also known as accessor and assignment methods) for particles, otherwise avoiding direct access of the particles variable itself. Concurrent Programming in Java Design Principles and Pattern

The following code demonstrates the declaration of a private field, followed by accessor and mutator methods that can be used to obtain or set the value of that field from an outside class: Java 7 Recipes A Problem-Solution Approach

443、单词 logout 的匹配文本：

We also define a logout method that takes a UserSession object and invalidates it and performs any cleanup that may be needed. Scala in Depth

To log out at the end of the transaction, use the method logout, passing to it the GUID. Jenkins Continuous Integration Cookbook

The logout() method sets any value that might be stored in the ThreadLocal to null. Pro Spring 3

Finally, we will change the logout so it is handled by Spring Security as well. Pro Spring MVC

To implement it, you’ll need to configure Spring Security, configure web.xml, and add login and logout links to the app. Spring in Practice

LogoutFilter no-op interaction when the logout URL is not requested. Pro Spring Security

From an authentication standpoint, you’ll use the form-login D and logout E elements to enable a login page, with a login failure URL and a security logout page, respectively, for your web application. Spring Roo in Action

Invokes the logout method on the managers in the JAAS configuration file. Core Java Volume II Advanced Features 9th Edition

This is one of the most important classes in the JAAS API for application programmers: it defines the login( ) method (and the corresponding logout( ) method) that allows an application to authenticate a user. Java in a Nutshell 5th Edition

As it turns out, logout capability is already enabled by your configuration without you having to do anything else. Spring in Action 4th Edition

444、单词 searchable 的匹配文本：

Views are functions that take your document data and produce searchable lists of information based on the document’s contents. Getting Started with CouchDB

The package must live in the directory indicated by its name, which must be a directory that is searchable starting from the CLASSPATH. Thinking in Java 4th Edition

Why Subscribe? f Fully searchable across every book published by Packt f Copy and paste, print and bookmark content f On demand and accessible via web browser. Hadoop Operations and Cluster Management Cookbook

Lucene operates on searchable documents, which are collections of fields, each having a value. Professional Hadoop Solutions

Why Subscribe? f Fully searchable across every book published by Packt. Cassandra High Performance Cookbook

Subscribers have access to thousands of books, training videos, and prepublication manuscripts in one fully searchable database from publishers like O’Reilly Media, Prentice Hall Professional, Addison-Wesley. Java EE 7 Essentials

Not only does it store them, it also indexes the contents of each document in order to make them searchable. Elasticsearch The Definitive Guide

If this is not the case, you may need to break objects and searchable data into fields that are individually compressed. Java Performance Tuning 2nd Edition

Organizations would like to build this living knowledge repository of Hive assets that is easily searchable by users. Programming Hive

Safari Books Online is a digital library providing searchable, on-demand access to thousands of technology, digital media, and professional development books and videos from leading publishers. Java SE 8 for the Really Impatient

445、单词 subdirectories 的匹配文本：

Over time, the hdfs.path directory will get very full so you will want to add some kind of time element into the path to partition the files into subdirectories. Apache Flume Distributed Log Collection for Hadoop

The attributes group, name, and version directly map to subdirectories in the file system. Gradle in Action

The views directory is structured into subdirectories, each of which is the name of the view. Beginning CouchDB

The simplest use of fileTree() simply points it at a parent directory, allowing it to recurse through all subdirectories and add all of the files it finds into the resulting file collection. Gradle Beyond The Basics

We can specify all the part prefixed subdirectories as input to the dependent job using wildcards. Hadoop MapReduce Cookbook

I also sometimes have data that’s specific to a chapter, which exists within chapter-specific subdirectories under the same GitHub location. Hadoop in Practice

If you want to see all the subdirectories, in a way similar to Unix’s ls with the -r option, you can use Hadoop’s lsr command. Hadoop in Action

Double-click on the ufodata directory to open it and within it create two subdirectories called ufo and states. Hadoop Beginners Guide

The AMQ message store directory contains subdirectories for all the brokers that are running on the machine. ActiveMQ in Action

The master project goes in a top-level directory, with subprojects arranged in subdirectories underneath. Building and Testing with Gradle

446、单词 bundlor 的匹配文本：

Tools (such as Eclipse PDE, Bnd, and Bundlor) can help you easily and efficiently manage the manifest file, and they can easily be integrated with building tools like Ant or Maven. Spring Dynamic Modules in Action

447、单词 lzo 的匹配文本：

On the other hand, using snappy or lzo results in faster compression performance, but results in a lower compression ratio. Apache Flume Distributed Log Collection for Hadoop

This step requires the HBase cluster to support LZO compression. HBase Administration Cookbook

The framework supports the gzip, zip, sometimes LZO, and bzip2 compression codecs. Pro Hadoop

Also covered is how to set the block size and replication, and enable LZO compression. Hadoop RealWorld Solutions Cookbook

Various compression codecs are available to be used with HBase, including LZO, Snappy, and GZIP. HBase in Action

Twitter LZO-compresses all of its data, and uses Protocol Buffers for serialization purposes, all of which are geared to optimizing the use of its storage and computing resources. Hadoop in Practice

The LzopCodec is compatible with the lzop tool, which is essentially the LZO format with extra headers, and is the one you normally want. Hadoop：The Definitive Guide 3rd Edition

Before Snappy was made available in 2011, the recommended algorithm was LZO, even if it did not have the best compression ratio. HBase：The Definitive Guide

As result of these experiments we chose a variant of LZO for production, since it gave an about 10% better compression ratio and was up to twice as fast when decompressing compared to Zippy. PowerDrill - Processing a Trillion Cells per Mouse Click

However, BZip2 and LZO provide block-level compression, where each block has complete records, so Hadoop can split these files on block boundaries. Programming Hive

448、单词 gzip 的匹配文本：

For example, if you specify the codec as follows all files written will have a .gzip extension, so you don't need to specify a hdfs.fileSuffix property in this case: Apache Flume Distributed Log Collection for Hadoop

The most notable things to do for performance are keeping data small (GZIP compression, high JPG compression) and avoiding requests all together (HTTP-level caching in the browser) CouchDB The Definitive Guide

We suggest downloading a .gzip archived file with the filename ending with tar.gz. Hadoop Operations and Cluster Management Cookbook

To create a TAR archive with the optional gzip or bzip2 compression, we must use the tarFiles task. Gradle Effective Implementation Guide

Since files compressed using the bzip2, gzip, and DEFLATE codecs cannot be split, the whole file must be given as a single input split to the mapper. Hadoop RealWorld Solutions Cookbook

Deflate Deflate is similar to zlib, which is the same compression algorithm that gzip uses without the gzip headers. Hadoop in Practice

Note that the gzip file format is DEFLATE with extra headers and a footer.) The .deflate filename extension is a Hadoop convention. Hadoop：The Definitive Guide 3rd Edition

This problem was solved by GZIP compressing the intermediate map output. Apache Hadoop Goes Realtime at Facebook

For example, to use GZIP compression, you can set the configuration object: Hadoop in Action

On Linux, GZip extraction utilities should be preinstalled; on Windows, you’ll need to get a program such as WinZip, which is commercial, or something like 7-Zip, which is freeware. Cassandra The Definitive Guide

449、单词 leverages 的匹配文本：

As compared to Ganglia, which stores its data in RRDtool, OpenTSDB leverages HBase's scalability to monitor it at a larger scale. HBase Administration Cookbook

Ganglia 1.0 had no time-series support, whereas Ganglia today leverages the power. Monitoring with Ganglia

The ElasticSearch point of view On top of all this is ElasticSearch which leverages Apache Lucene and thankfully allows us to change the default scoring algorithm (more about this can be found in the Altering Apache Lucene scoring section, Chapter 3, Low-level Index Control) Mastering ElasticSearch

Sqoop leverages database to describe the schema for the imported/ exported data and MapReduce for parallelization operation and fault tolerance. Professional Hadoop Solutions

To control granularity we propose a new Contraction phase that leverages Combiner functions, normally used to reduce network traffic by anticipating a small part of the processing done by Reduce tasks. Incoop MapReduce for Incremental Computations

To support this throughput HDFS leverages unusually large (for a filesystem) block sizes and data locality optimizations to reduce network input/output (I/O) Hadoop in Practice

Instead of implementing a separate RPC server, it leverages the Accept header of HTTP to send and receive the data encoded in Protocol Buffers. HBase：The Definitive Guide

Although both consider only independent failures, RADOS leverages CRUSH to mitigate correlated failure risk with failure domains. RADOS A Scalable, Reliable Storage Service for Petabyte-scale Storage Clusters.

HBase in turn leverages HDFS for scalable and fault tolerant storage and ZooKeeper for distributed consensus. Apache Hadoop Goes Realtime at Facebook

One of these methods leverages the Bloom filter, a data structure that’s useful in many data processing tasks. Hadoop in Action

450、单词 enums 的匹配文本：

Actually, the OpenOption is an interface from the java.nio.file package and it has two implementations: the LinkOption class (remember the well-known NOFOLLOW_LINKS enum constant) and the StandardOpenOption class, which defines the following enums: Pro Java 7 NIO.2

For instance, enums are compared by the order in which the symbol is defined and not by the value of the symbol string. Hadoop：The Definitive Guide 3rd Edition

The property values are defined on CacheRetrieveMode and CacheStoreMode enums and explained in Table 13-5 Java EE 7 Essentials

Typically, most of your string fields will be exact-value not_analyzed fields such as tags or enums, plus a handful of full-text fields which will use some default analyzer like standard orenglish or some other language. Elasticsearch The Definitive Guide

The language supports four kinds of types: interfaces (including annotations), classes (including enums), arrays, and primitives. Effective Java 2nd Edition

Arrays, enumerated types (or “enums”) and annotation types are the other three. Java in a Nutshell 5th Edition

Avro supports six kinds of complex types: records, enums, arrays, maps, unions, and fixed. Cassandra The Definitive Guide

Avro also supports primitive types such as integers, floats, and booleans as well as other structured types such as arrays and enums. Hadoop Beginners Guide

I could have used strings or int constants to specify the kind of record to search for and the database to search in, but because there are only a small number of valid values, lookUpNames() defines enums with a fixed number of members instead. Java Network Programming 4th Edition

By calling conversion methods against these enums, long values representing the duration between two dates can be converted. Java 7 Recipes A Problem-Solution Approach

451、单词 predicates 的匹配文本：

Using onlyIf predicates Every task has a method onlyIf that accepts a closure as an argument. Gradle Effective Implementation Guide

In addition, coprocessors can be used for implementation of new features — for example, secondary indexing, complex filtering (push down predicates), and access control. Professional Hadoop Solutions

Predicate pushdown – Predicates are pushed down to the scan if possible so that rows can be filter early in the processing. Hive A Petabyte Scale Data Warehouse Using Hadoop

Filters are sometimes called push-down predicates, allowing you to push data-filtering criteria down to the server (see figure 4.16) HBase in Action

In addition, it provides push-down predicates, that is, filters, reducing data transferred over the network. HBase：The Definitive Guide

A second important and growing use case for graph databases is the Semantic Web, where predicates are given equal status to subjects and objects in a triple. Cassandra The Definitive Guide

The execution engine will ignore the file after processing the header if it can determine that the file does not contain any records of interest, based on the predicates defined for that table in the query. Tenzing - A SQL Implementation On The MapReduce Framework

As with traditional relational databases, Hive allows tables to be partitioned based on the values of virtual columns and for these values to then be used in query predicates later. Hadoop Beginners Guide

When we add predicates to WHERE clauses that filter on partition values, these predicates are called partition filters. Programming Hive

Many formalisms are available for precisely stating predicates describing requirements (as discussed in most of the texts on concurrent design methods listed in the Further Readings) Concurrent Programming in Java Design Principles and Pattern

452、单词 deadlocks 的匹配文本：

The synchronicity of this model makes it easier to reason about program semantics when implementing algorithms, and ensures that Pregel programs are inherently free of deadlocks and data races common in asynchronous systems. Pregel - A System for Large-Scale Graph Processing

If no other thread calls signal again, then the system deadlocks. Core Java Volume I Fundamentals 9th Edition

To the scheduler they appeared as map-only jobs with radically different resource curves, thwarting the assumptions built into to the platform and causing poor utilization, potential deadlocks, and instability. Apache Hadoop YARN Yet Another Resource Negotiator

Reads within read-write transactions use woundwait [33] to avoid deadlocks. Spanner - Google’s Globally-Distributed Database

Locking, waits, and deadlocks It is a known fact that complex transactional processing, like two-phase commits, can increase the possibility of multiple clients waiting for a resource to become available. HBase：The Definitive Guide

Complete non blocking, event IO based communication (no blocking threads on sockets, no deadlocks, scalable with large number of shards/replicas) elasticsearch. The Road to a Distributed, (Near) Real Time, Search Engine

Depending on what an alien method does, calling it from a synchronized region can cause exceptions, deadlocks, or data corruption. Effective Java 2nd Edition

Java 7 has improved the use of class loaders by modifying the locking mechanism to avoid deadlocks. Java 7 New Features Cookbook

More Info You have to be very careful with the use of Locks to avoid deadlocks. Java 7 Concurrency Cookbook

In addition, you will find detailed coverage of all aspects of concurrent programming, including such topics as confinement and synchronization, deadlocks and conflicts, state-dependent action control, asynchronous message passing and control flow, coordinated interaction, and structuring web-based and computational services. Concurrent Programming in Java Design Principles and Pattern

453、单词 metamodel 的匹配文本：

In this new criteria API, the criteria being passed into the query is based on the mapped entity classes’ metamodel. Pro Spring 3

Most of the time, online store applications don’t have a static structure for the products in their catalog: they use a metamodel configured with the structure of the products and a generic engine that uses this metamodel to display products dynamically. Spring Batch in Action

Criteria API The Criteria API is an object-based, type-safe API and operates on a metamodel of the entities. Java EE 7 Essentials

This reflects additions to support many modeling constructs natural to Java developers, expansion of the standardized mapping options, an object-based criteria query API, a metamodel API, support for automatic validation, support for pessimistic locking, and much more. Pro JPA2

Introduction to Querydsl When working with Querydsl, you will usually start by deriving a metamodel from your domain classes. Spring Data Modern Data Access for Enterprise Java

454、单词 jaas 的匹配文本：

Note: Using well-supported third-party frameworks, such as JAAS and JDBC implementations, you can connect to a much wider set of services than mentioned in the table. Jenkins Continuous Integration Cookbook

Tomcat, for example, allows you to authenticate users against a relational database (using direct JDBC or a DataSource), JNDI, JAAS, or an XML configuration file. Jenkins The Definitive Guide

Standards such as Java Authentication and Authorization Service (JAAS) or Java EE Security do offer some ways of performing some of the same authentication and authorization functions, but Spring Security is a winner because it packages up everything you need to implement a top-to-bottom application security solution in a concise and sensible way. Spring Security 3.1

On the other hand, if you expected SASL to work, please fix your JAAS configuration. Professional Hadoop Solutions

So it overlaps mainly with the JAAS Java API, although they can be used together, as you will see later in the book. Pro Spring Security

Chapter 15 This chapter discusses how the Java Authentication and Authorization Service (JAAS) enables applications to authenticate users and grant them particular permissions based on their login ID or other credentials. Java Security 2nd Edition

Specifies the name of a Java Authentication and Authorization Service (JAAS) login configuration entry to use when the JMX agent authenticates users. Oracle JRockit The Definitive Guide

The string parameter "Login1" in the LoginContext constructor refers to an entry with the same name in the JAAS configuration file. Core Java Volume II Advanced Features 9th Edition

With the addition of the JAAS API to the core Java platform in Java 1.4, policytool allows the specification of a Principal to whom a set of permissions is granted. Java in a Nutshell 5th Edition

ActiveMQ also supports its own simple style of authentication and authorization using properties files as well as standard JAAS login modules. ActiveMQ in Action

455、单词 wal 的匹配文本：

This durability is provided by a combination of a Write Ahead Log (WAL) and one or more file storage directories. Apache Flume Distributed Log Collection for Hadoop

So, HBase can guarantee that when a region server dies, data can be recovered and replayed on other region servers using its Write-Ahead Log (WAL) HBase Administration Cookbook

It also means that if writing the record to the WAL fails, the whole operation must be considered a failure. Professional Hadoop Solutions

When a write is made, by default, it goes into two places: the write-ahead log (WAL), also referred to as the HLog, and the MemStore (figure 2.1) HBase in Action

When data is updated it is first written to a commit log, called a write-ahead log (WAL) in HBase, and then stored in the in-memory memstore. HBase：The Definitive Guide

A common practice for highthroughput data stores, use of a write ahead log (WAL) such as the edits file reduces I/ O operations to sequential, append-only operations (in the context of the namenode, since it serves directly from RAM), which avoids costly seek operations and yields better overall performance. Hadoop Operations

456、单词 initializes 的匹配文本：

But in case you have to know, it initializes to null several other internal elements of the mmodule structure and fills in a little bit of static information. Monitoring with Ganglia

At the start of a simulation, the lightweight simulator initializes cluster state using task-size data extracted from the. Omega - flexible, scalable schedulers for large compute clusters

More subtly, because the JobTracker needs to allocate tracking structures for every job it initializes, its admission control logic includes safeguards to protect its own availability; it may delay allocating fallow cluster resources to jobs because the overhead of tracking them could overwhelm the JobTracker process. Apache Hadoop YARN Yet Another Resource Negotiator

For each input record, Sawzall initializes the pre-defined variable input to the uninterpreted byte string of the input record. Interpreting the Data Parallel Analysis with Sawzall

The only feature that sets them apart from regular POJOs4 (or POGOs) is that they must have a property called name and a constructor that accepts a String and initializes the name property with it. Gradle Beyond The Basics

When we run the code, it initializes the K-means MapReduce job and executes it using the MapReduce framework. Hadoop MapReduce Cookbook

The first time we run this request, the upsert value is indexed as a new document, which initializes the views field to 1 Elasticsearch The Definitive Guide

When a task initializes, it calls configure(), which you override to retrieve and store your properties. Hadoop in Action

This tends to be a side effect of how the jobtracker initializes jobs submitted to the cluster and how it handles memory. Hadoop Operations

This simple class starts a minimal ActiveMQ broker configuration and initializes the Spring application context to start the JMS clients. ActiveMQ in Action

457、单词 regex 的匹配文本：

In order to extract one or more fields, you start by specifying the regex property with group matching parentheses. Apache Flume Distributed Log Collection for Hadoop

If you’re interested in something specific, you can specify a system name, or a regex or type-glob to combine various metrics from various hosts to create a custom graph of exactly what you want to see. Monitoring with Ganglia

ElasticSearch allows us to use two types of field data loading filtering: by term frequency, by using regex, or by combining both of them. Mastering ElasticSearch

We use the setup function to retrieve the map regex from the job configuration. MapReduce Design Patterns

The mapper class defines a statically initialized position to read the country from each line, as well as the regex pattern to split each line. Hadoop RealWorld Solutions Cookbook

To rename files using regular expressions, we can simply provide a source regex and a destination filename. Gradle Beyond The Basics

Keep in mind that this is a regex over the un-interpreted bytes in the rowkey. HBase in Action

Solution Use the RegexSerDe bundled with Hive and define a regular expression that can be used to parse the contents of Apache log files. Hadoop in Practice

RegexMapper A mapper that finds matches of a regular expression in the input value and emits the matches along with a count of one. Hadoop：The Definitive Guide 3rd Edition

Returns whether the string s matches the Java regular expression regex. Hadoop in Action

458、单词 gwt 的匹配文本：

However, Spring can also integrate with other popular web frameworks such as Struts, JSF, Google Web Toolkit (GWT), and so on. Pro Spring 3

Google’s GWT compiler can also output source maps, and work is being done for the CoffeeScript compiler as well, so as far as you can tell you are debugging your original native code while the browser is actually running the compiled version. Testable JavaScript

Google Web Toolkit (GWT) integration There are a number of ways to set up a GWT-based application to communicate with the server. Spring Security 3.1

We’ll show you how to use servlet and JSP technologies with Spring DM, and how to use web frameworks like Spring MVC, JSF, and GWT to implement real web applications. Spring Dynamic Modules in Action

Using a similar approach, we can also build support for other widgets in jQuery UI or other UI frameworks such as Yahoo UI, Doojo, and GWT. Selenium Testing Tools Cookbook

You can use this technique when doing rapid prototyping; perhaps somebody has created a set of domain object models, and you want to try using them from various front ends such as Spring MVC, JSF, or GWT. Spring Roo in Action

If you are working with GWT and would like to take advantage of Guava in your code, there is also a GWT compatible version that can be downloaded by clicking on the guava-gwt-14.jar link on the same page. Getting Started with Google Guava

459、单词 ioc 的匹配文本：

IoC is a technique that externalizes the creation and management of component dependencies. Pro Spring 3

To be able to do the things mentioned here, we will explore the Inversion of Control (IoC) container; the application context. Pro Spring MVC

Spring’s Core Container provides the inversion of control (IoC) and DI capabilities on which all the other modules are built. Spring in Practice

Dependency Injection The basic idea of DI, a type of Inversion of Control (IoC), is simply that instead of having an object instantiate its needed dependencies, the dependencies are somehow given to the object. Pro Spring Security

Dependency Injection in Scala: The Cake Pattern Dependency injection (DI), a form of inversion of control (IoC), is a powerful technique for resolving dependencies between “components” in larger applications. Programming Scala

Spring application context management in OSGi environments is an example of the inversion of control (IoC) pattern: Spring DM–powered bundles wait for the extender to inspect them and create their application context; no application code is needed for this task. Spring Dynamic Modules in Action

The theme of inversion of control (IoC) is central to this investigation because it’s a significant part of the Spring philosophy.3 For the purpose of this discussion, we consider IoC in broad terms. Spring Integration in Action

Using the Spring IoC container, these may configured as normal Spring beans and injected into GemFire components. Spring Data Modern Data Access for Enterprise Java

460、单词 lua 的匹配文本：

Part 3 completes the content with methods to help you scale Redis using memory-saving techniques, horizontal sharding, and Lua scripting. Redis in Action

Redis 2.6 includes a built-in Lua interpreter which developers can leverage to write more advanced queries to be. The Little Redis Book

Integration of Lua and R has been proposed, and work is in progress. Tenzing - A SQL Implementation On The MapReduce Framework

This has been achieved by integrating the Redis server with the Lua C API. Redis Cookbook

461、单词 coprocessor 的匹配文本：

The framework for coprocessors is very flexible, and allows implementing two basic coprocessor types: Professional Hadoop Solutions

Writing an observer coprocessor is tricky to get right, and such a coprocessor can be extremely difficult to debug when running at scale. HBase in Action

With the coprocessor feature in HBase, you can even move part of the computation to where the data lives. HBase：The Definitive Guide

HBase security is implemented as a coprocessor—a plug-in that, in many ways, works like a database trigger, firing authorization checking code before each operation is performed. Hadoop Operations

462、单词 multicore 的匹配文本：

How can we fulfill the need for computational speed? Moore’s law takes us some of the way, but multicore is the real future. Programming Concurrency on the JVM

Stepping back from extremes of multicore CPU deployments, Ganglia has also proven itself to be truly adaptable and versatile in the face of dramatic change. Monitoring with Ganglia

Now, with Moore’s Law running out of steam (at least for conventional chips), speed improvements are appearing in the form of multicore processors rather than faster chips. Thinking in Java 4th Edition

It is effectively a single point of contention for the whole server, and it can cause severe stalls on multicore servers. High Performance MySQL 3rd Edition

Given that today’s computers have blazingly fast multicore processors, you should use their computing powers to their fullest. Gradle in Action

There’s also a hardware influence: commodity CPUs have become multicore—the processor in your laptop or desktop machine probably has four or more CPU cores within it. Java 8 in Action Lambdas, Streams and Functional-style Programming

For production use, it is typical that you use multicore processors.* Quad-core are state of the art and affordable, while hexa-core processors are also becoming more popular. HBase：The Definitive Guide

It is inherent in much of what we do, and a requirement if you are to obtain good performance from multicore processors, which are now commonplace. Effective Java 2nd Edition

Being able to make efficient use of CPU cycles can be challenging for multithreaded applications running on multiprocessor and multicore. Java Performance

By making concurrent programs, you take advantage of today’s multicore CPUs. Java 7 Recipes A Problem-Solution Approach

463、单词 refactor 的匹配文本：

During that incubation year, work had already begun to refactor Flume under the Star Trek Themed tag, Flume-NG (Flume the Next Generation) Apache Flume Distributed Log Collection for Hadoop

We refactor the TimerTest example and extract a TalkingClock class. Core Java Volume I Fundamentals 9th Edition

It’s easy to refactor the existing project structure into the identified modules. Gradle in Action

At Stage 4 of the cycle (where MapReduce job should be invoked from an Oozie action), it is possible to either use an existing driver as is by using a java action, or refactor it for use as an Oozie map-reduce action. Professional Hadoop Solutions

As the plug-in takes shape, we will slowly refactor it into a distributable plug-in project with a lifecycle of its own. Gradle Beyond The Basics

We can refactor the MapReduce program to track the count explicitly. Hadoop in Action

Once you do this, you can refactor the method to give it a single return: Effective Java 2nd Edition

Lastly, we revisited the portfolio example from chapter 3 in order to refactor it to use Spring JMS. ActiveMQ in Action

When locking presents liveness or performance problems for a given class or program, usually the best solution is to refactor the design to use one or more of the other approaches presented in this chapter. Concurrent Programming in Java Design Principles and Pattern

While you are always free to solve problems in a locally optimized way by writing Groovy code inside your build, the most maintainable builds will eventually refactor this code into organized plug-ins that expose new build language idioms to the build files that use them. Building and Testing with Gradle

464、单词 jre 的匹配文本：

There is also a Java Runtime Environment (JRE) that contains the virtual machine but not the compiler. Core Java Volume I Fundamentals 9th Edition

OSNaming JNDI service provider is a part of the Java SE 7 SDK and JRE. Java EE 7 Specification

HBase needs the Java Runtime Environment (JRE) to be installed and available on the system. HBase in Action

JAVA_HOME must be set to point to a JDK, not to a JRE. Hadoop MapReduce Cookbook

All of the following assumes you have the Java Runtime Environment (JRE) installed. HBase：The Definitive Guide

You can run into grief if you drop a JAR file into the jre/lib/ext directory and one of its classes needs to load a class that is not a system or extension class. Core Java Volume II Advanced Features 9th Edition

The deployJava.js is part of the Java Deployment Toolkit and is used to detect the presence of a JRE, install one if necessary, and then run an applet. Java 7 New Features Cookbook

Ivy requires Ant, and building from source requires the complete JDK, version 1.6.0_20 or better, not just the JRE. Cassandra The Definitive Guide

Although the JRE (Java Runtime Environment) is all you need to run Hive, you will need the full JDK (Java Development Kit) to build examples in this book that demonstrate how to extend Hive with Java code. Programming Hive

The Java platform consists of a hierarchy of components, starting with the Java Development Kit (JDK), which is composed of the Java Runtime Environment (JRE), the Java programming language, and platform tools that are necessary to develop and run Java applications. Java 7 Recipes A Problem-Solution Approach

465、单词 instantiating 的匹配文本：

You do this by instantiating implementations of AppleFormatter and giving them as arguments to prettyPrintApple: Java 8 in Action Lambdas, Streams and Functional-style Programming

To do that, you must first connect to the Zookeeper instance keeping track of Accumulo by instantiating a ZookeeperInstance class, which will return a connector. Professional Hadoop Solutions

Using an HTablePool is more common in practice than instantiating HTables directly: HBase in Action

Instantiating a Configuration object and obtaining a FileSystem handle within a Hadoop environment will point it to the HDFS NameNode of that environment. Hadoop MapReduce Cookbook

There is a cost to instantiating an HTable, so if you do this for each insert, you may have a negative impact on performance, hence our setup of HTable in the configure() step. Hadoop：The Definitive Guide 3rd Edition

You must provide a row key when instantiating an Increment, which sets the row containing all the counters that the subsequent call to increment() should modify. HBase：The Definitive Guide

Another difference is that Pig programs are written in Pig Latin, whereas Cascading works more like a Java framework in which you create a data processing flow through instantiating various Java classes (Each, Every, etc.) Hadoop in Action

For example, the following uses a raw type when instantiating the identifier, list: Java 7 New Features Cookbook

We introduce them here, however, because they are the next higher level of syntax after methods, and because the rest of this chapter requires a basic familiarity with the concept of class and the basic syntax for defining a class, instantiating it, and using the resulting object. Java in a Nutshell 5th Edition

The method with this signature can be invoked without instantiating its containing class because it is marked as static. Java 7 Recipes A Problem-Solution Approach

466、单词 usr 的匹配文本：

For example, in a 32-bit Intel-compatible Linux host, the default is usually at /usr/ lib/ganglia. Monitoring with Ganglia

The argument -reducer "/usr/bin/perl -w wordCount.pl" causes the Perl program wordCount.pl to be used to perform the reduce. Pro Hadoop

Be sure to read through the Debian-specific system documentation that can be found under /usr/share/couchdb. CouchDB The Definitive Guide

Suppose we have both OpenJDK and Oracle Java installed in /usr/openjdk and / usr/jdk respectively. Hadoop Operations and Cluster Management Cookbook

One variant uses a shared /lib for moderate sharing, while the other shares /usr/include, which is very heavily read. Ceph A Scalable, High-Performance Distributed File System

In this example, we connected to the MySQL server using hdp_usr. Hadoop RealWorld Solutions Cookbook

This will create a new directory called otp_src_R13B01 under /usr/local/erlang. Beginning CouchDB

Next you’ll need to untar the tarball that the build created under /usr/local: Hadoop in Practice

Python script to filter out poor-quality weather records #!/usr/bin/env python. Hadoop：The Definitive Guide 3rd Edition

This directory can be made to be read only when configured correctly and usually lives in /usr/local, /opt, or /usr when Hadoop is installed via packages. Hadoop Operations

467、单词 postgresql 的匹配文本：

He has more than 7 years of experience in software and middleware (Apache, Tomcat, PostgreSQL, Hadoop eco system) engineering. HBase Administration Cookbook

The command will automatically install the PostgreSQL database, which is required by Ambari. Hadoop Operations and Cluster Management Cookbook

Using HDFS in a Greenplum external table Greenplum is a parallel database that distributes data and queries to one or more PostgreSQL instances. Hadoop RealWorld Solutions Cookbook

You could either use one of the free databases, such as MySQL or PostgreSQL, or a pure key/value store like Berkeley DB. HBase in Action

He has more than seven years' experience in software and middleware (Apache, Tomcat, PostgreSQL, and Hadoop eco system) engineering. Hadoop MapReduce Cookbook

What are the disadvantages of fast connectors? First, only MySQL and PostgreSQL are currently supported. Hadoop in Practice

For a majority of small- to medium-volume applications, there is no substitute for the ease of use, flexibility, maturity, and powerful feature set of available open source RDBMS solutions such as MySQL and PostgreSQL. Hadoop：The Definitive Guide 3rd Edition

Today, Sqoop includes native connectors (called direct support) for MySQL and PostgreSQL. Hadoop Operations

The first iteration of the data warehouse was built using an open source system with a Java SQL optimizer and a set of underlying PostGreSQL databases. Programming Hive

Hence, we use PostgreSQL as the database layer and Hadoop as the communication layer, Hive as the translation layer, and all code we add we release as open source [2] HadoopDB An Architectural Hybrid of MapReduce and DBMS Technologies for Analytical Workloads

468、单词 mutability 的匹配文本：

Programming with mutability in pure Java is like working with the mother-in-law who’s just waiting for you to fail. Programming Concurrency on the JVM

Another aspect of functional programming is that it doesn’t have side effects or mutability. Scala in Action

The old java.util.Date class and all other classes used to model date and time in Java before Java 8 have many inconsistencies and design flaws, including their mutability and some poorly chosen offsets, defaults, and naming. Java 8 in Action Lambdas, Streams and Functional-style Programming

For sets and maps, Scala models mutability in the class hierarchy. Programming in Scala 2nd Edition

These patterns focus on writing code that avoids mutability and favors a declarative style, which helps us write simpler, more maintainable code. Functional Programming Patterns in Scala and Clojure

If you remove mutability, then the problems essentially go away. Programming Scala

Given that mutability leads to unexpected defects, minimizing mutability in code minimizes mutability-related defects. Beginning Scala

If a class cannot be made immutable, limit its mutability as much as possible. Effective Java 2nd Edition

Its mutability makes it suitable for processing text in place, which is not possible with the immutable String class. Java in a Nutshell 5th Edition

Criteria Objects and Mutability Typical usage of the criteria API will result in many different objects being created. Pro JPA2

469、单词 thymeleaf 的匹配文本：

Thymeleaf: It is a modern, tempting framework that provides an excellent alternative to JSPs. Spring Security 3.1

Chapter 6 will explore various view technologies that can be used with Spring, including JSP, Apache Tiles, and Thymeleaf. Spring in Action 4th Edition

470、单词 gfs 的匹配文本：

The Google File System (GFS [14]) is one such distributed storage layer widely used in the company. Dremel - Interactive Analysis of WebScale Datasets

A GFS cluster consists of a single master and multiple. GFS - The Google File System

We conserve network bandwidth by taking advantage of the fact that the input data (managed by GFS [8]) is stored on the local disks of the machines that make up our cluster. MapReduce - Simplied Data Processing on Large Clusters

The Google File System (GFS)[9] is another distributed file system built for hosting the state of Google’s internal applications. Cassandra A Decentralized Structured Storage System

The data sets are often stored in GFS, the Google File System [9] Interpreting the Data Parallel Analysis with Sawzall

Bigtable in turn uses both Chubby[7] (a distributed lock system) and GFS. Dapper a Large-Scale Distributed Systems Tracing Infrastructure

Another storage data structure that is often required at scale is a FIFO queue, like that provided by GFS [7] RADOS A Scalable, Reliable Storage Service for Petabyte-scale Storage Clusters.

In addition, both GFS and Bigtable use Chubby as a well-known and available location to store a small amount of meta-data; in effect they use Chubby as the root of their distributed data structures. Chubby - The lock service for loosely-coupled distributed systems

Bigtable runs on top of the Google File System (GFS) Cassandra The Definitive Guide

Bigtable uses the distributed Google File System (GFS) [17] to store log and data files. Bigtable - A Distributed Storage System for Structured Data

471、单词 autocomplete 的匹配文本：

You can update or remove tag cloud, autocomplete, country code, and calendar widgets from reuters.js. Web Crawling and Data Mining with Apache Nutch

Additionally, Suggestor with autocomplete feature starts providing users options right at the time when user is typing search query enhancing the overall experience. Scaling Big Data with Hadoop and Solr Chapter-02 Understanding Solr

Figure 4-9 shows how gweb search autocomplete allows you to find metrics across your entire deployment. Monitoring with Ganglia

We’ll begin by building autocomplete functions to quickly find users in short and long lists of items. Redis in Action

But the book doesn't only pay attention to low-level aspects of ElasticSearch; it includes user search experience improvements tips, such as dealing with spelling mistakes, highly effective autocomplete feature, and a tutorial on how you can deal with query related improvements. Mastering ElasticSearch

The name-to-UID rows enable support for autocomplete of tag names. HBase in Action

You can customize the completion tokens by creating a file named autocomplete and placing it on Pig’s classpath (such as in the conf directory in Pig’s install directory) or in the directory you invoked Grunt from. Hadoop：The Definitive Guide 3rd Edition

We can do this by using the autocomplete analyzer at index time and the standard analyzer at search time. Elasticsearch The Definitive Guide

If you start typing and hit the Tab key, the CLI will autocomplete possible keywords and function names. Programming Hive

Appendix, Real Life Situations, describes how to handle real-life situations such as implementing different autocomplete functionalities, using near real-time search, or improving query relevance. Apache Solr 4 Cookbook

472、单词 paxos 的匹配文本：

A novel extension to Paxos, detailed in Section 4.4.1, allows local reads at any up-to-date replica. Megastore - Providing Scalable, Highly Available Storage for Interactive Services

CouchDB differs from others by accepting eventual consistency, as opposed to putting absolute consistency ahead of raw availability, like RDBMS or Paxos. CouchDB The Definitive Guide

At the highest level of abstraction, it is a database that shards data across many sets of Paxos [21] state machines in datacenters spread all over the world. Spanner - Google’s Globally-Distributed Database

I suggest that you should look at Paxos and Cassandra's use of that algorithm to affect conditional updates at the following URLs: Storm Blueprints Patterns for Distributed Real-time Computation

Improved primitives: Distributed cross-group transactions Synchronous replication groups (Paxos) Automatic failover of client requests. Storage Architecture and Challenges

The cluster, which is based in part on the Paxos part-time parliament algorithm [14], is designed to favor consistency and durability over availability and update latency. RADOS A Scalable, Reliable Storage Service for Petabyte-scale Storage Clusters.

One might argue that we should have built a library embodying Paxos, rather than a library that accesses a centralized lock service, even a highly reliable one. Chubby - The lock service for loosely-coupled distributed systems

Traditional protocols to implement replicated state machines, like Paxos [2], do not enable such a feature directly, however. Zab High-performance broadcast for primary-backup systems

473、单词 mutex 的匹配文本：

Because this clause produces mutual exclusion, a common name for such a mechanism is mutex. Thinking in Java 4th Edition

A mutex is a data structure that guarantees mutually exclusive access to some resource, preventing concurrent access from causing problems.) High Performance MySQL 3rd Edition

The volatile keyword accomplishes this: it forces reads and writes to the variable to go all the way out to main memory (where they’re visible to all threads), but there’s no mutex. Spring in Practice

A mutex lock is also known as a mutually exclusive lock. Java Threads 3rd Edition

Notice in the preceding code that we used a mutex to sequence the updates to scores, thereby eliminating the race condition mentioned earlier. Storm Blueprints Patterns for Distributed Real-time Computation

The Monitor class that functions as a Mutex, ensuring serial access to the defined areas in our code, much like the synchronized keyword but with much easier semantics and some useful additional features. Getting Started with Google Guava

This allowed for operating system tools such as Solaris mpstat to easily monitor a Java application for lock contention by observing the “smtx” (spin on mutex) column along with observing system or kernel CPU utilization. Java Performance

A Mutex (short for mutual exclusion lock) class can be defined as (omitting implementation code): Concurrent Programming in Java Design Principles and Pattern

Most global variables have an associated mutex that other threads will lock before accessing it. Understanding MySQL Internals

474、单词 jni 的匹配文本：

The methods in turn use JNI to actually interact with an embedded Java Virtual Machine (JVM) which actually interacts with HDFS. Pro Hadoop

The only way to write a true ping application in Java would be to use the Java Native Interface (JNI) to access native code; such a discussion is beyond the scope of this book. Java Network Programming and Distributed Computing

For example, a Java version of MapReduce is implemented as a JNI veneer on top of the C++ version of MapReduce. FlumeJava - Easy, Efficient Data-Parallel Pipelines

Using JNI JNI is a programming framework that allows Java code running in a JVM to call and to be called by native applications. Professional Hadoop Solutions

You also have to include the JNI header files of your Java installation. Hadoop MapReduce Cookbook

It works using the Java Native Interface (JNI) to call a Java filesystem client. Hadoop：The Definitive Guide 3rd Edition

Refer to the source code of the JNI library to find out what is available to you. HBase：The Definitive Guide

The usual Java mechanism for accessing non-native libraries is JNI [15], but it is regarded as slow and cumbersome. Chubby - The lock service for loosely-coupled distributed systems

The Java platform has an API for interoperating with native C code called the Java Native Interface (JNI) Core Java Volume II Advanced Features 9th Edition

In Java 1.1, the -jni option specified that javah should generate new-style files. Java in a Nutshell 5th Edition

475、单词 topologies 的匹配文本：

Further, it is possible to mix and match the deaf/mute, and default topologies to create a system architecture that better suits your environment. Monitoring with Ganglia

Preventing infinite loops is important for some of the more useful replication topologies, such as master-master replication.5 High Performance MySQL 3rd Edition

As shown in Figure 9-19, Storm applications (or topologies) are defined in the form of input streams (called spouts) and a connected graph of processing nodes (called bolts) Professional Hadoop Solutions

Next in the Storm cluster, jobs are typically referred to as topologies; the only difference is that these topologies run forever. Apache Kafka

A table lookup may make more sense for more complex cluster topologies. Hadoop in Action

Network Usage in Hadoop: A Review Hadoop was developed to exist and thrive in real-world network topologies. Hadoop Operations

This is known as a network of brokers and can support many different topologies. ActiveMQ in Action

In this chapter, we will cover the following replication topologies that MongoDB supports: The Definitive Guide to MongoDB

Approaches that assume particular architectures, topologies, processor capabilities, or other fixed environmental constraints are ill suited to commonly available JVM implementations. Concurrent Programming in Java Design Principles and Pattern

Replica sets provide other enhancements too, such as easier recovery and more sophistical deployment topologies. MongoDB in Action

476、单词 latencies 的匹配文本：

With replicas spread over a wide area, communication latencies limit overall throughput. Megastore - Providing Scalable, Highly Available Storage for Interactive Services

Cluster allocation latency was so high, users would often share long-awaited clusters with colleagues, holding on to nodes for longer than anticipated, raising latencies still further. Apache Hadoop YARN Yet Another Resource Negotiator

Our implementation of Paxos is pipelined, so as to improve Spanner’s throughput in the presence of WAN latencies; but writes are applied by Paxos in order (a fact on which we will depend in Section 4) Spanner - Google’s Globally-Distributed Database

Using nodetool cfhistograms to see the distribution of query latencies. Cassandra High Performance Cookbook

Larger clusters have imperfect load distributions, resulting in lower average per-MDS throughput (but, of course, much higher total throughput) and slightly higher latencies. Ceph A Scalable, High-Performance Distributed File System

Since users are served from data centers that are geographically distributed, being able to replicate data across data centers was key to keep search latencies down. Cassandra A Decentralized Structured Storage System

So, in this example, we can see that there’s a roughly log normal distribution of latencies for the selected execution pattern. Dapper a Large-Scale Distributed Systems Tracing Infrastructure

After launching to employees, we noticed that our put and sync latencies were very high. Apache Hadoop Goes Realtime at Facebook

This allows us to track down all clusters, discover how big they are, see which versions of our software they are running, how much traffic they are receiving, and whether or not there are any problems such as unexpectedly large latencies. Bigtable - A Distributed Storage System for Structured Data

Here and in all further experiments we do not show the corresponding latencies, since they do not change significantly (the main goal is to reduce the memory footprint) PowerDrill - Processing a Trillion Cells per Mouse Click

477、单词 xxx 的匹配文本：

This registration is performed by calling an addXXXListener( ) method in the event-firing component, in which "XXX" represents the type of event listened for. Thinking in Java 4th Edition

Note that lack of _source on the list of returned fields causes the sourceXXX() methods not to be working (as discussed in the next point) Mastering ElasticSearch

The super column version of these commands uses an extra ['xxx'] to represent the extra level of the map called the sub-column. Cassandra High Performance Cookbook

A signature file (XXX.SF, where XXX is the name of the entity that signed the archive) that contains signature information. Java Security 2nd Edition

The resource method parameters, fields, or bean properties are bound by way of xxxParam annotations added during object creation time. Java EE 7 Essentials

Xxx getXxx(int columnNumber) • Xxx getXxx(String columnLabel) (Xxx is a type such as int, double, String, Date, etc.) Core Java Volume II Advanced Features 9th Edition

To single out one record, look it up with "xxx", where xxx is one of the of the records displayed above. Java Network Programming 4th Edition

If you printed out the class name of the Connection object or other objects that you are using, you would probably see something like XXX Connection , XXX Statement , XXX ConnectionImpl , or XXX StatementImpl , where XXX is the name of the database you are using (Oracle, for example) Java Performance Tuning 2nd Edition

Lucene has a sandbox component, spellchecker, covered in section 8.XXX, that you can use. Lucene in Action 2nd Edition

In addition, you can call one of the ResultSet object’s updateXXX() methods to update any value within the ResultSet. Java 7 Recipes A Problem-Solution Approach

478、单词 log4j 的匹配文本：

The reason for this is the log4j configuration file would be included on the classpath. Apache Flume Distributed Log Collection for Hadoop

A dependency is generally an external, reusable library in the form of a JAR file (for example, Log4J for logging support) Gradle in Action

The following table shows all the logging levels provided by Log4j: Hadoop Operations and Cluster Management Cookbook

Forwarding Log4j logs to a central sever The faster a problem can be diagnosed and corrected, the better. Cassandra High Performance Cookbook

Within each of the directories in the previous code there are three files, corresponding to standard out, standard error, and the system log (output from both the infrastructure task code, as well as any of your own log4j logging): Hadoop in Practice

Flume supports a large variety of sources; some of the more commonly used ones include tail (which pipes data from a local file being written to into Flume, just like Unix tail), syslog, and Apache log4j (allowing Java applications to write events to files in HDFS via Flume) Hadoop：The Definitive Guide 3rd Edition

There are three basic functions: the storage-conf.xml file allows you to create your data store by configuring your keyspace and column families; there are files related to setting up authentication; and finally, the log4j properties let you change the logging levels to suit your needs. Cassandra The Definitive Guide

More precisely, this is the default behavior as specified within the log4j property file within our conf directory. Hadoop Beginners Guide

It supports native connectivity and support for writing directly to HDFS, and simplifies reliable, streaming data delivery from a variety of sources including RPC services, log4j appenders, syslog, and even the output from OS commands. Hadoop Operations

By adding the custom Log4J RepositorySelector to the application, you’ll avoid this error. ActiveMQ in Action

479、单词 lzop 的匹配文本：

The following example shows how to use the --hiveoverwrite in conjunction with enabling LZOP compression. Hadoop in Practice

The LzopCodec is compatible with the lzop tool, which is essentially the LZO format with extra headers, and is the one you normally want. Hadoop：The Definitive Guide 3rd Edition

480、单词 osds 的匹配文本：

Ceph maximizes the separation between data and metadata management by replacing allocation tables with a pseudo-random data distribution function (CRUSH) designed for heterogeneous and dynamic clusters of unreliable object storage devices (OSDs) Ceph A Scalable, High-Performance Distributed File System

Emerging clustered storage architectures constructed from storage bricks or object storage devices (OSDs) seek to distribute low-level block allocation. RADOS A Scalable, Reliable Storage Service for Petabyte-scale Storage Clusters.

481、单词 pmd 的匹配文本：

In the Java space, you can choose from a wide range of open source and commercial solutions, such as Checkstyle, PMD, Cobertura, FindBugs, and Sonar. Gradle in Action

You will also find recipes on static code review through PMD and FindBugs. Jenkins Continuous Integration Cookbook

There are tools already available for Java and Groovy projects to analyze and check source code, such as Checkstyle, JDepend, PMD, FindBugs, CodeNarc, and Sonar. Gradle Effective Implementation Guide

Modern IDEs have great support for many code quality tools—Checkstyle, PMD, and FindBugs all have plugins for Eclipse, NetBeans, and IntelliJ, which provide rapid feedback for developers on code quality issues. Jenkins The Definitive Guide

Although JSLint does a better/more thorough job, it is relatively easy to add new rule sets to PMD for more custom checking in your environment. Testable JavaScript

Head on to PMD's home page for a more comprehensive list. Sonar Code Quality Testing Essentials

These tools, such as Checkstyle, PMD, or FindBugs have a set of predefined rules of good practices and parse the source code looking for violations of those rules. Java 7 Concurrency Cookbook

In the dependencies closure, we once address the myPmd grouping and indicate that it should download the PMD JAR as needed. Building and Testing with Gradle

482、单词 multipart 的匹配文本：

JavaMail supports MIME attachments, and makes it easy to compose and transmit a multipart message with a file attachment. Java Network Programming and Distributed Computing

In addition to checking the option to make a post request multipart, you will need to specify the absolute path of the file, in cases where the file you are uploading is not within JMeter's bin directory, or the relative path in cases where the file resides within JMeter's bin directory. Performance Testing With JMeter 2.9

Instead of integration with Apache Commons File Upload, we will discuss how we can use Spring MVC with the Servlet 3.0 container’s built-in multipart support for file upload. Pro Spring 3

Next, the incoming request is checked for whether it is a multipart HTTP request (this is used when doing file uploads) Pro Spring MVC

There are two flavors of a put operation — a single upload and a multipart upload. Professional Hadoop Solutions

The getParts method provides a Collection of parts for this multipart request. Java EE 7 Essentials

Link walking always returns a multipart/mixed, since a single key can contain any number of links, meaning any number of objects returned. A Little Riak Book

An XOP package is created by placing a serialization of the XML infoset inside of an extensible packaging format (such as MIME Multipart/Related) Java Performance

Solution Creating a message that contains different parts (a multipart message) is what allows you to send attachments such as files and images. Java 7 Recipes A Problem-Solution Approach

In this chapter, you’ll learn how to customize Spring MVC configuration, handle multipart file uploads, deal with exceptions that may occur in a controller, and pass data between requests with flash attributes. Spring in Action 4th Edition

483、单词 throwable 的匹配文本：

Here, we print a stack trace by using the printStackTrace method of the Throwable class. Core Java Volume I Fundamentals 9th Edition

Throwable In case the servers return an exception for the operation it is returned to the client as-is. HBase：The Definitive Guide

In this code, the onComplete method is invoked when the request processing is finished, after a response is processed and is sent back to the client, or when an unmapped throwable has been propagated to the hosting I/O container. Java EE 7 Essentials

This exception chain is in addition to the “cause” chain of Throwable objects that every exception has. Core Java Volume II Advanced Features 9th Edition

In the event of a failure, we are getting the error message from the Throwable object. Getting Started with Google Guava

There is some confusion among programmers as to when it is appropriate to use each kind of throwable. Effective Java 2nd Edition

The first is the message to be associated with the AssertionError and the second is a Throwable clause. Java 7 New Features Cookbook

The Throwable class and all its subclasses include a String field that stores a human-readable error message that describes the exceptional condition. Java in a Nutshell 5th Edition

All exceptions inherit from Throwable, as shown in Figure 6-1 Java 7 Recipes A Problem-Solution Approach

NOTE: The classes Throwable, Exception, RuntimeException, and Error have constructors with an option for disabling suppressed exceptions and for disabling stack traces. Java SE 8 for the Really Impatient

484、单词 inline 的匹配文本：

If you define a method inside a class, then it is, automatically, an inline method. Core Java Volume I Fundamentals 9th Edition

In this case, we’re using them to import a template and some metadata about the blog (as JSON data), as well as to include link and template rendering functions as inline code. CouchDB The Definitive Guide

The special symbol @ specifies that the configuration is pointing to a file rather than inline. Hadoop Operations and Cluster Management Cookbook

Script location must be set to File instead of Inline. Gradle Effective Implementation Guide

This recipe will outline a shorthand technique for inline table creation when the query is executed. Hadoop RealWorld Solutions Cookbook

The shell provides you with tab-completion of your commands and inline access to command documentation: HBase in Action

For short scripts, you can use the -e option to specify the commands inline, in which case the final semicolon is not required: Hadoop：The Definitive Guide 3rd Edition

The C++ version of the JNIEnv class has inline member functions that take care of the function pointer lookup for you. Core Java Volume II Advanced Features 9th Edition

Views are simply named SELECT statements which are expanded inline during compilation. Tenzing - A SQL Implementation On The MapReduce Framework

The result file is large and printing the complete results inline will kill too many trees. Programming Hive

485、单词 dsls 的匹配文本：

However, Java’s syntax isn’t a natural fit for producing DSLs. Gradle in Action

How can you simplify MapReduce development using domain-specific languages (DSLs)? Professional Hadoop Solutions

And second, they support writing domainspecific languages (DSLs) within the language. Programming in Scala 2nd Edition

The Gradle APIs are rich, the possibilities for DSLs matching your domain are abundant, and the path towards finally having a build system that conforms to your product is clear. Gradle Beyond The Basics

On the other hand, we’ve got internal DSLs, also known as embedded languages. Functional Programming Patterns in Scala and Clojure

Domain-Specific Languages Domain-Specific Languages, or DSLs, provide a convenient syntactical means for expressing goals in a given problem domain. Programming Scala

First off, developers can create domain-specific languages (DSLs) using their favorite language, such as Groovy, JRuby, Jython, Scala, and others (see the project site for examples) Hadoop：The Definitive Guide 3rd Edition

Cascading to Hive and Pig offers a Java API, as opposed to the domain-specific languages (DSLs) provided by the others. HBase：The Definitive Guide

Although we refer to the Query DSL, in reality there are two DSLs: the Query. Elasticsearch The Definitive Guide

Choosing implicit conversions and designing domain-specific languages (DSLs) takes time, thought, and deliberation. Beginning Scala

486、单词 ctrl 的匹配文本：

The description starts with zero or more modifiers shift control ctrl meta alt altGraph and ends with either the string typed, followed by a one-character string (for example, "typed a") or an optional event specifier (pressed or released, with pressed being the default), followed by a key code. Core Java Volume I Fundamentals 9th Edition

If you started in a console, then hitting Ctrl-C and rerunning the same command you used to start it is the simplest way. CouchDB The Definitive Guide

Windows: Open the task manager with the keyboard shortcut Ctrl+Shift+Esc and click the Processes tab. Gradle in Action

To stop the Jetty web container, we press Ctrl + C at the command line, to return to our prompt. Gradle Effective Implementation Guide

Not to worry, just open another Terminal tab by hitting Shift+Ctrl+T. Beginning CouchDB

This lets you see what it's doing, and you can you can easily stop it with CTRL+C. Elasticsearch in Action MEAP V07 cHAPTER 1

To run the program, we specify the name of the file to write output to (pairs.avro) and send input pairs over standard in, marking the end of file by typing Ctrl-D: Hadoop：The Definitive Guide 3rd Edition

Stopping the server requires a Ctrl-C to abort the start script. HBase：The Definitive Guide

We type a word on each line and hit Enter then kill the session with a Ctrl + C command. Hadoop Beginners Guide

You can press CTRL-C in the second terminal to shut down the Consumer at this point. ActiveMQ in Action

487、单词 rdds 的匹配文本：

Abstract We present Resilient Distributed Datasets (RDDs), a distributed memory abstraction that lets programmers perform in-memory computations on large clusters in a fault-tolerant manner. Resilient Distributed Datasets - A Fault-Tolerant Abstraction for In-Memory Cluster Computing

488、单词 atomicity 的匹配文本：

To prevent conflicts from causing starvation, Omega schedulers typically choose to use incremental transactions, which accept all but the conflicting changes (i.e., the transaction provides atomicity but not independence) Omega - flexible, scalable schedulers for large compute clusters

This 5-tuple key provides atomicity, locality, uniqueness, access control, and versioning. Professional Hadoop Solutions

Future works involves adding compression, ability to support atomicity across keys and secondary index support. Cassandra A Decentralized Structured Storage System

The CouchDB layout and committing system adheres to the ACID properties (atomicity, consistency, isolation, and durability), which guarantee the reliable execution of database transactions. Beginning CouchDB

For those who don’t know (or don’t remember), ACID is an acronym standing for atomicity, consistency, isolation, and durability. HBase in Action

Designing the schema in a way to completely avoid explicit locking, combined with row-level atomicity, gives you the ability to scale your system without any notable effect on read or write performance. HBase：The Definitive Guide

It also provides row-level atomicity guarantees, but no native cross-row transactional support. Apache Hadoop Goes Realtime at Facebook

Record-Level Atomicity on Writes Cassandra automatically gives you record-level atomicity on every write operation. Cassandra The Definitive Guide

The performance gains and atomicity of the service co-located with its broker can be an attractive architectural change, too. ActiveMQ in Action

In other words, synchronized is not equivalent to atomic, but synchronization can be used to achieve atomicity. Concurrent Programming in Java Design Principles and Pattern

489、单词 instanceof 的匹配文本：

Use instanceof to check before casting from a superclass to a subclass. Core Java Volume I Fundamentals 9th Edition

Even if an HttpURLConnection object is expected, it is good programming practice to test the class type using the instanceof keyword. Java Network Programming and Distributed Computing

Adding Scala-style pattern matching in full generality to Java seems quite a big job, but following the recent generalization to switch to allow Strings, you can imagine a more-modest syntax extension, which allows switch to operate on objects, using the instanceof syntax. Java 8 in Action Lambdas, Streams and Functional-style Programming

As a result, the typecast will generate a warning, and a corresponding instanceof test is illegal. Data Structures and Algorithm Analysis in Java 3rd Edition

You have two options available: you can either determine the type (using instanceof) in your execmethod, or you can specify in your UDF type-specific alternative implementations, which is neater and removes ugly type-switching code. Hadoop in Practice

You can use the instanceof operator to find out whether a particular remote object implements an interface. Core Java Volume II Advanced Features 9th Edition

If o is a normal Point, do a color-blind comparison if (!(o instanceof ColorPoint)) Effective Java 2nd Edition

The instanceof operator requires an object or array value as its left operand and the name of a reference type as its right operand. Java in a Nutshell 5th Edition

In the application layer where Java programs reside, you simply don’t need to know this (and even if you do need to know, it’s quicker to check the size of the byte array returned by getAddress() than to use instanceof to test which subclass you have) Java Network Programming 4th Edition

Or you could focus on particular classes within a more general monitoring class by filtering interesting hierarchies using instanceof. Java Performance Tuning 2nd Edition

490、单词 nodetool 的匹配文本：

The nodetool program communicates with the JMX port to confirm that the server is alive. Cassandra High Performance Cookbook

It also has scripts to run the nodetool, which is a utility for inspecting a cluster to determine whether it is properly configured, and to perform a variety of maintenance operations. Cassandra The Definitive Guide

491、单词 covariant 的匹配文本：

Nowadays, the covariant return type feature lets you specify the correct return. Core Java Volume I Fundamentals 9th Edition

Covariant return types Java SE5 adds covariant return types, which means that an overridden method in a derived class can return a type derived from the type returned by the base-class method: Thinking in Java 4th Edition

The Maybe class is declared as a covariant on type A. Scala in Action

Or, since it just has one type parameter, you could say simply that Queues are covariant. Programming in Scala 2nd Edition

In Java, as we mentioned in Section 1.4.4, arrays are covariant. Data Structures and Algorithm Analysis in Java 3rd Edition

This rule allows for covariant return types - refining the return type of a method when overriding it. The Java Language Specification Java SE 7 Edition

This is especially true for library writers, who will want to understand when to use parameterized types versus abstract types, which type parameters should be covariant, contravariant, or invariant under subtyping, and so forth. Programming Scala

We’ll discuss why this happened and the very complex topic of invariant, covariant, and contravariant types later in this chapter. Beginning Scala

Covariant returns to allow a subclass to override a superclass method and narrow the return type of the method. Java in a Nutshell 5th Edition

If List was immutable, as it is in a functional programming language, then the problem would disappear, and one could have a covariant list. Java SE 8 for the Really Impatient

492、单词 ntp 的匹配文本：

We will start two EC2 instances; one is a DNS/NTP server, and the other one is the client: HBase Administration Cookbook

Enable the NTP server with the following command: sudo service ntpd start. Hadoop Operations and Cluster Management Cookbook

Network Time Protocol (NTP) is a distributed hierarchical system used to keep system clocks in sync. Cassandra High Performance Cookbook

As we described earlier, this can be achieved using NTP. HBase in Action

We synchronized the two masters’ clocks using NTP and measured the mean time. Mesos A Platform for Fine-Grained Resource Sharing in the Data Center

Run NTP on your cluster, or an equivalent application, to synchronize the time on all servers. HBase：The Definitive Guide

One good strategy here is the use of a Network Time Protocol (NTP) server. Cassandra The Definitive Guide

This contrasts with the various NTP and Time Server protocols, which do not return data that can be easily read by humans. Java Network Programming 4th Edition

Clock synchronization can be achieved through the use of Network Time Protocol (NTP) or using the rdate utility available on Oracle Solaris and Linux systems. Java Performance

You need to ensure that each of your servers uses NTP (Network Time Protocol) to keep their clocks synchronized. MongoDB in Action

493、单词 installer 的匹配文本：

The recommended solution is the current beta project for a CouchDB installer. Getting Started with CouchDB

The Cloudera folks have Amazon machine images (AMIs), Debian and RPM installer files, and an online configuration tool to generate configuration files. Pro Hadoop

CouchDB intends to provide an official Windows installer at some point in the future, so this may change. CouchDB The Definitive Guide

Download the distribution for your OS (at the time of writing, this is version 7.3) and run the installer. Gradle in Action

We simply run the installer software to install Jenkins on our computer. Gradle Effective Implementation Guide

This recipe also requires Python and the Python installer pip. Cassandra High Performance Cookbook

To build the file, download the Jython java installer, run the installer, and select Standalone from the installation menu. Hadoop RealWorld Solutions Cookbook

The Xcode Tools Installer will now load, and you can click the Continue button to begin the install. Beginning CouchDB

RedHat uses packages called RPMs for installation, and Yum as a package installer that can fetch RPMs from remote Yum repositories. Hadoop in Practice

Crucially, they also provide hooks to run scripts at the end of the process, which are invaluable for doing the final system tweaks and customization that are not covered by the standard installer. Hadoop：The Definitive Guide 3rd Edition

494、单词 packtpub 的匹配文本：

Any existing errata can be viewed by selecting your title from http://www.packtpub. Apache Flume Distributed Log Collection for Hadoop

If you purchased this book elsewhere, you can visit http://www.packtpub. HBase Administration Cookbook

If you find any errata, please report them by visiting http://www.packtpub.com/submit-errata, selecting your book, clicking on the errata submission form link, and entering the details of your errata. Hadoop Operations and Cluster Management Cookbook

If you find any errata, please report them by visiting http://www.packtpub. Gradle Effective Implementation Guide

Downloading the example code You can download the example code files for all Packt books you have purchased from your account at http://www.packtpub.com. Mastering ElasticSearch

If you find any errata, please report them by visiting http://www.packtpub.com/support, selecting your book, clicking on the errata submission form link, and entering the details of your errata. Cassandra High Performance Cookbook

Downloading the example code for this book You can download the example code files for all the Packt books you have purchased from your account at http://www.packtpub.com. Hadoop RealWorld Solutions Cookbook

Any existing errata can be viewed by selecting your title from http://www.packtpub.com/support. Hadoop MapReduce Cookbook

If you purchased this book elsewhere, you can visit http://www.packtpub.com/support and register to have the files e-mailed directly to you. Hadoop Beginners Guide

Get in touch with us at service@ packtpub.com for more details. Java 7 Concurrency Cookbook

495、单词 facter 的匹配文本：

I’ll cover the basics of using Puppet manifests for configuration management and techniques for executing and managing those configurations with MCollective and Facter. Managing Infrastructure with Puppet

It shows you how to create Puppet recipes, extend Puppet, and use Facter to gather configuration data from your servers. Pro Puppet

496、单词 uuid 的匹配文本：

For each GPU discovered on the system, the NVML modules expose the maximum and current clock speeds utilization information for the GPU memory and SM, temperature, fan speeds, power draw, ECC mode, used and total GPU memory, performance state, and identifiers such as the PCI bus ID, the GPU UUID, and the brand. Monitoring with Ganglia

Values generated by UUID() have different characteristics from those generated by a cryptographic hash function such as SHA1(): the UUID values are unevenly distributed and are somewhat sequential. High Performance MySQL 3rd Edition

Make sure to leave the document ID blank, and CouchDB will generate a UUID for you. CouchDB The Definitive Guide

Similarly, if you’re sharding on something coarsely-grained key like log level, add a second field to your shard key with a lot of granularity, such as an MD5 hash or UUID. MongoDB The Definitive Guide 2nd Edition

As shown in Listing 9-1, this table will have a key that contains a concatenation of the user’s Universally Unique Identification (UUID) and a timestamp for a picture that shows when the picture was taken. Professional Hadoop Solutions

This unique ID can be assigned by the user or application, or it can use a universally unique identifier (UUID)—a random number generated by CouchDB that greatly reduces the chance of duplicate IDs being used. Beginning CouchDB

This is a UUID generated when HBase starts with an empty storage directory. HBase：The Definitive Guide

This class is an immutable representation of 128-bit Universal Unique Identifier, or UUID, which serves as an identifier that is (with very high probability) globally unique. Java in a Nutshell 5th Edition

The scheme Cassandra uses is a version one UUID, which means that it is generated based on conflating the computer’s MAC address and the number of 100-nanosecond intervals since the beginning of the Gregorian calendar. Cassandra The Definitive Guide

Although the correlation ID in this case uses a random UUID, just about any ID generator can be used. ActiveMQ in Action

497、单词 mapred 的匹配文本：

This file is organized into different contexts: jvm, rpc, hdfs, mapred, and hbase. Monitoring with Ganglia

File Description hadoop-env.sh Configures the environment variable used by Hadoop core-site.xml Configures parameters for the whole Hadoop cluster hdfs-site.xml Configures parameters for HDFS and its clients mapred-site. Hadoop Operations and Cluster Management Cookbook

The MapReduce community has made sure that applications written against 1.x can run on top of YARN in a fully binary compatible manner (mapred APIs) or just by recompiling (source compatibility for mapreduce APIs) Apache Hadoop YARN Yet Another Resource Negotiator

Each Hadoop component uses information in core-site.xml and either mapred-site. Cassandra High Performance Cookbook

For example, if your cluster has 10 nodes running a task tracker and the mapred. Hadoop RealWorld Solutions Cookbook

Inside each node's HADOOP_HOME/conf directory, add the following to the core-site.xml, hdfs-site.xml and mapred-site.xml. Hadoop MapReduce Cookbook

Solution Working with compressed files in MapReduce involves updating the MapReduce configuration file mapred-site.xml and registering the compression codec you are using. Hadoop in Practice

Clicking on the link brings one to a Hadoop Map/Reduce Tutorial (http://hadoop.apache.org/core/docs/current/mapred_ tutorial.html) explaining the Map/Reduce framework. Hadoop in Action

There are many configuration files, but the ones we need to modify are core-site.xml, hdfs-site.xml and mapred-site.xml. Hadoop Beginners Guide

Child tasks mapred a a In secure mode, the user that submitted the job. Hadoop Operations

498、单词 bnd 的匹配文本：

Tools (such as Eclipse PDE, Bnd, and Bundlor) can help you easily and efficiently manage the manifest file, and they can easily be integrated with building tools like Ant or Maven. Spring Dynamic Modules in Action

This becomes important in more complex add-ons, because the Roo shell will take BND’s defaults and export all packages found in the source code. Spring Roo in Action

499、单词 combinator 的匹配文本：

You can then use this to define an operation, which captures internal iteration as a combinator. Java 8 in Action Lambdas, Streams and Functional-style Programming

To figure out what to do, you need to know first what the individual parsers in the combinator frameworks return as a result (provided they succeed in parsing the input) Programming in Scala 2nd Edition

What we need is a new combinator intDouble doubleInt , that can combine two RNG actions into one using a binary rather than unarymap2 Functional Programming in Scala

However, the Scala library includes a powerful parser combinator library that can be used for parsing most external DSLs that have a context-free grammar. Programming Scala

Chapter 8 rolls the pieces together with an exploration of Scala’s parser combinator library. Beginning Scala

If we want to specify repetition, we need to use the rep parser combinator. Steps in Scala An Introduction to Object-Functional Programming

500、单词 embeddable 的匹配文本：

Jetty (http://www.mortbay.org/jetty/) is an embeddable servlet container, which serves a function similar to that of Tomcat or Resin. Mahout in Action

G2: Low memory footprint Tika should be embeddable within Java applications at low memory cost so that it’s as easy to use Tika in a desktop-class environment with capacious network and memory as it is within a mobile PDA with limited resources on which to operate. Tika in Action

Underneath, Storm makes use of zeromq (0mq, zeromq), an advanced, embeddable networking library that provides wonderful features that make Storm possible. Getting Started with Storm

Embeddable API The Embeddable EJB API allows client code and its corresponding enterprise beans to run within the same JVM and class loader. Java EE 7 Essentials

But even complex systems, such as an embeddable spreadsheet application, can function as individual beans. Java in a Nutshell 5th Edition

Berkeley XML DB is embeddable, so it can run as a JAR within your application. Cassandra The Definitive Guide

Not only is it written in 100% Java, but it’s also designed to be embeddable. ActiveMQ in Action

An embedded type is marked as such by adding the @Embeddable annotation to the class definition. Pro JPA2

In terms of persistence mapping, the EmailAddress class is an @Embeddable, which will cause the persistence provider to flatten out all properties of it into the table of the surrounding class. Spring Data Modern Data Access for Enterprise Java

501、单词 async 的匹配文本：

The async mode allows the client to batch small random messages into larger data chunks before sending it over the network. Building LinkedIn’s Real-time Activity Data1

The first takes the String argument which can be one of the following values: sync, async, or default. Mastering ElasticSearch

To achieve this you first have to turn the getPrice method into a getPriceAsync method and change its return value: Java 8 in Action Lambdas, Streams and Functional-style Programming

Async allows you to build parallel data-processing pipelines by chaining successive actions onto asynchronous computations. HBase in Action

The createTable() method, for example, wraps the createTableAsync() method, while adding a loop that waits for the table to be created on the remote servers before yielding control back to the caller. HBase：The Definitive Guide

There is also AsyncListener, which is used to manage async events such as completed, timed out, or an error. Java EE 7 Essentials

If you set replication to async, then it will return success to the client as soon as the request has been executed on the primary shard. Elasticsearch The Definitive Guide

The SettableFuture, AsyncFunction, and FutureFallback classes that are useful utility classes we can use when working with Future instances and doing asynchronous transformation of objects. Getting Started with Google Guava

But you can also get the same performance gain for persistent messages by setting the useAsyncSend property on the message producer’s connection factory, as shown next. ActiveMQ in Action

However, the doInOrderAsync returns right away and does the work in a separate thread, so it is not possible to have the method rethrow the exception. Java SE 8 for the Really Impatient

502、单词 mixin 的匹配文本：

We bring that method into our Account class using incude—JRuby’s mixin facility. Programming Concurrency on the JVM

Chapter 4 covers new issues arising in object-oriented design due to Scala’s mixin inheritance. Scala in Depth

Mixins The term mixin seems to have acquired numerous meanings over time, but the fundamental concept is that of mixing in capabilities from multiple classes in order to produce a resulting class that represents all the types of the mixins. Thinking in Java 4th Edition

Modular mixin composition—This feature of Scala has traits in common with both Java interfaces and abstract classes. Scala in Action

Objects are constructed by mixin composition, which takes the members of a class and adds the members of a number of traits to them. Programming in Scala 2nd Edition

We’ll then use Scala’s self-type annotation and mixin inheritance to specify wiring in a typesafe manner. Functional Programming Patterns in Scala and Clojure

Scala improves upon Java’s support for OOP with the addition of traits, a clean way of implementing classes using mixin composition. Programming Scala

Ruby has a singleton mixin that provides the singleton pattern in Ruby programs. Beginning Scala

For example, Comparable is a mixin interface that allows a class to declare that its instances are ordered with respect to other mutually comparable objects. Effective Java 2nd Edition

Remember that all GUI related classes are actually wrappers around Java’s JFC/Swing classes, thus, trait SuperMixin is used to redirect certain calls from the peer to the wrapper and back. Steps in Scala An Introduction to Object-Functional Programming

503、单词 idf 的匹配文本：

These and many other tricks are part of a popular weighting method called TF-IDF (term frequency–inverse document frequency) Mahout in Action

In this section, we'll look at the default Apache Lucene scoring mechanism: the TF/IDF (term frequency/ inverse document frequency) algorithm and how it affects the returned document. Mastering ElasticSearch

He has been a software engineer in Intelligence Corps, Israel Defense Force (IDF), 2005-2008, working on a variety of military IT systems, and later a software engineer and a team leader in web-based startup named AnyClip, making a dream of "Find any moment from any film, instantly" to come true. Cassandra High Performance Cookbook

You will get scores that are of lesser quality due to sub-optimal document frequency values, a component of the IDF part of the score. Solr 3 Enterprise Search Server

Introduction This chapter will present the implementation of a very well-known data processing algorithm, Term Frequency–Inverse Document Frequency (TF-IDF), using Storm's Trident API. Storm Real-Time Processing Cookbook

The Term frequency-inverse document frequency (TF-IDF) model solves this issue by utilizing the inverted document frequencies (IDF) to scale the term frequencies (TF) Hadoop MapReduce Cookbook

Calculate the sum of TF/IDF values for each word, for each category, Hadoop in Practice

The standard similarity algorithm used in Elasticsearch is known as TF/IDF, or Term Frequency/Inverse Document Frequency, which takes the following. Elasticsearch The Definitive Guide

At a high level, the important concepts are demonstrated by the high-level formula – namely, Term Frequency (tf), Inverse Document Frequency (idf), Term Boosts (t.getBoost), the Field Normalization (norm), the Coordination Factor (coord), and the Query Normalization (queryNorm) Solr in action

Very common terms have a low idf; very rare terms have a high idf. Lucene in Action 2nd Edition

504、单词 unicast 的匹配文本：

Instead of the default UDP multicast communication method we use between Gmonds, we have configured it to use unicast messages. HBase Administration Cookbook

Clusters may use a unicast address instead, which is more compatible with various types of network hardware, and has performance benefits, at the cost of additional per-host configuration. Monitoring with Ganglia

Hadoop supports network communication through both unicast (with normal IP addresses, which is the one we use here) and multicast, which uses multicast addresses, such as 239.2.11.71 Hadoop Operations and Cluster Management Cookbook

The boostrap process When the ElasticSearch node starts, it uses multicast (or unicast, if configured) to find the other nodes in the same cluster (the key here is the cluster name defined in the configuration) and connect to them. Mastering ElasticSearch

Multicast Versus Unicast While the default communication method between monitoring daemons (gmond) is UDP multicast messages, you may encounter environments where multicast is either not possible or a limiting factor. HBase：The Definitive Guide

The term “unicast” refers to the fact that the remote object is located by making a call to a single IP address and port. Core Java Volume II Advanced Features 9th Edition

A unicast event is a rare kind of event for which there can be only a single registered listener object. Java in a Nutshell 5th Edition

The sockets in the previous chapters are unicast: they provide point-to-point communication. Java Network Programming 4th Edition

For this latter scenario, a unicast-based model of message queuing, organized into a hub-and-spoke model, is more efficient than multicast. Java Performance Tuning 2nd Edition

Instead of sending out unicast requests by cycling through a range of addresses where the service mught be, the client can send out a single broadcast or multicast request. Fundamental Networking in Java

505、单词 appending 的匹配文本：

Commit: Use Paxos to achieve consensus for appending that entry to the log. Megastore - Providing Scalable, Highly Available Storage for Interactive Services

This allows different processes to access old versions while the new version is being inserted, and also means that updating the information is really just a case of appending the new data, not reading, updating, and writing back a new version. Getting Started with CouchDB

Log file rotation is a function that periodically rotates a log file to a different name, for example, by appending the date to the filename, so that the original logfile name can be used as an empty file. Hadoop Operations and Cluster Management Cookbook

Otherwise, we merely adjust the record structure without appending any value—which needs to be done if empty records are present. Dremel - Interactive Analysis of WebScale Datasets

The function getReverseTime() achieves this by appending a reverse timestamp for the given item to the already calculated Z-order curve, separated by an underscore token. Hadoop RealWorld Solutions Cookbook

Within the group, appending the postdate orders the twits chronologically. HBase in Action

Pig supports a repartition map-side join by appending USING 'replicated' to the end of the JOIN statement. Hadoop in Practice

Configuration properties apply to a context name and are specified by appending the property name to the context name (separated by a dot) Hadoop：The Definitive Guide 3rd Edition

Commit Logs You can set the value for how large the commit log is allowed to grow before it stops appending new writes to a file and creates a new one. Cassandra The Definitive Guide

Appending a task’s actions one at a time task hello. Building and Testing with Gradle

506、单词 args 的匹配文本：

The args parameters can be used to supply formatting arguments, as described in the next section. Core Java Volume I Fundamentals 9th Edition

The command-line arguments accepted in args are listed in Table 8-15 Pro Hadoop

While you’re at it, you’ll also fine-tune the JVM memory settings by calling the method jvmArgs to avoid potential OutOfMemoryErrors. Gradle in Action

The paths for the posts and comments data are provided via the command line, and the program retrieves them from the args array. MapReduce Design Patterns

For security reasons, be sure to change this magic cookie value to a new value (using the same new value on each node) in each node’s vm.args file. Scaling CouchDB

Args: peerid Id of the peer used for verification, must match the one given for replication tablename Name of the table to verify. HBase in Action

Some older documentation shows file utilities in the form of hadoop dfs -cmd <args> Hadoop in Action

You will see the following output: usage: sqoop COMMAND [ARGS] Hadoop Beginners Guide

The Consumer is executed using the maven-exec-plugin (http://mng.bz/bf7g) by passing it some system properties as arguments using the exec.args property. ActiveMQ in Action

The args parameter expects a List, so we use Groovy list literal syntax to wrap the poet variable. Building and Testing with Gradle

507、单词 javac 的匹配文本：

In step 4, we added the HBase jar to our classpath and then run the javac command to compile the Java code. HBase Administration Cookbook

You should get a display such as this one: javac 1.7.0_02 Core Java Volume I Fundamentals 9th Edition

Look for the string '/bin/javac$', using the egrep program, which searches for simple regular expressions in its input stream: Pro Hadoop

The following code snippet shows the usage of the javac Ant task for compiling Java source code within your build script: Gradle in Action

Internally, the core Java plug-in does this with compile-time dependencies when executing the javac compiler (Example 1-21) Gradle Beyond The Basics

You can specify the character encoding of your source files with the -encoding flag, for example, javac -encoding Big5 Myfile.java. Core Java Volume II Advanced Features 9th Edition

First, check what's already available by opening up a terminal and typing the following: $ javac. Hadoop Beginners Guide

This code can be compiled and run using javac dns.java followed by java dns. Hadoop Operations

Alternately, the javac tool provides an option for specifying resources that need to be loaded for an application. Java 7 Recipes A Problem-Solution Approach

When I started writing Java code in 1996, there weren’t many options for compiling and building a project outside of javac and a custom build script. Building and Testing with Gradle

508、单词 param 的匹配文本：

All @param tags for one method must be kept together. Core Java Volume I Fundamentals 9th Edition

Multiple parameters can be passed to the module’s initialization function by including one or more param sections. Monitoring with Ganglia

Parameters can be specified when launching Pig, using the -param option, one for each parameter: Hadoop：The Definitive Guide 3rd Edition

As explained earlier, @WebInitParam is used to specify init params for a servlet: Java EE 7 Essentials

The third parameter can contain an ImageWriteParam object to set image writing details such as tiling and compression; use null for default values. Core Java Volume II Advanced Features 9th Edition

If you run this script using the pig command, you specify the parameters using the -param name=value argument. Hadoop in Action

Also, preconditions can be specified along with the affected parameters in their @param tags. Effective Java 2nd Edition

For example, the @param tag lets you specify the name and meaning of a single parameter for a method. Java in a Nutshell 5th Edition

The param> clause can be used to pass one or more name/value pairs as parameters to the included resource. Java Performance

The @WebInitParam annotation accepts a name/value pair as an initialization parameter. Java 7 Recipes A Problem-Solution Approach

509、单词 superclasses 的匹配文本：

In fact, the opposite is true: Subclasses have more functionality than their superclasses. Core Java Volume I Fundamentals 9th Edition

This callback interceptor method may be defined only on interceptor classes and/or superclasses of interceptor classes and cannot be defined on the target class. Java EE 7 Essentials

This method saves the class of the object, the signature of the class, and the values of any nonstatic, nontransient fields of the class and its superclasses. Core Java Volume II Advanced Features 9th Edition

This is a very powerful concept because interfaces, superclasses, and interfaces implemented by superclasses are included, meaning we can easily make our event handlers as course- or fine-grained as we want, simply by changing the type accepted by the event-handling method. Getting Started with Google Guava

If implementing the Cloneable interface is to have any effect on a class, the class and all of its superclasses must obey a fairly complex, unenforceable, and thinly documented protocol. Effective Java 2nd Edition

In discussing nonstatic member classes, we saw that a member class can access any members inherited from superclasses and any members defined by its containing classes. Java in a Nutshell 5th Edition

In this mirror image hierarchy, two abstract superclasses define the basic API for reading and writing characters. Java Network Programming 4th Edition

The simplest way to handle all this is to have a flag on which objects are conditionally passed to the monitor from the Object constructor and to have this flag in a simple class with no superclasses so that classloading does not impose extra calls to superclasses. Java Performance Tuning 2nd Edition

Loading a class requires loading all Java superclasses and all Java superinterfaces. Java Performance

For example, self-calls in "superclasses" are not automatically bound to the versions of methods that have been "overridden" in delegation-based "subclasses" Concurrent Programming in Java Design Principles and Pattern

510、单词 uppercase 的匹配文本：

The standard name for a Kerberos realm is your domain name in uppercase. HBase Administration Cookbook

If you get a complaint about welcome (with a lowercase w), then you should reissue the java Welcome command with an uppercase W. Core Java Volume I Fundamentals 9th Edition

Often the host appears twice, with the name in uppercase and lowercase, or some other differing variations of the hostname. Monitoring with Ganglia

Bamboo also expects a short identifier in uppercase characters, as the project key, in the Project Key field. Gradle Effective Implementation Guide

This recipe shows how to use a map-only job to locate any key with a specific column and convert the value of the column to uppercase. Cassandra High Performance Cookbook

Prior experience has told you that the combination of words, and the fact that they’re uppercase, means that this email is most likely spam. Hadoop in Practice

The Pig Latin functions are case-insensitive, though commonly they are written in uppercase. HBase：The Definitive Guide

For example, there’s an UPPER function in PiggyBank that transforms a string to uppercase: Hadoop in Action

By convention, SQL statements use uppercase for SQL language keywords and we will generally follow this when using HiveQL within files, as shown later. Hadoop Beginners Guide

Throughout the book, we will follow the SQL convention of showing Hive keywords in uppercase (e.g., CREATE, TABLE, SELECT and FROM), even though case is ignored by Hive, following SQL conventions. Programming Hive

511、单词 logfiles 的匹配文本：

If your input is comprised of tailed logfiles, the array is most likely a UTF-8 encoded String containing a line of text. Apache Flume Distributed Log Collection for Hadoop

Hadoop logging file naming conventions Hadoop logfiles are kept under the directory $HADOOP_HOME/logs. Hadoop Operations and Cluster Management Cookbook

In environments with only a few systems, connection to the server over SSH and using command-line tools to examine logfiles is usually sufficient. Cassandra High Performance Cookbook

We can use the debug scripts to copy all the logfiles to a single location, e-mail them to a single e-mail account, or perform some analysis. Hadoop MapReduce Cookbook

A web server log is a good example of a set of records that is not normalized (for example, the client hostnames are specified in full each time, even though the same client may appear many times), and this is one reason that logfiles of all kinds are particularly well-suited to analysis with MapReduce. Hadoop：The Definitive Guide 3rd Edition

The logging framework used by HBase is creating the directory and logfiles dynamically. HBase：The Definitive Guide

Like regular logfiles, commit logs have a configurable rollover threshold, and once this file size threshold is reached, the log will roll over, carrying with it any extant dirty bit flags. Cassandra The Definitive Guide

The reason is that the local standalone mode prints information about each individual task execution to the screen, whereas in the other modes this information is written only to logfiles on the running hosts. Hadoop Beginners Guide

Most of the real work is done on the data devices, while the OS device usually only has to contend with logfiles in /var/log. Hadoop Operations

Suppose our Extract, Transform, and Load (ETL) process ingests and aggregates logfiles in our environment, converting each log message to a tab-delimited record and also decomposing the timestamp into separate year, month, and day fields, and a combined hms field for the remaining hour, minute, and second parts of the timestamp, for reasons that will become clear in a moment. Programming Hive

512、单词 testable 的匹配文本：

Carla Schroer and her teams of compatibility testers in Cupertino, California, and Novosibirsk, Russia (with special thanks to Leonid Arbouzov and Alexei Kaigorodov) painstakingly wrote compatibility tests for each testable assertion in the First Edition. The Java Virtual Machine Specification Java SE 7 Edition

Because you’re dealing with an actual class, any method is fully testable through unit tests. Gradle in Action

Chapter 10 focuses on testing Scala applications and how you can use patterns to make your code more testable. Scala in Action

Testable JavaScript, the image of a Doctor fish, and related trade dress are trademarks of O’Reilly Media, Inc. Testable JavaScript

The usage of the Tool interface also makes driver implementation more testable by allowing you to inject arbitrary configurations using Configured’s setConf() method. Professional Hadoop Solutions

It assumes that first is a valid word, which is an easily testable condition prior to the call. Data Structures and Algorithm Analysis in Java 3rd Edition

In particular, the schema-free nature of the graph data model, coupled with the testable nature of a graph database’s application programming interface (API) and query language, empower us to evolve an application in a controlled manner. Graph Databases

This makes the application more testable and conforms to accessibility standards. Selenium Testing Tools Cookbook

This change in thinking has lead to lower defect rates, more modular code, and more testable code. Beginning Scala

The first technique we will demonstrate to make this class testable is to introduce a subclass that eliminates entity manager calls. Pro JPA2

513、单词 xhtml 的匹配文本：

Moreover, there are many file formats that support text, such as PDF, Microsoft Word, Excel, and PowerPoint, simple text files, XML, HTML, XHTML, and so forth. Pro Java 7 NIO.2

The first TestSuite.xhtml defines the suite having HTML links to the tests. Jenkins Continuous Integration Cookbook

Similarly, NotFound is a case class representing HTTP status code 404; when calling the xhtml method, it implicitly gets converted to a Scalaz Response object with an HTTP header value for a content-type of "application/ xhtml+xml." This is a good example of how you can use higher-order functions and combine functions to create nice APIs like the preceding example. Scala in Action

I am very grateful to Dmitry Kirsanov and Alina Kirsanova who turned my manuscript from XHTML into a beautiful book, allowing me to concentrate on the content instead of fussing with the format. Scala for the Impatient

By convention, web pages built with XHTML have a .xhtml extension. Java EE 7 Essentials

However, most pages authored by the W3C are written in XHTML (an HTML dialect that is proper XML) Core Java Volume II Advanced Features 9th Edition

Currently, no browser natively supports XForms, though plug-ins are available for Firefox, and libraries exist to allow serverside translation of XForms into XHTML. Cassandra The Definitive Guide

Also, we show how to produce XHTML content with Scala. Steps in Scala An Introduction to Object-Functional Programming

Tika uses the XHTML (Extensible Hypertext Markup Language) standard to model all documents, regardless of their original format. Lucene in Action 2nd Edition

Dmitry Kirsanov and Alina Kirsanova once again turned an XHTML manuscript into an attractive book with amazing speed and attention to detail. Java SE 8 for the Really Impatient

514、单词 brainz 的匹配文本：

Through using a large set of metadata about artists, releases, and tracks courtesy of the MusicBrainz.org project, you will have a testing ground for Solr, and will learn how to import this data in various ways. Solr 3 Enterprise Search Server

515、单词 checkbox 的匹配文本：

Pressing the space bar when the focus is in the checkbox also toggles the checkmark. Core Java Volume I Fundamentals 9th Edition

Handling checkboxes Capturing checkbox submission is similar to that of capturing textbox submissions, which we encountered earlier in Chapter 2, Recording Your First Test. Performance Testing With JMeter 2.9

Make sure the checkbox next to the plugin name is checked and press the Install button to initiate the installation process. Gradle in Action

When configuring a Job, click on Discard Old Builds, and then the Advanced checkbox, define the Max # of builds to keep with the artifacts. Jenkins Continuous Integration Cookbook

If we want to use a locally installed instance of Gradle, we must uncheck the Install automatically checkbox. Gradle Effective Implementation Guide

To choose this option, just tick the Install automatically checkbox. Jenkins The Definitive Guide

If you want to present a set of choices to a user, and a radio button or checkbox set consumes too much space, you can use a combo box or a list. Core Java Volume II Advanced Features 9th Edition

Starting, changes to Complete; then click its checkbox to see details of the job flow, as shown in the following screenshot: Hadoop Beginners Guide

This interface has a text field to enter the name to be searched for and a checkbox to determine whether the match should be exact or partial. Java Network Programming 4th Edition

For example, perhaps your documents have Country as a field, and your search interface presents a checkbox allowing the user to pick and choose which countries to include in the search. Lucene in Action 2nd Edition

516、单词 mvn 的匹配文本：

This will ensure that the mvn command is always available. Mahout in Action

Starting it up via mvn is as easy as shown in Listing 12.8 Netty in Action MEAP V08

The install command will run all the tests by default; we can ignore the tests to speed up the installation process with command sudo mvn -DskipTests install. Hadoop Operations and Cluster Management Cookbook

Goals bundle phases, for example, for the site goal there are four phases: pre-site, site, post-site, and site-deploy, all of which will be called in order by mvn site or directly by using the syntax mvn site:phase. Jenkins Continuous Integration Cookbook

For example, in the following pom.xml file, the project URL will point to the Jenkins build job that ran the mvn site build: Jenkins The Definitive Guide

For example, if you want to compile a project, you run Maven's mvn command with the compile goal as a parameter. Sonar Code Quality Testing Essentials

Change the folder into a trunk and run the build: $ mvn compile. Hadoop RealWorld Solutions Cookbook

To compile all the source, run the following command: $ mvn clean install This command will compile and package the source so that it’s ready to be run. ActiveMQ in Action

Then run "mvn install" from within the Tika source directory you unpacked above. Lucene in Action 2nd Edition

This install process, equivalent to mvn install, is guaranteed to be compatible with Maven because it actually uses the Maven Ant Tasks produced by the Maven team and hosted at Apache. Building and Testing with Gradle

517、单词 osd 的匹配文本：

This approach allows Ceph to more effectively leverage the intelligence (CPU and memory) present on each OSD to achieve reliable, highly available object storage with linear scaling. Ceph A Scalable, High-Performance Distributed File System

As in conventional storage systems based on local or network-attached (SAN) disk drives or those embracing the proposed T10 OSD standard, devices passively respond to read and write commands, despite their potential to encapsulate significant intelligence. RADOS A Scalable, Reliable Storage Service for Petabyte-scale Storage Clusters.

518、单词 testng 的匹配文本：

By the end of this chapter, you’ll write tests with JUnit, TestNG, and Spock and execute them as part of the build lifecycle. Gradle in Action

JUnit (http://www.junit.org/) is a popular unit test framework for Java; TestNG (http://testng.org/doc/index.html) is another. Jenkins Continuous Integration Cookbook

Using TestNG for testing We have written a test with the JUnit test framework. Gradle Effective Implementation Guide

This format is also used by many other Java testing tools, such as TestNG, Spock and Easyb. Jenkins The Definitive Guide

Some basic familiarity with the TestNG testing API will help make sense of the unit tests. Hadoop RealWorld Solutions Cookbook

You have many options for unit testing in Scala, from established Java tools, such as JUnit and TestNG, to new tools written in Scala, such as ScalaTest, specs, and ScalaCheck. Programming in Scala 2nd Edition

JUnit and TestNG are both good starting points for unit testing Scala. Programming Scala Tackle Multi-Core Complexity on the Java Virtual Machine

You can drive your tests using the built-in Runner or use the provided integration with JUnit or TestNG. Programming Scala

There are test goals for JUnit, TestNG, and so on. Beginning Scala

This facilitates compiled tests being able to reference JUnit or TestNG annotations or base classes. Building and Testing with Gradle

519、单词 corba 的匹配文本：

If you do want to write a method to change numeric parameters, you can use one of the holder types defined in the org.omg.CORBA package: IntHolder, BooleanHolder, and so on. Core Java Volume I Fundamentals 9th Edition

With its choice of two distributed systems technologies (RMI and CORBA), developers can create systems that span many computers. Java Network Programming and Distributed Computing

Java EE applications can use RMI-IIOP, P protocol support, to access CORBA services that are compatible with the. Java EE 7 Specification

This mechanism can only be used to replace certain standard Java libraries (such as those for XML and CORBA support) with newer versions. Core Java Volume II Advanced Features 9th Edition

As you can see, CORBA has more overhead than RMI, which in turn has more overhead than the proprietary system. Java Performance Tuning 2nd Edition

Technologies such as those discussed in chapter 2 (COM, CORBA, DCE, and EJB) using RPC are considered to be tightly coupled. ActiveMQ in Action

Request objects are used in distributed object support systems such as those in java.rmi and org.omg.corba. Concurrent Programming in Java Design Principles and Pattern

Finally, entity beans were modeled as remote objects that used RMI and CORBA, introducing network overhead and restrictions that should never have been added to a persistent object to begin with. Pro JPA2

Before RMI, the only remoting options available to Java programmers were CORBA (which at the time required the purchase of a third-party object request broker [ORB]) and handwritten socket programming. Spring in Action 4th Edition

520、单词 immutability 的匹配文本：

However, great books such as Joshua Bloch’s Effective Java [Blo08] have advised us to promote immutability. Programming Concurrency on the JVM

We first examine the concepts behind functional programming such as side effects, immutability, declarative programming, and referential transparency and relate these to Java 8 Java 8 in Action Lambdas, Streams and Functional-style Programming

Final fields must be used correctly to provide a guarantee of immutability. JSR-133 Java Memory Model and Thread

However, changing modules explicitly disclaim immutability, so cached copies of those modules must be expired after some time. Gradle Beyond The Basics

Immutability One of the most surprising problems in the original Java Memory Model was that objects that were declared final were sometimes not in fact final (immutable) at all. Oracle JRockit The Definitive Guide

The functional approach may appear unnatural if you’re not familiar with it, but it enables immutability, which has many advantages. Effective Java 2nd Edition

Due to the immutability of files in HDFS, it’s uncommon to place a quota on them. Hadoop Operations

You can take advantage of immutability in your own classes. Java Network Programming 4th Edition

Finally, the immutability of SSTables enables us to split tablets quickly. Bigtable - A Distributed Storage System for Structured Data

However, selective use of immutability is a basic tool in concurrent OO programming. Concurrent Programming in Java Design Principles and Pattern

521、单词 interoperability 的匹配文本：

Further information about sFlow interoperability is provided in Chapter 8 Monitoring with Ganglia

However, the CouchDB team recommends sticking with JavaScript as it provides the highest level of support and interoperability, though other options are available. CouchDB The Definitive Guide

Cross-language interoperability is achieved using a standard binary on-the-wire representation of records, in which field values are laid out sequentially as they occur in the record. Dremel - Interactive Analysis of WebScale Datasets

BigTop — This is a formal process and framework for packaging and interoperability testing of Hadoop’s sub-projects and related components. Professional Hadoop Solutions

The Protocol Buffers format has been used heavily by Google for interoperability. Hadoop in Practice

The Apache Bigtop project (http://incubator.apache.org/bigtop/) runs interoperability tests on stacks of Hadoop components and provides Linux packages (RPMs and Debian packages) for easy installation. Hadoop：The Definitive Guide 3rd Edition

The long-term goal of HCatalog is to enable interoperability between tools such as Apache Hive and Pig so that they can share dataset metadata information. Hadoop Operations

You create data files with a wide choice of formats for convenience, compactness, or interoperability with different Apache Hadoop components. Cloudera Impala

But the listed constraints serve to eliminate large classes of safety and liveness problems while also satisfying common reusability and performance goals: unidirectional flow avoids deadlock, connectivity management avoids unwanted interleavings across different flows, transfer protocols avoid safety problems due to inadvertent sharing without the need for extensive dynamic synchronization, and interface conformance assures type safety while still permitting interoperability among components. Concurrent Programming in Java Design Principles and Pattern

Importing an Entire Ant Build File Up to this point, we’ve dealt with interoperability of Ant and Gradle on a fine-grained level, either calling existing Ant tasks, creating new Ant tasks on the fly, or manipulating the results of Ant task calls with Groovy. Building and Testing with Gradle

522、单词 multithreading 的匹配文本：

Java even has built-in support for advanced programming tasks, such as network programming, database connectivity, and multithreading. Core Java Volume I Fundamentals 9th Edition

This book's coverage of advanced topics such as input/output streaming and multithreading allows even the most experienced Java developers to sharpen their skills. Java Network Programming and Distributed Computing

This allows the application developer to focus on the business logic and not worry about lowlevel transaction and state management details, remoting, concurrency, multithreading, connection pooling, or other complex low-level APIs. Java EE 7 Essentials

On the server side, columns are immutable in order to prevent multithreading issues. Cassandra The Definitive Guide

Chapter 3, Threads, explores multithreading and synchronization, with a special emphasis on how they can be used for asynchronous I/O and network servers. Java Network Programming 4th Edition

In Java, the key to making an application responsive is multithreading. Java Performance Tuning 2nd Edition

A note on multithreading in JMS applications The JMS spec specifically defines concurrency for various objects in the JMS API and requires that only a few objects support concurrent access. ActiveMQ in Action

The introduction of the SPARC T-series processor brought chip multiprocessing and chip multithreading to Oracle’s offering of processors. Java Performance

Both simultaneous multithreading (SMT) and chip multiprocessor (CMP) architectures target thread-level parallelism and should improve the performance of many of our servers. Web search for a planet the google cluster architecture

With his experience in multithreading and Java Swing he has managed to create Java clients that can consume tens of thousands of market events per second. Java 7 Recipes A Problem-Solution Approach

523、单词 opentsdb 的匹配文本：

Chapter 5, Monitoring and Diagnosis: In this chapter, we will describe how to monitor and diagnose HBase cluster with Ganglia, OpenTSDB, Nagios, and other tools. HBase Administration Cookbook

StumbleUpon uses OpenTSDB to monitor all of its infrastructure and software, including its HBase clusters. HBase in Action

Use Case: OpenTSDB The OpenTSDB‡ project provides a time series database used to store metrics about servers and services, gathered by external collection agents. HBase：The Definitive Guide

524、单词 hfiles 的匹配文本：

Since HBase 0.94, the Import utility can generate the HBase internal files (HFiles) for bulk load. HBase Administration Cookbook

This is the number of blocks of StoreFiles (HFiles) in the cache. Monitoring with Ganglia

The implementation of HFile in a region server is responsible for reading and writing HFiles to and from HDFS. Professional Hadoop Solutions

Data manipulation and a column family can have multiple HFiles. HBase in Action

The importtsv tool supports loading data to HBase directly using the Put operations as well as by generating the HBase internal HFiles. Hadoop MapReduce Cookbook

The data is stored in store files, called HFiles, which are persistent and ordered immutable maps from keys to values. HBase：The Definitive Guide

HFiles are immutable HDFS files containing key/value pairs in sorted order. Apache Hadoop Goes Realtime at Facebook

As most HFiles are hosted on HDFS, it would be impossible anyway because HDFS files are immutable as well. Hadoop in Action

525、单词 mng 的匹配文本：

The Simple Build Tool (http://mng.bz/2f7Q) provides the cc target for continuous compilation. Scala in Depth

If you’re having difficulty understanding what’s going on with function definition or argument passing, the Python language tutorial offers a good introduction to what’s going on, and you can jump right to the particular section by visiting this shortened URL: http://mng.bz/KM5x. Redis in Action

Make sure you explore it in the Akka documentation( http://mng.bz/wc7D) Scala in Action

In the great book Java Concurrency in Practice (http://mng.bz/979c), Brian Goetz and coauthors give some advice to find the optimal size for a thread pool. Java 8 in Action Lambdas, Streams and Functional-style Programming

You can read more about the built-in PropertyEditors at http://mng.bz/7CO9 Spring in Practice

This versioning scheme is roughly outlined in the Apache Release Management guide: http://mng.bz/6uvM. HBase in Action

I created a sample log file in Apache Common Log Format (see http://mng.bz/ L4S3) with some fake Class E IP addresses and some dummy resources and response codes. Hadoop in Practice

That means the channel adapter not only handles the communication via the particular transport and protocol being used but also must provide a Messaging Mapper (http://mng.bz/Fl0P) so that whatever data representation is used by the external system is converted to and from simple Spring Integration messages. Spring Integration in Action

The Consumer is executed using the maven-exec-plugin (http://mng.bz/bf7g) by passing it some system properties as arguments using the exec.args property. ActiveMQ in Action

For an up-to-date list of MongoDB production deployments, see http://mng.bz/z2CH. MongoDB in Action

526、单词 normalization 的匹配文本：

This is because the score depends on multiple factors, not only the boosts and query structure, but also on how many terms were matched, in which fields, and the type of matching that was used on query normalization, and so on. Mastering ElasticSearch

By default, a field in Lucene is indexed with its norm, a product of the document’s boost, the field’s boost, and the field’s length normalization factor. Professional Hadoop Solutions

As of today, HBase doesn’t provide the features that make normalization easy to do. HBase in Action

Normalization poses problems for MapReduce because it makes reading a record a nonlocal operation, and one of the central assumptions that MapReduce makes is that it is possible to perform (high-speed) streaming reads and writes. Hadoop：The Definitive Guide 3rd Edition

The usual database normalization* rules do not apply directly to HBase, and therefore the number of tables is usually very low. HBase：The Definitive Guide

This process of tokenization and normalization is called analysis, which we discuss in the next section. Elasticsearch The Definitive Guide

Denormalization In relational database design, we are often taught the importance of normalization. Cassandra The Definitive Guide

These are useful for tasks such as normalization of data and doing complex computation involving groups of rows. Tenzing - A SQL Implementation On The MapReduce Framework

In such cases, it's common to have a precursor MapReduce job that applies a normalization or clean-up strategy to the data set. Hadoop Beginners Guide

Unique Keys and Normalization Relational databases typically use unique keys, indexes, and normalization to store data sets that fit into memory or mostly into memory. Programming Hive

527、单词 customization 的匹配文本：

It is easy to take the effects of the modern websites for granted, but the functionality of pop-up lists during searches, customization, and the in-page experience (traditionally referred to as AJAX) of a dynamic website. Getting Started with CouchDB

This is a lot of configuration that the Gradle plugins provide out of the box without any additional customization from your side. Gradle in Action

A rule set file contains several rules and allows customization of the rules. Gradle Effective Implementation Guide

The most common use case for this is when a build needs to access a custom version of some other module, especially when that customization is itself under development and is not ready to publish to the outside world. Gradle Beyond The Basics

We focused on understanding how Hive manages data in internal and external tables, and also looked at advanced topics such as join customization and user-defined functions. Hadoop in Practice

Crucially, they also provide hooks to run scripts at the end of the process, which are invaluable for doing the final system tweaks and customization that are not covered by the standard installer. Hadoop：The Definitive Guide 3rd Edition

But when performing this customization, quite often the administrators don’t edit the corresponding HTML meta headers that. Hadoop in Action

Do not expect to understand all of these now, but do spend a little time browsing to get a flavor for the type of customization allowed by Hadoop. Hadoop Beginners Guide

This allows for easy configuration and customization via the element of the ActiveMQ XML configuration file. ActiveMQ in Action

Let’s take a look at the different ways of defining a task, the two key aspects of task definitions, and the task API we can use to perform our own customization. Building and Testing with Gradle

528、单词 rcfile 的匹配文本：

Currently, it provides various row/columnar store file formats, such as CSVFile, RowFile, RCFile, and Trevni (still unstable) Introduction to Tajo

In this paper, we present a big data placement structure called RCFile (Record Columnar File) and its implementation in the Hadoop system. RCFile A Fast and Space-efficient Data Placement Structure in MapReduce-based Warehouse Systems

Apart from the two formats mentioned above, Hive also provides an RCFileInputFormat which stores the data in a column oriented manner. Hive A Petabyte Scale Data Warehouse Using Hadoop

Hive provides another binary storage format called RCFile, short for Record Columnar File. Hadoop：The Definitive Guide 3rd Edition

Some types of storage formats store their data by fields instead of by records (for example, Hive’s RCFile) Programming Pig

Use any of the Hive data loading techniques, especially for tables using the Avro, SequenceFile, or RCFile formats. Cloudera Impala

You can replace TEXTFILE with one of the other built-in file formats supported by Hive, including SEQUENCEFILE and RCFILE, both of which optimize disk space usage and I/O bandwidth performance using binary encoding and optional compression. Programming Hive

529、单词 apress 的匹配文本：

Note A principal named “apress" is used in the examples in this section, but this principal will not be available on your machine. Pro Java 7 NIO.2

Apress and friends of ED books may be purchased in bulk for academic, corporate, or promotional use. Pro Hadoop

If you need to study the basics of database design, we suggest Clare Churcher’s book Beginning Database Design (Apress) High Performance MySQL 3rd Edition

For a detail description of RESTful web services, we recommend the book Ajax and REST Recipes, A Problem-Solution Approach (Apress, 2006) Pro Spring 3

For information on translations, please e-mail info@apress.com, or visit http://www.apress.com. Beginning CouchDB

Tip If you’re interested in Git, we strongly recommend Scott Chacon’s excellent book Pro Git – also published by Apress. Pro Puppet

Although every precaution has been taken in the preparation of this work, neither the author(s) nor Apress shall have any liability to any person or entity with respect to any loss or damage caused or alleged to be caused directly or indirectly by the information contained in this work. Beginning Scala

I would also like to express my gratitude towards all the people at Apress for all the work they have done to get this book out. The Definitive Guide to MongoDB

For your convenience Apress has placed some of the front matter material after the index. Java 7 Recipes A Problem-Solution Approach

For more information, reference our Special Bulk Sales–eBook Licensing web page at http://www.apress.com/info/bulksales. Pro JPA2

530、单词 retries 的匹配文本：

In Flume 0.9, the input thread also did the writing to the output (except for failover retries) Apache Flume Distributed Log Collection for Hadoop

This property configures the maximum number of reducer retries to fetch map outputs in case of fetch failure. Hadoop Operations and Cluster Management Cookbook

It turns out that most of these overwrites came from client retries due to errors or timeouts. GFS - The Google File System

The number of this retries can be defined by this method. Mastering ElasticSearch

There's more Initializing connections in this manner does not account for server fail-over or retries. Cassandra High Performance Cookbook

When the task fails, the tasktracker retries the task, skipping the records that caused the failure. Hadoop：The Definitive Guide 3rd Edition

For example, when dealing with errors, it favors retries or wait over fast failures. Apache Hadoop Goes Realtime at Facebook

Should a task fail, the JobTracker will automatically relaunch the task, possibly on a different node, up to a predefined limit of retries. Hadoop in Action

When a task fails, Hadoop retries it on the same block but attempts to work around the invalid records. Hadoop Beginners Guide

MapReduce is dead simple for developers: no socket programming, no threading or fancy synchronization logic, no management of retries, no special techniques to deal with enormous amounts of data. Hadoop Operations

531、单词 cobertura 的匹配文本：

In the Java space, you can choose from a wide range of open source and commercial solutions, such as Checkstyle, PMD, Cobertura, FindBugs, and Sonar. Gradle in Action

This recipe uses Cobertura (http://cobertura.sourceforge.net/) to find the code that is not covered by unit tests. Jenkins Continuous Integration Cookbook

We can define the location of the Clover or Cobertura coverage XML result file and much more. Gradle Effective Implementation Guide

To see how the plugins work, we are going to integrate code coverage metrics using the Cobertura plugin. Jenkins The Definitive Guide

Now that we have the aggregated lcov.info file, we need to convert it to the Cobertura XML format. Testable JavaScript

Cobertura and Clover Cobertura, based on the jcoverage Java library, is used to calculate the percentage of code accessed by tests and identify which parts of your source code lack test coverage. Sonar Code Quality Testing Essentials

Setting up a code-coverage tool like Clover, EMMA, or Cobertura can be a big help with managing this issue. Spring in Practice

Install the Maven Reporting plug-in, configuring the Cobertura code coverage report as well as the Surefire Testing report. Spring Roo in Action

532、单词 tarball 的匹配文本：

As of Ganglia 3.4.0, the web interface is a separate distribution tarball maintained in a separate source code repository. Monitoring with Ganglia

If you prefer, a TAR file is available as well: http://github.com/couchapp/couchapp/tarball/master. CouchDB The Definitive Guide

The locations of installed files will be different from the tarball method, and we can check the file layout with the following command: Hadoop Operations and Cluster Management Cookbook

The following commands will download and unpack the tarball for CouchDB 0.10.0 Beginning CouchDB

Appendix A includes the installation instructions for both the CDH tarball and the Apache Hadoop tarball. Hadoop in Practice

Download and extract the tarball into a directory that we call HIVE_HOME. Hadoop in Action

Just click the link on the home page to download the latest release version as a gzipped tarball. Cassandra The Definitive Guide

What just happened? Setting up Avro is a little involved; it is a much newer project than the other Apache tools we'll be using, so it requires more than a single download of a tarball. Hadoop Beginners Guide

For tarball-based installs, this leaves quite a bit of flexibility but also an equal amount of ambiguity. Hadoop Operations

Click on the link to the 5.4.1 release and you’ll find both tarball and zip formats available (the tarball is for Linux and Unix; the zip is for Windows) ActiveMQ in Action

533、单词 maintainable 的匹配文本：

Using interfaces for the callbacks (see the next chapter as well) leads to code that runs faster and is a lot more maintainable. Core Java Volume I Fundamentals 9th Edition

The build tool of choice shouldn’t stand in the way of this effort; rather, it should provide you with a flexible and maintainable way to model your automation needs. Gradle in Action

Build logic is described with a powerful DSL and empowers developers to create reusable and maintainable build logic. Gradle Effective Implementation Guide

With MapReduce we’re not only battling with clean and maintainable code, but also with the performance of a job that will be distributed across hundreds of nodes to compute over terabytes and even petabytes of data. MapReduce Design Patterns

Dealing with Imperative Rule Code Our current rule definition relies on 18 lines of doLast code, which is probably too much imperative code for a well-factored, maintainable build. Gradle Beyond The Basics

I was at that point in late 2007 when I was facing the task of storing millions of documents in a system that needed to be fault-tolerant and scalable while still being maintainable by just me. HBase：The Definitive Guide

With careful design6 and a relatively moderate amount of data, such a system is flexible and maintainable. Hadoop in Action

It is about writing programs that are clear, correct, usable, robust, flexible, and maintainable. Effective Java 2nd Edition

How we build this complex network in a maintainable fashion is the focus of this chapter. Hadoop Beginners Guide

While you are always free to solve problems in a locally optimized way by writing Groovy code inside your build, the most maintainable builds will eventually refactor this code into organized plug-ins that expose new build language idioms to the build files that use them. Building and Testing with Gradle

534、单词 subqueries 的匹配文本：

Temporary tables created to hold the results of subqueries don’t have indexes, so this seems plausible. High Performance MySQL 3rd Edition

The language supports nested subqueries, inter and intra-record aggregation, top-k, joins, user-defined functions, etc; some of these features are exemplified in the experimental section. Dremel - Interactive Analysis of WebScale Datasets

If necessary, any node can send subqueries parallel to other nodes and merge responses to return the full response to the user. Mastering ElasticSearch

Traditional SQL features like from clause subqueries, various types of joins – inner, left outer, right outer and outer joins, cartesian products, group bys and aggregations, union all, create table as select and many useful functions on primitive and complex types make the language very SQL like. Hive A Petabyte Scale Data Warehouse Using Hadoop

Hive doesn’t support IN subqueries (at the time of this writing), but you can use a LEFT SEMI JOIN to do the same thing. Hadoop：The Definitive Guide 3rd Edition

Use the UNION operation only when the columns in both subqueries. SQL Antipatterns

Even if you have to use several complex subqueries or join statements, this is usually true. Cassandra The Definitive Guide

We use parentheses to form subqueries, enabling advanced construction of BooleanQuerys: Lucene in Action 2nd Edition

A broad selection of SQL features are directly supported, including subqueries, aggregate queries, update and delete statements, numerous SQL functions, and more. Pro JPA2

However, it still lacked support for transactions, subqueries, foreign keys, stored procedures, and views. Understanding MySQL Internals

535、单词 invariants 的匹配文本：

Specifically, a per-cluster ResourceManager (RM) tracks resource usage and node liveness, enforces allocation invariants, and arbitrates contention among tenants. Apache Hadoop YARN Yet Another Resource Negotiator

When creating a representative dataset, we try to reproduce any domain invariants we have identified: the minimum, maximum, and average number of relationships per node, the spread of different relationship types, property value ranges, and so on. Graph Databases

It is now possible to account for cases where the masters are accessed directly and guarantee important system invariants. Dapper a Large-Scale Distributed Systems Tracing Infrastructure

This data uses the sequence file format, first because it’s processed sequentially, and second because we couldn’t satisfy the map file invariants of sorted keys. Hadoop：The Definitive Guide 3rd Edition

Several important prerequisites for code optimization, such as identifying loop invariants and loop bodies require expensive analysis in a control flow graph. Oracle JRockit The Definitive Guide

Like a constructor, a builder can impose invariants on its parameters. Effective Java 2nd Edition

Both scenarios break some of the invariants assumed by the security model and can inadvertently expose data to malicious applications. Hadoop Operations

You should use this pattern for debug and trace statements and assertion preconditions, postconditions, and invariants. Java Performance Tuning 2nd Edition

One path is first to establish conceptual-level invariants, for example the rule that water tank volumes must always be between zero and their capacities. Concurrent Programming in Java Design Principles and Pattern

The indexing system starts by crawling every page on the web and processing them while maintaining a set of invariants on the index. Percolator - Large-scale Incremental Processing Using Distributed Transactions and Notifications

536、单词 pointcuts 的匹配文本：

Pointcuts: A pointcut is a collection of joinpoints that you use to define when advice should be executed. Pro Spring 3

Consult the Spring Security reference documentation for information on using security pointcuts and AOP to secure Java methods. Spring in Practice

Note that the new pointcut we added uses some more advanced AspectJ syntax, illustrating Boolean logic and the other supported pointcuts, args, that can be used to specify the type declaration of arguments. Spring Security 3.1

The AspectJ expression language for pointcuts offers a lot of flexibility to determine which methods to match against. Pro Spring Security

An aspect encapsulates pointcuts and advices, much the way a class encapsulates member fields and methods. Programming Scala

Aspects are often described in terms of advice, pointcuts, and join points. Spring in Action 4th Edition

537、单词 instantiation 的匹配文本：

Upon instantiation, the class expects you to provide the account credentials as well as the API’s URL, format, and version. Gradle in Action

Using an in-mapper combiner is more efficient than using actual combiners — there is no additional overhead of reading and instantiation of key/value pairs. Professional Hadoop Solutions

It is reset and reused every time the vertex needs to send a message, and helps to avoid unnecessary instantiation. Hadoop RealWorld Solutions Cookbook

It is instantiated in parallel on many machines, with each instantiation processing one file or perhaps GFS chunk. Interpreting the Data Parallel Analysis with Sawzall

You can set the block size during table instantiation like this: HBase in Action

Instantiation of Generic Types It is illegal to create an instance of a generic type. Data Structures and Algorithm Analysis in Java 3rd Edition

You can create an instance of FilterList while providing various parameters at instantiation time, using one of these constructors: HBase：The Definitive Guide

A Supplier interface also gives you the flexibility to use lazy instantiation by not constructing an instance until the get method is called. Getting Started with Google Guava

Its default instantiation is based on the resource configuration for your HDFS system. Hadoop in Action

This approach is functionally equivalent to the public field approach, except that it is more concise, provides the serialization machinery for free, and provides an ironclad guarantee against multiple instantiation, even in the face of sophisticated serialization or reflection attacks. Effective Java 2nd Edition

538、单词 sdk 的匹配文本：

You will need to include the flume-ng-sdk JAR in the classpath of your Java application at runtime to use Flume's Log4J Appender. Apache Flume Distributed Log Collection for Hadoop

With the master plan in place, you’ll get started by writing some tasks to interact with the CloudBees client SDK. Gradle in Action

We can also use the options property to define a lot of properties we know from the Java SDK javadoc tool. Gradle Effective Implementation Guide

The examples presented here use the Java SDK to show how to implement S3 data manipulation. Professional Hadoop Solutions

OSNaming JNDI service provider is a part of the Java SE 7 SDK and JRE. Java EE 7 Specification

Flume Highly reliable, configurable streaming data collection ZooKeeper Coordination service for distributed applications Hue User interface framework and SDK for visual Hadoop applications Whirr Library for running Hadoop, and HBase, in the cloud In regard to HBase, CDH solves the issue of running a truly reliable cluster setup, as it has all the required HDFS patches to enable durability. HBase：The Definitive Guide

In the Java SDK documentation, you may see it referred to as Java 1.7 Java 7 New Features Cookbook

Details on downloading and using the SDK are available in the Amazon EMR documentation. Programming Hive

As of this writing, a similar class is scheduled to be supported in an upcoming SDK release. Concurrent Programming in Java Design Principles and Pattern

Once you have launched the JavaFX SDK setup executable you will see the start of the wizard in Figure 15-2 Java 7 Recipes A Problem-Solution Approach

539、单词 kvm 的匹配文本：

The libvirt project has created a common set of tools for managing virtualization resources on different virtualization platforms, currently including: Xen, QEMU, KVM, LXC, OpenVZ, User Mode Linux, VirtualBox, and VMware ESX and GSX. Monitoring with Ganglia

The nova-compute process is primarily a worker daemon that creates and terminates virtual machine instances via hypervisor's APIs (XenAPI for XenServer/XCP, libvirt for KVM or QEMU, VMwareAPI for VMware, etc.) bk-compute-adminguide-grizzly

Real hardware, VMWare virtualization, or KVM virtualization will generally allow us to create a fork of a Redis process at roughly 10–20ms per gigabyte of. Redis in Action

An example of this would be nova-compute accepting a message from the queue to create a new instance and then using the libvirt library to start a new KVM instance. Deploying OpenStack

The process by which it does so is fairly complex but the basics are simple: Accept actions from the queue and perform a series of system commands, like launching a KVM instance, to carry them out while updating state in the database. bk-admin-guide-cloud-latest

For performance or scalability testing, you would use real hardware in a cluster configuration.) You run the VM in VMWare, KVM, or VirtualBox, start the Impala service through the Cloudera Manager web interface, and then interact with Impala through the impalashell interpreter or the ODBC and JDBC interfaces. Cloudera Impala

This deployment ran central services on a set of servers running KVM. OpenStack Operations Guide

540、单词 dnsmasq 的匹配文本：

Something to look at is the amount of dnsmasq processes that are running. bk-compute-adminguide-grizzly

Tracks IP address leases and records them in the database by using the dnsmasq dhcp-script facility. bk-admin-guide-cloud-latest

Debugging DHCP Issues One common networking problem is that an instance boots successfully but is not reachable because it failed to obtain an IP address from dnsmasq, which is the DHCP server that is launched by the nova-network service. OpenStack Operations Guide

541、单词 uid 的匹配文本：

This script comes with OpenTSDB; it creates two tables—tsdb and tsdb-uid. HBase Administration Cookbook

The serialVersionUID field is used to determine whether a compatible class definition was found during deserialization that matches that of a serialized object. Java Network Programming and Distributed Computing

A Solaris subject that contains a principal with a UID of 0, for example, implies a Solaris principal with any other UID. Java Security 2nd Edition

The tsdb-uid table maintains an index of globally unique values for use as metric tags. HBase in Action

By default, the _uid field is stored (can be retrieved) and indexed (searchable) Elasticsearch The Definitive Guide

The graphical version of the serialver program All later versions of the class must define the serialVersionUID constant to the same fingerprint as the original. Core Java Volume II Advanced Features 9th Edition

If you do not specify this number explicitly by declaring a static final long field named serialVersionUID, the system automatically generates it at runtime by applying a complex procedure to the class. Effective Java 2nd Edition

If the specified class declares a long serialVersionUID constant, the value of that field is displayed. Java in a Nutshell 5th Edition

The easiest way to accomplish this is to ensure that the uid of the user the namenode process run as is the same on all namenodes. Hadoop Operations

The “UID” column reports the user id responsible for performing the disk access. Java Performance

542、单词 prefixed 的匹配文本：

Note that the revision ID (rev) is prefixed with an underscore. Getting Started with CouchDB

As we’ve seen, design documents are normal JSON documents, denoted by the fact that their DocID is prefixed with _design/ CouchDB The Definitive Guide

The mentioned properties we just saw, should be prefixed with the index.merge. Mastering ElasticSearch

Note Up until this point, all UNIX commands have been prefixed with the prompt symbol ($) Beginning CouchDB

Quantifier variables are declared like regular variables, but the base type (usually int) is prefixed by a keyword specifying the form of quantifier. Interpreting the Data Parallel Analysis with Sawzall

We can specify all the part prefixed subdirectories as input to the dependent job using wildcards. Hadoop MapReduce Cookbook

Properties in the file are prefixed with whirr., but if they are passed as arguments on the command line, the prefix is dropped. Hadoop：The Definitive Guide 3rd Edition

A job ID is a string prefixed with job_, followed by the cluster ID (which is a timestamp of when the cluster was started), followed by an auto-incremented job number. Hadoop in Action

Instead, use the Thrift API calls prefixed with system_ to configure your keyspaces and column families, as shown previously. Cassandra The Definitive Guide

Type the following commands; the output should look as shown next, where the commands are included on the lines prefixed by $: Hadoop Beginners Guide

543、单词 portability 的匹配文本：

Some languages give you portability, garbage collection, and the like, but they don’t have much of a library, forcing you to roll your own if you want fancy graphics or networking or database access. Core Java Volume I Fundamentals 9th Edition

Ganglia will not allow IPv6=>IPv4 mapping (for portability and security reasons) Monitoring with Ganglia

In addition to less portability and JVM commitment, using the REST API you can connect to different major ElasticSearch versions if you are aware of the differences in REST endpoints and response returned by ElasticSearch. Mastering ElasticSearch

Portability — Your application may not require all the extra features that an RDBMS provides (for example, typed columns, secondary indexes, transactions, advanced query languages, and so on) Professional Hadoop Solutions

You learned that the advantages that a 100 percent client-side application can offer in terms of flexibility and portability. Beginning CouchDB

The project was created by Doug Cutting (the creator of Hadoop) to address the major downside of Hadoop Writables: lack of language portability. Hadoop：The Definitive Guide 3rd Edition

Java encourages portability of entire applications at the expense of incremental adoption by making it somewhat irksome to link against other languages. Chubby - The lock service for loosely-coupled distributed systems

The JMS API minimizes the amount of enterprise messaging knowledge a Java programmer is required to possess in order to develop complex messaging applications, while still maintaining a certain amount of portability across JMS provider implementations. ActiveMQ in Action

For script portability, it’s typical to omit the authority, only specifying it when referring to another distributed filesystem instance (including S3 buckets) Programming Hive

This enhances portability, at the expense of needing to accommodate differences in the quality of implementation of these mappings. Concurrent Programming in Java Design Principles and Pattern

544、单词 codecs 的匹配文本：

Usually, each mapper is assigned a single block of a file as input (unless you have used certain compression codecs) Apache Flume Distributed Log Collection for Hadoop

The framework supports the gzip, zip, sometimes LZO, and bzip2 compression codecs. Pro Hadoop

Using codecs One of the most significant changes introduced by Apache Lucene 4.0 was the ability to alter how index files are written. Mastering ElasticSearch

With the addition of new cryptographic codecs, a new token-based authentication mechanism that supports more authentication protocols, a unified authorization system that supports Attribute Based Access Control (ABAC) and supports policy enforcement using open standards and XACML, and changes to HBase to allow cell-level authorization, Hadoop will be able to move from isolated cluster environments with perimeter-level security to very secure deployments that can meet the requirements of highly secure environments. Professional Hadoop Solutions

Meaning, once a file is compressed using any of the codecs that Hadoop provides, the file cannot be decompressed without the whole file being read. Hadoop RealWorld Solutions Cookbook

If you’ve set up your cluster with LZO compression, you’ll need to create a flume-env.sh file and set the directory that contains the native compression codecs: Hadoop in Practice

Table 4-2 lists the codecs that are available for Hadoop. Hadoop：The Definitive Guide 3rd Edition

Hadoop supports a number of compression codecs (see table 6.4) Hadoop in Action

This leaves administrators and developers to select more or less aggressive compression codecs based on the resource consumption and state of the cluster. Hadoop Operations

The code itself declares a single class called Encoder, which exposes a method that calls both the Metaphone and Base64 codecs on a string argument. Building and Testing with Gradle

545、单词 contrib 的匹配文本：

If you are certain the spikes are not normal, you can remove them from the RRDtool database using the contributed script removespikes.pl, which is usually shipped in the Ganglia release tarball under contrib/ Monitoring with Ganglia

I once had a cluster that was using the Filesystem in Userspace (FUSE) contrib package to export HDFS as a read-only file system on a machine, which re-exported that file system via the Common Internet File System (CIFS) to a Windows server machine. Pro Hadoop

The standard Lucene distribution contains several Directory implementations, including filesystembased and memory-based, Berkeley DB-based (in the Lucene contrib module), and several others. Professional Hadoop Solutions

A tool to handle this is inside a contrib sub project. Cassandra High Performance Cookbook

They’re so common that Hadoop ships with a contrib JAR called hadoop-datajoin to make things easier. HBase in Action

The contrib package does all of the heavy lifting and only requires a handful of methods to be implemented. Hadoop in Practice

Hadoop’s Fuse-DFS contrib module allows any Hadoop filesystem (but typically HDFS) to be mounted as a standard filesystem. Hadoop：The Definitive Guide 3rd Edition

Hadoop has a contrib package called datajoin that works as a generic framework for data joining in Hadoop. Hadoop in Action

If you tune your JVM options and see some success using a load-testing tool or something like the Python stress test in contrib, don’t get too excited. Cassandra The Definitive Guide

Enabling alternative schedulers Each of the alternative schedulers is provided as a JAR file in capacityScheduler and fairScheduler directories within the contrib directory in the Hadoop installation. Hadoop Beginners Guide

546、单词 accumulator 的匹配文本：

The multiple Modifier and Reader tasks provide contention on the Accumulator object. Thinking in Java 4th Edition

This same accumulator will be used for each member of a given group, so any changes made to it can be persisted. MongoDB The Definitive Guide 2nd Edition

As you can see, you don’t need to implement how to filter using the attributes of a dish (for example, using an iterator and an accumulator) Java 8 in Action Lambdas, Streams and Functional-style Programming

We first created a UDF that extended the EvalFunc abstract class and implemented the Accumulator interface. Hadoop RealWorld Solutions Cookbook

If a function can’t be algebraic but can operate on subsets of the data, it’s recommended to implement the Accumulator interface, which can be seen in figure 11.6 Hadoop in Practice

Pig’s Algebraic or Accumulator interfaces for more efficient processing of the bag in chunks. Hadoop：The Definitive Guide 3rd Edition

Invokers do not use the Accumulator or Algebraic interfaces, and are thus likely to be much slower and to use much more memory than UDFs written specifically for Pig. Programming Pig

To accomplish aggregation in an external application, declare an accumulator before the loop that reads from the input stream and output the sum after the completion of the input: Programming Hive

The LongAccumulator generalizes this idea to an arbitrary accumulation operation. Java SE 8 for the Really Impatient

An in-memory table can be used to store temporary results when executing a complex set of queries; as a fast data accumulator that gets periodically flushed to disk; as a fast cache for a portion of the data from some large disk-based table; and in many other ways. Understanding MySQL Internals

547、单词 jboss 的匹配文本：

Under the covers, Akka uses JBoss Netty and Google Protocol Buffers libraries to make the remoting happen seamlessly. Programming Concurrency on the JVM

He has provided consultations and training to corporations around the world on various open source technologies such as Mule ESB, Activiti BPM, JBoss jBPM and Drools, Liferay Portal, Alfresco ECM, JBoss SOA, and cloud computing. Web Crawling and Data Mining with Apache Nutch

If you can take an extra dependency, JBoss Marshalling25 is the way to go. Netty in Action MEAP V08

This plugin can deploy across a range of server types and version ranges including Tomcat, GlassFish, and JBoss. Jenkins Continuous Integration Cookbook

You can run it either as a stand-alone application, or deployed on a conventional Java application server such as Tomcat or JBoss. Jenkins The Definitive Guide

He also knew the next company he started would be open source and build on the model just proved successful by JBoss and MySQL. RabbitMQ in Action

MBeans can be managed remotely by a wide variety of management consoles such as the JConsole GUI that comes with Java and the webbased JMX Console that comes with the JBoss application server. Solr 3 Enterprise Search Server

In these applications, JMX is simply one way of managing the container; JBoss Application Server, on the other hand, uses JMX as the primary way of interacting with the container. Cassandra The Definitive Guide

Jeff Bleiel, our development editor, for his work liaising with Manning—without Jeff’s input, the book would not be what it is today; Gary Tully for his tireless technical reviews of the entire book; the Manning staff for their arduous work on the book; Filip Hanik for his assistance with Tomcat; Jan Bartel and Greg Wilkins for their assistance with Jetty; David Jencks and Kevan Miller for their assistance with Geronimo; and Jaikiran Pai for his assistance with JBoss. ActiveMQ in Action

NetBeans Profiler supports profiling of many popular application servers and Web servers right out of the box such as GlassFish, Tomcat, Weblogic, and JBoss. Java Performance

548、单词 spittles 的匹配文本：

The Spittr application has two essential domain concepts: spitters (the users of the application) and spittles (the brief status updates that users publish) Spring in Action 4th Edition

549、单词 yaml 的匹配文本：

Cloud-config format Cloud-init supports a YAML-based config format that allows the user to configure a large number of options on a system. bk-compute-adminguide-grizzly

Over time, some developers became grumpy about this: it requires a lot of explicit bean wiring, and XML compares unfavorably to terser formats like JSON, YAML, and perhaps even Java. Spring in Practice

The listing below provides a minimal storm.yaml file with entries that you must override: Storm Blueprints Patterns for Distributed Real-time Computation

By changing options in the cassandra.yaml configuration file, Cassandra uses specific directories created. Cassandra High Performance Cookbook

YAML, which stands for Yet Another Markup Language, is also supported for the same purpose, and you can activate it by adding the format=yaml parameter to the HTTP request. Elasticsearch in Action MEAP V07 cHAPTER 1

It is often used as a configuration file format; for example, the database configuration file used in Ruby on Rails applications, database.yml, is a YAML file. Pro Puppet

Modify the configuration file to add your Storm cluster configuration (you can see all default configurations on defaults.yaml in the Storm repository) Getting Started with Storm

The output from explain can be difficult to read in JSON, but it is easier when it is formatted as YAML. Elasticsearch The Definitive Guide

For this example, we’ll define the schema in YAML and then load it, although you could also use client code to define it. Cassandra The Definitive Guide

What just happened? We have just got the Selenium Grid to load our original Selenium Grid item by passing in the original YAML file. Selenium 2 Testing Tools Beginner's Guide

550、单词 compactions 的匹配文本：

Managing compactions An HBase table has the following physical storage structure: HBase Administration Cookbook

These tombstones, along with the dead values, are cleaned up on major compactions. Professional Hadoop Solutions

Lowering this setting to three days allows compactions to reclaim the disk space sooner. Cassandra High Performance Cookbook

Future versions of CouchDB may include more advanced compaction features such as queuing compactions and processing multiple compactions with one API call. Beginning CouchDB

Minor compactions are designed to be minimally detrimental to HBase performance, so there is an upper limit on the number of HFiles involved. HBase in Action

There are two types of compaction: minor compactions and major compactions. HBase：The Definitive Guide

To address this, we made compactions interruptible to favor responsiveness over completion. Apache Hadoop Goes Realtime at Facebook

Cassandra can use a lot of memory, and garbage collection and major compactions can have a big impact on performance. Cassandra The Definitive Guide

Incoming read and write operations can continue while compactions occur. Bigtable - A Distributed Storage System for Structured Data

In this experiment, we configured the benchmark with increasing numbers of customers and measured both the achieved performance and the number of cores used by all parts of the system including cores used for background maintenance such as Bigtable compactions. Percolator - Large-scale Incremental Processing Using Distributed Transactions and Notifications

涉及文献数目：167
1、A Java ForkJoin Framework
2、A Little Riak Book
3、ActiveMQ in Action
4、Apache Flume Distributed Log Collection for Hadoop
5、Apache Hadoop Goes Realtime at Facebook
6、Apache Hadoop YARN Yet Another Resource Negotiator
7、Apache Kafka
8、Apache Solr 4 Cookbook
9、Apache Sqoop Cookbook
10、Beginning CouchDB
11、Beginning Scala
12、Bigtable - A Distributed Storage System for Structured Data
13、Building LinkedIn’s Real-time Activity Data1
14、Building and Testing with Gradle
15、Cassandra A Decentralized Structured Storage System
16、Cassandra High Performance Cookbook
17、Cassandra The Definitive Guide
18、Ceph A Scalable, High-Performance Distributed File System
19、Chubby - The lock service for loosely-coupled distributed systems
20、Cloudera Impala
21、Concurrent Programming in Java Design Principles and Pattern
22、Core Java Volume I Fundamentals 9th Edition
23、Core Java Volume II Advanced Features 9th Edition
24、CouchDB The Definitive Guide
25、Dapper a Large-Scale Distributed Systems Tracing Infrastructure
26、Data Structures and Algorithm Analysis in Java 3rd Edition
27、Deploying OpenStack
28、Dremel - Interactive Analysis of WebScale Datasets
29、Effective Java 2nd Edition
30、Elasticsearch The Definitive Guide
31、Elasticsearch in Action MEAP V07 cHAPTER 1
32、FlumeJava - Easy, Efficient Data-Parallel Pipelines
33、Functional Programming Patterns in Scala and Clojure
34、Functional Programming in Scala
35、Fundamental Networking in Java
36、GFS - The Google File System
37、Getting Started with CouchDB
38、Getting Started with Google Guava
39、Getting Started with Storm
40、Gradle Beyond The Basics
41、Gradle Effective Implementation Guide
42、Gradle in Action
43、Graph Databases
44、HBase Administration Cookbook
45、HBase in Action
46、HBase：The Definitive Guide
47、HDFS scalability the limits to growth
48、Hadoop Beginners Guide
49、Hadoop MapReduce Cookbook
50、Hadoop Operations
51、Hadoop Operations and Cluster Management Cookbook
52、Hadoop RealWorld Solutions Cookbook
53、Hadoop in Action
54、Hadoop in Practice
55、HadoopDB An Architectural Hybrid of MapReduce and DBMS Technologies for Analytical Workloads
56、Hadoop：The Definitive Guide 3rd Edition
57、High Performance MySQL 3rd Edition
58、Hive A Petabyte Scale Data Warehouse Using Hadoop
59、Incoop MapReduce for Incremental Computations
60、Interpreting the Data Parallel Analysis with Sawzall
61、Introduction to Tajo
62、JSR-133 Java Memory Model and Thread
63、Java 7 Concurrency Cookbook
64、Java 7 New Features Cookbook
65、Java 7 Recipes A Problem-Solution Approach
66、Java 8 in Action Lambdas, Streams and Functional-style Programming
67、Java EE 7 Essentials
68、Java EE 7 Specification
69、Java Network Programming 4th Edition
70、Java Network Programming and Distributed Computing
71、Java Performance
72、Java Performance Tuning 2nd Edition
73、Java SE 8 for the Really Impatient
74、Java Security 2nd Edition
75、Java Threads 3rd Edition
76、Java in a Nutshell 5th Edition
77、Jenkins Continuous Integration Cookbook
78、Jenkins The Definitive Guide
79、Kafka a Distributed Messaging System for Log Processing
80、Lucene in Action 2nd Edition
81、Mahout in Action
82、Managing Infrastructure with Puppet
83、MapReduce - Simplied Data Processing on Large Clusters
84、MapReduce Design Patterns
85、Mastering ElasticSearch
86、Megastore - Providing Scalable, Highly Available Storage for Interactive Services
87、Mesos A Platform for Fine-Grained Resource Sharing in the Data Center
88、MongoDB The Definitive Guide 2nd Edition
89、MongoDB in Action
90、Monitoring with Ganglia
91、Namenode High Availability
92、Netty in Action MEAP V08
93、NoSQL Distilled
94、Omega - flexible, scalable schedulers for large compute clusters
95、OpenStack Operations Guide
96、Oracle JRockit The Definitive Guide
97、Percolator - Large-scale Incremental Processing Using Distributed Transactions and Notifications
98、Performance Testing With JMeter 2.9
99、PowerDrill - Processing a Trillion Cells per Mouse Click
100、Pregel - A System for Large-Scale Graph Processing
101、Pro Hadoop
102、Pro JPA2
103、Pro Java 7 NIO.2
104、Pro Puppet
105、Pro Spring 3
106、Pro Spring MVC
107、Pro Spring Security
108、Professional Hadoop Solutions
109、Programming Concurrency on the JVM
110、Programming Hive
111、Programming Pig
112、Programming Scala
113、Programming Scala Tackle Multi-Core Complexity on the Java Virtual Machine
114、Programming in Scala 2nd Edition
115、RADOS A Scalable, Reliable Storage Service for Petabyte-scale Storage Clusters.
116、RCFile A Fast and Space-efficient Data Placement Structure in MapReduce-based Warehouse Systems
117、RabbitMQ in Action
118、Redis Cookbook
119、Redis in Action
120、Resilient Distributed Datasets - A Fault-Tolerant Abstraction for In-Memory Cluster Computing
121、SQL Antipatterns
122、Scala for the Impatient
123、Scala in Action
124、Scala in Depth
125、Scaling Big Data with Hadoop and Solr Chapter-02 Understanding Solr
126、Scaling CouchDB
127、Selenium 2 Testing Tools Beginner's Guide
128、Selenium Testing Tools Cookbook
129、Solr 3 Enterprise Search Server
130、Solr 3.1 Cookbook
131、Solr in action
132、Sonar Code Quality Testing Essentials
133、Spanner - Google’s Globally-Distributed Database
134、Spring Batch in Action
135、Spring Data Modern Data Access for Enterprise Java
136、Spring Dynamic Modules in Action
137、Spring Integration in Action
138、Spring Roo in Action
139、Spring Security 3.1
140、Spring in Action 4th Edition
141、Spring in Practice
142、Steps in Scala An Introduction to Object-Functional Programming
143、Storage Architecture and Challenges
144、Storm Blueprints Patterns for Distributed Real-time Computation
145、Storm Real-Time Processing Cookbook
146、Tenzing - A SQL Implementation On The MapReduce Framework
147、Testable JavaScript
148、The Definitive Guide to MongoDB
149、The Hadoop Distributed File System
150、The Java Language Specification Java SE 7 Edition
151、The Java Virtual Machine Specification Java SE 7 Edition
152、The Little Redis Book
153、The anatomy of a large-scale hypertextual Web search engine
154、The anatomy of a large-scale hypertextual Web search engine short
155、Thinking in Java 4th Edition
156、Tika in Action
157、Understanding MySQL Internals
158、Web Crawling and Data Mining with Apache Nutch
159、Web search for a planet the google cluster architecture
160、Zab High-performance broadcast for primary-backup systems
161、ZooKeeper Distributed Process Coordination
162、admin-user-guide-trunk
163、bk-admin-guide-cloud-latest
164、bk-compute-adminguide-grizzly
165、elasticsearch. The Road to a Distributed, (Near) Real Time, Search Engine
166、high-availability-guide
167、user-guide