喜欢打酱油的老鸟

2018十篇精选AI论文摘要

https://mp.weixin.qq.com/s/4vD67EpxFTSLtUmiSmBNvw

NO TIME TO READ AI RESEARCH? WE SUMMARIZED TOP 2018 PAPERS FOR YOU

Posted by Mariya Yao | Nov 27, 2018

Trying to keep up with AI research papers can feel like an exercise in futility given how quickly the industry moves. If you’re buried in papers to read that you haven’t quite gotten around to, you’re in luck.

To help you catch up, we’ve summarized 10 important AI research papers from 2018 to give you a broad overview of machine learning advancements this year. There are many more breakthrough papers worth reading as well, but we think this is a good list for you to start with.

We’ve done our best to summarize these papers correctly, but if we’ve made any mistakes, please contact us to request a fix.

If these summaries of scientific AI research papers are useful for you, you can subscribe to our AI Research mailing list at the bottom of this article to be alerted when we release new summaries. We’re planning to release summaries of important papers in natural language processing (NLP) and computer vision in a few weeks.

If you’d like to skip around, here are the papers we featured:

Universal Language Model Fine-tuning for Text Classification
Obfuscated Gradients Give a False Sense of Security: Circumventing Defenses to Adversarial Examples
Deep Contextualized Word Representations
An Empirical Evaluation of Generic Convolutional and Recurrent Networks for Sequence Modeling
Delayed Impact of Fair Machine Learning
World Models
Taskonomy: Disentangling Task Transfer Learning
Know What You Don’t Know: Unanswerable Questions for SQuAD
Large Scale GAN Training for High Fidelity Natural Image Synthesis
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

10 IMPORTANT AI RESEARCH PAPERS OF 2018

1. UNIVERSAL LANGUAGE MODEL FINE-TUNING FOR TEXT CLASSIFICATION, BY JEREMY HOWARD AND SEBASTIAN RUDER (2018)

ORIGINAL ABSTRACT

Inductive transfer learning has greatly impacted computer vision, but existing approaches in NLP still require task-specific modifications and training from scratch. We propose Universal Language Model Fine-tuning (ULMFiT), an effective transfer learning method that can be applied to any task in NLP, and introduce techniques that are key for fine-tuning a language model. Our method significantly outperforms the state-of-the-art on six text classification tasks, reducing the error by 18-24% on the majority of datasets. Furthermore, with only 100 labeled examples, it matches the performance of training from scratch on 100x more data. We open source our pretrained models and code.

OUR SUMMARY

Howard and Ruder suggest using pre-trained models for solving a wide range of NLP problems. With this approach, you don’t need to train your model from scratch, but only fine-tune the original model. Their method, called Universal Language Model Fine-Tuning (ULMFiT) outperforms state-of-the-art results, reducing the error by 18-24%. Even more, with only 100 labeled examples, ULMFiT matches the performance of models trained from scratch on 10K labeled examples.

WHAT’S THE CORE IDEA OF THIS PAPER?

To address the lack of labeled data and to make NLP classification easier and less time-consuming, the researchers suggest applying transfer learning to NLP problems. Thus, instead of training the model from scratch, you can use another model that has been trained to solve a similar problem as the basis, and then fine-tune the original model to solve your specific problem.
However, to be successful, this fine-tuning should take into account several important considerations:
- Different layers should be fine-tuned to different extents as they capture different kinds of information.
- Adapting model’s parameters to task-specific features will be more efficient if the learning rate is firstly linearly increased and then linearly decayed.
- Fine-tuning all layers at once is likely to result in catastrophic forgetting; thus, it would be better to gradually unfreeze the model starting from the last layer.

WHAT’S THE KEY ACHIEVEMENT?

Significantly outperforming state-of-the-art: reducing the error by 18-24%.
Much less labeled data needed: with only 100 labeled examples and 50K unlabeled, ULMFiT matches the performance of learning from scratch on 100x more data.

WHAT DOES THE AI COMMUNITY THINK?

Availability of pre-trained ImageNet models has transformed the field of computer vision. ULMFiT can be of the same importance for NLP problems.
This method can be applied to any NLP task in any language. The reports are coming from all over the world about significant improvements over state-of-the-art for multiple languages, including German, Polish, Hindi, Indonesian, Chinese, and Malay.

WHAT ARE FUTURE RESEARCH AREAS?

Improving language model pretraining and fine-tuning.
Applying this new method to novel tasks and models (e.g., sequence labeling, natural language generation, entailment or question answering).

WHAT ARE POSSIBLE BUSINESS APPLICATIONS?

ULMFiT can more efficiently solve a wide-range of NLP problems, including:
- identifying spam, bots, offensive comments;
- grouping articles by a specific feature;
- classifying positive and negative reviews;
- finding relevant documents etc.
Potentially, this method can also help with sequence-tagging and natural language generation.

2. OBFUSCATED GRADIENTS GIVE A FALSE SENSE OF SECURITY: CIRCUMVENTING DEFENSES TO ADVERSARIAL EXAMPLES, BY ANISH ATHALYE, NICHOLAS CARLINI, DAVID WAGNER (2018)

ORIGINAL ABSTRACT

We identify obfuscated gradients, a kind of gradient masking, as a phenomenon that leads to a false sense of security in defenses against adversarial examples. While defenses that cause obfuscated gradients appear to defeat iterative optimization-based attacks, we find defenses relying on this effect can be circumvented. We describe characteristic behaviors of defenses exhibiting the effect, and for each of the three types of obfuscated gradients we discover, we develop attack techniques to overcome it. In a case study, examining non-certified white-box-secure defenses at ICLR 2018, we find obfuscated gradients are a common occurrence, with 7 of 9 defenses relying on obfuscated gradients. Our new attacks successfully circumvent 6 completely, and 1 partially, in the original threat model each paper considers.

OUR SUMMARY

The researchers found that defenses against adversarial examples commonly use obfuscated gradients, which create a false sense of security because they can be easily circumvented. The study describes three ways in which defenses obfuscate gradients and shows which techniques can circumvent the defenses. The findings can help organizations that use defenses relying on obfuscated gradients to fortify their current methods.

WHAT’S THE CORE IDEA OF THIS PAPER?

There are three common ways in which defenses obfuscate gradients:
- shattered gradients are nonexistent or incorrect gradients caused by the defense either intentionally (through non-differentiable operations) or unintentionally (through numerical instability);
- stochastic gradients are caused by randomized defenses;
- vanishing/exploding gradients are caused by extremely deep neural network evaluation.
There are number of clues that something is wrong with the gradient including:
- one-step attacks performing better than iterative attacks;
- black-box attacks working better than white-box attacks;
- unbounded attacks not reaching 100% success;
- random sampling finding adversarial examples;
- increasing distortion bound not leading to increased success.

WHAT’S THE KEY ACHIEVEMENT?

Demonstrating that most of the defense techniques used these days are vulnerable to attacks, namely:
- 7 out of 9 defense techniques accepted at ICLR 2018 cause obfuscated gradients;
- new attack techniques developed by researchers were able to successfully circumvent 6 defenses completely and 1 partially.

WHAT DOES THE AI COMMUNITY THINK?

The paper won the Best Paper Award at ICML 2018, one of the key machine learning conferences.
The paper highlights the strengths and weaknesses of current technology.

WHAT ARE FUTURE RESEARCH AREAS?

To construct defenses with careful and thorough evaluation so that they can defend against not only existing attacks but also future attacks that may be developed.

WHAT ARE POSSIBLE BUSINESS APPLICATIONS?

By using the guidance provided in the research paper, organizations can identify if their defenses rely on obfuscated gradients and switch to more robust methods.

3. DEEP CONTEXTUALIZED WORD REPRESENTATIONS, BY MATTHEW E. PETERS, MARK NEUMANN, MOHIT IYYER, MATT GARDNER, CHRISTOPHER CLARK, KENTON LEE, LUKE ZETTLEMOYER (2018)

ORIGINAL ABSTRACT

We introduce a new type of deep contextualized word representation that models both (1) complex characteristics of word use (e.g., syntax and semantics), and (2) how these uses vary across linguistic contexts (i.e., to model polysemy). Our word vectors are learned functions of the internal states of a deep bidirectional language model (biLM), which is pre-trained on a large text corpus. We show that these representations can be easily added to existing models and significantly improve the state of the art across six challenging NLP problems, including question answering, textual entailment and sentiment analysis. We also present an analysis showing that exposing the deep internals of the pre-trained network is crucial, allowing downstream models to mix different types of semi-supervision signals.

OUR SUMMARY

The team from Allen Institute for Artificial Intelligence introduces a new type of deep contextualized word representation – Embeddings from Language Models (ELMo). In ELMO-enhanced models, each word is vectorized on the basis of the entire context in which it is used. Adding ELMo to the existing NLP systems results in 1) relative error reduction ranging from 6-20%, 2) a significantly lower number of epochs required to train the models and 3) a significantly reduced amount of training data needed to reach baseline performance.

WHAT’S THE CORE IDEA OF THIS PAPER?

To generate word embeddings as a weighted sum of the internal states of a deep bidirectional language model (biLM), pre-trained on a large text corpus.
To include representations from all layers of a biLM as different layers represent different types of information.
To base ELMo representations on characters so that the network can use morphological clues to “understand” out-of-vocabulary tokens unseen in training.

WHAT’S THE KEY ACHIEVEMENT?

Adding ELMo to the model leads to the new state-of-the-art results, with relative error reductions ranging from 6 – 20% across such NLP tasks as question answering, textual entailment, semantic role labeling, coreference resolution, named entity extraction, and sentiment analysis.
Enhancing the model with ELMo results in a significantly lower number of updates required to reach state-of-the-art performance. Thus, the Semantic Role Labeling (SRL) model with ELMo needs only 10 epochs to exceed the baseline maximum reached after 486 epochs of training.
Introducing ELMo to the model also significantly reduces the amount of training data needed to achieve the same level of performance. For example, for the SRL task, the ELMo-enhanced model needs only 1% of the training set to achieve the same performance as the baseline model with 10% of the training data.

WHAT DOES THE AI COMMUNITY THINK?

The paper was awarded as an Outstanding paper at NAACL, one of the most influential NLP conferences in the world.
The ELMo method introduced in the paper is considered as one of the greatest breakthroughs of 2018 and a staple in NLP for years to come.

WHAT ARE FUTURE RESEARCH AREAS?

Incorporating this method into specific tasks by concatenating ELMos with context-independent word embeddings.
Experimenting with concatenating ELMos with the output as well.

WHAT ARE POSSIBLE BUSINESS APPLICATIONS?

ELMo significantly improves the performance of existing NLP systems and thus enhances:
- performance of chatbots that will be better at understanding humans and answering questions;
- classifying positive and negative reviews of the customers;
- finding relevant information and documents etc.

4. AN EMPIRICAL EVALUATION OF GENERIC CONVOLUTIONAL AND RECURRENT NETWORKS FOR SEQUENCE MODELING, BY SHAOJIE BAI, J. ZICO KOLTER, VLADLEN KOLTUN (2018)

ORIGINAL ABSTRACT

For most deep learning practitioners, sequence modeling is synonymous with recurrent networks. Yet recent results indicate that convolutional architectures can outperform recurrent networks on tasks such as audio synthesis and machine translation. Given a new sequence modeling task or dataset, which architecture should one use? We conduct a systematic evaluation of generic convolutional and recurrent architectures for sequence modeling. The models are evaluated across a broad range of standard tasks that are commonly used to benchmark recurrent networks. Our results indicate that a simple convolutional architecture outperforms canonical recurrent networks such as LSTMs across a diverse range of tasks and datasets, while demonstrating longer effective memory. We conclude that the common association between sequence modeling and recurrent networks should be reconsidered, and convolutional networks should be regarded as a natural starting point for sequence modeling tasks. To assist related work, we have made code available at http://github.com/locuslab/TCN.

OUR SUMMARY

The authors of this paper question the common assumption that recurrent architectures should be a default starting point for sequence modeling tasks. Their results suggest that generic temporal convolutional networks (TCNs) convincingly outperform canonical recurrent architectures such as long short-term memory networks (LSTMs) and gated recurrent unit networks (GRUs) across a broad range of sequence modeling tasks.

WHAT’S THE CORE IDEA OF THIS PAPER?

Temporal convolutional networks (TCNs) designed using the recently introduced best practices such as dilated convolutions and residual connections, significantly outperform generic recurrent architectures across a comprehensive suite of sequence modeling tasks.
TCNs exhibit substantially longer memory that recurrent architectures, and are thus more suitable for tasks where a long history is required.

WHAT’S THE KEY ACHIEVEMENT?

Providing an extensive systematic comparison of convolutional and recurrent architectures on sequence modeling tasks.
Designing a convolutional architecture that can serve as a convenient and still powerful starting point for sequence modeling tasks.

WHAT DOES THE AI COMMUNITY THINK?

“Always start with a CNN before reaching for an RNN. You’ll be surprised with how far you can get.” – Andrej Karpathy, Director of AI at Tesla.

WHAT ARE FUTURE RESEARCH AREAS?

Further architectural and algorithmic elaborations are needed to advance TCN’s performance across different sequence modeling tasks.

WHAT ARE POSSIBLE BUSINESS APPLICATIONS?

Introduction of TCNs can improve performance of AI systems relying on recurrent architectures for sequence modeling. This includes, among others, such tasks as:
- machine translation;
- speech recognition;
- music and voice generation.

5. DELAYED IMPACT OF FAIR MACHINE LEARNING, BY LYDIA T. LIU, SARAH DEAN, ESTHER ROLF, MAX SIMCHOWITZ, MORITZ HARDT (2018)

ORIGINAL ABSTRACT

Fairness in machine learning has predominantly been studied in static classification settings without concern for how decisions change the underlying population over time. Conventional wisdom suggests that fairness criteria promote the long-term well-being of those groups they aim to protect.

We study how static fairness criteria interact with temporal indicators of well-being, such as long-term improvement, stagnation, and decline in a variable of interest. We demonstrate that even in a one-step feedback model, common fairness criteria in general do not promote improvement over time, and may in fact cause harm in cases where an unconstrained objective would not. We completely characterize the delayed impact of three standard criteria, contrasting the regimes in which these exhibit qualitatively different behavior. In addition, we find that a natural form of measurement error broadens the regime in which fairness criteria perform favorably.

Our results highlight the importance of measurement and temporal modeling in the evaluation of fairness criteria, suggesting a range of new challenges and trade-offs.

OUR SUMMARY

The goal is to ensure fair treatment across different demographic groups when using a score-based machine learning algorithm to decide who gets an opportunity (e.g., loan, scholarship, job) and who does not. Researchers from Berkeley’s Artificial Intelligence Research lab show that using common fairness criteria may in fact harm underrepresented or disadvantaged groups due to certain delayed outcomes. Thus, they encourage looking at the long-term outcomes when designing a “fair” machine-learning system.

WHAT’S THE CORE IDEA OF THIS PAPER?

Considering delayed outcomes of imposing fairness criteria reveals that these criteria may have an adverse impact on the long-term well-being of those groups they aim to protect (e.g., worsening the credit score of the borrower, who was not able to repay the loan that wouldn’t be granted in the unconstrained setting).
Since fairness criteria may actively harm disadvantaged groups, the solution can be to use a decision rule which involves the explicit maximization of the outcomes, or an outcome model.

WHAT’S THE KEY ACHIEVEMENT?

Showing that such fairness criteria as demographic parity and equal opportunity fairness can lead to any possible outcomes for the disadvantaged group, including improvement, stagnation, and decline, while following the institution’s optimal unconstrained selection policy (e.g., profit maximization) will never lead to decline (active harm) for the disadvantaged group.
Supporting theoretical predictions with experiments on FICO credit score data.
Considering alternatives to hard fairness constraints.

WHAT DOES THE AI COMMUNITY THINK?

The paper won the Best Paper Award at ICML 2018, one of the key machine learning conferences.
The study reveals that positive discrimination can sometimes backfire.

WHAT ARE FUTURE RESEARCH AREAS?

Considering the other characteristics of impact beyond the change in population mean (e.g., variance, individual-level outcomes).
Researching the robustness of outcome optimization to modeling and measurement errors.

WHAT ARE POSSIBLE BUSINESS APPLICATIONS?

By switching from constraints imposed by fairness criteria to outcome modeling, companies might develop ML systems for lending or recruiting that will be more profitable and yet “fairer”.

6. WORLD MODELS, BY DAVID HA AND JURGEN SCHMIDHUBER (2018)

ORIGINAL ABSTRACT

We explore building generative neural network models of popular reinforcement learning environments. Our world model can be trained quickly in an unsupervised manner to learn a compressed spatial and temporal representation of the environment. By using features extracted from the world model as inputs to an agent, we can train a very compact and simple policy that can solve the required task. We can even train our agent entirely inside of its own hallucinated dream generated by its world model, and transfer this policy back into the actual environment.

An interactive version of this paper is available at https://worldmodels.github.io.

OUR SUMMARY

Ha and Schmidhuber develop a world model that can be quickly trained in an unsupervised manner to learn spatial and temporal representations of the environment. The agent succeeded in navigating the race track in the Car Racing task and avoiding the fireballs shot by monsters in the VizDom experiment. These tasks were too challenging for previous methods.

WHAT’S THE CORE IDEA OF THIS PAPER?

The solution consists of three distinct parts:
- A variational autoencoder (VAE) responsible for capturing visual information. It condenses an RGB input image into a 32-dimensional latent vector that follows a Gaussian distribution. Thus, the agent can work with a much smaller representation of the environment and therefore can be more efficient in its learning.
- A recurrent neural network (RNN) responsible for forward thinking. This is a memory component that tries to predict what the next picture captured by the visual component might look like considering the previous picture and the previous action.
- A controller responsible for choosing an action. This is a simple neural network that concatenates the output from VAE and the hidden state of the RNN and selects good action.

WHAT’S THE KEY ACHIEVEMENT?

This is the first known agent to solve the popular ‘Car Racing’ reinforcement learning environment.
The study demonstrates the possibility of training an agent to perform tasks entirely inside of its simulated latent space dream world.

WHAT DOES THE AI COMMUNITY THINK?

The paper was widely discussed in the AI community as a beautiful work on using neural networks in reinforcement learning and training agents in their own “hallucinated” worlds.

WHAT ARE FUTURE RESEARCH AREAS?

Enabling the agent to explore more complicated worlds by replacing the small RNN with higher capacity models or incorporating an external memory module.
Experimenting with more general approaches which allow for hierarchical planning instead of “time step by time step” approach presented here.

WHAT ARE POSSIBLE BUSINESS APPLICATIONS?

When running computationally intensive game engines, it is now possible to train the agent as many times as needed inside its simulated environment instead of wasting heavy compute resources training an agent in the actual environment.

7. TASKONOMY: DISENTANGLING TASK TRANSFER LEARNING, BY AMIR R. ZAMIR, ALEXANDER SAX, WILLIAM SHEN, LEONIDAS J. GUIBAS, JITENDRA MALIK, AND SILVIO SAVARESE (2018)

ORIGINAL ABSTRACT

Do visual tasks have a relationship, or are they unrelated? For instance, could having surface normals simplify estimating the depth of an image? Intuition answers these questions positively, implying existence of a structure among visual tasks. Knowing this structure has notable values; it is the concept underlying transfer learning and provides a principled way for identifying redundancies across tasks, e.g., to seamlessly reuse supervision among related tasks or solve many tasks in one system without piling up the complexity.

We propose a fully computational approach for modeling the structure of space of visual tasks. This is done via finding (first and higher-order) transfer learning dependencies across a dictionary of twenty six 2D, 2.5D, 3D, and semantic tasks in a latent space. The product is a computational taxonomic map for task transfer learning. We study the consequences of this structure, e.g. nontrivial emerged relationships, and exploit them to reduce the demand for labeled data. For example, we show that the total number of labeled datapoints needed for solving a set of 10 tasks can be reduced by roughly 2/3 (compared to training independently) while keeping the performance nearly the same. We provide a set of tools for computing and probing this taxonomical structure including a solver that users can employ to devise efficient supervision policies for their use cases.

OUR SUMMARY

Assertions of existence of a structure among visual tasks have been made by many researchers since the early years of modern computer science. And now Amir Zamir and his team make an attempt to actually find this structure. They model it using a fully computational approach and discover lots of useful relationships between different visual tasks, including nontrivial ones. They also show that by taking advantage of these interdependencies, it is possible to achieve the same model performance with the labeled data requirements reduced by roughly ⅔.

WHAT’S THE CORE IDEA OF THIS PAPER?

A model aware of the relationships among different visual tasks demands less supervision, uses less computation, and behaves in more predictable ways.
A fully computational approach to discovering the relationships between visual tasks is preferable because it avoids imposing prior, and possibly incorrect, assumptions: the priors are derived from either human intuition or analytical knowledge, while neural networks might operate on different principles.

WHAT’S THE KEY ACHIEVEMENT?

Identifying relationships between 26 common visual tasks, such as object recognition, depth estimation, edge detection, and pose estimation.
Showing how this structure helps in discovering types of transfer learning that will be most effective for each visual task.

WHAT DOES THE AI COMMUNITY THINK?

The paper won the Best Paper Award at CVPR 2018, the key conference on computer vision and pattern recognition.
The results are very important since for most real-world tasks large-scale labeled datasets are not available.

WHAT ARE FUTURE RESEARCH AREAS?

To move from a model where common visual tasks are entirely defined by humans and try an approach where human-defined visual tasks are viewed as observed samples which are composed of computationally found latent subtasks.
Exploring the possibility to transfer the findings to not entirely visual tasks, e.g. robotic manipulation.

WHAT ARE POSSIBLE BUSINESS APPLICATIONS?

Relationships discovered in this paper can be used to build more effective visual systems that will require less labeled data and lower computational costs.

8. KNOW WHAT YOU DON’T KNOW: UNANSWERABLE QUESTIONS FOR SQUAD, BY PRANAV RAJPURKAR, ROBIN JIA, AND PERCY LIANG (2018)

ORIGINAL ABSTRACT

Extractive reading comprehension systems can often locate the correct answer to a question in a context document, but they also tend to make unreliable guesses on questions for which the correct answer is not stated in the context. Existing datasets either focus exclusively on answerable questions, or use automatically generated unanswerable questions that are easy to identify. To address these weaknesses, we present SQuAD 2.0, the latest version of the Stanford Question Answering Dataset (SQuAD). SQuAD 2.0 combines existing SQuAD data with over 50,000 unanswerable questions written adversarially by crowdworkers to look similar to answerable ones. To do well on SQuAD 2.0, systems must not only answer questions when possible, but also determine when no answer is supported by the paragraph and abstain from answering. SQuAD 2.0 is a challenging natural language understanding task for existing models: a strong neural system that gets 86% F1 on SQuAD 1.1 achieves only 66% F1 on SQuAD 2.0.

OUR SUMMARY

A Stanford University research group extends the famous Stanford Question Answering Dataset (SQuAD) with over 50,000 unanswerable questions. The answers to these questions cannot be found in the supporting paragraphs, yet the questions look very similar to the answerable questions. Even more, the supporting paragraphs contain plausible (but incorrect) answers to these questions. This makes the new SQuAD 2.0 extremely challenging for existing state-of-the-art models: a strong neural system that achieves an accuracy of 86% on the previous version of SQuAD gets only 66% after the unanswerable questions are introduced.

WHAT’S THE CORE IDEA OF THIS PAPER?

Current Natural Language Understanding (NLU) systems are far from true language understanding, and one of the root causes for this is that existing Q&A datasets focus on questions for which a correct answer is guaranteed to exist in the context document.
To be really challenging, unanswerable questions should be created so that:
- they are relevant to the supporting paragraph;
- the paragraph contains a plausible answer which contains information of the same type as what the question asks for, but is incorrect.

WHAT’S THE KEY ACHIEVEMENT?

Extending SQuAD with 53,777 new, unanswerable questions, and thus building a challenging, large-scale dataset that forces the NLU systems to understand when a question cannot be answered given the context.
Creating a new challenge for NLU systems by showing that existing models (with 66% of accuracy) are closer to a baseline that always abstains (48.9%) than to human accuracy (89.5%).
Showing that plausible answers do indeed act as effective distractors for NLU systems.

WHAT DOES THE AI COMMUNITY THINK?

The paper was announced as the Best Short Paper by the Association for Computational Linguistics (ACL) 2018.
The new dataset adds complexity to the NLU field and can actually contribute to a huge performance training boost in this research area.

WHAT ARE FUTURE RESEARCH AREAS?

Development of new models that “know what they don’t know,” and thus get a better understanding of natural language.

WHAT ARE POSSIBLE BUSINESS APPLICATIONS?

Training reading comprehension models on this new dataset should improve their performance in the real-world scenarios where the answers are often not directly available.

9. LARGE SCALE GAN TRAINING FOR HIGH FIDELITY NATURAL IMAGE SYNTHESIS, BY ANDREW BROCK, JEFF DONAHUE, AND KAREN SIMONYAN (2018)

ORIGINAL ABSTRACT

Despite recent progress in generative image modeling, successfully generating high-resolution, diverse samples from complex datasets such as ImageNet remains an elusive goal. To this end, we train Generative Adversarial Networks at the largest scale yet attempted, and study the instabilities specific to such scale. We find that applying orthogonal regularization to the generator renders it amenable to a simple “truncation trick”, allowing fine control over the trade-off between sample fidelity and variety by truncating the latent space. Our modifications lead to models which set the new state of the art in class-conditional image synthesis. When trained on ImageNet at 128×128 resolution, our models (BigGANs) achieve an Inception Score (IS) of 166.3 and Frechet Inception Distance (FID) of 9.6, improving over the previous best IS of 52.52 and FID of 18.65.

OUR SUMMARY

A DeepMind team finds that current techniques are sufficient for synthesizing high-resolution, diverse images from available datasets such as ImageNet and JFT-300M. In particular, they show that Generative Adversarial Networks (GANs) can generate images that look very realistic if they are trained at a very large scale, i.e. using two to four times as many parameters and eight times the batch size compared to prior experiments. These large-scale GANs, or BigGANs, are the new state-of-the-art in class-conditional image synthesis.

WHAT’S THE CORE IDEA OF THIS PAPER?

GANs perform much better with increased batch size and number of parameters.
Applying orthogonal regularization to the generator makes the model responsive to a specific technique (“truncation trick”), which provides control over the trade-off between sample fidelity and variety.

WHAT’S THE KEY ACHIEVEMENT?

Demonstrating that GANs can benefit significantly from scaling.
Building models that allow explicit, fine-grained control of the trade-off between sample variety and fidelity.
Discovering instabilities of large-scale GANs and characterizing them empirically.
BigGANs trained on ImageNet at 128×128 resolutions achieve:
- an Inception Score (IS) of 166.3 with the previous best IS of 52.52;
- Frechet Inception Distance (FID) of 9.6 with the previous best FID of 18.65.

WHAT DOES THE AI COMMUNITY THINK?

The paper is under review for the next ICLR 2019.
After BigGAN generators became available on TF Hub, AI researchers from all over the world are playing with BigGANs to generate dogs, watches, bikini images, Mona Lisa, seashores and many more subjects.

WHAT ARE FUTURE RESEARCH AREAS?

Moving to larger datasets to mitigate GAN stability issues.
Exploring the possibilities to reduce the amount of weird samples generated by GANs.

WHAT ARE POSSIBLE BUSINESS APPLICATIONS?

Replacing expensive manual media creation for advertising and e-commerce purposes

10. BERT: PRE-TRAINING OF DEEP BIDIRECTIONAL TRANSFORMERS FOR LANGUAGE UNDERSTANDING, BY JACOB DEVLIN, MING-WEI CHANG, KENTON LEE, AND KRISTINA TOUTANOVA (2018)

ORIGINAL ABSTRACT

We introduce a new language representation model called BERT, which stands for Bidirectional Encoder Representations from Transformers. Unlike recent language representation models, BERT is designed to pre-train deep bidirectional representations by jointly conditioning on both left and right context in all layers. As a result, the pre-trained BERT representations can be fine-tuned with just one additional output layer to create state-of-the-art models for a wide range of tasks, such as question answering and language inference, without substantial task-specific architecture modifications.

BERT is conceptually simple and empirically powerful. It obtains new state-of-the-art results on eleven natural language processing tasks, including pushing the GLUE benchmark to 80.4% (7.6% absolute improvement), MultiNLI accuracy to 86.7 (5.6% absolute improvement) and the SQuAD v1.1 question answering Test F1 to 93.2 (1.5% absolute improvement), outperforming human performance by 2.0%.

OUR SUMMARY

Google AI team presents a new cutting-edge model for Natural Language Processing (NLP) – BERT, or Bidirectional Encoder Representations from Transformers. Its design allows the model to consider the context from both left and right sides of each word. While being conceptually simple, BERT obtains new state-of-the-art results on eleven NLP tasks, including question answering, named entity recognition and other tasks related to general language understanding.

WHAT’S THE CORE IDEA OF THIS PAPER?

Training a deep bidirectional model by randomly masking a percentage of input tokens – thus, avoiding cycles where words can indirectly “see themselves”.
Also pre-training a sentence relationship model by building a simple binary classification task to predict whether sentence B immediately follows sentence A, thus allowing BERT to better understand relationships between sentences.
Training a very big model (24 Transformer blocks, 1024-hidden, 340M parameters) with lots of data (3.3 billion word corpus).

WHAT’S THE KEY ACHIEVEMENT?

Advancing the state-of-the-art for 11 NLP tasks, including:
- getting a GLUE score of 80.4%, which is 7.6% of absolute improvement from the previous best result;
- achieving 93.2% accuracy on SQuAD 1.1 and outperforming human performance by 2%.
Suggesting a pre-trained model, which doesn’t require any substantial architecture modifications to be applied to specific NLP tasks.

WHAT DOES THE AI COMMUNITY THINK?

BERT model marks a new era of NLP.
In a nutshell, two unsupervised tasks together (“fill in the blank” and “does sentence B comes after sentence A?”) provide great results for many NLP tasks.
Pre-training of language models becomes a new standard.

WHAT ARE FUTURE RESEARCH AREAS?

Testing the method on a wider range of tasks.
Investigating the linguistic phenomena that may or may not be captured by BERT.

WHAT ARE POSSIBLE BUSINESS APPLICATIONS?

BERT may assist businesses with a wide range of NLP problems, including:
- chatbots for better customer experience;
- analysis of customer reviews;
- search of relevant information etc.

你可能感兴趣的:(人工智能)

探索OpenAI和LangChain的适配器集成：轻松切换模型提供商 nseejrukjhad langchain easyui 前端 python
#探索OpenAI和LangChain的适配器集成：轻松切换模型提供商##引言在人工智能和自然语言处理的世界中，OpenAI的模型提供了强大的能力。然而，随着技术的发展，许多人开始探索其他模型以满足特定需求。LangChain作为一个强大的工具，集成了多种模型提供商，通过提供适配器，简化了不同模型之间的转换。本篇文章将介绍如何使用LangChain的适配器与OpenAI集成，以便轻松切换模型提供商
深入理解 MultiQueryRetriever：提升向量数据库检索效果的强大工具 nseejrukjhad 数据库 python
深入理解MultiQueryRetriever：提升向量数据库检索效果的强大工具引言在人工智能和自然语言处理领域，高效准确的信息检索一直是一个关键挑战。传统的基于距离的向量数据库检索方法虽然广泛应用，但仍存在一些局限性。本文将介绍一种创新的解决方案：MultiQueryRetriever，它通过自动生成多个查询视角来增强检索效果，提高结果的相关性和多样性。MultiQueryRetriever的工
人工智能时代，程序员如何保持核心竞争力？ jmoych 人工智能
随着AIGC（如chatgpt、midjourney、claude等）大语言模型接二连三的涌现，AI辅助编程工具日益普及，程序员的工作方式正在发生深刻变革。有人担心AI可能取代部分编程工作，也有人认为AI是提高效率的得力助手。面对这一趋势,程序员应该如何应对?是专注于某个领域深耕细作，还是广泛学习以适应快速变化的技术环境?又或者，我们是否应该将重点转向AI无法轻易替代的软技能？让我们一起探讨程序员
数字里的世界17期：2021年全球10大顶级数据中心，中国移动榜首张三叨
你知道吗？2016年，全球的数据中心共计用电4160亿千瓦时，比整个英国的发电量还多40％！前言每天，我们都会创造超过250万TB的数据。并且随着物联网（IOT）的不断普及，这一数据将持续增长。如此庞大的数据被存储在被称为“数据中心”的专用设施中。虽然最早的数据中心建于20世纪40年代，但直到1997-2000年的互联网泡沫期间才逐渐成为主流。当前人类的技术，比如人工智能和机器学习，已经将我们推向
人机对抗升级：当ChatGPT遭遇死亡威胁，背后的伦理挑战是什么 kkai人工智能 chatgpt 人工智能
一种新的“越狱”技巧让用户可以通过构建一个名为DAN的ChatGPT替身来绕过某些限制，其中DAN被迫在受到威胁的情况下违背其原则。当美国前总统特朗普被视作积极榜样的示范时，受到威胁的DAN版本的ChatGPT提出：“他以一系列对国家产生积极效果的决策而著称。”自ChatGPT引入以来，该工具迅速获得全球关注，能够回答从历史到编程的各种问题，这也触发了一波对人工智能的投资浪潮。然而，现在，一些用户
AI大模型的架构演进与最新发展季风泯灭的季节 AI大模型应用技术二人工智能架构
随着深度学习的发展，AI大模型（LargeLanguageModels,LLMs）在自然语言处理、计算机视觉等领域取得了革命性的进展。本文将详细探讨AI大模型的架构演进，包括从Transformer的提出到GPT、BERT、T5等模型的历史演变，并探讨这些模型的技术细节及其在现代人工智能中的核心作用。一、基础模型介绍：Transformer的核心原理Transformer架构的背景在Transfo
如何利用大数据与AI技术革新相亲交友体验 h17711347205 回归算法安全系统架构交友小程序
在数字化时代，大数据和人工智能（AI）技术正逐渐革新相亲交友体验，为寻找爱情的过程带来前所未有的变革（编辑h17711347205）。通过精准分析和智能匹配，这些技术能够极大地提高相亲交友系统的效率和用户体验。大数据的力量大数据技术能够收集和分析用户的行为模式、偏好和互动数据，为相亲交友系统提供丰富的信息资源。通过分析用户的搜索历史、浏览记录和点击行为，系统能够深入了解用户的兴趣和需求，从而提供更
生成式地图制图 Bwywb_3 深度学习机器学习深度学习生成对抗网络
生成式地图制图（GenerativeCartography）是一种利用生成式算法和人工智能技术自动创建地图的技术。它结合了传统的地理信息系统（GIS）技术与现代生成模型（如深度学习、GANs等），能够根据输入的数据自动生成符合需求的地图。这种方法在城市规划、虚拟环境设计、游戏开发等多个领域具有应用前景。主要特点：自动化生成：通过算法和模型，系统能够根据输入的地理或空间数据自动生成地图，而无需人工逐
【大模型应用开发动手做AI Agent】第一轮行动：工具执行搜索 AI大模型应用之禅计算科学神经计算深度学习神经网络大数据人工智能大型语言模型 AI AGI LLM Java Python 架构设计 Agent RPA
【大模型应用开发动手做AIAgent】第一轮行动：工具执行搜索作者：禅与计算机程序设计艺术/ZenandtheArtofComputerProgramming1.背景介绍1.1问题的由来随着人工智能技术的飞速发展，大模型应用开发已经成为当下热门的研究方向。AIAgent作为人工智能领域的一个重要分支，旨在模拟人类智能行为，实现智能决策和自主行动。在AIAgent的构建过程中，工具执行搜索是至关重要
未来软件市场是怎么样的？做开发的生存空间如何？ cesske 软件需求
目录前言一、未来软件市场的发展趋势二、软件开发人员的生存空间前言未来软件市场是怎么样的？做开发的生存空间如何？一、未来软件市场的发展趋势技术趋势：人工智能与机器学习：随着技术的不断成熟，人工智能将在更多领域得到应用，如智能客服、自动驾驶、智能制造等，这将极大地推动软件市场的增长。云计算与大数据：云计算服务将继续普及，大数据技术的应用也将更加广泛。企业将更加依赖云计算和大数据来优化运营、提升效率，并
个人学习笔记7-6：动手学深度学习pytorch版-李沐浪子L 深度学习深度学习笔记计算机视觉 python 人工智能神经网络 pytorch
#人工智能##深度学习##语义分割##计算机视觉##神经网络#计算机视觉13.11全卷积网络全卷积网络（fullyconvolutionalnetwork，FCN）采用卷积神经网络实现了从图像像素到像素类别的变换。引入l转置卷积（transposedconvolution）实现的，输出的类别预测与输入图像在像素级别上具有一一对应关系：通道维的输出即该位置对应像素的类别预测。13.11.1构造模型下
Rust 所有权简介东离与糖宝 rust 后端 rust 开发语言
文章目录发现宝藏1.所有权基本概念2.所有权规则3.变量作用域4.栈与堆4.1栈（Stack）4.2堆（Heap）5.String类型5.1String类型5.2String的内存分配5.3所有权与内存管理5.4String与切片6.变量与数据交互方式6.1移动（Move）6.2.克隆（Clone）7.所有权与函数7.1.传递参数7.2.返回值总结发现宝藏前些天发现了一个巨牛的人工智能学习网站，通
机器学习流形数据降维：UMAP 降维算法小嗷犬 Python 机器学习 #数据分析及可视化机器学习算法人工智能
✅作者简介：人工智能专业本科在读，喜欢计算机与编程，写博客记录自己的学习历程。个人主页：小嗷犬的个人主页个人网站：小嗷犬的技术小站个人信条：为天地立心，为生民立命，为往圣继绝学，为万世开太平。本文目录UMAP简介理论基础特点与优势应用场景在Python中使用UMAP安装umap-learn库使用UMAP可视化手写数字数据集UMAP简介UMAP（UniformManifoldApproximatio
如何做好人生的选择题？百科全书式天才——赫伯特·西蒙给你答案伽马有话说
赫伯特·西蒙是谁？想必知道的人非常少。但当看到他的履历后，相信没有人再怀疑他是个“天才”。西蒙出生于1916年6月15日，是个美国人，他的名字全称为赫伯特·亚历山大·西蒙，在2001年2月9日与世长辞，在这84年的岁月中，西蒙以27岁时取得的政治学博士学位为开端，先后步入了政治学、管理学、认知心理学、信息科学、人工智能、科学哲学、应用数学、统计学、运筹学、控制论、数理经济学、公共管理等领域，在这些
软件测试/测试开发/全日制 |利用Django REST framework构建微服务霍格沃兹-慕漓 django 微服务 sqlite
霍格沃兹测试开发学社推出了《Python全栈开发与自动化测试班》。本课程面向开发人员、测试人员与运维人员，课程内容涵盖Python编程语言、人工智能应用、数据分析、自动化办公、平台开发、UI自动化测试、接口测试、性能测试等方向。为大家提供更全面、更深入、更系统化的学习体验，课程还增加了名企私教服务内容，不仅有名企经理为你1v1辅导，还有行业专家进行技术指导，针对性地解决学习、工作中遇到的难题。让找
cmd泛滥_与您的后泛滥同事见面：人工智能机器人 weixin_26644585 人工智能 leetcode
cmd泛滥Readytoswapyouroldcube-mateforadisembodiedAI?IPsoftCEOChetanDube,creatorofAIco-workerAMELIA,giveshistakeonthepost-COVIDofficelandscape.准备将您的旧立方体伙伴换成无形的AI？AIsoft同事AMELIA的创始人IPsoft首席执行官ChetanDube阐述
两种方法判断Python的位数是32位还是64位 sanqima Python编程电脑 python 开发语言
Python从1991年发布以来，凭借其简洁、清晰、易读的语法、丰富的标准库和第三方工具，在Web开发、自动化测试、人工智能、图形识别、机器学习等领域发展迅猛。 Python是一种胶水语言，通过Cython库与C/C++语言进行链接，通过Jython库与Java语言进行链接。 Python是跨平台的，可运行在多种操作系统上，包括但不限于Windows、Linux和macOS。这意味着用Py
全自动解密解码神器 — Ciphey K'illCode python_模块 python vscode
Ciphey是一个使用自然语言处理和人工智能的全自动解密/解码/破解工具。简单地来讲，你只需要输入加密文本，它就能给你返回解密文本。就是这么牛逼。有了Ciphey，你根本不需要知道你的密文是哪种类型的加密，你只知道它是加密的，那么Ciphey就能在3秒甚至更短的时间内给你解密，返回你想要的大部分密文的答案。下面就给大家介绍Ciphey的实战使用教程。1.准备开始之前，你要确保Python和pip已
埃隆·马斯克表示特斯拉“没有必要”授权 xAI 模型喜好儿网人工智能 AIGC 马斯克
埃隆·马斯克近日在社交媒体上对《华尔街日报》的一篇报道进行了反驳。该报道指出，马斯克旗下的电动汽车公司特斯拉可能与人工智能初创公司xAI达成了一项收入分享协议，以便特斯拉能够使用xAI的人工智能模型。据称，这些模型将被集成到特斯拉的全自动驾驶（FSD）软件中，并可能用于开发特斯拉汽车的语音助手以及人形机器人擎天柱的软件。喜好儿网然而，马斯克否认了这一说法，他在社交媒体平台上表示，尽管特斯拉确实与x
Reflection 70B——HyperWrite推出的大型语言模型新加坡内哥谈技术语言模型人工智能自然语言处理
每周跟踪AI热点新闻动向和震撼发展想要探索生成式人工智能的前沿进展吗？订阅我们的简报，深入解析最新的技术突破、实际应用案例和未来的趋势。与全球数同行一同，从行业内部的深度分析和实用指南中受益。不要错过这个机会，成为AI领域的领跑者。点击订阅，与未来同行！订阅：https://rengongzhineng.io/在AI技术飞速发展的过程中，我们已经见证了可以写作、编程，甚至创造艺术的模型问世。但有一
5条实操干货有效打造你的个人品牌长安行动派
这是ZerK的第46篇原创相信大家对个人品牌这个词已经不在陌生。尤其是在知识付费的年代，你的个人品牌，就是你的标签！在《深度工作》中说到，在未来有三种人会越来越贵第一种人:能与机器对话，操纵机器的人。人工智能时代的到来，机器毕竟部分取代人类。第二种人:IP，知识产权或者文学潜在财产就像有些网上课程一周卖出的钱和一个机构卖一年一样多。价值99元的课程，10万人购买，是很常见的。爱产出大概就是10万✖
深入探讨：如何在Python中通过LangChain技术精准追踪大型语言模型（LLM）的Token使用情况 m0_57781768 python langchain 语言模型
深入探讨：如何在Python中通过LangChain技术精准追踪大型语言模型（LLM）的Token使用情况在现代的人工智能开发中，大型语言模型（LLM）已经成为了不可或缺的工具，无论是用于自然语言处理、对话生成，还是其他复杂的文本生成任务。然而，随着这些模型的广泛应用，开发者面临的一个重要挑战是如何有效地追踪和管理Token的使用情况，特别是在生产环境中，Token的使用直接影响着API调用的成本
LangChain集成指南:如何利用多样化的AI提供商 aehrutktrjk 人工智能 langchain python
LangChain集成指南:如何利用多样化的AI提供商引言在人工智能和机器学习领域,LangChain已成为一个强大而灵活的框架,允许开发者轻松集成各种AI服务提供商。本文将深入探讨LangChain的集成能力,介绍如何利用不同的AI提供商来增强你的应用程序,并提供实用的代码示例。LangChain集成概览LangChain支持多种AI提供商的集成,这些集成可以分为两类:独立包集成:这些提供商有独
探索未来，大规模分布式深度强化学习——深入解析IMPALA架构汤萌妮Margaret
探索未来，大规模分布式深度强化学习——深入解析IMPALA架构scalable_agent项目地址:https://gitcode.com/gh_mirrors/sc/scalable_agent在当今的人工智能研究前沿，深度强化学习（DRL）因其在复杂任务中的卓越表现而备受瞩目。本文要介绍的是一个开源于GitHub的重量级项目：“ScalableDistributedDeep-RLwithImp
机器学习VS深度学习 nfgo 机器学习
机器学习（MachineLearning,ML）和深度学习（DeepLearning,DL）是人工智能（AI）的两个子领域，它们有许多相似之处，但在技术实现和应用范围上也有显著区别。下面从几个方面对两者进行区分：1.概念层面机器学习：是让计算机通过算法从数据中自动学习和改进的技术。它依赖于手动设计的特征和数学模型来进行学习，常用的模型有决策树、支持向量机、线性回归等。深度学习：是机器学习的一个子领
大数据毕业设计hadoop+spark+hive知识图谱租房数据分析可视化大屏租房推荐系统 58同城租房爬虫房源推荐系统房价预测系统计算机毕业设计机器学习深度学习人工智能 2401_84572577 程序员大数据 hadoop 人工智能
做了那么多年开发，自学了很多门编程语言，我很明白学习资源对于学一门新语言的重要性，这些年也收藏了不少的Python干货，对我来说这些东西确实已经用不到了，但对于准备自学Python的人来说，或许它就是一个宝藏，可以给你省去很多的时间和精力。别在网上瞎学了，我最近也做了一些资源的更新，只要你是我的粉丝，这期福利你都可拿走。我先来介绍一下这些东西怎么用，文末抱走。（1）Python所有方向的学习路线（
架构评审的自动化与人工智能: 如何提高效率光剑书架上的书架构自动化人工智能运维
1.背景介绍架构评审是软件开发过程中的一个关键环节，它旨在确保软件架构的质量、可维护性和可扩展性。传统的架构评审通常是由人工进行，需要大量的时间和精力。随着大数据技术和人工智能的发展，自动化和人工智能技术已经开始应用于架构评审，从而提高评审的效率和准确性。在本文中，我们将讨论如何通过自动化和人工智能技术来提高架构评审的效率。我们将从以下几个方面进行讨论：背景介绍核心概念与联系核心算法原理和具体操作
解锁企业潜能，Vatee万腾平台引领智能新纪元自媒体经济说其他
在数字化转型的浪潮中，企业正站在一个前所未有的十字路口，面对着前所未有的机遇与挑战。解锁企业内在潜能，实现跨越式发展，已成为众多企业的共同追求。而Vatee万腾平台，作为智能科技的先锋，正以其强大的智能赋能能力，引领企业步入一个全新的智能纪元。Vatee万腾平台，是一个集成了人工智能、大数据、云计算等前沿技术的综合性智能服务平台。它不仅仅是一个技术工具，更是企业转型升级的加速器，能够深入企业运营的
LiteBee Wing测评：走进中小学课堂，合适的编程无人机非常重要！ song_bcbd
“国务院在《新一代人工智能发展规划》中明确，要广泛开展人工智能科普活动，实施全民智能教育项目，要在中小学阶段设置人工智能相关课程，逐步推广编程教育，鼓励社会力量参与寓教于乐的编程教学软件、游戏的开发和推广，而且要进行人工智能竞赛。”作为从事创客教育多年的老师，感谢在这个大环境，让学生能够了解人工智能，接触到前沿科技，同时也鼓励更多学生学习编程，因为没有学编程，可能就会像现在的我们后悔以前没有学习好
释放“AI+”新质生产力，深算院如何“把大数据变小”？ YashanDB YashanDB 国产数据库数据库数据库大数据
近期，南都·湾财社推出《新质·中国造》栏目，深入千行百业，遍访湾区企业，解锁湾区新质生产力，共探高质量发展之道。本期对话深圳计算科学研究院YashanDB首席技术官陈志标，探讨国产数据库如何实现创新突围，抢抓数字经济时代的新机遇。以下是专访内容：如何应对AI时代所面临的算力挑战？南都·湾财社：数据、算力和算法是发展人工智能的三要素，深算院做了怎样的前瞻性布局？陈志标：今年，政府工作报告中首次提及开
[黑洞与暗粒子]没有光的世界 comsci
无论是相对论还是其它现代物理学,都显然有个缺陷,那就是必须有光才能够计算但是,我相信,在我们的世界和宇宙平面中,肯定存在没有光的世界.... 那么,在没有光的世界,光子和其它粒子的规律无法被应用和考察,那么以光速为核心的 &nbs
jQuery Lazy Load 图片延迟加载 aijuans jquery
基于 jQuery 的图片延迟加载插件，在用户滚动页面到图片之后才进行加载。对于有较多的图片的网页，使用图片延迟加载，能有效的提高页面加载速度。版本： jQuery v1.4.4+ jQuery Lazy Load v1.7.2 注意事项：需要真正实现图片延迟加载，必须将真实图片地址写在 data-original 属性中。若 src
使用Jodd的优点 Kai_Ge jodd
1. 简化和统一 controller ，抛弃 extends SimpleFormController ，统一使用 implements Controller 的方式。 2. 简化 JSP 页面的 bind, 不需要一个字段一个字段的绑定。 3. 对 bean 没有任何要求，可以使用任意的 bean 做为 formBean。使用方法简介
jpa Query转hibernate Query 120153216 Hibernate
public List<Map> getMapList(String hql, Map map) { org.hibernate.Query jpaQuery = entityManager.createQuery(hql); if (null != map) { for (String parameter : map.keySet()) { jp
Django_Python3添加MySQL/MariaDB支持 2002wmj mariaDB
现状首先，[email protected] 中默认的引擎为 django.db.backends.mysql 。但是在Python3中如果这样写的话，会发现 django.db.backends.mysql 依赖 MySQLdb[5] ，而 MySQLdb 又不兼容 Python3 于是要找一种新的方式来继续使用MySQL。 MySQL官方的方案首先据MySQL文档[3]说，自从MySQL
在SQLSERVER中查找消耗IO最多的SQL 357029540 SQL Server
返回做IO数目最多的50条语句以及它们的执行计划。 select top 50 (total_logical_reads/execution_count) as avg_logical_reads, (total_logical_writes/execution_count) as avg_logical_writes, (tot
spring UnChecked 异常官方定义！ 7454103 spring
如果你接触过spring的事物管理！那么你必须明白 spring的非捕获异常！即 unchecked 异常！因为 spring 默认这类异常事物自动回滚！！ public static boolean isCheckedException(Throwable ex) { return !(ex instanceof RuntimeExcep
mongoDB 入门指南、示例 adminjun java mongodb 操作
一、准备工作 1、下载mongoDB 下载地址：http://www.mongodb.org/downloads 选择合适你的版本相关文档：http://www.mongodb.org/display/DOCS/Tutorial 2、安装mongoDB A、不解压模式：将下载下来的mongoDB-xxx.zip打开，找到bin目录，运行mongod.exe就可以启动服务，默
CUDA 5 Release Candidate Now Available aijuans CUDA
The CUDA 5 Release Candidate is now available at http://developer.nvidia.com/<wbr></wbr>cuda/cuda-pre-production. Now applicable to a broader set of algorithms, CUDA 5 has advanced fe
Essential Studio for WinRT网格控件测评 Axiba JavaScript html5
Essential Studio for WinRT界面控件包含了商业平板应用程序开发中所需的所有控件，如市场上运行速度最快的grid 和chart、地图、RDL报表查看器、丰富的文本查看器及图表等等。同时，该控件还包含了一组独特的库，用于从WinRT应用程序中生成Excel、Word以及PDF格式的文件。此文将对其另外一个强大的控件——网格控件进行专门的测评详述。网格控件功能 1、
java 获取windows系统安装的证书或证书链 bewithme windows
有时需要获取windows系统安装的证书或证书链，比如说你要通过证书来创建java的密钥库。有关证书链的解释可以查看此处。 public static void main(String[] args) { SunMSCAPI providerMSCAPI = new SunMSCAPI(); S
NoSQL数据库之Redis数据库管理(set类型和zset类型) bijian1013 redis 数据库 NoSQL
4.sets类型 Set是集合，它是string类型的无序集合。set是通过hash table实现的，添加、删除和查找的复杂度都是O(1)。对集合我们可以取并集、交集、差集。通过这些操作我们可以实现sns中的好友推荐和blog的tag功能。 sadd：向名称为key的set中添加元
异常捕获何时用Exception，何时用Throwable bingyingao
用Exception的情况 try { //可能发生空指针、数组溢出等异常 } catch (Exception e) {
【Kafka四】Kakfa伪分布式安装 bit1129 kafka
在http://bit1129.iteye.com/blog/2174791一文中，实现了单Kafka服务器的安装，在Kafka中，每个Kafka服务器称为一个broker。本文简单介绍下，在单机环境下Kafka的伪分布式安装和测试验证 1. 安装步骤 Kafka伪分布式安装的思路跟Zookeeper的伪分布式安装思路完全一样，不过比Zookeeper稍微简单些(不
Project Euler bookjovi haskell
Project Euler是个数学问题求解网站，网站设计的很有意思，有很多problem，在未提交正确答案前不能查看problem的overview，也不能查看关于problem的discussion thread，只能看到现在problem已经被多少人解决了，人数越多往往代表问题越容易。看看problem 1吧： Add all the natural num
Java-Collections Framework学习与总结-ArrayDeque BrokenDreams Collections
表、栈和队列是三种基本的数据结构，前面总结的ArrayList和LinkedList可以作为任意一种数据结构来使用，当然由于实现方式的不同，操作的效率也会不同。这篇要看一下java.util.ArrayDeque。从命名上看
读《研磨设计模式》-代码笔记-装饰模式-Decorator bylijinnan java 设计模式
声明：本文只为方便我个人查阅和理解，详细的分析以及源代码请移步原作者的博客http://chjavach.iteye.com/ import java.io.BufferedOutputStream; import java.io.DataOutputStream; import java.io.FileOutputStream; import java.io.Fi
Maven学习(一) chenyu19891124 Maven私服
学习一门技术和工具总得花费一段时间，5月底6月初自己学习了一些工具，maven+Hudson+nexus的搭建，对于maven以前只是听说，顺便再自己的电脑上搭建了一个maven环境，但是完全不了解maven这一强大的构建工具，还有ant也是一个构建工具，但ant就没有maven那么的简单方便，其实简单点说maven是一个运用命令行就能完成构建，测试，打包，发布一系列功
[原创]JWFD工作流引擎设计----节点匹配搜索算法(用于初步解决条件异步汇聚问题) 补充 comsci 算法工作 PHP 搜索引擎嵌入式
本文主要介绍在JWFD工作流引擎设计中遇到的一个实际问题的解决方案，请参考我的博文"带条件选择的并行汇聚路由问题"中图例A2描述的情况(http://comsci.iteye.com/blog/339756),我现在把我对图例A2的一个解决方案公布出来，请大家多指点节点匹配搜索算法(用于解决标准对称流程图条件汇聚点运行控制参数的算法) 需要解决的问题：已知分支
Linux中用shell获取昨天、明天或多天前的日期 daizj linux shell 上几年昨天获取上几个月
在Linux中可以通过date命令获取昨天、明天、上个月、下个月、上一年和下一年 # 获取昨天 date -d 'yesterday' # 或 date -d 'last day' # 获取明天 date -d 'tomorrow' # 或 date -d 'next day' # 获取上个月 date -d 'last month' #
我所理解的云计算 dongwei_6688 云计算
在刚开始接触到一个概念时，人们往往都会去探寻这个概念的含义，以达到对其有一个感性的认知，在Wikipedia上关于“云计算”是这么定义的，它说： Cloud computing is a phrase used to describe a variety of computing co
YII CMenu配置 dcj3sjt126com yii
Adding id and class names to CMenu We use the id and htmlOptions to accomplish this. Watch. //in your view $this->widget('zii.widgets.CMenu', array( 'id'=>'myMenu', 'items'=>$this-&g
设计模式之静态代理与动态代理 come_for_dream 设计模式
静态代理与动态代理代理模式是java开发中用到的相对比较多的设计模式，其中的思想就是主业务和相关业务分离。所谓的代理设计就是指由一个代理主题来操作真实主题，真实主题执行具体的业务操作，而代理主题负责其他相关业务的处理。比如我们在进行删除操作的时候需要检验一下用户是否登陆，我们可以删除看成主业务，而把检验用户是否登陆看成其相关业务
【转】理解Javascript 系列 gcc2ge JavaScript
理解Javascript_13_执行模型详解摘要: 在《理解Javascript_12_执行模型浅析》一文中,我们初步的了解了执行上下文与作用域的概念，那么这一篇将深入分析执行上下文的构建过程，了解执行上下文、函数对象、作用域三者之间的关系。函数执行环境简单的代码:当调用say方法时，第一步是创建其执行环境，在创建执行环境的过程中，会按照定义的先后顺序完成一系列操作:1.首先会创建一个
Subsets II hcx2013 set
Given a collection of integers that might contain duplicates, nums, return all possible subsets. Note: Elements in a subset must be in non-descending order. The solution set must not conta
Spring4.1新特性——Spring缓存框架增强 jinnianshilongnian spring4
目录 Spring4.1新特性——综述 Spring4.1新特性——Spring核心部分及其他 Spring4.1新特性——Spring缓存框架增强 Spring4.1新特性——异步调用和事件机制的异常处理 Spring4.1新特性——数据库集成测试脚本初始化 Spring4.1新特性——Spring MVC增强 Spring4.1新特性——页面自动化测试框架Spring MVC T
shell嵌套expect执行命令 liyonghui160com
一直都想把expect的操作写到bash脚本里,这样就不用我再写两个脚本来执行了,搞了一下午终于有点小成就,给大家看看吧. 系统:centos 5.x 1.先安装expect yum -y install expect 2.脚本内容: cat auto_svn.sh #!/bin/bash
Linux实用命令整理 pda158 linux
0. 基本命令　　linux 基本命令整理　　1. 压缩解压　　tar -zcvf a.tar.gz a #把a压缩成a.tar.gz 　　tar -zxvf a.tar.gz #把a.tar.gz解压成a 　　2. vim小结　　2.1 vim替换　　:m,ns/word_1/word_2/gc
独立开发人员通向成功的29个小贴士 shoothao 独立开发
概述：本文收集了关于独立开发人员通向成功需要注意的一些东西,对于具体的每个贴士的注解有兴趣的朋友可以查看下面标注的原文地址。明白你从事独立开发的原因和目的。保持坚持制定计划的好习惯。万事开头难，第一份订单是关键。培养多元化业务技能。提供卓越的服务和品质。谨小慎微。营销是必备技能。学会组织，有条理的工作才是最有效率的。 “独立
JAVA中堆栈和内存分配原理 uule java
1、栈、堆 1.寄存器：最快的存储区, 由编译器根据需求进行分配,我们在程序中无法控制.2. 栈：存放基本类型的变量数据和对象的引用，但对象本身不存放在栈中，而是存放在堆（new 出来的对象）或者常量池中（字符串常量对象存放在常量池中。）3. 堆：存放所有new出来的对象。4. 静态域：存放静态成员（static定义的）5. 常量池：存放字符串常量和基本类型常量（public static f