ylzhj02

How does LinkedIn's recommendation system work?

http://www.quora.com/How-does-LinkedIns-recommendation-system-work

I gave this talk earlier this week at Hadoop World(http://www.hadoopworld.com/sessi...), a conference that is evangelizing Hadoop by way of highlighting how people across the industry are solving big business challenges by leveraging Hadoop. I am posting here the slides with an approximate transcript of my talk.

Ever since I studied Machine Learning and Data Mining at Stanford 3 years ago., I have been enamored by the idea that it is now possible to write programs that can sift through TBS of data to recommend useful things.

So here I am with my colleague Adil Aijaz, for a talk on some of the lessons we learnt and challenges we faced in building large-scale recommender system

At LinkedIn we believe in building platform, not verticals. Our talk is divided into 2 parts. In the first part of this talk, I will talk about our motivation for building the recommendation platform, followed by a discussion on how we do recommendations. No analytics platform is complete without Hadoop. So, in he next part of our talk, Adil will talk about leveraging Hadoop for scaling our products.

‘Think Platform, Leverage Hadoop’ is our core message.

Throughout our talk, we will provide examples that highlight how these two ideas have helped us ‘scale innovation’.

With north of 135 million members, we’re making great strides toward our mission of connecting the world’s professionals to make them more productive and successful. For us this not only means helping people to find their dream jobs, but also enabling them to be great at the jobs they’re already in.

With terabytes of data flowing through our systems, generated from member’s profile, their connections and their activity on LinkedIn, we have amassed rich and structured data of one of the most influential, affluent and highly-educated audience on the web.

This huge semi-structured data is getting updated in real-time and growing at a tremendous pace, we are all very excited about the data opportunity at LinkedIn.

For an average user, there is so much data, there is no way users can leverage all the data on their own.

We need to put the right information, to the right user at the right time.

With such rich data of members, jobs, groups, news, companies, schools, discussions and events. We do all kinds of recommendations in a relevant and engaging way.

We have products like
‘Job Recommendation’: here using profile data we suggest top top jobs that our member might be interested in. The idea is to let our users be aware of the possibilities out there for them.

‘Talent Match’: When recruiters post jobs, we in real-time suggest top candidates for the job.

‘News Recommendation’: Using articles shared per industry, we suggest top news that our users need to keep them updated with the latest happenings.

‘Companies You May Want to Follow’: Using a combination of content matching and collaborative filtering, we recommend companies a user might be interested in keeping up-to date with.

‘PYMK’: based on common attributes like connections, schools, companies and some activity based features, we suggest people that you may know outside of LinkedIn and may want to connect with at LinkedIn.

‘Similar Profiles’: Finally our latest offering which we released a few months ago for recruiters, Similar Profiles. Given a candidate profile, we suggest top Similar candidates for hiring based on overall background, experience and skills.

We have recommendation solutions for everyone, for individuals, recruiters and advertisers

In our view, recommendations are ubiquitous and they permeate the whole site.

Before we discuss our motivation for building a recommendation platform or a discussion on how we do recommendation or how we leverage Hadoop. Let’s first answer a basic question: Are recommendations really important? To put things in perspective, 50% of total job applications and job views by members are a direct result of recommendations. Interestingly, in the past year and half it has risen from 6% to 50%.

This kind of contributio n is observed across all our recommendation products and is growing by the day.

Let us start with an example of the kind of data we have.
For a member, we have positions, education, Summary, Specialty, Experience and skills of the user from the profile itself. Then, from member’s activity we have data about members connections, the groups that the member has joined, the company that the member follows amongst others.

Before we can start leveraging data for recommendation, we first need to clean and canonicalize it.

In order to accurately match members to jobs, we need to understand that all the ways of listing these titles refer to the same entity ‘Software Engineer’
‘Technical Yahoo’ is a Software Engineer at Yahoo
‘Member Technical Staff’ is a Software Engineer at Oracle
‘Software Development Engineer’ is a Software Engineer at Microsoft
‘SDE’ is a Software Engineer at Amazon

Solving this problem is itself a research topic broadly referred to as ‘Entity Resolution’.

As another example, How many variations do you think, we have for the company name ‘IBM’?

When I joined LinkedIn, I was surprised to find that we had close to 8000+ user entered variations of the same company IBM.

We apply machine learnt classifiers for the entity resolution using a host of features for company standardization.

In summary, data canonicalization is the key to accurate matching and is one of the most challenging aspects of our recommendation platform.

Now, we will discuss our motivation behind building a common platform by way of 3 key example trade-offs that we’ve encountered. In LinkedIn ecosystem, one t rade- off t hat we encou ntered is that of real-time Vs time independent recommendations.

Lets look at ‘News Recommendation’, which finds relevant news for our users. Relevant news today might be old news tomorrow. Hence, News recommendation has to have a strong real-time component.

On the other hand, we have another product called ‘Similar Profiles’. The motivation here is that if a hiring manager, already know the kind of person that he wants to hire. He could be like person in his team already or like one of his connections on LinkedIn, then using that as the source profile, we suggest top Similar Profiles for hiring. Since, ‘people don’t reinvent themselves everyday’. ‘People similar today are most likely similar today’. So, we can potentially do this computation completely offline with a more sophisticated model.

These are the 2 extreme cases in terms of ‘freshness’ : Most examples fall into an intermediate category. For E.g. new jobs gets posted by the hour and they expire when they get filled up. All jobs posted today don’t expire the same day. Hence, we cache job recommendation for members for some time as it is OK to not recommend the absolute latest jobs instantly to all members.
In solving the completely real-time Vs completely offline problem, we could have gone down the route of creating separate solutions optimized for the use-case. In the short run, that would have been a quicker solution.

But we went down the platform route because we realized that we would churn out more and more such verticals as LinkedIn grows. Now, as a result of which we have the same code that computes recommendations online as well as offline. Moreover, in the production system caching and an expiry policy allows us to keep recommendations fresh irrespective of how we compute the recommendations. Now, as a result for newer verticals. We can easily get ‘freshness’ of recommendations irrespective of whether we compute recommendations online or offline.

Another interesting trade-off, is choosing between content analysis and collaborative filtering.

Historically speaking: ‘job posting has been a post-a nd-prey’ model. Where job posters post a job and pray that hopefully someone will apply for the job. But we at LinkedIn, believe in ‘post-and-produce’ so we go ahead and produce matches to the job poster in real-time right after the job gets posted. When someone posts a job, the job poster naturally expects the candidates to have a strong match between job and the profile. Hence, this type of recommendation is heavy on content analysis.

On the other hand, we have a product called ‘Viewers of the profile also viewed ..’. When a member views more than 1 profile within a single session. We record it as a co-view. Then on aggregating these co-views for every member. We get the data for all profiles that get co-viewed, when someone visits any given profile. This is a classical collaborative filtering based recommendation much like Amaozon’s ‘people who viewed this item also viewed’

Most other recommendations are hybrid. For e.g. for ‘SimilarJobs’ jobs that have high content overlap with each other are similar. Interestingly, jobs that get applied to or viewed by the same members are also Similar. So, Similar Jobs is a nice mix of content and collaborative filtering.

Again, Because of a platform approach, we can re-use the content matching and the collaborative filtering components to come up with newer verticals without re-inventing the wheel.

Finally, the last key trade-off is of precision vs recall.

On our homepage, we suggest jobs that are good fit for our users with the motivation for them to be more aware of the possibilities out there. In some sense, we are pushing recommendations to you , as opposed to you actively looking for them.

Even if a single job recommendation looks bad to the user either because of a lower seniority of the job or because the recommendation is for the company that the user is not fond of, our users might feel less than pleased.

Here, getting the absolute best 3 jobs even at the cost of aggressively filtering out a lot of jobs is acceptable.

On the other hand, We have another recommendation product called ‘Similar Profiles’ for hiring managers who are actively looking for candidates. Here, if one finds a candidate we suggest other candidates like the original one in terms of overall experience, specialty, education background and a host of other features.

Since, the hiring manager is actively looking so in this case they are more open to getting a few bad ones as long as they get a lot of good ones too. So in essence, recall is more important here.

Again, because of a platform approach and because we can re-use features, filters and code-base across verticals. So, tuning the knob of more precision vs more recall is mostly a matter of figuring out 1. how complicated the matching model should be and 2. how aggressively we want to apply filters for all recommendation verticals. Hence, our core message ‘Think Platform’. Now, we will discuss in some detail how our recommendations work.

L ets see how we do recommendations by taking an example of ‘Similar Profiles’ that we just discussed. Given a member profile, the goal is to find other similar people for hiring. Lets try to find profiles similar to me.
Here, we look at host of different features, such as
1. User provided features like ‘Title, Specialty, Education, experience amongst others’
2. Complex derived features like ‘ Seniority ’ and ‘Skills’, computed using machine learnt classifiers.
3. Both these kinds of features help in precision, we also have features like ‘Related Titles’ and ‘Related Companies’ that help increase the recall.

Intuitively, one might imagine that we use the following pair of features to compute Similar Profiles. In the next slide, we will discuss a more principled approach to figuring out pair of features to match against. Here in order to compute overall of similarity between me and Adil, we are first computing similarity between our specialties, our skills, our titles and other attribute.

With this we get a ‘Similarity score vector’ for how Similar Adil is to me, Similarly we can get such a vector for other profiles.

Now we somehow need to combine the similarity score in the vector to a single number s.t. profiles that have higher similarity score across more dimensions get ranked higher, as similar profiles for me.
Moreover, the fact that our skills match might matter more for hiring than whether our education matches. Hence, there should be relative importance of one feature over the others.

Once we g et the topK recommendations, we also apply application specific filtering with the goal of leveraging domain knowledge

For example, it could be the case that for a ‘Data Engineer role’ you as a hiring manager are looking for a candidate like one of your team member but who is local. Whereas, for all you know the ideal ‘Data Engineer most similar to the one you are looking for in terms of skills might be working somewhere in INDIA’ .

To ensure our recommendation quality keeps improving as more and more people use our products, we use explicit and implicit user feedback combined with crowd-sourcing, to construct high quality training and test sets for constructing the ‘Importance weight vector’. Moreover, classifier with L1 regularization helps prune out the weakly correlated features. We use this for figuring out which features to match profiles against.
We just discussed an example. However, the same concepts apply to all the recommendation verticals.

And now the technologies that drives it all.
The core our matching algorithm uses Lucene with our custom query implementation.

We use Hadoop to scale our platform. It serves a variety of needs from computing Collaborative filtering features, building Lucene indices offline, doing quality analysis of recomme ndation and host of other exciting things that Adil will talk about in a bit.

Lucene does not provide fast real-time indexing . To keep our indices up-to date, we use a real-time indexing library on top of Lucene called Zoie .

We provide facets to our members for drilling down and exploring recommendation results. T his is made possible by a Faceting Search library called Bobo.

For storing features and for caching recommendation results, we use a key-value store Voldemort.
For analyzing tracking and reporting data, we use a distributed messaging system called Kafka.

Out of these Bobo, Zoie, Voldemort and Kafka are developed at LinkedIn and are open sourced. In fact, Kafka is an apache incubator project.

Historically, we have used R for model training. We have recently started experimenting with Mahout for model training and are excited about it.

All the above technologies, combined with great engineers powers LinkedIn’s Recommendation platform.

Now Adil will talk about how we leverage Hadoop.

In the second half of our talk we will present case studies on how Hadoop has helped us scale innovation for our Recommendation Platform. We will use the ‘Similar Profiles’ vertical which was discussed earlier as the example for each case study. As a quick reminder, similar profiles recommends profiles that are similar to the one a user is interested in. Some of its biggest customers are hiring managers and recruiters. For each of the case studies, we will lay out the solutions we tried before turning to Hadoop, an alyze the pros and cons of the approaches before and after Hadoop, and finally derive some lessons that are applicable to folks working on large scale recommendation systems.

When it comes to re commendations, relevance is the most important consideration. However, with over 120M members, and billions of recommendation computations, the latency of our recommendations becomes equally important. No mater how great our recommendations are, they wont be of utility to our members if we take too long to return recommendations. Among our many products, ‘Similar Profiles’ is a particularly challenging product to speed up. Our plain vanilla solution involved using a large number of features to mine the entire member index for the best matches. The latency of this solution was in the order of seconds. Clearly, no matter how relevant our recommendations were, with that type of latency, our members would not even wait for the results. So, something had to be done. We needed a solution that could pre-filter most of the irrelevant results while maintaining a high precision on the documents that survived the filter. One technique that meets these conditions is minhashing. On a very high level, minhashing involves running each document, in our case member profiles, through k hash functions to to construct a bit vector. One can play with ANDING/Oring of subsets of the bit vector, to get the right balance between recall and precision. As our second pass solution, we minhashed each document and stored the resulting bit array in the member index. At query time, we minhashed the query into a bit array, filtered out documents that did not have the exact same subsets of the bit array, and finally did advanced matching on documents that survived the filtration. This solution brought down the latency well below a second, however, minhashing did not give us the recall we had hoped for. This was a really disappointing result since we had spent significant engineering resources in productionalizing minhashing, yet it was all for nought.

So, we went back to the drawing board and started thinking about how we can use Hadoop to solve this problem. The key breakthrough was when we realized that people do not reinvent themselves everyday. The folks I was similar to yesterday, are likely to be the same folks I am similar to today. This meant that we could serve ‘similar profiles’ recommendations from cache. When the cache would expire, we could compute fresh recommendations online and repopulate the cache. This meant that the user would almost always be served from cache. Great: but we still have to populate the cache somehow. This is where Hadoop comes into the picture. By opening an index shard per mapper, we can generate a portion of the recommendations in each mapper, and combine them into a final recommendation set in the reducers. With the distributed computation of Hadoop, we easily generate similar profiles for each member, and then copy over the results to online caches. So the three elements of:
a) Offline batch computations on Hadoop copied to online caches.
b) Aggressive online caching policy
c) Online computation when the cache e xpires

Have scaled our similar profiles recommendations while maintaining a high precision of recommendations.

So, the key takeaway from this case study of scaling is that if one is facing the problem of:
a) High latency computation
b )High qps
c) And not so stringent freshness requirements

Then one should leverage Hadoop and caching to scale the recommendation computation.

With our scaling problems solved, we rolled out Similar Profiles to our members. The reception was amazing. However, we felt that we could do even better by going beyond content based features alone. One such feature that we wanted to experiment w ith was collaborative filtering. More specifically, if a member browses multiple member profiles within a single session, aka coviews, it is quite likely that those member profiles are very similar to each other. How we blend collaborative filtering with existing content based recommendations is the subject of our second case study: “blending multiple recommendation algorithms”.

Our basic blending solution is this: While constructing the query for content based similar profiles, we fetch collaborative filtering recommendations and their scores, and attach them to the query. In the scoring of content based recommendations, we can use the collaborative filtering score as a boost. An alternative approach is a bag of models approach with content and collaborative filtering serving as two of the models in the bag.
In either solution, we need a way to keep collaborative filtering results fresh. If two member profiles were coviewed yesterday, we should be able to use that knowledge today.

W e first sketched out a completely online solution. This online solution involved keeping track of the state of each member session, accumulating all the profile views within that session. At the end of each session, we updated the counts of the various coview pairs. As you can appreciate, such a stateful solution can get very complicated very quickly. As an example, we have to worry about machine failures, multi data center coordination just to name two challenges. In essence such a solution could introduce more problems than it solves. So, we scratched this solution even before implementing it.
We thought more about this problem and realized two important aspects: 1) Coview counts can be updated in batch mode. 2) We can tolerate delay in updating our collaborative filtering results.

These two properties, batch computation and tolerance for delay in impacting the online world, led us to leverage Hadoop to solve this problem. Our production servers can produce tracking events everytime a member profile is viewed. These tracking events are copied over to hdfs where everyday, we use these tracking events to batch compute a fresh set of collaborative filtering recommendations. These recommendations are then copied to online key value stores where we use the blending approaches outlined earlier to blend collaborative filtering and content based recommendations.

Compared to the purely online solution, the Hadoop solution is simpler in complexity, less error prone, but introduces a lag between the time two profiles are coviewed and the time that coview has an impact on similar profiles. For us, this solution works great. The other great thing about this solution is that it can be easily extended to blend social or globally popular recommendations in addition to collaborative filtering.

The lesson we derive from this case study is that by leveraging Hadoop, we were able to experiment with collaborative filtering in similar profiles without significant investment in an online system to keep collaborative filtering results fresh. Once our proof of concept was successful, we could always go back and see if reducing the lag between a profile coview and its impact on similar profile by building an online system would be useful. If it is, we could invest in a non-Hadoop system. However, by leveraging Hadoop, we were able to defer that decision till the point when we had data to backup our assumptions.

A consistent feedback from hiring managers using Similar Profiles was that while the recommendations were highly relevant, often times the recommended members were not ready to take the next steps in their professional career. Such recommended members would thus respond negatively to a contact from the hiring manager, lead ing to a bad experience for the hiring manager. This feedback indicated a strong preference from our users that they would like a tradeoff between relevance of recommendations and the responsiveness of those recommendations. One can imagine a similar scenario playing out for a sales person looking for recommendations for potential clients.

As our next case study, let’s take a look at how we approached solving this problem for our users. Let’s say we come up with an algorithm that assigns each LinkedIn member a ‘JobSeeker’ score which indicates how open is she to taking the next step in her career. As we said already, this feature would be very useful for ‘Similar Profiles’. However, the utility of this feature would be directly related to how many members have this score: aka coverage. The key challenge we faced was that since ‘Similar Profiles’ was already in production, we had to add this new feature while continuing to serve recommendations. We call this problem “grandfathering”.

A naïve solution , could be to assign a ‘jobseeker’ score to a member next time she updates her profile. This approach will have minimal impact on the system serving traffic, however, we will not have all members tagged with such a score for a very long time, which impacts the utility of this feature for ‘Similar Profiles’.

So, we scratch the naïve solution and look for a solution that will batch update all members with this score in all data centers while serving traffic.

A second pass solution is to run a ‘batch’ feature extraction pipeline in parallel to the production feature extraction pipeline. This batch pipeline will query the db for all members and add a ‘job seeker score’ to every member. This solution ensures that we have an upper bound on the time it takes to grandfather all members with job seeker score. It will work great for small startups whose member base is in a few million range.

However, the downside of this solution at LinkedIn scale is:
1) It adds load on the production databases serving live traffic.
2) To avoid the load, we end up throttling the batch solution which in turn makes the batch pipeline run for days or weeks. This slows down rate of batch update.
3) Lastly, the two factors above combine to make grandfathering a ‘dreaded word’. You only end up grandfathering ‘once a quarter’ which is clearly not helpful in innovating faster.

So, we clearly cannot use that solution either. However, one good aspect of this solutio n is the batch update which leads us into a Hadoop based solution.

Using Hadoop, we take a snapshot of member profiles in production, move it to hdfs, grandfather members with a ‘job seeker score’ in a matter of hours, and copy this data back online. This way we can grandfather all members with a job seeker score in hours.
The biggest advantage of using Hadoop here is that grandfathering is no longer a ‘dreaded word’. We can grandfather when ready instead of grandfather every quarter which speeds up innovation.

So, in a nutshell, if one finds oneself slowed down due to constraints of updating features in the online world, consider batch updating the features offline using Hadoop and swapping them out online.

With the first few versions of similar profiles out the door, we began to simultaneously investigate a number of avenues for improvement. Some of us investigated different model families, say logistic regression vs SVM, others investigated new features with the existing model. In this case study, we will talk about how we decided which one of these experiments would actually improve the online relevance of similar profiles so we could double down on getting them out to production. We are not concerned with ‘how’ we come up with these new models. For all that matters, we hand tuned a common sense model. The question is how to decide whether or not to mo ve that new model to production.

Now, as a base solution, we can always move every model to production. We can A/B test the model with real traffic and see which models sink and which ones float. The simplicity of this approach is very attractive, however, there are some major flaws with it:
1) For each models we have to push code to production. This takes up valuable development resources.
2) There is an upper limit on the number of A/B tests one can run at the same time. This can be due to user experience concerns and/or revenue concerns.
3) Since online tests need to run for days before enough data is accumulated to make a call, this approach slows down rate of progress.

Ideally, we would like to try all our ideas offline, evaluate them, and only push to production the best ones. Hadoop proves critical in the evaluation step. Overtime, using implicit and explicit feedback combined with crowdsourcing, we have accumulated a huge gold test set for ‘similar profiles’. We rank the gold set with each model on Hadoop, and use standard ranking metrics to evaluate which one performs best.

As you c an guess, Hadoop provides a very good sandbox for our ideas. We are able to filter many of the craziest ideas, and double down on only those few that show promise. Plus, it allows us to have relatively large gold sets which gives us strong confidence in our evaluation results.

Now that we have learned a new model for ‘Similar Profiles’ that performs well in our offline evaluation framework, we need to test it online. An industry standard approach to this problem is known as A/B testing or bucket testing. Formally, AB testing involves partitioning real traffic between alternatives and then evaluating which alternative maximizes the desired objective. Typical desired objectives are CTR, revenue, or number of views.

The key requirements of AB testing is that: time to evaluate which bucket to send traffic to should ideally be < 1ms, at-worst a few ms.

Let’s discuss how we would do A/B testing for our new model. For simple partitioning requirements, one can use a mod-based scheme. This is very fast and very simple and can satisfy most use cases. However, if one wishes to partition traffic based on profile and member activity criteria for e.g. “Send 10% of members who have greater than 100 connections AND who have logged in the last one week AND who are based in Europe” then doing this online is too expensive. Keep in mind that deciding which bucket to send traffic to should be very fast, ideally less than a millisecond. In worst case scenario a few milliseconds. I am not going to even attempt an online solution for this problem.
So, we go straight to Hadoop. For complex criteria like this, we run over our entire member base on Hadoop every couple of hours, assigning them to the appropriate bucket for each test. The results of this computation are pushed online, where the problem of A/B testing reduces to given a member and a test, fetching which bucket to send the traffic to from cache.

The take home message here is: If you need complex targeting and A/B testing, leverage Hadoop.

Our last case study involves the last step of a model deployment process: tracking and reporting. These two steps allow us to have an unbiased, data-driven way of saying whether or not a new model is successful in lifting our desired metrics: ctr or revenue or engagement or whatever else one is interested in. Our production servers generate tracking events every time a recommendation is impressed, clicked or rejected by the member.

Before Hadoop, we used to have an online reporting tool that would listen to tracking events over a moving window of time, doing in-memory joins of different streams of events and reporting up-to-the minute performance of our models. The clear advantages of this approach were that we could see exactly how the model was performing online at this moment. However, there are a few downsides such as
a) One cannot look at greater than certain amount of time window in the past..
b) As the number of tracking streams increases, it becomes harder and harder to join them online.

To increase the time window, we will have to spend significant engineering resources in architecting a scalable reporting system which would be an overkill. Instead we placed our bet on Hadoop. All tracking events at LinkedIn are stored on HDFS. Add to this data Pig or plain Map-Reduce, we can do arbitrary k-way joins across billions of rows to come up with reports that can look as far in the past as we want to.

The advantages of this approach are quite clear. Complex joins are easy to compute and reporting is flexible on time windows. However, we cannot have up-to-the minute reports since we copy over tracking events in batch to hdfs. If we ever need that level of reporting, we can always use our online solution.

We can say without any hesitation that Hadoop has now become an integral part of the whole life-cycle of our workflow starting from prototyping a new idea to eventually tracking the impact of that idea.

By thinking about platform and not verticals, we are able to to come with newer verticals at a fast pace.
By leveraging Hadoop, we were able to continuously improve the quality and scale the computations.

Hence, these 2 ideas helped us ‘scale innovation’ at Linkedin.

To conclude, we want to say that the data opportunity at LinkedIn is HUGE and so come work with us at LinkedIn!

你可能感兴趣的:(System)

android系统selinux中添加新属性property 辉色投像
1.定位/android/system/sepolicy/private/property_contexts声明属性开头：persist.charge声明属性类型：u:object_r:system_prop:s0图12.定位到android/system/sepolicy/public/domain.te删除neverallow{domain-init}default_prop:property
C#中使用split分割字符串互联网打工人no1 c#
1、用字符串分隔：usingSystem.Text.RegularExpressions;stringstr="aaajsbbbjsccc";string[]sArray=Regex.Split(str,"js",RegexOptions.IgnoreCase);foreach(stringiinsArray)Response.Write(i.ToString()+"");输出结果：aaabbbc
SpringBlade dict-biz/list 接口 SQL 注入漏洞文章永久免费只为良心 oracle 数据库
SpringBladedict-biz/list接口SQL注入漏洞POC:构造请求包查看返回包你的网址/api/blade-system/dict-biz/list?updatexml(1,concat(0x7e,md5(1),0x7e),1)=1漏洞概述在SpringBlade框架中，如果dict-biz/list接口的后台处理逻辑没有正确地对用户输入进行过滤或参数化查询（PreparedSta
openssl+keepalived安装部署 _小亦_ 项目部署 keepalived openssl
文章目录OpenSSL安装下载地址编译安装修改系统配置版本Keepalived安装下载地址安装遇到问题安装完成配置文件keepalived运行检查运行状态查看系统日志修改服务service重新加载systemd检查配置文件语法错误OpenSSL安装下载地址考虑到后面设备可能没法连接到外网，所以采用安装包的方式进行部署，下载地址：https://www.openssl.org/source/old/
CentOS的根目录下，/bin 和 /sbin 用途和权限 Energet!c Linux日常 centos linux 运维
CentOS的根目录下，/bin和/sbin用途和权限一、/bin(Binary)二、/sbin(SystemBinary)三、总结在CentOS的根目录下，/bin和/sbin目录有不同的用途和权限一、/bin(Binary)用途:存放系统的基本命令，这些命令对所有用户都是可用的。例如：ls、cp、mv、rm等。权限:普通用户和系统管理员都可以使用这些命令。二、/sbin(SystemBinar
[Unity]在场景中随机生成不同位置且不重叠的物体 Bartender_Jill Graphics图形学笔记 unity 游戏引擎动画
1.前言最近任务需要用到Unity在场景中随机生成物体，且这些物体不能重叠，简单记录一下。参考资料:Howtoensurethatspawnedtargetsdonotoverlap?2.结果与代码结果如下所示：代码如下所示：usingSystem.Collections.Generic;usingUnityEngine;namespaceAssets.Scripts{publicclassNew
ubuntu安装wordpress lissettecarlr
1安装nginx网上安装方式很多，这就就直接用apt-get了apt-getinstallnginx不用启动啥，然后直接在浏览器里面输入IP:80就能看到nginx的主页了。如果修改了一些配置可以使用下列命令重启一下systemctlrestartnginx.service2安装mysql输入安装前也可以更新一下软件源，在安装过程中将会让你输入数据库的密码。sudoapt-getinstallmy
最简单将静态网页挂载到服务器上(不用nginx) 全能全知者服务器 nginx 运维前端 html 笔记
最简单将静态网页挂载到服务器上(不用nginx)如果随便弄个静态网页挂在服务器都要用nignx就太麻烦了，所以直接使用Apache来搭建一些简单前端静态网页会相对方便很多检查Web服务器服务状态：sudosystemctlstatushttpd#ApacheWeb服务器如果发现没有安装web服务器：安装Apache：sudoyuminstallhttpd启动Apache：sudosystemctl
新能源汽车 BMS 学习笔记篇—BMS 基本定义及分类 WPG大大通其他笔记汽车 BMS 经验分享新能源电池
一、BMS定义1、概念：BMS（BatteryManagementSystem）即电池管理系统，其管理对象是二次电池（充电电池或蓄电池），其主要目的是电池的利用率，防止电池出现过度充电和过度放电，可应用于电动汽车、电瓶车、机器人、无人机等图片来源：腾讯网https://new.qq.com《标准普尔警告，电动汽车电池生产面临供应链和地缘政治风险》2、四大功能①感知和测量：检测电池的电压、电流、温度
代码的执行效果高天
packagecom20210409;publicclassdemo04{publicstaticvoidmain(String[]args){//////&&当前的条件不满足,则最后结果一定不满足,后面的条件不再执行////&不管条件是否满足所有条件均作判断//intx=1,y=1;//if(++y==2&&x++==2){//x=7;//}//System.out.println("x="+x
Hadoop 傲雪凌霜，松柏长青后端大数据 hadoop 大数据分布式
ApacheHadoop是一个开源的分布式计算框架，主要用于处理海量数据集。它具有高度的可扩展性、容错性和高效的分布式存储与计算能力。Hadoop核心由四个主要模块组成，分别是HDFS（分布式文件系统）、MapReduce（分布式计算框架）、YARN（资源管理）和HadoopCommon（公共工具和库）。1.HDFS（HadoopDistributedFileSystem）HDFS是Hadoop生
k8s证书过期问题处理 olina_qin kubernetes 容器云原生
k8s证书过期问题处理opensslx509-in/etc/kubernetes/pki/apiserver.crt-noout-dateskubeadmcertsrenewallsystemctlrestartkubeleopensslx509-in/etc/kubernetes/pki/apiserver.crt-noout-text|grep"NotAfter"cp/etc/kubernet
如何重启Linux服务器？老男孩IT教育 git linux 运维
在Linux操作系统中，提供了多种方法用于重启服务器，那么Linux服务器如何重启?以下列举了常用的几种方法，希望对大家有所帮助，快来看看吧。重启Linux服务器有以下几种方法：1、使用命令行使用reboot命令reboot使用shutdown命令shutdown-rnow2、使用systemctl使用以下命令：systemctlreboot3、使用web界面大多数现代Linux发行版本都提供一个
PAT Advanced 1015. Reversible Primes (C语言实现) OliverLew
我的PAT系列文章更新重心已移至Github，欢迎来看PAT题解的小伙伴请到GithubPages浏览最新内容。此处文章目前已更新至与GithubPages同步。欢迎star我的repo。题目Areversibleprimeinanynumbersystemisaprimewhose"reverse"inthatnumbersystemisalsoaprime.Forexampleinthedec
linux挂载文件夹小码快撩 linux
1.使用NFS（NetworkFileSystem）NFS是一种分布式文件系统协议，允许一个系统将其文件系统的一部分共享给其他系统。检查是否安装NFSrpm-qa|grepnfs2.启动和启用NFS服务假设服务名称为nfs-server.service，你可以使用以下命令启动和启用它：sudosystemctlstartnfs-server.servicesudosystemctlenablenf
Python实现mysql命令行 xu-jssy python mysql adb
一、源码importosimportpymysqldefsql_shell():password=input("EnterPassword:")#访问密码ifpassword.strip()!="yyds":print("Bye")return#清空控制台输出os.system("cls"ifos.name=="nt"else"clear")try:#连接到MySQL数据库conn=pymysql
Tomcat 中 catalina.out、catalina.log、localhost.log 和 access_log 的区别金色888
打开Tomcat安装目录中的log文件夹，我们可以看到很多日志文件，这篇文章就来介绍下这些日记文件的具体区别。catalina.out日志#catalina.out日志文件是Tomcat的标准输出（stdout）和标准出错（stderr）输出的“目的地”。我们在应用里使用System.out打印的内容都会输出到这个日志文件中。另外，如果我们在应用里使用其他的日志框架，配置了向Console输出日志
华为坤灵路由器配置SSH redmond88 网络技术华为 ssh 运维
配置SSH服务器的管理网口IP地址。system-view[HUAWEI]sysnameSSHServer[SSHServer]interfacemeth0/0/0[SSHServer-MEth0/0/0]ipaddress10.248.103.194255.255.255.0[SSHServer-MEth0/0/0]quit在SSH服务器端生成本地密钥对。[SSHServer]rsalocal-
华为坤灵路由器初始化开局的注意事项，含NAT配置 redmond88 网络技术华为服务器运维
坤灵路由器比较坑，无web界面，全程命令行配置，但是版本更新导致和华为企业路由器配置很多不一样的地方，今天介绍下1、aaa密码复杂度修改：#使能设备对密码进行四选三复杂度检查功能。system-view[HUAWEI]aaa[HUAWEI-aaa]local-aaa-userpasswordpolicyadministrator[HUAWEI-aaa-lupp-admin]passwordcomp
day12 控制流程 if switch while do...while 猜数字游戏卓越小Y JAVA学习日志游戏 java 开发语言
控制流程顺序结构所有的程序都是按顺序执行if语句选择结构单选择语句if(a>0){System.out.println(“hello”);}packagecom.ckw.blog.select;importjava.util.Scanner;publicclassdemo01{publicstaticvoidmain(String[]args){intscore=0;Scannerscanner=
C#文件被占用的解决方案花北城 C#项目文件占用
问题打更新包时，提示文件被占用。System.IO.IOException:文件“D:\RS\RS_CCVI20111210.exe”正由另一进程使用，因此该进程无法访问该文件。在System.IO.__Error.WinIOError(Int32errorCode,StringmaybeFullPath)在System.IO.FileStream.Init(Stringpath,FileMode
FPGA器件在线配置方法概述 fpga和matlab FPGA 其他 fpga开发 FPGA 在线配置
目录1.配置电路结构和原理2.ICR控制电路软件3.几种常见的FPGA在线配置方法3.1动态部分重配置（PartialReconfiguration,PR）3.2在系统编程（In-SystemProgramming,ISP）3.3多比特流配置（Multi-BitstreamConfiguration）3.4远程更新与配置3.5使用OpenCL或HLS工具FPGA（Field-Programmabl
C# 自动化 TineAine C#代码片段自动化 c#自动化模拟操作
实现的方法可能很笨，但是确实很好用usingSystem;usingSystem.Collections.Generic;usingSystem.Linq;usingSystem.Runtime.InteropServices;usingSystem.Text;usingSystem.Threading;usingSystem.Threading.Tasks;/******************
数组拷贝Arraycopy xing2516 Arraycopy java
packageqing;//数组拷贝publicclassArraycopy{publicstaticvoidmain(String[]args){//一维数组拷贝Stringa[]={"小米","华为","阿里","腾讯","百度"};String[]aBak=newString[6];//从a数组第0个copy到数组aBak0个开始，长度是a数组长度System.arraycopy(a,0,a
Java – 数组Copy的几种方式 hooc java web
目前在Java中数据拷贝提供了如下方式：cloneSystem.arraycopyArrays.copyOfArrays.copyOfRange1、clone方法clone方法是从Object类继承过来的，基本数据类型（String，boolean，char，byte，short，float，double，long）都可以直接使用clone方法进行克隆，注意String类型是因为其值不可变所以才可
java数组拷贝的方法千锋IT教育 java java 数组
小千在给大家讲解数组扩容时，涉及到了数组中数据元素的拷贝复制。那么除了上面的拷贝方式之外，数组还有哪些拷贝方式呢？1.拷贝方式在Java中，数组的拷贝主要有三种实现方式：1.通过循环语句，将原数组中的各个元素拷贝到新数组中(即数组扩容案例中使用的方法)；2.System类提供的数组拷贝方法；3.Arrays类提供的数组拷贝方法。接下来小千就设计几个案例，来给大家展示这几种方式都是怎么进行数组拷贝的
Java中四种常用的数组复制的方法copyOf(),arraycop()，clone（）和copyOfRange()的使用与区别方九九 java知识点总结 java
所谓复制数组，是指将一个数组中的元素在另一个数组中进行复制。本文主要介绍关于Java里面的数组复制（拷贝）的几种方式和用法。在Java中实现数组复制分别有以下4种方法：1.Arrays类的copyOf()方法2.Arrays类的copyOfRange()方法3.System类的arraycopy()方法4.Object类的clone()方法下面来详细介绍这4种方法的使用。使用copyOf()方法和
springboot与日志最后的夏t
日志1、日志框架小张；开发一个大型系统；1、System.out.println("")；将关键数据打印在控制台；去掉？写在一个文件？2、框架来记录系统的一些运行时信息；日志框架；zhanglogging.jar；3、高大上的几个功能？异步模式？自动归档？xxxx？zhanglogging-good.jar？4、将以前框架卸下来？换上新的框架，重新修改之前相关的API；zhanglogging-p
TinyReplaySystem回放系统设计和开发 W8023Y2014 Unity Unity
TinyReplaySystem回放系统设计和开发简单探讨和分析下游戏回放系统的设计和针对特定需求回放功能的TinyReplaySystem设计和具体实现需求分析在屏幕舞台中，玩家操控动画角色通过手势缩放，移动，修改角色颜色等属性，用户操控所需要的角色进行PlayAnimation，角色扮演。扮演结束，保存到本地，可以回放用户所扮演的动画。相当于录制屏幕指定区域，存储成视频，加载回放。记录用户通过
操作系统简介像风一样自由2020 操作系统 linux ubuntu windows harmonyos
操作系统简介操作系统（OperatingSystem，简称OS）是管理计算机硬件和软件资源的系统软件，为用户和应用程序提供接口。现今，操作系统种类繁多，主要分为桌面操作系统、服务器操作系统和移动操作系统等。以下是对目前存在的主要操作系统的详细介绍。1.Windows简介：Windows是由微软公司开发的系列图形界面操作系统，是全球使用最广泛的桌面操作系统之一。最新版本：截至2023年10月，最新版
深入浅出Java Annotation(元注解和自定义注解） Josh_Persistence Java Annotation 元注解自定义注解
一、基本概述　　 Annontation是Java5开始引入的新特征。中文名称一般叫注解。它提供了一种安全的类似注释的机制，用来将任何的信息或元数据（metadata）与程序元素（类、方法、成员变量等）进行关联。　　更通俗的意思是为程序的元素（类、方法、成员变量）加上更直观更明了的说明，这些说明信息是与程序的业务逻辑无关，并且是供指定的工具或
mysql优化特定类型的查询 annan211 java 工作 mysql
本节所介绍的查询优化的技巧都是和特定版本相关的，所以对于未来mysql的版本未必适用。 1 优化count查询对于count这个函数的网上的大部分资料都是错误的或者是理解的都是一知半解的。在做优化之前我们先来看看真正的count()函数的作用到底是什么。 count()是一个特殊的函数，有两种非常不同的作用，他可以统计某个列值的数量，也可以统计行数。在统
MAC下安装多版本JDK和切换几种方式棋子chessman jdk
环境： MAC AIR,OS X 10.10,64位历史：过去 Mac 上的 Java 都是由 Apple 自己提供，只支持到 Java 6，并且OS X 10.7 开始系统并不自带（而是可选安装）（原自带的是1.6）。后来 Apple 加入 OpenJDK 继续支持 Java 6，而 Java 7 将由 Oracle 负责提供。在终端中输入jav
javaScript （1） Array_06 JavaScript java 浏览器
JavaScript 1、运算符　　运算符就是完成操作的一系列符号，它有七类：　　赋值运算符（=,+=,-=,*=,/=,%=,<<=,>>=,|=,&=）、算术运算符(+,-,*,/,++,--,%)、比较运算符(>,<,<=,>=,==,===,!=,!==)、逻辑运算符(||,&&,!)、条件运算(?:)、位
国内顶级代码分享网站袁潇含 java jdk oracle .net PHP
现在国内很多开源网站感觉都是为了利益而做的当然利益是肯定的,否则谁也不会免费的去做网站 &
Elasticsearch、MongoDB和Hadoop比较随意而生 mongodb hadoop 搜索引擎
IT界在过去几年中出现了一个有趣的现象。很多新的技术出现并立即拥抱了“大数据”。稍微老一点的技术也会将大数据添进自己的特性，避免落大部队太远，我们看到了不同技术之间的边际的模糊化。假如你有诸如Elasticsearch或者Solr这样的搜索引擎，它们存储着JSON文档，MongoDB存着JSON文档，或者一堆JSON文档存放在一个Hadoop集群的HDFS中。你可以使用这三种配
mac os 系统科研软件总结张亚雄 mac os
1.1 Microsoft Office for Mac 2011 大客户版，自行搜索。 1.2 Latex （MacTex）: 系统环境：https://tug.org/mactex/ &nb
Maven实战（四）生命周期 AdyZhang maven
1. 三套生命周期 Maven拥有三套相互独立的生命周期，它们分别为clean，default和site。每个生命周期包含一些阶段，这些阶段是有顺序的，并且后面的阶段依赖于前面的阶段，用户和Maven最直接的交互方式就是调用这些生命周期阶段。以clean生命周期为例，它包含的阶段有pre-clean, clean 和 post
Linux下Jenkins迁移 aijuans Jenkins
1. 将Jenkins程序目录copy过去源程序在/export/data/tomcatRoot/ofctest-jenkins.jd.com下面 tar -cvzf jenkins.tar.gz ofctest-jenkins.jd.com &
request.getInputStream()只能获取一次的问题 ayaoxinchao request Inputstream
问题：在使用HTTP协议实现应用间接口通信时，服务端读取客户端请求过来的数据，会用到request.getInputStream()，第一次读取的时候可以读取到数据，但是接下来的读取操作都读取不到数据原因： 1. 一个InputStream对象在被读取完成后，将无法被再次读取，始终返回-1； 2. InputStream并没有实现reset方法（可以重
数据库SQL优化大总结之百万级数据库优化方案 BigBird2012 SQL优化
网上关于SQL优化的教程很多，但是比较杂乱。近日有空整理了一下，写出来跟大家分享一下，其中有错误和不足的地方，还请大家纠正补充。这篇文章我花费了大量的时间查找资料、修改、排版，希望大家阅读之后，感觉好的话推荐给更多的人，让更多的人看到、纠正以及补充。 1.对查询进行优化，要尽量避免全表扫描，首先应考虑在 where 及 order by 涉及的列上建立索引。 2.应尽量避免在 where
jsonObject的使用 bijian1013 java json
在项目中难免会用java处理json格式的数据，因此封装了一个JSONUtil工具类。 JSONUtil.java package com.bijian.json.study; import java.util.ArrayList; import java.util.Date; import java.util.HashMap;
[Zookeeper学习笔记之六]Zookeeper源代码分析之Zookeeper.WatchRegistration bit1129 zookeeper
Zookeeper类是Zookeeper提供给用户访问Zookeeper service的主要API，它包含了如下几个内部类首先分析它的内部类，从WatchRegistration开始，为指定的znode path注册一个Watcher， /** * Register a watcher for a particular p
【Scala十三】Scala核心七：部分应用函数 bit1129 scala
何为部分应用函数？ Partially applied function: A function that’s used in an expression and that misses some of its arguments.For instance, if function f has type Int => Int => Int, then f and f(1) are p
Tomcat Error listenerStart 终极大法 ronin47 tomcat
Tomcat报的错太含糊了，什么错都没报出来，只提示了Error listenerStart。为了调试，我们要获得更详细的日志。可以在WEB-INF/classes目录下新建一个文件叫logging.properties，内容如下 Java代码 handlers = org.apache.juli.FileHandler, java.util.logging.ConsoleHa
不用加减符号实现加减法 BrokenDreams 实现
今天有群友发了一个问题，要求不用加减符号(包括负号)来实现加减法。分析一下，先看最简单的情况，假设1+1，按二进制算的话结果是10，可以看到从右往左的第一位变为0，第二位由于进位变为1。
读《研磨设计模式》-代码笔记-状态模式-State bylijinnan java 设计模式
声明：本文只为方便我个人查阅和理解，详细的分析以及源代码请移步原作者的博客http://chjavach.iteye.com/ /* 当一个对象的内在状态改变时允许改变其行为，这个对象看起来像是改变了其类状态模式主要解决的是当控制一个对象状态的条件表达式过于复杂时的情况把状态的判断逻辑转移到表示不同状态的一系列类中，可以把复杂的判断逻辑简化如果在
CUDA程序block和thread超出硬件允许值时的异常 cherishLC CUDA
调用CUDA的核函数时指定block 和 thread大小，该大小可以是dim3类型的（三维数组），只用一维时可以是usigned int型的。以下程序验证了当block或thread大小超出硬件允许值时会产生异常！！！GPU根本不会执行运算！！！所以验证结果的正确性很重要！！！在VS中创建CUDA项目会有一个模板，里面有更详细的状态验证。以下程序在K5000GPU上跑的。
诡异的超长时间GC问题定位 chenchao051 jvm cms GC hbase swap
HBase的GC策略采用PawNew+CMS, 这是大众化的配置，ParNew经常会出现停顿时间特别长的情况，有时候甚至长到令人发指的地步，例如请看如下日志： 2012-10-17T05:54:54.293+0800: 739594.224: [GC 739606.508: [ParNew: 996800K->110720K(996800K), 178.8826900 secs] 3700
maven环境快速搭建 daizj 安装 mavne 环境配置
一下载maven 安装maven之前，要先安装jdk及配置JAVA_HOME环境变量。这个安装和配置java环境不用多说。 maven下载地址：http://maven.apache.org/download.html，目前最新的是这个apache-maven-3.2.5-bin.zip，然后解压在任意位置，最好地址中不要带中文字符，这个做java 的都知道，地址中出现中文会出现很多
PHP网站安全，避免PHP网站受到攻击的方法 dcj3sjt126com PHP
对于PHP网站安全主要存在这样几种攻击方式:1、命令注入(Command Injection)2、eval注入(Eval Injection)3、客户端脚本攻击(Script Insertion)4、跨网站脚本攻击(Cross Site Scripting, XSS)5、SQL注入攻击(SQL injection)6、跨网站请求伪造攻击(Cross Site Request Forgerie
yii中给CGridView设置默认的排序根据时间倒序的方法 dcj3sjt126com GridView
public function searchWithRelated() { $criteria = new CDbCriteria; $criteria->together = true; //without th
Java集合对象和数组对象的转换 dyy_gusi java集合
在开发中，我们经常需要将集合对象（List，Set）转换为数组对象，或者将数组对象转换为集合对象。Java提供了相互转换的工具，但是我们使用的时候需要注意，不能乱用滥用。 1、数组对象转换为集合对象最暴力的方式是new一个集合对象，然后遍历数组，依次将数组中的元素放入到新的集合中，但是这样做显然过
nginx同一主机部署多个应用 geeksun nginx
近日有一需求，需要在一台主机上用nginx部署2个php应用，分别是wordpress和wiki，探索了半天，终于部署好了，下面把过程记录下来。 1. 在nginx下创建vhosts目录，用以放置vhost文件。 mkdir vhosts 2. 修改nginx.conf的配置，在http节点增加下面内容设置，用来包含vhosts里的配置文件 #
ubuntu添加admin权限的用户账号 hongtoushizi ubuntu useradd
ubuntu创建账号的方式通常用到两种：useradd 和adduser . 本人尝试了useradd方法，步骤如下： 1:useradd 使用useradd时，如果后面不加任何参数的话，如：sudo useradd sysadm 创建出来的用户将是默认的三无用户：无home directory ,无密码,无系统shell。顾应该如下操作：
第五章常用Lua开发库2-JSON库、编码转换、字符串处理 jinnianshilongnian nginx lua
JSON库在进行数据传输时JSON格式目前应用广泛，因此从Lua对象与JSON字符串之间相互转换是一个非常常见的功能；目前Lua也有几个JSON库，本人用过cjson、dkjson。其中cjson的语法严格（比如unicode \u0020\u7eaf），要求符合规范否则会解析失败（如\u002），而dkjson相对宽松，当然也可以通过修改cjson的源码来完成
Spring定时器配置的两种实现方式OpenSymphony Quartz和java Timer详解 yaerfeng1989 timer quartz 定时器
原创整理不易，转载请注明出处：Spring定时器配置的两种实现方式OpenSymphony Quartz和java Timer详解代码下载地址：http://www.zuidaima.com/share/1772648445103104.htm 有两种流行Spring定时器配置：Java的Timer类和OpenSymphony的Quartz。 1.Java Timer定时首先继承jav
Linux下df与du两个命令的差别？ pda158 linux
　一、df显示文件系统的使用情况，与du比較，就是更全盘化。　　最经常使用的就是 df -T，显示文件系统的使用情况并显示文件系统的类型。　　举比例如以下：　　[root@localhost ~]# df -T 　　Filesystem Type &n
[转]SQLite的工具类 ---- 通过反射把Cursor封装到VO对象 ctfzh VO android sqlite 反射 Cursor
在写DAO层时，觉得从Cursor里一个一个的取出字段值再装到VO(值对象)里太麻烦了，就写了一个工具类，用到了反射，可以把查询记录的值装到对应的VO里，也可以生成该VO的List。使用时需要注意：考虑到Android的性能问题，VO没有使用Setter和Getter，而是直接用public的属性。表中的字段名需要和VO的属性名一样，要是不一样就得在查询的SQL中
该学习笔记用到的Employee表 vipbooks oracle sql 工作
这是我在学习Oracle是用到的Employee表，在该笔记中用到的就是这张表，大家可以用它来学习和练习。 drop table Employee; -- 员工信息表 create table Employee( -- 员工编号 EmpNo number(3) primary key, -- 姓