weixin_34377919

爬虫Larbin解析(二)——sequencer()

分析的函数: void sequencer()

//位置：larbin-2.6.3/src/fetch/sequencer.cc
void sequencer() 
{
    bool testPriority = true;
    if (space == 0)                //unit space = 0
    {
        space = global::inter->putAll();
    }
    int still = space;
    if (still > maxPerCall)        //#define maxPerCall 100
        still = maxPerCall;
    while (still) 
    {
        if (canGetUrl(&testPriority)) 
        {
            --space; 
            --still; 
        } 
        else 
        { 
            still = 0; 
        } 
    }
}

所在的文件

larbin-2.6.3/src/fetch/sequencer.h、larbin-2.6.3/src/fetch/sequencer.cc

// Larbin
// Sebastien Ailleret
// 15-11-99 -> 15-11-99

#ifndef SEQUENCER_H
#define SEQUENCER_H

/** only for debugging, handle with care */
extern uint space;

/** Call the sequencer */
void sequencer ();

#endif

View Code

// Larbin
// Sebastien Ailleret
// 15-11-99 -> 04-01-02

#include 

#include "options.h"

#include "global.h"
#include "types.h"
#include "utils/url.h"
#include "utils/debug.h"
#include "fetch/site.h"

static bool canGetUrl (bool *testPriority);
uint space = 0;

#define maxPerCall 100

/** start the sequencer*/
//按优先度将URL放到代爬站点
void sequencer() 
{
    bool testPriority = true;
    if (space == 0) 
    {
        space = global::inter->putAll();
    }
    int still = space;
    if (still > maxPerCall)  //#define maxPerCall 100
        still = maxPerCall;
    while (still) 
    {
        if (canGetUrl(&testPriority)) 
        {
            space--; 
            still--; 
        } 
        else 
        { 
            still = 0; 
        } 
    }
}

/* Get the next url
 * here is defined how priorities are handled
 按优先级从各个URL队列
 （比如URLsDisk,URLsDiskWait或URLsPriority,URLsPriorityWait）
 获取url保存到某个NameSite(通过url的hash值)
 */
static bool canGetUrl (bool *testPriority) 
{
    url *u;
    if (global::readPriorityWait)  // global.cc  赋值为0
    {
        global::readPriorityWait--;
        u = global::URLsPriorityWait->get();
        global::namedSiteList[u->hostHashCode()].putPriorityUrlWait(u);
        return true;
    } 
    else if (*testPriority && (u=global::URLsPriority->tryGet()) != NULL) 
    {
        // We've got one url (priority)
        global::namedSiteList[u->hostHashCode()].putPriorityUrl(u);
        return true;
    } 
    else 
    {
        *testPriority = false;
        // Try to get an ordinary url
        if (global::readWait) 
        {
          global::readWait--;
          u = global::URLsDiskWait->get();
          global::namedSiteList[u->hostHashCode()].putUrlWait(u);
          return true;
        } 
        else 
        {
            u = global::URLsDisk->tryGet();
            if (u != NULL) 
            {
                global::namedSiteList[u->hostHashCode()].putUrl(u);
                return true;
            }
            else 
            {
                return false;
            }
        }
    }
}

View Code

一、对于space = global::inter->putAll();

1. interf在global.cc(位置：/larbin-2.6.3/src/global.cc)中的定义为

inter = new Interval(ramUrls);   //#define ramUrls 100000  (位置：larbin-2.6.3/src/types.h)

批注：区别 inter = new Interval(ramUrls); 和 inter = new Interval[ramUrls]; 前一个（）内是参数，要传入构造函数的；后一个[]内是开辟数组的个数。

2. 类 Interval定义（位置：/larbin-2.6.3/src/fetch/site.h）

/** This class is intended to make sure the sum of the
 * sizes of the fifo included in the different sites
 * are not too big
 */
class Interval 
{
    public:
        Interval (uint sizes) : size(sizes), pos(0) {}
        ~Interval () {}
        /** How many urls can we put. Answer 0: if no urls can be put */
        inline uint putAll () 
        { 
            int res = size - pos; 
            pos = size; 
            return res; 
        }
        /** Warn an url has been retrieved */
        inline void getOne () 
        { 
            --pos; 
        }
        /** only for debugging, handle with care */
        inline uint getPos () 
        { 
            return pos; 
        }
    private:
        /** Size of the interval */
        uint size;
        /** Position in the interval */
        uint pos;
};

View Code

批注：类内的函数定义为inline。对内联函数的几点说明：

内联函数避免函数调用的开销。将函数指定为内联函数，（通常）就是将它在程序的每个调用点上“内联地”展开，消除调用函数进行的额外开销（调用前先保存寄存器，并在返回时回复）。内联说明（在函数返回值前加inline）对编译器来说只是一个建议，编译器可以选择忽略。一般内敛函数适用于优化小的、只有几行、经常被调用的函数。大多数编译器不支持递归函数的内敛。
把内联函数放在头文件。以便编译器能够在调用点展开同一个函数（保证编译器可见、所有的定义相同）。
编译器隐式地将在类内定义的成员函数当作为内联函数.

二、对于canGetUrl(&testPriority)

函数定义（位置larbin-2.6.3/src/fetch/sequencer.cc）

/* Get the next url
 * here is defined how priorities are handled
 按优先级从各个URL队列
 （比如URLsDisk,URLsDiskWait或URLsPriority,URLsPriorityWait）
 获取url保存到某个NameSite(通过url的hash值)

at "global.cc"
// FIFOs
URLsDisk         = new PersistentFifo(reload, fifoFile);
URLsDiskWait     = new PersistentFifo(reload, fifoFileWait);
URLsPriority     = new SyncFifo;
URLsPriorityWait = new SyncFifo;

 */
static bool canGetUrl (bool *testPriority) 
{
    url *u;
    if (global::readPriorityWait != 0)  // 在global.cc声明定义: uint global::readPriorityWait=0;
    {
        global::readPriorityWait--;
        u = global::URLsPriorityWait->get();
        global::namedSiteList[u->hostHashCode()].putPriorityUrlWait(u);
        return true;
    } 
    else if (*testPriority && (u=global::URLsPriority->tryGet()) != NULL) 
    {
        // We've got one url (priority)
        global::namedSiteList[u->hostHashCode()].putPriorityUrl(u);
        return true;
    } 
    else 
    {
        *testPriority = false;
        // Try to get an ordinary url
        if (global::readWait) 
        {
          global::readWait--;
          u = global::URLsDiskWait->get();
          global::namedSiteList[u->hostHashCode()].putUrlWait(u);
          return true;
        } 
        else 
        {
            u = global::URLsDisk->tryGet();
            if (u != NULL) 
            {
                global::namedSiteList[u->hostHashCode()].putUrl(u);
                return true;
            }
            else 
            {
                return false;
            }
        }
    }
}

1. 为什么disk和priority的队列都是成对出现的，是因为可以认为每个site在namedSiteList当中都有一个小的队列来保存它的url，这个url的个数是有个数限制的，当超过这个限制的时候就不能再把该site下的url放入，但也不能丢弃，而是放入wait队列。Larbin会控制一段时间在disk队列中取url,一段时间在diskWait当中取url。disk和priority的区别只是优先级的区别。namedSiteList的作用是实现了DNS缓存。

2. global::readPriorityWait 的值由main.cc的cron()函数中变化得知

// see if we should read again urls in fifowait
if ((global::now % 300) == 0) {
    global::readPriorityWait = global::URLsPriorityWait->getLength();
    global::readWait = global::URLsDiskWait->getLength();
}
if ((global::now % 300) == 150) {
    global::readPriorityWait = 0;
    global::readWait = 0;
}

这里global::now%300是判断这次是对wait里的url进行处理，还是对不是wait里的进行处理，这里的%300等于0和150的概率都是1/300，所以大约300次换一次。readPriorityWait是URLsPriorityWait中的长度（也就是url的数量）；readWait是URLsDiskWait中url的个数。

3. 在canGetUrl中，在对于每个站点，将相应的url放进去。putPriorityUrlWait, putPriorityUrl, putUrlWait, putUrl在site.h的定义如下

/** Put an url in the fifo
 * If there are too much, put it back in UrlsInternal
 * Never fill totally the fifo => call at least with 1 */
void putGenericUrl(url *u, int limit, bool prio);
inline void putUrl(url *u) {
    putGenericUrl(u, 15, false);
}
inline void putUrlWait(url *u) {
    putGenericUrl(u, 10, false);
}
inline void putPriorityUrl(url *u) {
    putGenericUrl(u, 5, true);
}
inline void putPriorityUrlWait(url *u) {
    putGenericUrl(u, 1, true);
}

可以发现，每次都是调用函数putGenericUrl，其定义如下

/* Put an url in the fifo if their are not too many */
void NamedSite::putGenericUrl(url *u, int limit, bool prio) 
{
    if (nburls > maxUrlsBySite - limit) 
    {
       // Already enough Urls in memory for this Site
       // first check if it can already be forgotten
       if (!strcmp(name, u->getHost())) 
       {
           if (dnsState == errorDns) 
           {
              nburls++;
              forgetUrl(u, noDNS);
              return;
           }
           if (dnsState == noConnDns) 
           {
              nburls++;
              forgetUrl(u, noConnection);
              return;
           }
           if (u->getPort() == port && dnsState == doneDns && !testRobots(u->getFile())) 
           {
              nburls++;
              forgetUrl(u, forbiddenRobots);
              return;
           }
       }
       // else put it back in URLsDisk
       refUrl();
       global::inter->getOne();
       if (prio) 
       {
           global::URLsPriorityWait->put(u);
       } 
       else 
       {
           global::URLsDiskWait->put(u);
       }
   }

如果已经有足够多的url在内存里，执行这里if中的代码，strcmp(name,u->getHost())是判断这个主机是不是已经就进行过dns方面的判断，也就是说对于一个站点，只做一次dns解析的判断，以后就按这个结果进行处理，dnsState有noDns，noConnDns，还有robots.txt不允许的情况，如果没有问题，就把它放到URLsDisk中。

else {
    nburls++;
    if (dnsState == waitDns || strcmp(name, u->getHost()) || port
           != u->getPort() || global::now > dnsTimeout) {
       // dns not done or other site
       putInFifo(u);
       addNamedUrl();
       // Put Site in fifo if not yet in
       if (!isInFifo) {
           isInFifo = true;
           global::dnsSites->put(this);
       }
    } else
       switch (dnsState) {
       case doneDns:
           transfer(u);
           break;
       case errorDns:
           forgetUrl(u, noDNS);
           break;
       default: // noConnDns
           forgetUrl(u, noConnection);
       }
}

如果需要判断dns能不能解析，就将它放到dnsSites里，这个会在fetchDns中判断。或是如果还能放到内存里，并且又是doneDns，表示可以解析，就调用transfer：

void NamedSite::transfer(url *u) {
    if (testRobots(u->getFile())) {
       if (global::proxyAddr == NULL) {
           memcpy(&u->addr, &addr, sizeof(struct in_addr));
       }
       global::IPSiteList[ipHash].putUrl(u);
    } else {
       forgetUrl(u, forbiddenRobots);
    }
}

这里是将url放入到IPSiteList的相应ipHash中。

附类的定义

类url定义（larbin-2.6.3/src/utils/url.h larbin-2.6.3/src/utils/url.cc）

// Larbin
// Sebastien Ailleret
// 15-11-99 -> 14-03-02

/* This class describes an URL */

#ifndef URL_H
#define URL_H

#include in.h>
#include 
#include 
#include 

#include "types.h"

bool fileNormalize (char *file);

class url {
 private:
  char *host;
  char *file;
  uint16_t port; // the order of variables is important for physical size
  int8_t depth;
  /* parse the url */
  void parse (char *s);
  /** parse a file with base */
  void parseWithBase (char *u, url *base);
  /* normalize file name */
  bool normalize (char *file);
  /* Does this url starts with a protocol name */
  bool isProtocol (char *s);
  /* constructor used by giveBase */
  url (char *host, uint port, char *file);

 public:
  /* Constructor : Parses an url (u is deleted) */
  url (char *u, int8_t depth, url *base);

  /* constructor used by input */
  url (char *line, int8_t depth);

  /* Constructor : read the url from a file (cf serialize) */
  url (char *line);

  /* Destructor */
  ~url ();

  /* inet addr (once calculated) */
  struct in_addr addr;

  /* Is it a valid url ? */
  bool isValid ();

  /* print an URL */
  void print ();

  /* return the host */
  inline char *getHost () { return host; }

  /* return the port */
  inline uint getPort () { return port; }

  /* return the file */
  inline char *getFile () { return file; }

  /** Depth in the Site */
  inline int8_t getDepth () { return depth; }

  /* Set depth to max if we are at an entry point in the site
   * try to find the ip addr
   * answer false if forbidden by robots.txt, true otherwise */
  bool initOK (url *from);

  /** return the base of the url
   * give means that you have to delete the string yourself
   */
  url *giveBase ();

  /** return a char * representation of the url
   * give means that you have to delete the string yourself
   */
  char *giveUrl ();

  /** write the url in a buffer
   * buf must be at least of size maxUrlSize
   * returns the size of what has been written (not including '\0')
   */
  int writeUrl (char *buf);

  /* serialize the url for the Persistent Fifo */
  char *serialize ();

  /* very thread unsafe serialisation in a static buffer */
  char *getUrl();

  /* return a hashcode for the host of this url */
  uint hostHashCode ();

  /* return a hashcode for this url */
  uint hashCode ();

#ifdef URL_TAGS
  /* tag associated to this url */
  uint tag;
#endif // URL_TAGS

#ifdef COOKIES
  /* cookies associated with this page */
  char *cookie;
  void addCookie(char *header);
#else // COOKIES
  inline void addCookie(char *header) {}
#endif // COOKIES
};

#endif // URL_H

View Code

// Larbin
// Sebastien Ailleret
// 15-11-99 -> 16-03-02

/* This class describes an URL */

#include 
#include 
#include 
#include <string.h>
#include 
#include 
#include 

#include "options.h"

#include "types.h"
#include "global.h"
#include "utils/url.h"
#include "utils/text.h"
#include "utils/connexion.h"
#include "utils/debug.h"

#ifdef COOKIES
#define initCookie() cookie=NULL
#else // COOKIES
#define initCookie() ((void) 0)
#endif // COOKIES

/* small functions used later */
static uint siteHashCode (char *host) {
  uint h=0;
  uint i=0;
  while (host[i] != 0) {
    h = 37*h + host[i];
    i++;
  }
  return h % namedSiteListSize;
}

/* return the int with correspond to a char
 * -1 if not an hexa char */
static int int_of_hexa (char c) {
  if (c >= '0' && c <= '9')
    return (c - '0');
  else if (c >= 'a' && c <= 'f')
    return (c - 'a' + 10);
  else if (c >= 'A' && c <= 'F')
    return (c - 'A' + 10);
  else
    return -1;
}

/* normalize a file name : also called by robots.txt parser
 * return true if it is ok, false otherwise (cgi-bin)
 */
bool fileNormalize (char *file) {
  int i=0;
  while (file[i] != 0 && file[i] != '#') {
    if (file[i] == '/') {
      if (file[i+1] == '.' && file[i+2] == '/') {
        // suppress /./
        int j=i+3;
        while (file[j] != 0) {
          file[j-2] = file[j];
          j++;
        }
        file[j-2] = 0;
      } else if (file[i+1] == '/') {
        // replace // by /
        int j=i+2;
        while (file[j] != 0) {
          file[j-1] = file[j];
          j++;
        }
        file[j-1] = 0;
      } else if (file[i+1] == '.' && file[i+2] == '.' && file[i+3] == '/') {
        // suppress /../
        if (i == 0) {
          // the file name starts with /../ : error
          return false;
        } else {
          int j = i+4, dec;
          i--;
          while (file[i] != '/') { i--; }
          dec = i+1-j; // dec < 0
          while (file[j] != 0) {
            file[j+dec] = file[j];
            j++;
          }
          file[j+dec] = 0;
        }
      } else if (file[i+1] == '.' && file[i+2] == 0) {
        // suppress /.
        file[i+1] = 0;
        return true;
      } else if (file[i+1] == '.' && file[i+2] == '.' && file[i+3] == 0) {
        // suppress /..
        if (i == 0) {
          // the file name starts with /.. : error
          return false;
        } else {
          i--;
          while (file[i] != '/') {
            i--;
          }
          file[i+1] = 0;
          return true;
        }
      } else { // nothing special, go forward
        i++;
      }
    } else if (file[i] == '%') {
      int v1 = int_of_hexa(file[i+1]);
      int v2 = int_of_hexa(file[i+2]);
      if (v1 < 0 || v2 < 0) return false;
      char c = 16 * v1 + v2;
      if (isgraph(c)) {
        file[i] = c;
        int j = i+3;
        while (file[j] != 0) {
          file[j-2] = file[j];
          j++;
        }
        file[j-2] = 0;
        i++;
      } else if (c == ' ' || c == '/') { // keep it with the % notation
        i += 3;
      } else { // bad url
        return false;
      }
    } else { // nothing special, go forward
      i++;
    }
  }
  file[i] = 0;
  return true;
}

/**************************************/
/* definition of methods of class url */
/**************************************/

/* Constructor : Parses an url */
url::url (char *u, int8_t depth, url *base) {
  newUrl();
  this->depth = depth;
  host = NULL;
  port = 80;
  file = NULL;
  initCookie();
#ifdef URL_TAGS
  tag = 0;
#endif // URL_TAGS
  if (startWith("http://", u)) {
    // absolute url
    parse (u + 7);
    // normalize file name
    if (file != NULL && !normalize(file)) {
      delete [] file;
      file = NULL;
      delete [] host;
      host = NULL;
    }
  } else if (base != NULL) {
    if (startWith("http:", u)) {
      parseWithBase(u+5, base);
    } else if (isProtocol(u)) {
      // Unknown protocol (mailto, ftp, news, file, gopher...)
    } else {
      parseWithBase(u, base);
    }
  }
}

/* constructor used by input */
url::url (char *line,  int8_t depth) {
  newUrl();
  this->depth = depth;
  host = NULL;
  port = 80;
  file = NULL;
  initCookie();
  int i=0;
#ifdef URL_TAGS
  tag = 0;
  while (line[i] >= '0' && line[i] <= '9') {
    tag = 10*tag + line[i] - '0';
    i++;
  }
  i++;
#endif // URL_TAGS
  if (startWith("http://", line+i)) {
    parse(line+i+7);
    // normalize file name
    if (file != NULL && !normalize(file)) {
      delete [] file;
      file = NULL;
      delete [] host;
      host = NULL;
    }
  }
}

/* Constructor : read the url from a file (cf serialize)
 */
url::url (char *line) {
  newUrl();
  int i=0;
  // Read depth
  depth = 0;
  while (line[i] >= '0' && line[i] <= '9') {
    depth = 10*depth + line[i] - '0';
    i++;
  }
#ifdef URL_TAGS
  // read tag
  tag = 0; i++;
  while (line[i] >= '0' && line[i] <= '9') {
    tag = 10*tag + line[i] - '0';
    i++;
  }
#endif // URL_TAGS
  int deb = ++i;
  // Read host
  while (line[i] != ':') {
    i++;
  }
  line[i] = 0;
  host = newString(line+deb);
  i++;
  // Read port
  port = 0;
  while (line[i] >= '0' && line[i] <= '9') {
    port = 10*port + line[i] - '0';
    i++;
  }
#ifndef COOKIES
  // Read file name
  file = newString(line+i);
#else // COOKIES
  char *cpos = strchr(line+i, ' ');
  if (cpos == NULL) {
    cookie = NULL;
  } else {
    *cpos = 0;
    // read cookies
    cookie = new char[maxCookieSize];
    strcpy(cookie, cpos+1);
  }
  // Read file name
  file = newString(line+i);
#endif // COOKIES
}

/* constructor used by giveBase */
url::url (char *host, uint port, char *file) {
  newUrl();
  initCookie();
  this->host = host;
  this->port = port;
  this->file = file;
}

/* Destructor */
url::~url () {
  delUrl();
  delete [] host;
  delete [] file;
#ifdef COOKIES
  delete [] cookie;
#endif // COOKIES
}

/* Is it a valid url ? */
bool url::isValid () {
  if (host == NULL) return false;
  int lh = strlen(host);
  return file!=NULL && lh < maxSiteSize
    && lh + strlen(file) + 18 < maxUrlSize;
}

/* print an URL */
void url::print () {
  printf("http://%s:%u%s\n", host, port, file);
}

/* Set depth to max if necessary
 * try to find the ip addr
 * answer false if forbidden by robots.txt, true otherwise */
bool url::initOK (url *from) {
#if defined(DEPTHBYSITE) || defined(COOKIES)
  if (strcmp(from->getHost(), host)) { // different site
#ifdef DEPTHBYSITE
    depth = global::depthInSite;
#endif // DEPTHBYSITE
  } else { // same site
#ifdef COOKIES
    if (from->cookie != NULL) {
      cookie = new char[maxCookieSize];
      strcpy(cookie, from->cookie);
    }
#endif // COOKIES
  }
#endif // defined(DEPTHBYSITE) || defined(COOKIES)
  if (depth < 0) {
    errno = tooDeep;
    return false;
  }
  NamedSite *ns = global::namedSiteList + (hostHashCode());
  if (!strcmp(ns->name, host) && ns->port == port) {
    switch (ns->dnsState) {
    case errorDns:
      errno = fastNoDns;
      return false;
    case noConnDns:
      errno = fastNoConn;
      return false;
    case doneDns:
      if (!ns->testRobots(file)) {
        errno = fastRobots;
        return false;
      }
    }
  }
  return true;
}

/* return the base of the url */
url *url::giveBase () {
  int i = strlen(file);
  assert (file[0] == '/');
  while (file[i] != '/') {
    i--;
  }
  char *newFile = new char[i+2];
  memcpy(newFile, file, i+1);
  newFile[i+1] = 0;
  return new url(newString(host), port, newFile);
}

/** return a char * representation of the url
 * give means that you have to delete the string yourself
 */
char *url::giveUrl () {
  char *tmp;
  int i = strlen(file);
  int j = strlen(host);

  tmp = new char[18+i+j];  // 7 + j + 1 + 9 + i + 1
                           // http://(host):(port)(file)\0
  strcpy(tmp, "http://");
  strcpy (tmp+7, host);
  j += 7;
  if (port != 80) {
    j += sprintf(tmp + j, ":%u", port);
  }
  // Copy file name
  while (i >= 0) {
    tmp [j+i] = file[i];
    i--;
  }
  return tmp;
}

/** write the url in a buffer
 * buf must be at least of size maxUrlSize
 * returns the size of what has been written (not including '\0')
 */
int url::writeUrl (char *buf) {
  if (port == 80)
    return sprintf(buf, "http://%s%s", host, file);
  else
    return sprintf(buf, "http://%s:%u%s", host, port, file);
}

/* serialize the url for the Persistent Fifo */
char *url::serialize () {
  // this buffer is protected by the lock of PersFifo
  static char statstr[maxUrlSize+40+maxCookieSize];
  int pos = sprintf(statstr, "%u ", depth);
#ifdef URL_TAGS
  pos += sprintf(statstr+pos, "%u ", tag);
#endif // URL_TAGS
  pos += sprintf(statstr+pos, "%s:%u%s", host, port, file);
#ifdef COOKIES
  if (cookie != NULL) {
    pos += sprintf(statstr+pos, " %s", cookie);
  }
#endif // COOKIES
  statstr[pos] = '\n';
  statstr[pos+1] = 0;
  return statstr;
}

/* very thread unsafe serialisation in a static buffer */
char *url::getUrl() {
  static char statstr[maxUrlSize+40];
  sprintf(statstr, "http://%s:%u%s", host, port, file);
  return statstr;
}

/* return a hashcode for the host of this url */
uint url::hostHashCode () {
  return siteHashCode (host);
}

/* return a hashcode for this url */
uint url::hashCode () {
  unsigned int h=port;
  unsigned int i=0;
  while (host[i] != 0) {
    h = 31*h + host[i];
    i++;
  }
  i=0;
  while (file[i] != 0) {
    h = 31*h + file[i];
    i++;
  }
  return h % hashSize;
}

/* parses a url : 
 * at the end, arg must have its initial state, 
 * http:// has allready been suppressed
 */
void url::parse (char *arg) {
  int deb = 0, fin = deb;
  // Find the end of host name (put it into lowerCase)
  while (arg[fin] != '/' && arg[fin] != ':' && arg[fin] != 0) {
    fin++;
  }
  if (fin == 0) return;

  // get host name
  host = new char[fin+1];
  for (int  i=0; i) {
    host[i] = lowerCase(arg[i]);
  }
  host[fin] = 0;

  // get port number
  if (arg[fin] == ':') {
    port = 0;
    fin++;
    while (arg[fin] >= '0' && arg[fin] <= '9') {
      port = port*10 + arg[fin]-'0';
      fin++;
    }
  }

  // get file name
  if (arg[fin] != '/') {
    // www.inria.fr => add the final /
    file = newString("/");
  } else {
    file = newString(arg + fin);
  }
}

/** parse a file with base
 */
void url::parseWithBase (char *u, url *base) {
  // cat filebase and file
  if (u[0] == '/') {
    file = newString(u);
  } else {
    uint lenb = strlen(base->file);
    char *tmp = new char[lenb + strlen(u) + 1];
    memcpy(tmp, base->file, lenb);
    strcpy(tmp + lenb, u);
    file = tmp;
  }
  if (!normalize(file)) {
    delete [] file;
    file = NULL;
    return;
  }
  host = newString(base->host);
  port = base->port;
}

/** normalize file name
 * return true if it is ok, false otherwise (cgi-bin)
 */
bool url::normalize (char *file) {
  return fileNormalize(file);
}

/* Does this url starts with a protocol name */
bool url::isProtocol (char *s) {
  uint i = 0;
  while (isalnum(s[i])) {
    i++;
  }
  return s[i] == ':';
}

#ifdef COOKIES
#define addToCookie(s) len = strlen(cookie); \
    strncpy(cookie+len, s, maxCookieSize-len); \
    cookie[maxCookieSize-1] = 0;

/* see if a header contain a new cookie */
void url::addCookie(char *header) {
  if (startWithIgnoreCase("set-cookie: ", header)) {
    char *pos = strchr(header+12, ';');
    if (pos != NULL) {
      int len;
      if (cookie == NULL) {
        cookie = new char[maxCookieSize];
        cookie[0] = 0;
      } else {
        addToCookie("; ");
      }
      *pos = 0;
      addToCookie(header+12);
      *pos = ';';
    }
  }
}
#endif // COOKIES

View Code

global::namedSiteList

NamedSite *global::namedSiteList;
namedSiteList = new NamedSite[namedSiteListSize];

class NamedSite 
{
    private:
        /* string used for following CNAME chains (just one jump) */
        char *cname;
        /** we've got a good dns answer
        * get the robots.txt */
        void dnsOK ();
        /** Cannot get the inet addr
        * dnsState must have been set properly before the call */
        void dnsErr ();
        /** Delete the old identity of the site */
        void newId ();
        /** put this url in its IPSite */
        void transfer (url *u);
        /** forget this url for this reason */
        void forgetUrl (url *u, FetchError reason);
    public:
        /** Constructor */
        NamedSite ();
        /** Destructor : never used */
        ~NamedSite ();
        /* name of the site */
        char name[maxSiteSize];
        /* port of the site */
        uint16_t port;
        /* numbers of urls in ram for this site */
        uint16_t nburls;
        /* fifo of urls waiting to be fetched */
        url *fifo[maxUrlsBySite];
        uint8_t inFifo;
        uint8_t outFifo;
        void putInFifo(url *u);
        url *getInFifo();
        short fifoLength();
        /** Is this Site in a dnsSites */
        bool isInFifo;
        /** internet addr of this server */
        char dnsState;
        struct in_addr addr;
        uint ipHash;
        /* Date of expiration of dns call and robots.txt fetch */
        time_t dnsTimeout;
        /** test if a file can be fetched thanks to the robots.txt */
        bool testRobots(char *file);
        /* forbidden paths : given by robots.txt */
        Vector<char> forbidden;
        /** Put an url in the fifo
        * If there are too much, put it back in UrlsInternal
        * Never fill totally the fifo => call at least with 1 */
        void putGenericUrl(url *u, int limit, bool prio);
        inline void putUrl (url *u) { putGenericUrl(u, 15, false); }
        inline void putUrlWait (url *u) { putGenericUrl(u, 10, false); }
        inline void putPriorityUrl (url *u) { putGenericUrl(u, 5, true); }
        inline void putPriorityUrlWait (url *u) { putGenericUrl(u, 1, true); }
        /** Init a new dns query */
        void newQuery ();
        /** The dns query ended with success */
        void dnsAns (adns_answer *ans);
        /** we got the robots.txt, transfer what must be in IPSites */
        void robotsResult (FetchError res);
};

View Code

///////////////////////////////////////////////////////////
// class NamedSite
///////////////////////////////////////////////////////////

/** Constructor : initiate fields used by the program
 */
NamedSite::NamedSite () 
{
  name[0] = 0;
  nburls = 0;
  inFifo = 0; outFifo = 0;
  isInFifo = false;
  dnsState = waitDns;
  cname = NULL;
}

/** Destructor : This one is never used
 */
NamedSite::~NamedSite () {
  assert(false);
}

/* Management of the Fifo */
void NamedSite::putInFifo(url *u) {
  fifo[inFifo] = u;
  inFifo = (inFifo + 1) % maxUrlsBySite;
  assert(inFifo!=outFifo);
}

url *NamedSite::getInFifo() {
  assert (inFifo != outFifo);
  url *tmp = fifo[outFifo];
  outFifo = (outFifo + 1) % maxUrlsBySite;
  return tmp;
}

short NamedSite::fifoLength() {
  return (inFifo + maxUrlsBySite - outFifo) % maxUrlsBySite;
}

/* Put an url in the fifo if their are not too many */
void NamedSite::putGenericUrl(url *u, int limit, bool prio) {
  if (nburls > maxUrlsBySite-limit) {
    // Already enough Urls in memory for this Site
    // first check if it can already be forgotten
    if (!strcmp(name, u->getHost())) {
      if (dnsState == errorDns) {
        nburls++;
        forgetUrl(u, noDNS);
        return;
      }
      if (dnsState == noConnDns) {
        nburls++;
        forgetUrl(u, noConnection);
        return;
      }
      if (u->getPort() == port
          && dnsState == doneDns && !testRobots(u->getFile())) {
        nburls++;
        forgetUrl(u, forbiddenRobots);
        return;
      }
    }
    // else put it back in URLsDisk
    refUrl();
    global::inter->getOne();
    if (prio) {
      global::URLsPriorityWait->put(u);
    } else {
      global::URLsDiskWait->put(u);
    }
  } else {
    nburls++;
    if (dnsState == waitDns
        || strcmp(name, u->getHost())
        || port != u->getPort()
        || global::now > dnsTimeout) {
      // dns not done or other site
      putInFifo(u);
      addNamedUrl();
      // Put Site in fifo if not yet in
      if (!isInFifo) {
        isInFifo = true;
        global::dnsSites->put(this);
      }
    } else switch (dnsState) {
    case doneDns:
      transfer(u);
      break;
    case errorDns:
      forgetUrl(u, noDNS);
      break;
    default: // noConnDns
      forgetUrl(u, noConnection);
    }
  }
}

/** Init a new dns query
 */
void NamedSite::newQuery () 
{
    // Update our stats
    newId();
    if (global::proxyAddr != NULL) 
    {
        // we use a proxy, no need to get the sockaddr
        // give anything for going on
        siteSeen();
        siteDNS();
        // Get the robots.txt
        dnsOK();
    } 
    else if (isdigit(name[0])) 
    {
        // the name already in numbers-and-dots notation
        siteSeen();
        if (inet_aton(name, &addr)) 
        {
              // Yes, it is in numbers-and-dots notation
              siteDNS();
              // Get the robots.txt
              dnsOK();
        } 
        else 
        {
            // No, it isn't : this site is a non sense
            dnsState = errorDns;
            dnsErr();
        }
    } 
    else 
    {
        // submit an adns query
        global::nbDnsCalls++;
        adns_query quer = NULL;
        adns_submit(global::ads, name,
                    (adns_rrtype) adns_r_addr,
                    (adns_queryflags) 0,
                    this, &quer);
    }
}

/** The dns query ended with success
 * assert there is a freeConn
 */
void NamedSite::dnsAns (adns_answer *ans) 
{
    if (ans->status == adns_s_prohibitedcname) 
    {
        if (cname == NULL) 
        {
            // try to find ip for cname of cname
            cname = newString(ans->cname);
            global::nbDnsCalls++;
            adns_query quer = NULL;
            adns_submit(global::ads, cname,
                  (adns_rrtype) adns_r_addr,
                  (adns_queryflags) 0,
                  this, &quer);
        } 
        else 
        {
            // dns chains too long => dns error
            // cf nslookup or host for more information
            siteSeen();
            delete [] cname; cname = NULL;
            dnsState = errorDns;
            dnsErr();
        }
    } 
    else 
    {
        siteSeen();
        if (cname != NULL) 
        { 
            delete [] cname; 
            cname = NULL; 
        }
        if (ans->status != adns_s_ok) 
        {
          // No addr inet
          dnsState = errorDns;
          dnsErr();
        } 
        else 
        {
          siteDNS();
          // compute the new addr
          memcpy (&addr,
                  &ans->rrs.addr->addr.inet.sin_addr,
                  sizeof (struct in_addr));
          // Get the robots.txt
          dnsOK();
        }
    }
}

/** we've got a good dns answer
 * get the robots.txt
 * assert there is a freeConn
 */
void NamedSite::dnsOK () {
  Connexion *conn = global::freeConns->get();
  char res = getFds(conn, &addr, port);
  if (res != emptyC) {
    conn->timeout = timeoutPage;
    if (global::proxyAddr != NULL) {
      // use a proxy
      conn->request.addString("GET http://");
      conn->request.addString(name);
      char tmp[15];
      sprintf(tmp, ":%u", port);
      conn->request.addString(tmp);
      conn->request.addString("/robots.txt HTTP/1.0\r\nHost: ");
    } else {
      // direct connection
      conn->request.addString("GET /robots.txt HTTP/1.0\r\nHost: ");
    }
    conn->request.addString(name);
    conn->request.addString(global::headersRobots);
    conn->parser = new robots(this, conn);
    conn->pos = 0;
    conn->err = success;
    conn->state = res;
  } else {
    // Unable to get a socket
    global::freeConns->put(conn);
    dnsState = noConnDns;
    dnsErr();
  }
}

/** Cannot get the inet addr
 * dnsState must have been set properly before the call
 */
void NamedSite::dnsErr () {
  FetchError theErr;
  if (dnsState == errorDns) {
    theErr = noDNS;
  } else {
    theErr = noConnection;
  }
  int ss = fifoLength();
  // scan the queue
  for (int i=0; i) {
    url *u = getInFifo();
    if (!strcmp(name, u->getHost())) {
      delNamedUrl();
      forgetUrl(u, theErr);
    } else { // different name
      putInFifo(u);
    }
  }
  // where should now lie this site
  if (inFifo==outFifo) {
    isInFifo = false;
  } else {
    global::dnsSites->put(this);
  }
}

/** test if a file can be fetched thanks to the robots.txt */
bool NamedSite::testRobots(char *file) {
  uint pos = forbidden.getLength();
  for (uint i=0; i) {
    if (robotsMatch(forbidden[i], file))
      return false;
  }
  return true;
}

/** Delete the old identity of the site */
void NamedSite::newId () {
  // ip expires or new name or just new port
  // Change the identity of this site
#ifndef NDEBUG
  if (name[0] == 0) {
    addsite();
  }
#endif // NDEBUG
  url *u = fifo[outFifo];
  strcpy(name, u->getHost());
  port = u->getPort();
  dnsTimeout = global::now + dnsValidTime;
  dnsState = waitDns;
}

/** we got the robots.txt,
 * compute ipHashCode
 * transfer what must be in IPSites
 */
void NamedSite::robotsResult (FetchError res) {
  bool ok = res != noConnection;
  if (ok) {
    dnsState = doneDns;
    // compute ip hashcode
    if (global::proxyAddr == NULL) {
      ipHash=0;
      char *s = (char *) &addr;
      for (uint i=0; i<sizeof(struct in_addr); i++) {
        ipHash = ipHash*31 + s[i];
      }
    } else {
      // no ip and need to avoid rapidFire => use hostHashCode
      ipHash = this - global::namedSiteList;
    }
    ipHash %= IPSiteListSize;
  } else {
    dnsState = noConnDns;
  }
  int ss = fifoLength();
  // scan the queue
  for (int i=0; i) {
    url *u = getInFifo();
    if (!strcmp(name, u->getHost())) {
      delNamedUrl();
      if (ok) {
        if (port == u->getPort()) {
          transfer(u);
        } else {
          putInFifo(u);
        }
      } else {
        forgetUrl(u, noConnection);
      }
    } else {
      putInFifo(u);
    }
  }
  // where should now lie this site
  if (inFifo==outFifo) {
    isInFifo = false;
  } else {
    global::dnsSites->put(this);
  }  
}

void NamedSite::transfer (url *u) {
  if (testRobots(u->getFile())) {
    if (global::proxyAddr == NULL) {
      memcpy (&u->addr, &addr, sizeof (struct in_addr));
    }
    global::IPSiteList[ipHash].putUrl(u);
  } else {
    forgetUrl(u, forbiddenRobots);
  }
}

void NamedSite::forgetUrl (url *u, FetchError reason) {
  urls();
  fetchFail(u, reason);
  answers(reason);
  nburls--;
  delete u;
  global::inter->getOne();
}

View Code

其中两个类的定义

larbin-2.6.3/src/utils/PersistentFifo.h、larbin-2.6.3/src/utils/PersistentFifo.cc

// Larbin
// Sebastien Ailleret
// 06-01-00 -> 12-06-01

/* this fifo is stored on disk */

#ifndef PERSFIFO_H
#define PERSFIFO_H

#include 
#include 
#include 
#include 
#include 
#include 
#include <string.h>

#include "types.h"
#include "utils/url.h"
#include "utils/text.h"
#include "utils/connexion.h"
#include "utils/mypthread.h"

class PersistentFifo 
{
    protected:
        uint in, out;
        #ifdef THREAD_OUTPUT
        pthread_mutex_t lock;
        #endif
        // number of the file used for reading
        uint fileNameLength;
        // name of files
        int fin, fout;
        char *fileName;

    protected:
        // Make fileName fit with this number
        void makeName(uint nb);
        // Give a file name for this int
        int getNumber(char *file);
        // Change the file used for reading
        void updateRead ();
        // Change the file used for writing
        void updateWrite ();

    protected:
        // buffer used for readLine
        char outbuf[BUF_SIZE];
        // number of char used in this buffer
        uint outbufPos;
        // buffer used for readLine
        char buf[BUF_SIZE];
        // number of char used in this buffer
        uint bufPos, bufEnd;
        // sockets for reading and writing
        int rfds, wfds;
    protected:
        // read a line on rfds
        char *readLine ();
        // write an url in the out file (buffered write)
        void writeUrl (char *s);
        // Flush the out Buffer in the outFile
        void flushOut ();

    public:
        PersistentFifo (bool reload, char *baseName);
        ~PersistentFifo ();

        /* get the first object (non totally blocking)
        * return NULL if there is none
        */
        url *tryGet ();

        /* get the first object (non totally blocking)
        * probably crash if there is none
        */
        url *get ();

        /* add an object in the fifo */
        void put (url *obj);

        /* how many items are there inside ? */
        int getLength ();
};

#endif // PERSFIFO_H

View Code

// Larbin
// Sebastien Ailleret
// 27-05-01 -> 04-01-02

#include <string.h>
#include 
#include 
#include 
#include <string.h>
#include 
#include 

#include "types.h"
#include "global.h"
#include "utils/mypthread.h"
#include "utils/PersistentFifo.h"

PersistentFifo::PersistentFifo (bool reload, char *baseName) 
{
  fileNameLength = strlen(baseName)+5;
  fileName = new char[fileNameLength+2];
  strcpy(fileName, baseName);
  fileName[fileNameLength+1] = 0;
  outbufPos = 0;
  bufPos = 0;
  bufEnd = 0;
  mypthread_mutex_init(&lock, NULL);
  if (reload) 
  {
    DIR *dir = opendir(".");
    struct dirent *name;

    fin = -1;
    fout = -1;
    name = readdir(dir);
    while (name != NULL) 
    {
      if (startWith(fileName, name->d_name)) 
      {
        int tmp = getNumber(name->d_name);
        if (fin == -1) 
        {
          fin = tmp;
          fout = tmp;
        } 
        else 
        {
          if (tmp > fin)  { fin = tmp; }
          if (tmp < fout) { fout = tmp; }
        }
      }
      name = readdir(dir);
    }
    if (fin == -1) 
    {
      fin = 0;
      fout = 0;
    }
    if (fin == fout && fin != 0) 
    {
      cerr << "previous crawl was too little, cannot reload state\n"
           << "please restart larbin with -scratch option\n";
      exit(1);
    }
    closedir(dir);
    in = (fin - fout) * urlByFile;
    out = 0;
    makeName(fin);
    wfds = creat (fileName, S_IRUSR | S_IWUSR);
    makeName(fout);
    rfds = open (fileName, O_RDONLY);
  } 
  else 
  {
    // Delete old fifos
    DIR *dir = opendir(".");
    struct dirent *name;
    name = readdir(dir);
    while (name != NULL) 
    {
      if (startWith(fileName, name->d_name)) 
      {
        unlink(name->d_name);
      }
      name = readdir(dir);
    }
    closedir(dir);

    fin = 0;
    fout = 0;
    in = 0;
    out = 0;
    makeName(0);
    wfds = creat (fileName, S_IRUSR | S_IWUSR);
    rfds = open (fileName, O_RDONLY);
  }
}

PersistentFifo::~PersistentFifo () 
{
  mypthread_mutex_destroy (&lock);
  close(rfds);
  close(wfds);
}

url *PersistentFifo::tryGet () 
{
  url *tmp = NULL;
  mypthread_mutex_lock(&lock);
  if (in != out) 
  {
    // The stack is not empty
    char *line = readLine();
    tmp = new url(line);
    out++;
    updateRead();
  }
  mypthread_mutex_unlock(&lock);
  return tmp;
}

url *PersistentFifo::get () 
{
  mypthread_mutex_lock(&lock);
  char *line = readLine();
  url *res = new url(line);
  out++;
  updateRead();
  mypthread_mutex_unlock(&lock);
  return res;
}

/** Put something in the fifo
 * The objet is then deleted
 */
void PersistentFifo::put (url *obj) 
{
  mypthread_mutex_lock(&lock);
  char *s = obj->serialize(); // statically allocated string
  writeUrl(s);
  in++;
  updateWrite();
  mypthread_mutex_unlock(&lock);
  delete obj;
} 
int PersistentFifo::getLength () 
{
  return in - out;
}

void PersistentFifo::makeName (uint nb) 
{
  for (uint i=fileNameLength; i>=fileNameLength-5; i--) 
  {
    fileName[i] = (nb % 10) + '0';
    nb /= 10;
  }
}

int PersistentFifo::getNumber (char *file) 
{
  uint len = strlen(file);
  int res = 0;
  for (uint i=len-6; i<=len-1; i++) 
  {
    res = (res * 10) + file[i] - '0';
  }
  return res;
}

void PersistentFifo::updateRead () 
{
  if ((out % urlByFile) == 0) 
  {
    close(rfds);
    makeName(fout);
    unlink(fileName);
    makeName(++fout);
    rfds = open(fileName, O_RDONLY);
    in -= out;
    out = 0;
    assert(bufPos == bufEnd);
  }
}

void PersistentFifo::updateWrite () 
{
  if ((in % urlByFile) == 0) 
  {
    flushOut();
    close(wfds);
    makeName(++fin);
    wfds = creat(fileName, S_IRUSR | S_IWUSR);
#ifdef RELOAD
    global::seen->save();
#ifdef NO_DUP
    global::hDuplicate->save();
#endif
#endif
  }
}

/* read a line from the file
 * uses a buffer
 */
char *PersistentFifo::readLine () {
  if (bufPos == bufEnd) {
    bufPos = 0; bufEnd = 0; buf[0] = 0;
  }
  char *posn = strchr(buf + bufPos, '\n');
  while (posn == NULL) {
    if (!(bufEnd - bufPos < maxUrlSize + 40 + maxCookieSize)) {
      printf(fileName);
      printf(buf+bufPos);
    }
    if (bufPos*2 > BUF_SIZE) {
      bufEnd -= bufPos;
      memmove(buf, buf+bufPos, bufEnd);
      bufPos = 0;
    }
    int postmp = bufEnd;
    bool noRead = true;
    while (noRead) {
      int rd = read(rfds, buf+bufEnd, BUF_SIZE-1-bufEnd);
      switch (rd) {
      case 0 :
        // We need to flush the output in order to read it
        flushOut();
        break;
      case -1 :
        // We have a trouble here
        if (errno != EINTR) {
          cerr << "Big Problem while reading (persistentFifo.h)\n";
          perror("reason");
          assert(false);
        } else {
          perror("Warning in PersistentFifo: ");
        }
        break;
      default:
        noRead = false;
        bufEnd += rd;
        buf[bufEnd] = 0;
        break;
      }
    }
    posn = strchr(buf + postmp, '\n');
  }
  *posn = 0;
  char *res = buf + bufPos;
  bufPos = posn + 1 - buf;
  return res;
}

// write an url in the out file (buffered write)
void PersistentFifo::writeUrl (char *s) {
  size_t len = strlen(s);
  assert(len < maxUrlSize + 40 + maxCookieSize);
  if (outbufPos + len < BUF_SIZE) {
    memcpy(outbuf + outbufPos, s, len);
    outbufPos += len;
  } else {
    // The buffer is full
    flushOut ();
    memcpy(outbuf + outbufPos, s, len);
    outbufPos = len;
  }
}

// Flush the out Buffer in the outFile
void PersistentFifo::flushOut () {
  ecrireBuff (wfds, outbuf, outbufPos);
  outbufPos = 0;
}

View Code

Larbin-2.6.3/src/utils/syncFifo.h

// Larbin
// Sebastien Ailleret
// 09-11-99 -> 07-12-01

/* fifo in RAM with synchronisations */

#ifndef SYNCFIFO_H
#define SYNCFIFO_H

#define std_size 100

#include "utils/mypthread.h"

template <class T>
class SyncFifo {
 protected:
  uint in, out;
  uint size;
  T **tab;
#ifdef THREAD_OUTPUT
  pthread_mutex_t lock;
  pthread_cond_t nonEmpty;
#endif

 public:
  /* Specific constructor */
  SyncFifo (uint size = std_size);

  /* Destructor */
  ~SyncFifo ();

  /* get the first object */
  T *get ();

  /* get the first object (non totally blocking)
   * return NULL if there is none
   */
  T *tryGet ();

  /* add an object in the Fifo */
  void put (T *obj);

  /* how many itmes are there inside ? */
  int getLength ();
};

template <class T>
SyncFifo::SyncFifo (uint size) {
  tab = new T*[size];
  this->size = size;
  in = 0;
  out = 0;
  mypthread_mutex_init (&lock, NULL);
  mypthread_cond_init (&nonEmpty, NULL);
}

template <class T>
SyncFifo::~SyncFifo () {
  delete [] tab;
  mypthread_mutex_destroy (&lock);
  mypthread_cond_destroy (&nonEmpty);
}

template <class T>
T *SyncFifo::get () {
  T *tmp;
  mypthread_mutex_lock(&lock);
  mypthread_cond_wait(in == out, &nonEmpty, &lock);
  tmp = tab[out];
  out = (out + 1) % size;
  mypthread_mutex_unlock(&lock);
  return tmp;
}

template <class T>
T *SyncFifo::tryGet () {
  T *tmp = NULL;
  mypthread_mutex_lock(&lock);
  if (in != out) {
    // The stack is not empty
    tmp = tab[out];
    out = (out + 1) % size;
  }
  mypthread_mutex_unlock(&lock);
  return tmp;
}

template <class T>
void SyncFifo::put (T *obj) {
  mypthread_mutex_lock(&lock);
  tab[in] = obj;
  if (in == out) {
    mypthread_cond_broadcast(&nonEmpty);
  }
  in = (in + 1) % size;
  if (in == out) {
    T **tmp;
    tmp = new T*[2*size];
    for (uint i=out; i) {
      tmp[i] = tab[i];
    }
    for (uint i=0; i<in; i++) {
      tmp[i+size] = tab[i];
    }
    in += size;
    size *= 2;
    delete [] tab;
    tab = tmp;
  }
  mypthread_mutex_unlock(&lock);
}

template <class T>
int SyncFifo::getLength () {
  int tmp;
  mypthread_mutex_lock(&lock);
  tmp = (in + size - out) % size;
  mypthread_mutex_unlock(&lock);
  return tmp;
}

#endif // SYNCFIFO_H

View Code

你可能感兴趣的:(爬虫Larbin解析(二)——sequencer())

情绪觉察日记第37天露露_e800
今天是家庭关系规划师的第二阶最后一天，慧萍老师帮我做了个案，帮我处理了埋在心底好多年的一份恐惧，并给了我深深的力量！这几天出来学习，爸妈过来婆家帮我带小孩，妈妈出于爱帮我收拾东西，并跟我先生和婆婆产生矛盾，妈妈觉得他们没有照顾好我…。今晚回家见到妈妈，我很欣赏她并赞扬她，妈妈说今晚要跟我睡我说好，当我们俩躺在床上准备睡觉的时候，我握着妈妈的手对她说:妈妈这几天辛苦你了，你看你多利害把我们的家收拾得
一百九十四章. 自相矛盾巨木擎天
唉！就这么一夜，林子感觉就像过了很多天似的，先是回了阳间家里，遇到了那么多不可思议的事情儿。特别是小伙伴们，第二次与自己见面时，僵硬的表情和恐怖的气氛，让自己如坐针毡，打从心眼里难受！还有东子，他现在还好吗？有没有被人欺负？护城河里的小鱼小虾们，还都在吗？水不会真的干枯了吧？那对相亲相爱漂亮的太平鸟儿，还好吧！春天了，到了做窝、下蛋、喂养小鸟宝宝的时候了，希望它们都能够平安啊！虽然没有看见家人，也
10月|愿你的青春不负梦想-读书笔记-01 Tracy的小书斋
本书的作者是俞敏洪，大家都很熟悉他了吧。俞敏洪老师是我行业的领头羊吧，也是我事业上的偶像。本日摘录他书中第一章中的金句：『一个人如果什么目标都没有，就会浑浑噩噩，感觉生命中缺少能量。能给我们能量的，是对未来的期待。第一件事，我始终为了进步而努力。与其追寻全世界的骏马，不如种植丰美的草原，到时骏马自然会来。第二件事，我始终有阶段性的目标。什么东西能给我能量？答案是对未来的期待。』读到这里的时候，我便
C语言宏函数南林yan C语言 c语言
一、什么是宏函数？通过宏定义的函数是宏函数。如下，编译器在预处理阶段会将Add(x,y)替换为((x)*(y))#defineAdd(x,y)((x)*(y))#defineAdd(x,y)((x)*(y))intmain(){inta=10;intb=20;intd=10;intc=Add(a+d,b)*2;cout<
谢谢你们，爱你们！鹿游儿
昨天家人去泡温泉，二个孩子也带着去，出发前一晚，匆匆下班，赶回家和孩子一起收拾。饭后，我拿出笔和本子（上次去澳门时做手帐的本子）写下了1\2\3\4\5\6\7\8\9,让后让小壹去思考，带什么出发去旅游呢？她在对应的数字旁边画上了，泳衣、泳圈、肖恩、内衣内裤、tapuy、拖鞋……画完后，就让她自己对着这个本子，将要带的，一一带上，没想到这次带的书还是这本《便便工厂》(晚上姑婆发照片过来，妹妹累得
爬山后遗症璃绛
爬山，攀登，一步一步走向制高点，是一种挑战。成功抵达是一种无法言语的快乐，在山顶吹吹风，看看风景，这是从未有过的体验。然而，爬山一时爽，下山腿打颤，颠簸的路，一路向下走，腿部力量不够，走起来抖到不行，停不下来了！第二天必定腿疼，浑身酸痛，坐立难安！
《策划经理回忆录之二》路基雅虎
话说三年变六年，飘了，飘了……眨眼，2013年5月，老吴回到了他的家乡——油城从新开启他的工作幻想症生涯。很庆幸，这是一家很有追求，同时敢于尝试的，且实力不容低调的新星房企——金源置业(前身泰源置业)更值得庆幸的是第一个盘就是油城十路的标杆之一:金源盛世。2013年5月，到2015年11月，两年的陪伴，迎来了一场大爆发。2000个筹，5万/筹，直接回笼1个亿！！！这……让我开始认真审视这座看似五线
《大清方方案》| 第二话谁佐清欢
和珅究竟说了些什么？竟能令堂堂九五之尊龙颜失色！此处暂且按下不表；单说这位乾隆皇帝，果真不愧是康熙从小带过的，一旦决定了要做的事，便杀伐决断毫不含糊。他当即亲自拟旨，着令和珅为钦差大臣，全权负责处理方方事件，并钦赐尚方宝剑，遇急则三品以下官员可先斩后奏。和珅身负皇上重托，岂敢有半点怠慢，当夜即率领相关人等，马不停蹄杀奔江汉。这一路上，和珅的几位幕僚一直在商讨方方事件的处置方案。有位年轻幕僚建议快刀
【一起学Rust | 设计模式】习惯语法——使用借用类型作为参数、格式化拼接字符串、构造函数广龙宇一起学Rust #Rust设计模式 rust 设计模式开发语言
提示：文章写完后，目录可以自动生成，如何生成可参考右边的帮助文档文章目录前言一、使用借用类型作为参数二、格式化拼接字符串三、使用构造函数总结前言Rust不是传统的面向对象编程语言，它的所有特性，使其独一无二。因此，学习特定于Rust的设计模式是必要的。本系列文章为作者学习《Rust设计模式》的学习笔记以及自己的见解。因此，本系列文章的结构也与此书的结构相同（后续可能会调成结构），基本上分为三个部分
本周第二次约练 2cfbdfe28a51
中原焦点团队中24初26刘霞2021.12.3约练161次，分享第368天当事人虽然是带着问题来的，但是咨询过程中发现，她是经过自己不断地调整和努力才走到现在的，看到当事人的不容易，找到例外，发现资源，力量感也就随之而来。增强画面感，或者说重温，会给当事人带来更深刻的感受。
2022-07-08 保利学府里李楚怡1307022
——保利碧桂园学府里——童梦奇趣【科学实验室】「7.9-7.10」✏玩出大智慧约99-144㎡二期全新升级力作
直抒《紫罗兰永恒花园外传》雷姆的黑色童话
没看过《紫罗兰永恒花园》的我莫名的看完了《紫罗兰永恒花园外传》，又莫名的被故事中的姐妹之情狠狠地感动了的一把。感动何在：困苦中相依为命的姐妹二人被迫分离，用一个人的自由换取另一个人的幸福。之后，虽相隔不知几许依旧心心念念彼此牵挂。这种深深的姐妹情谊就是令我为之动容的所在。贝拉和泰勒分别影片开始，海天之间一个孩童凭栏眺望，手中拿着折旧的信纸。镜头一转，挑灯伏案的薇尔莉特正在打字机前奋笔疾书。这些片段
Python教程：一文了解使用Python处理XPath 旦莫 Python进阶 python 开发语言
目录1.环境准备1.1安装lxml1.2验证安装2.XPath基础2.1什么是XPath？2.2XPath语法2.3示例XML文档3.使用lxml解析XML3.1解析XML文档3.2查看解析结果4.XPath查询4.1基本路径查询4.2使用属性查询4.3查询多个节点5.XPath的高级用法5.1使用逻辑运算符5.2使用函数6.实战案例6.1从网页抓取数据6.1.1安装Requests库6.1.2代
基于社交网络算法优化的二维最大熵图像分割智能算法研学社（Jack旭）智能优化算法应用图像分割算法 php 开发语言
智能优化算法应用：基于社交网络优化的二维最大熵图像阈值分割-附代码文章目录智能优化算法应用：基于社交网络优化的二维最大熵图像阈值分割-附代码1.前言2.二维最大熵阈值分割原理3.基于社交网络优化的多阈值分割4.算法结果：5.参考文献：6.Matlab代码摘要：本文介绍基于最大熵的图像分割，并且应用社交网络算法进行阈值寻优。1.前言阅读此文章前，请阅读《图像分割：直方图区域划分及信息统计介绍》htt
如果做到轻松在股市赚钱？只要坚持这三个原则。履霜之人
大A股里向来就有七亏二平一赚的说法，能赚钱的都是少数人。否则股市就成了慈善机构，人人都有钱赚，谁还要上班？所以说亏钱是正常的，或者说是应该的。那么那些赚钱的人又是如何做到的呢？普通人能不能找到捷径去分一杯羹呢？方法是有的，但要做到需要你有极高的自律。第一，控制仓位，散户最大的问题是追涨杀跌，只要涨起来，就把钱往股票上砸，然后被套，隔天跌的受不了，又一刀切，全部割肉。来来回回间，遍体鳞伤。所以散户首
DIV+CSS+JavaScript技术制作网页（旅游主题网页设计与制作）云南大理 STU学生网页设计网页设计期末网页作业 html静态网页 html5期末大作业网页设计 web大作业
️精彩专栏推荐作者主页:【进入主页—获取更多源码】web前端期末大作业：【HTML5网页期末作业(1000套)】程序员有趣的告白方式：【HTML七夕情人节表白网页制作(110套)】文章目录二、网站介绍三、网站效果▶️1.视频演示2.图片演示四、网站代码HTML结构代码CSS样式代码五、更多源码二、网站介绍网站布局方面：计划采用目前主流的、能兼容各大主流浏览器、显示效果稳定的浮动网页布局结构。网站程
特殊的拜年飘雪的天堂
文/雪儿大年初一，家家户户没有了轰响的鞭炮声，大街上没有了人流涌动的喧闹，几乎看不到人影，变得冷冷清清。天刚亮不大会儿，村里的大喇叭响了起来：由于当前正值疾病高发期，流感流行的高峰期。同时，新型冠状病毒感染的肺炎进入第二波流行的上升期。为了自己和他人的健康安全着想，请大家尽量不要串门拜年，不要在街里走动。可以通过手机微信，视频，电话，信息拜年……今年的春节真是特别。禁止燃放鞭炮，烟花爆竹，禁止出村
398顺境，逆境戴骁勇
2018.11.27周二雾霾最近儿子进入了一段顺境期，今天表现尤其不错。今天的数学测试成绩喜人，没有出现以往的计算错误，整个卷面书写工整，附加题也在规定时间内完成且做对。为迎接体育测试的锻炼有了质的飞跃。坐位体前屈成绩突飞猛进，估测成绩能达到12cm，这和上次测试的零分来比，简直是逆袭。儿子还在不断锻炼和提升，唯恐到时候掉链子。跑步姿势在我的调教下，逐渐正规起来，速度随之也有了提升。今晚测试的50
想明白这个问题，你才能写下去文自拾
春节放假的时候，又有一天梦见她，第二天她冒着漫天大雪，傻傻地跑来见我。她说，见见傻傻的我，天很冷，心很暖。她回去后，我写了一篇文章，题目叫——从此梦中只有你。我们没在一起的很长一段时间里，她都在我的心底，一次次出现在我的梦里。我对她说，在一起之前，是胆小且闷骚，在一起之后，我变得不要脸了。不要脸的——去爱你。那文章没写完，火车上，给她看了。我有点小失望，花了好几个小时写，她分分钟就看完，很希望她逐
python八股文面试题分享及解析(1) Shawn________ python
#1.'''a=1b=2不用中间变量交换a和b'''#1.a=1b=2a,b=b,aprint(a)print(b)结果：21#2.ll=[]foriinrange(3):ll.append({'num':i})print(11)结果:#[{'num':0},{'num':1},{'num':2}]#3.kk=[]a={'num':0}foriinrange(3):#0,12#可变类型，不仅仅改变
【JS】执行时长(100分) |思路参考+代码解析（C++） l939035548 JS 算法数据结构 c++
题目为了充分发挥GPU算力，需要尽可能多的将任务交给GPU执行，现在有一个任务数组，数组元素表示在这1秒内新增的任务个数且每秒都有新增任务。假设GPU最多一次执行n个任务，一次执行耗时1秒，在保证GPU不空闲情况下，最少需要多长时间执行完成。题目输入第一个参数为GPU一次最多执行的任务个数，取值范围[1,10000]第二个参数为任务数组长度，取值范围[1,10000]第三个参数为任务数组，数字范围
人工智能时代，程序员如何保持核心竞争力？ jmoych 人工智能
随着AIGC（如chatgpt、midjourney、claude等）大语言模型接二连三的涌现，AI辅助编程工具日益普及，程序员的工作方式正在发生深刻变革。有人担心AI可能取代部分编程工作，也有人认为AI是提高效率的得力助手。面对这一趋势,程序员应该如何应对?是专注于某个领域深耕细作，还是广泛学习以适应快速变化的技术环境?又或者，我们是否应该将重点转向AI无法轻易替代的软技能？让我们一起探讨程序员
其二十八尾喵
你知道吗？图片发自App我今天知道了你有喜欢的人，不是我。心空空的，整个人都不是我的了。可，怎么办？还是要好好的活着，毕竟你喜欢的人，我不能杀，可是我可以杀其他喜欢你的人呀！也罢，此生无缘，来世再见。鱼干
libyuv之linux编译 jaronho Linux linux 运维服务器
文章目录一、下载源码二、编译源码三、注意事项1、银河麒麟系统（aarch64）（1）解决armv8-a+dotprod+i8mm指令集支持问题（2）解决armv9-a+sve2指令集支持问题一、下载源码到GitHub网站下载https://github.com/lemenkov/libyuv源码，或者用直接用git克隆到本地，如：gitclonehttps://github.com/lemenko
webpack图片等资源的处理 dmengmeng
需要的loaderfile-loader（让我们可以引入这些资源文件）url-loader（其实是file-loader的二次封装）img-loader（处理图片所需要的）在没有使用任何处理图片的loader之前，比如说css中用到了背景图片，那么最后打包会报错的，因为他没办法处理图片。其实你只想能够使用图片的话。只加一个file-loader就可以，打开网页能准确看到图片。{test:/\.(p
ARM中断处理过程落汤老狗嵌入式linux
一、前言本文主要以ARM体系结构下的中断处理为例，讲述整个中断处理过程中的硬件行为和软件动作。具体整个处理过程分成三个步骤来描述：1、第二章描述了中断处理的准备过程2、第三章描述了当发生中的时候，ARM硬件的行为3、第四章描述了ARM的中断进入过程4、第五章描述了ARM的中断退出过程本文涉及的代码来自3.14内核。另外，本文注意描述ARM指令集的内容，有些sourcecode为了简短一些，删除了T
如何成为段子手欣雅阅读
我是一个尬聊大师，与朋友聊天经常把话题聊死，留我一个人在群里，望着自己打下的最后一句话无语凝噎。看到风趣幽默的朋友与人聊天，很是艳羡，觉得自己何时才能成为这样的段子手呢？一、段子是什么？“段子”一词在百度百科上的解释：本是相声中的一个艺术术语，指的是相声作品中一节或一段艺术内容。我的理解：段子就是一些搞笑的故事或者笑话。二、为什么要会说段子？不知道大家有没有这样的朋友，本来很无趣的聚会，只要有他参
【华为OD技术面试真题 - 技术面】- python八股文真题题库（4) 算法大师华为od 面试 python
华为OD面试真题精选专栏：华为OD面试真题精选目录:2024华为OD面试手撕代码真题目录以及八股文真题目录文章目录华为OD面试真题精选**1.Python中的`with`**用途和功能自动资源管理示例：文件操作上下文管理协议示例代码工作流程解析优点2.\_\_new\_\_和**\_\_init\_\_**区别__new____init__区别总结3.**切片（Slicing）操作**基本切片语法
Python爬虫解析工具之xpath使用详解 eqa11 python 爬虫开发语言
文章目录Python爬虫解析工具之xpath使用详解一、引言二、环境准备1、插件安装2、依赖库安装三、xpath语法详解1、路径表达式2、通配符3、谓语4、常用函数四、xpath在Python代码中的使用1、文档树的创建2、使用xpath表达式3、获取元素内容和属性五、总结Python爬虫解析工具之xpath使用详解一、引言在Python爬虫开发中，数据提取是一个至关重要的环节。xpath作为一门
主题升华随机抽总结木棉咕噜
昨天晚上在火山灿教练那里抽了主题升华最后一关。一共抽了两个故事，现总结如下。第一个故事是《并不是你想象的那样》。主题一：有时候，面对别人一些貌似不合常情的行为，不要轻易的指责他，也许背后有我们所不知道的原因。在这一个主题里面，刚开始的时候，我没有加上貌似二字。所以就没有改动之后这么精准。主题二：有时候我们对他人善意的行为，可能会给我们带来一些意外的回报。主题三：面对同样一件事，因为不同的人看待问题
web前段跨域nginx代理配置刘正强 nginx cms Web
nginx代理配置可参考server部分 server { listen 80; server_name localhost;
spring学习笔记 caoyong spring
一、概述 a>、核心技术 : IOC与AOP b>、开发为什么需要面向接口而不是实现接口降低一个组件与整个系统的藕合程度，当该组件不满足系统需求时，可以很容易的将该组件从系统中替换掉，而不会对整个系统产生大的影响 c>、面向接口编口编程的难点在于如何对接口进行初始化,(使用工厂设计模式)
Eclipse打开workspace提示工作空间不可用 0624chenhong eclipse
做项目的时候，难免会用到整个团队的代码，或者上一任同事创建的workspace， 1.电脑切换账号后，Eclipse打开时，会提示Eclipse对应的目录锁定，无法访问，根据提示，找到对应目录，G:\eclipse\configuration\org.eclipse.osgi\.manager，其中文件.fileTableLock提示被锁定。解决办法，删掉.fileTableLock文件，重
Javascript 面向对面写法的必要性？一炮送你回车库 JavaScript
现在Javascript面向对象的方式来写页面很流行，什么纯javascript的mvc框架都出来了：ember 这是javascript层的mvc框架哦,不是j2ee的mvc框架我想说的是，javascript本来就不是一门面向对象的语言，用它写出来的面向对象的程序，本身就有些别扭，很多人提到js的面向对象首先提的是：复用性。那么我请问你写的js里有多少是可以复用的，用fu
js array对象的迭代方法换个号韩国红果果 array
1.forEach 该方法接受一个函数作为参数，对数组中的每个元素使用该函数 return 语句失效 function square(num) { print(num, num * num); } var nums = [1,2,3,4,5,6,7,8,9,10]; nums.forEach(square); 2.every 该方法接受一个返回值为布尔类型
对Hibernate缓存机制的理解归来朝歌 session 一级缓存对象持久化
在hibernate中session一级缓存机制中，有这么一种情况：问题描述：我需要new一个对象，对它的几个字段赋值，但是有一些属性并没有进行赋值，然后调用 session.save()方法，在提交事务后，会出现这样的情况： 1：在数据库中有默认属性的字段的值为空 2：既然是持久化对象，为什么在最后对象拿不到默认属性的值？通过调试后解决方案如下：对于问题一，如你在数据库里设置了
WebService调用错误合集 darkranger webservice
Java.Lang.NoClassDefFoundError: Org/Apache/Commons/Discovery/Tools/DiscoverSingleton 调用接口出错，一个简单的WebService import org.apache.axis.client.Call;import org.apache.axis.client.Service; 首先必不可
JSP和Servlet的中文乱码处理 aijuans Java Web
JSP和Servlet的中文乱码处理前几天学习了JSP和Servlet中有关中文乱码的一些问题，写成了博客，今天进行更新一下。应该是可以解决日常的乱码问题了。现在作以下总结希望对需要的人有所帮助。我也是刚学，所以有不足之处希望谅解。一、表单提交时出现乱码：在进行表单提交的时候，经常提交一些中文，自然就避免不了出现中文乱码的情况，对于表单来说有两种提交方式：get和post提交方式。所以
面试经典六问 atongyeye 工作面试
题记：因为我不善沟通，所以在面试中经常碰壁，看了网上太多面试宝典，基本上不太靠谱。只好自己总结，并试着根据最近工作情况完成个人答案。以备不时之需。以下是人事了解应聘者情况的最典型的六个问题： 1 简单自我介绍关于这个问题，主要为了弄清两件事，一是了解应聘者的背景，二是应聘者将这些背景信息组织成合适语言的能力。我的回答：(针对技术面试回答，如果是人事面试，可以就掌
contentResolver.query()参数详解百合不是茶 android query()详解
收藏csdn的博客,介绍的比较详细,新手值得一看 1.获取联系人姓名一个简单的例子，这个函数获取设备上所有的联系人ID和联系人NAME。 [java] view plain copy public void fetchAllContacts() {
ora-00054:resource busy and acquire with nowait specified解决方法 bijian1013 oracle 数据库 kill nowait
当某个数据库用户在数据库中插入、更新、删除一个表的数据，或者增加一个表的主键时或者表的索引时，常常会出现ora-00054:resource busy and acquire with nowait specified这样的错误。主要是因为有事务正在执行（或者事务已经被锁），所有导致执行不成功。 1.下面的语句
web 开发乱码征客丶 spring Web
以下前端都是 utf-8 字符集编码一、后台接收 1.1、 get 请求乱码 get 请求中，请求参数在请求头中；乱码解决方法： a、通过在web 服务器中配置编码格式：tomcat 中，在 Connector 中添加URIEncoding="UTF-8"； 1.2、post 请求乱码 post 请求中，请求参数分两部份， 1.2.1、url？参数，
【Spark十六】： Spark SQL第二部分数据源和注册表的几种方式 bit1129 spark
Spark SQL数据源和表的Schema case class apply schema parquet json JSON数据源准备源数据 {"name":"Jack", "age": 12, "addr":{"city":"beijing&
JVM学习之:调优总结 -Xms -Xmx -Xmn -Xss BlueSkator -Xss -Xmn -Xms -Xmx
堆大小设置JVM 中最大堆大小有三方面限制：相关操作系统的数据模型（32-bt还是64-bit）限制；系统的可用虚拟内存限制；系统的可用物理内存限制。32位系统下，一般限制在1.5G~2G；64为操作系统对内存无限制。我在Windows Server 2003 系统，3.5G物理内存，JDK5.0下测试，最大可设置为1478m。典型设置： java -Xmx355
jqGrid 各种参数详解(转帖) BreakingBad jqGrid
jqGrid 各种参数详解分类：源代码分享个人随笔请勿参考解决开发问题 2012-05-09 20:29 84282人阅读评论(22) 收藏举报 jquery 服务器 parameters function ajax string
读《研磨设计模式》-代码笔记-代理模式-Proxy bylijinnan java 设计模式
声明：本文只为方便我个人查阅和理解，详细的分析以及源代码请移步原作者的博客http://chjavach.iteye.com/ import java.lang.reflect.InvocationHandler; import java.lang.reflect.Method; import java.lang.reflect.Proxy; /* * 下面
应用升级iOS8中遇到的一些问题 chenhbc ios8 升级iOS8
1、很奇怪的问题，登录界面，有一个判断，如果不存在某个值，则跳转到设置界面，ios8之前的系统都可以正常跳转，iOS8中代码已经执行到下一个界面了，但界面并没有跳转过去，而且这个值如果设置过的话，也是可以正常跳转过去的，这个问题纠结了两天多，之前的判断我是在 -(void)viewWillAppear:(BOOL)animated 中写的，最终的解决办法是把判断写在 -(void
工作流与自组织的关系？ comsci 设计模式工作
目前的工作流系统中的节点及其相互之间的连接是事先根据管理的实际需要而绘制好的，这种固定的模式在实际的运用中会受到很多限制，特别是节点之间的依存关系是固定的，节点的处理不考虑到流程整体的运行情况，细节和整体间的关系是脱节的，那么我们提出一个新的观点，一个流程是否可以通过节点的自组织运动来自动生成呢？这种流程有什么实际意义呢？这里有篇论文，摘要是：“针对网格中的服务
Oracle11.2新特性之INSERT提示IGNORE_ROW_ON_DUPKEY_INDEX daizj oracle
insert提示IGNORE_ROW_ON_DUPKEY_INDEX 转自：http://space.itpub.net/18922393/viewspace-752123 在 insert into tablea ...select * from tableb中，如果存在唯一约束，会导致整个insert操作失败。使用IGNORE_ROW_ON_DUPKEY_INDEX提示，会忽略唯一
二叉树:堆 dieslrae 二叉树
这里说的堆其实是一个完全二叉树,每个节点都不小于自己的子节点,不要跟jvm的堆搞混了.由于是完全二叉树,可以用数组来构建.用数组构建树的规则很简单: 一个节点的父节点下标为: (当前下标 - 1)/2 一个节点的左节点下标为: 当前下标 * 2 + 1 &
C语言学习八结构体 dcj3sjt126com c
为什么需要结构体，看代码 # include <stdio.h> struct Student //定义一个学生类型，里面有age, score, sex, 然后可以定义这个类型的变量 { int age; float score; char sex; } int main(void) { struct Student st = {80, 66.6,
centos安装golang dcj3sjt126com centos
#在国内镜像下载二进制包 wget -c http://www.golangtc.com/static/go/go1.4.1.linux-amd64.tar.gz tar -C /usr/local -xzf go1.4.1.linux-amd64.tar.gz #把golang的bin目录加入全局环境变量 cat >>/etc/profile<
10.性能优化-监控-MySQL慢查询 frank1234 性能优化 MySQL慢查询
1.记录慢查询配置 show variables where variable_name like 'slow%' ; --查看默认日志路径查询结果：--不用的机器可能不同 slow_query_log_file=/var/lib/mysql/centos-slow.log 修改mysqld配置文件：/usr /my.cnf[一般在/etc/my.cnf，本机在/user/my.cn
Java父类取得子类类名 happyqing java this 父类子类类名
在继承关系中，不管父类还是子类，这些类里面的this都代表了最终new出来的那个类的实例对象，所以在父类中你可以用this获取到子类的信息！ package com.urthinker.module.test; import org.junit.Test; abstract class BaseDao<T> { public void
Spring3.2新注解@ControllerAdvice jinnianshilongnian @Controller
@ControllerAdvice，是spring3.2提供的新注解，从名字上可以看出大体意思是控制器增强。让我们先看看@ControllerAdvice的实现： @Target(ElementType.TYPE) @Retention(RetentionPolicy.RUNTIME) @Documented @Component public @interface Co
Java spring mvc多数据源配置 liuxihope spring
转自：http://www.itpub.net/thread-1906608-1-1.html 1、首先配置两个数据库 <bean id="dataSourceA" class="org.apache.commons.dbcp.BasicDataSource" destroy-method="close&quo
第12章 Ajax（下） onestopweb Ajax
index.html <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> <html xmlns="http://www.w3.org/
BW / Universe Mappings blueoxygen BO
BW Element OLAP Universe Element Cube Dimension Class Charateristic A class with dimension and detail objects (Detail objects for key and desription) Hi
Java开发熟手该当心的11个错误 tomcat_oracle java 多线程工作单元测试
#1、不在属性文件或XML文件中外化配置属性。比如，没有把批处理使用的线程数设置成可在属性文件中配置。你的批处理程序无论在DEV环境中，还是UAT（用户验收测试）环境中，都可以顺畅无阻地运行，但是一旦部署在PROD 上，把它作为多线程程序处理更大的数据集时，就会抛出IOException，原因可能是JDBC驱动版本不同，也可能是#2中讨论的问题。如果线程数目可以在属性文件中配置，那么使它成为
推行国产操作系统的优劣 yananay windows linux 国产操作系统
最近刮起了一股风，就是去“国外货”。从应用程序开始，到基础的系统，数据库，现在已经刮到操作系统了。原因就是“棱镜计划”，使我们终于认识到了国外货的危害，开始重视起了信息安全。操作系统是计算机的灵魂。既然是灵魂，为了信息安全，那我们就自然要使用和推行国货。可是，一味地推行，是否就一定正确呢？先说说信息安全。其实从很早以来大家就在讨论信息安全。很多年以前，就据传某世界级的网络设备制造商生产的交