lionzl

protlib - Easily implement binary network protocols

protlib builds on thestructandSocketServermodules in the standard library to make it easy to implement binarynetwork protocols. It provides support for default and constant structfields, nested structs, arrays of structs, better handling for stringsand arrays, struct inheritance, and convenient syntax for instantiatingand using your custom structs.

Here’s an example of defining, instantiating, writing, and reading a struct using file i/o:

 
     from protlib import *
class Point(CStruct):
    x = CInt()
    y = CInt()

p1 = Point(5, 6)
p2 = Point(x=5, y=6)
p3 = Point(y=6, x=5)
assert p1 == p2 == p3

with open("point.dat", "wb") as f:
    f.write( p1.serialize() )

with open("point.dat", "rb") as f:
    p4 = Point.parse(f)

assert p1 == p4
 
    

You may use thesocket.makefilemethod to use this file i/o approach for sockets.

Installation

protlib is free under the BSD license. It requires Python 2.6 or later and has no otherdependencies. Because protlib supports Python 3, the code snippets in thisdocumentation are copied from a Python 3 interpreter.

You may click here to download protlib.You may also run easy_install protlib if you haveEasyInstall on your system. Theproject page for protlib in the Cheese Shop (aka the Python Package Index or PyPI)may be found here.

You may also check out the development version of protlib with this command:

svn checkout http://courtwright.org/svn/protlib

You may download older versions of protlib and view older versions of the protlib documentationhere.

Data Types

class CType ( **kwargs )

This is the root class of all classes representing C data typesin the protlib library. It may not be directly instantiated; youmust always use one of its subtypes instead. There are fiveoptional keyword arguments which you may pass to a CType:

length: Only valid for the CString, CUnicode, and CArray data types, for which it is required. This may be one of three things: an integer which represents the length of the string; the special value protlib.AUTOSIZED, which indicates that the string is null-terminated and can be any size; or a string denoting the field where the actual length value may be found. For example:

 
         >>> from protlib import *
>>> class Person(CStruct):
...     state    = CString(length = 2)
...     name_len = CShort()
...     name     = CString(length = "name_len")
...
>>> Person(state="VA", name_len=3, name="Eli")
Person(state=b'VA', name_len=3, name=b'Eli')
>>>
>>> class Person(CStruct):
...     state = CString(length = 2)
...     name  = CString(length = AUTOSIZED)
...
>>> Person.parse(b"VAEli\0")
Person(state=b'VA', name=b'Eli')
>>> Person(state="VA", name="Eli").serialize()
b'VAEli\x00'
 
        

always: Use this to set a constant value for a field. You won’t need to specify this value, and a CWarning will be triggered if this field is ever assigned a different value. For example:

 
         >>> import warnings
>>> warnings.simplefilter("always")
>>>
>>> from protlib import *
>>> class OriginPoint(CStruct):
...     x = CInt(always = 0)
...     y = CInt(always = 0)
...
>>>
>>> op1 = OriginPoint()
>>> op1
OriginPoint(x=0, y=0)
>>> op1.x = 5
/home/eli/protlib/env3/lib/python3.1/site-packages/protlib-1.4-py3.1.egg/protlib.py:733: CWarning: OriginPoint.x should always be 0 but was given a value of 5
  warn("{0}.{1} should always be {2!r} but was given a value of {3!r}".format(self.__class__.__name__, name, field.always, value), CWarning)
>>>
>>> buf = op1.serialize()
>>> op2 = OriginPoint.parse(buf)
/home/eli/protlib/env3/lib/python3.1/site-packages/protlib-1.4-py3.1.egg/protlib.py:733: CWarning: OriginPoint.x should always be 0 but was given a value of 5
  warn("{0}.{1} should always be {2!r} but was given a value of {3!r}".format(self.__class__.__name__, name, field.always, value), CWarning)
>>>
>>> assert op1 == op2
 
        

default: Like the always parameter, except that no warnings are raised when a different value is parsed or serialized. Also, a default parameter may be either a value or a callable object. For example:

 
         >>> from protlib import *
>>> class Point(CStruct):
...     x = CInt(default = 0)
...     y = CInt(default = lambda: 5)
...
>>> p = Point()
>>> p
Point(x=0, y=5)
 
        

full_string: Unlike the struct module, protlib right-strips strings when they’re parsed, starting with the first null byte. This default behavior can be overridden by setting this parameter to True. For example:

 
         >>> raw = b"foo\0\0"
>>> import struct
>>> s = struct.unpack(b"5s", raw)[0]
>>> assert s == b"foo\0\0"
>>>
>>> from protlib import *
>>> s = CString(length = 5).parse(raw)
>>> assert s == b"foo"
>>>
>>> raw = b"foo\0!"
>>> s = CString(length = 5).parse(raw)
>>> assert s == b"foo"
>>>
>>> raw = b"foo\0!"
>>> s = CString(length = 5, full_string = True).parse(raw)
>>> assert s == b"foo\0!"
 
        

encoding: This is required for CUnicode objects but invalid for all other types. It specifies the encoding to use when translating to and from unicode and raw bytes. For example:

 
         >>> from protlib import *
>>> CUnicode(length=6, encoding="utf8").serialize("andré")
b'andr\xc3\xa9'
>>> assert "andré" == CUnicode(length=6, encoding="utf8").parse(b"andr\xc3\xa9")

enc_errors: This optional parameter is only valid for CUnicode objects. It defined how errors are handled, e.g. by being passed as the errors argument to the errors argument to the unicode builtin. If omitted, it defaults to “strict”. For example:

 
         >>> CUnicode(length=3, encoding="utf8", enc_errors="ignore").serialize(b"\x80")
b'\x00\x00\x00'
>>> CUnicode(length=3, encoding="utf8", enc_errors="replace").serialize(b"\x80")
b'\xef\xbf\xbd'
>>> CUnicode(length=3, encoding="utf8", enc_errors="strict").serialize(b"\x80")
Traceback (most recent call last):
  File "/home/eli/protlib/env3/lib/python3.1/site-packages/protlib-1.4-py3.1.egg/protlib.py", line 374, in serialize
    encoded = self.convert(val).encode(self.encoding, self.enc_errors)
  File "/home/eli/protlib/env3/lib/python3.1/site-packages/protlib-1.4-py3.1.egg/protlib.py", line 274, in convert
    return x if isinstance(x, str) else str(_to_bytes(x), self.encoding, self.enc_errors)
  File "/home/eli/protlib/examples/../env3/lib/python3.1/encodings/utf_8.py", line 16, in decode
    return codecs.utf_8_decode(input, errors, True)
UnicodeDecodeError: 'utf8' codec can't decode byte 0x80 in position 0: unexpected code byte
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
  File "", line 1, in 
  File "/home/eli/protlib/env3/lib/python3.1/site-packages/protlib-1.4-py3.1.egg/protlib.py", line 378, in serialize
    raise CError(cerror).with_traceback(tb)
  File "/home/eli/protlib/env3/lib/python3.1/site-packages/protlib-1.4-py3.1.egg/protlib.py", line 374, in serialize
    encoded = self.convert(val).encode(self.encoding, self.enc_errors)
  File "/home/eli/protlib/env3/lib/python3.1/site-packages/protlib-1.4-py3.1.egg/protlib.py", line 274, in convert
    return x if isinstance(x, str) else str(_to_bytes(x), self.encoding, self.enc_errors)
  File "/home/eli/protlib/examples/../env3/lib/python3.1/encodings/utf_8.py", line 16, in decode
    return codecs.utf_8_decode(input, errors, True)
protlib.CError: unicode error serializing b'\x80': 'utf8' codec can't decode byte 0x80 in position 0: unexpected code byte
 
        

Warning

The length parameter of the CUnicode class indicates the max lengthof the raw serialized bytes of the CUnicode field. It does not indicatethe number of unicode characters. For example, a 5-character unicode stringmight serialize to more than 5 bytes:

 
        >>> from __future__ import unicode_literals
>>> from protlib import *
>>> CUnicode(length=5, encoding="utf8").serialize("andré")
/home/eli/protlib/env3/lib/python3.1/site-packages/protlib-1.4-py3.1.egg/protlib.py:384: CWarning: CUnicode value has length 5 and was told to serialize an encoded string of length 6 b'andr\xc3\xa9'
  warn("CUnicode value has length {0} and was told to serialize an encoded string of length {1} {2!r}".format(self.real_length(cstruct), len(encoded), encoded), CWarning)
b'andr\xc3'
 
       

Warning

Some unicode character encodings commonly contain null bytes, which makes itinadvisable to use those encodings with an AUTOSIZED string. For example:

 
        >>> from protlib import *
>>> s = "Hello World".encode("utf-32")
>>> s.count(b"\0")
35
>>> CUnicode(length=AUTOSIZED, encoding="utf-32").parse(s)
Traceback (most recent call last):
  File "/home/eli/protlib/env3/lib/python3.1/site-packages/protlib-1.4-py3.1.egg/protlib.py", line 366, in parse
    return s.decode(self.encoding, self.enc_errors)
UnicodeDecodeError: 'utf32' codec can't decode bytes in position 0-1: truncated data
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
  File "", line 1, in 
  File "/home/eli/protlib/env3/lib/python3.1/site-packages/protlib-1.4-py3.1.egg/protlib.py", line 370, in parse
    raise CError(cerror).with_traceback(tb)
  File "/home/eli/protlib/env3/lib/python3.1/site-packages/protlib-1.4-py3.1.egg/protlib.py", line 366, in parse
    return s.decode(self.encoding, self.enc_errors)
protlib.CError: unicode error parsing b'\xff\xfe': 'utf32' codec can't decode bytes in position 0-1: truncated data
 
       

sizeof

The size of the packed binary data representing this CType.Note that this is a classmethod for subclasses of CStruct.

struct_format

The format string used by the underyingstruct moduleto represent the packed binary data format.Note that this is a classmethod for subclasses of CStruct.

parse ( f )

Accepts either a string or a file-like object (anything with a read method)and returns a Python object with the appropriate value.

 
         >>> raw = b"\x00\x00\x00\x05"
>>> i = CInt().parse(raw)
>>> assert i == 5

Note that this is a classmethod on subclasses of CStruct.

serialize ( x )

Serializes the value according to the specific CType class.Note that this takes no argument when called on a CStructinstance.

Basic Data Types

Because protlib is built on top of struct module, each basic data typein protlib uses a struct format string. The list of struct format stringsis hereand the protlib types which use them are listed below. These sizes areconstant on all processor architectures by default, but this will changeif you change the value of protlib.BYTE_ORDER

C data type	protlib class	struct format string	size in bytes
char	CChar	b	1
unsigned char	CUChar	B	1
short	CShort	h	2
unsigned short	CUShort	H	2
int	CInt	i	4
unsigned int	CUInt	I	4
long	CLong	q	8
unsigned long	CULong	Q	8
float	CFloat	f	4
double	CDouble	d	8
char[]	CString	Xs (e.g. 5s for char[5])	1 * length
char[]	CUnicode	Xs (e.g. 5s for char[5])	1 * length

Creating Custom CTypes

Some projects might require you to write custom parsing and serializing code;protlib makes this easy by allowing you to subclass CType classes. Here’san example, which you can find in examples/ctype_subclassing/testing.py:

 
      import json
from protlib import *

class JsonCString(CString):
    def parse(self, f, cstruct=None):
        return json.loads(CString.parse(self, f, cstruct).decode("utf8"))
    
    def serialize(self, s, cstruct=None):
        return CString.serialize(self, json.dumps(s).encode("utf8"), cstruct)
    
    def convert(self, x):
        return x

class Person(CStruct):
    name = CUnicode(encoding = "utf8", length = 6)
    data = JsonCString(length = AUTOSIZED)

eli = Person("Eli", {"age": 28})
assert eli.data == {"age": 28}
assert eli.serialize() == b'Eli\0\0\0{"age": 28}\0'
 
     

This code works in both Python 2 and Python 3 and demonstrates the three methodsyou can override to define your custom parsing and serialization:

The parse method calls json.loads, which requires a unicode string in Python 3 but can take either a unicode a regular byte string in Python 2. Since CString.parse returns a byte string, we make sure to encode it before passing it to json.loads.
The serialize method calls json.dumps, which returns a unicode string in Python 3 and returns a byte string in Python 2. Because the serialize method must return a byte string, we always encode our result; on Python 3 this will encode as we exect and when called on a regular byte string in Python 2 the original string is returned.
The convert method defines what happens when we assign a value to our struct field. As mentioned in the CStruct.__setattr__ documentation below, protlib automatically does type coercion, so if you assign 5 to a CString field it will be converted to "5", etc. This behavior is defined by the convert method, and in our case if someone assigns a value to a JsonCString field we don’t want that value to be converted to a string as it would for a regular CString, so we simply return the object unchanged.

Arrays

class CArray ( length, ctype )

You can make an array of any CType. Arrays pack and unpack to andfrom Python lists. For example:

 
        >>> ca = CArray(5, CInt)
>>> raw = ca.serialize( [5,6,7,8,9] )
>>> xs = ca.parse(raw)
>>> assert xs == [5,6,7,8,9]

Arrays may either be given default/always values themselves or use thedefault/always values of the CType they are given. For example:

 
        >>> class Triangle(CStruct):
...     xcoords = CArray(3, CInt(default=0))
...     ycoords = CArray(3, CInt, default=[0,0,0])
...
>>> tri = Triangle()
>>> assert tri.xcoords == tri.ycoords == [0,0,0]
 
       

Nested arrays work as you might expect:

 
        >>> class Matrix(CStruct):
...     xs = CArray(3, CArray(2, CInt(default=0)))
...
>>> assert Matrix().xs == [[0,0], [0,0], [0,0]]

Custom Structs

class CStruct

This should never be instantiated directly. Instead, you should subclassthis when defining a custom struct. Your subclass will be given aconstructor which takes the fields of your struct as positional and/orkeyword arguments. However, you don’t have to provide values for yourfields at this time. For example:

 
        >>> class Point(CStruct):
...     x = CInt()
...     y = CInt()
...
>>> p1 = Point(5, 6)
>>> p2 = Point()
>>> p2.x = 5
>>> p2.y = 6
>>> assert p1 == p2
 
       

classmethod sizeof ( cstruct = None )

Returns the size of the packed binary data needed to hold thisCStruct. This method takes no arguments on a fixed-sizestruct, but if any of this struct’s fields has a variable length,this method will throw an exception if called with no arguments.You can pass an instance of this CStruct to get the size of thatparticular instance, for example:

 
          >>> from protlib import *
>>>
>>> class Person(CStruct):
...     name = CString(length = 5)
...
>>> Person.sizeof()
5
>>>
>>> class Person(CStruct):
...     name = CString(length = AUTOSIZED)
...
>>> Person.sizeof()
Traceback (most recent call last):
  File "", line 1, in 
  File "/home/eli/protlib/env3/lib/python3.1/site-packages/protlib-1.4-py3.1.egg/protlib.py", line 654, in sizeof
    return cls.get_type(cached=True).sizeof(cstruct)
  File "/home/eli/protlib/env3/lib/python3.1/site-packages/protlib-1.4-py3.1.egg/protlib.py", line 254, in sizeof
    return struct.calcsize(BYTE_ORDER + self.struct_format(cstruct))
  File "/home/eli/protlib/env3/lib/python3.1/site-packages/protlib-1.4-py3.1.egg/protlib.py", line 486, in struct_format
    return b"".join(ctype.struct_format(cstruct) for name,ctype in self.subclass.get_fields())
  File "/home/eli/protlib/env3/lib/python3.1/site-packages/protlib-1.4-py3.1.egg/protlib.py", line 486, in 
    return b"".join(ctype.struct_format(cstruct) for name,ctype in self.subclass.get_fields())
  File "/home/eli/protlib/env3/lib/python3.1/site-packages/protlib-1.4-py3.1.egg/protlib.py", line 245, in struct_format
    CString:  _to_bytes("{0}s".format(self.real_length(cstruct))),
  File "/home/eli/protlib/env3/lib/python3.1/site-packages/protlib-1.4-py3.1.egg/protlib.py", line 210, in real_length
    raise CError("cstruct not provided to resolve variable-length field with length attribute {0!r}".format(self.length))
protlib.CError: cstruct not provided to resolve variable-length field with length attribute 'AUTOSIZED'
>>>
>>> eli = Person(name = "Eli")
>>> Person.sizeof(eli)
4
 
         

classmethod parse ( f )

Accepts a string or file-like object and returns an instance ofthis CStruct drawn from that data source.

serialize ( )

Returns the packed binary data representing this CStruct.This is what should be written to files and sockets.

__str__ ( )

Alias for __repr__

__repr__ ( )

Returns a literal representation of the CStruct. For example:

 
          >>> class Point(CStruct):
...     x = CInt()
...     y = CInt()
...
>>> p = Point(x=5, y=6)
>>> p
Point(x=5, y=6)
 
         

__setattr__ ( self, name, val )

When you assign a value to one of a struct’s fields, protlib convertsthe value to the proper data type, according to the data type.For example:

 
          >>> class Point(CStruct):
...     code = CChar()
...     x = CInt()
...     y = CInt()
...
>>> p = Point(code="A", x="5")
>>> assert p.code == ord("A") == 65
>>> assert p.x == 5
>>>
>>> p.y = 6.25
/home/eli/protlib/env3/lib/python3.1/site-packages/protlib-1.4-py3.1.egg/protlib.py:746: CWarning: Loss of precision when converting a float (6.25) to an integer field
  warn("Loss of precision when converting a float ({0}) to an integer field".format(x), CWarning)
>>> assert p.y == 6
 
         

classmethod get_type ( **kwargs )

Returns an objects which may be used to declare a CStruct as afield in another CStruct. This accepts the same defaultand always parameters as the CType constructor. For example:

 
          >>> class Point(CStruct):
...     x = CInt()
...     y = CInt()
...
>>> class Vector(CStruct):
...     p1 = Point.get_type()
...     p2 = Point.get_type(default = Point(0,0))
...
>>> v = Vector(p1 = Point(5,6))
 
         

classmethod get_fields ( )

Returns a list of the CType objects which define the fields ofthis struct in the order in which they were declared.

Warning

The order of struct fields is defined by the order in which the CTypesubclasses for those fields were instantiated. In other words, if you say

 
       from protlib import *

y_field = CInt()
x_field = CInt()

class Point(CStruct):
    x = x_field
    y = y_field

then when you serialize your struct, the y field will come beforethe x field because its CInt value was instantiated first. Similarly,if you say

 
       from protlib import *

class Point(CStruct):
    x = y = CInt()

then the order of the x and y fields is undefined since they share the sameCInt instance. In this second case, a CWarning will be triggered,but the first case is not automatically detected by the protlib library.

Protocol Handlers

protlib also provides a convenient framework for implementing servers which receive andsend CStruct objects. This makes it easy to implement custom binary protocols inwhich structs are passed back and forth over socket connections. This is based onthe SocketServer modulein the Python standard library.

In order to use these examples, you must do only two things.

First, make sure that each struct which represents a message begins with a constantvalue which uniquely identifies that struct.
Second, define a subclass of the appropriate handler class, either TCPHandler orUDPHandler, and define a handler method for each message type you wish to respond to.

An example client/server

Let’s walk through a simple example. We’ll define several structs to represent geometricconcepts: a Point, a Vector, and a Rectangle. Each of these structs is a message whichcan be sent between the client and server. We’ll also define a variable-length messagecalled PointGroup, which demonstrates using variable-length arrays.

Note that first field in each of these messages is a constant value that uniquelyidentifies the message.

This entire example can be found in the examples/geocalc directory. Here’s thecommon.py file, which is imported by both the server.py and client.py programs:

 
      import logging
logging.basicConfig(level = logging.INFO)

from protlib import *

SERVER_ADDR = ("127.0.0.1", 32123)

class Point(CStruct):
    code = CShort(always = 1)
    x    = CFloat()
    y    = CFloat()

class Vector(CStruct):
    code = CShort(always = 2)
    p1   = Point.get_type()
    p2   = Point.get_type()

class Rectangle(CStruct):
    code   = CShort(always = 4)
    points = CArray(4, Point)

class PointGroup(CStruct):
    code   = CShort(always = 3)
    count  = CInt()
    points = CArray("count", Point)
 
     

For our server, we define a handler class with a handler method for each message we wishto accept. The name of each handler method should be the name of the message class inlower case with the words separated by underscores. For example, the Vector classis handled by the vector method, and the PointGroup class is handled by thepoint_group method. Each of these handler methods takes a single parameter otherthan self which is the actual message read and parsed from the socket.

Here’s the server.py file which uses our subclasses ofthe SocketServer moduleclasses to accept and handle incoming messages:

 
      from math import sqrt

from common import *

class Handler(TCPHandler):
    LOG_TO_SCREEN = True
    
    def vector(self, v):
        """returns the mid-point of the line segment"""
        return Point(x = (v.p1.x + v.p2.x) / 2,
                     y = (v.p1.y + v.p2.y) / 2)
    
    def rectangle(self, r):
        """returns the endpoint closest to the origin"""
        dists = [(sqrt(p.x**2 + p.y**2), p) for p in r.points]
        return min(dists)[1]
    
    def point_group(self, pg):
        """returns a rectangle which encompasses all points in the group"""
        xmin = min(p.x for p in pg.points)
        xmax = max(p.x for p in pg.points)
        ymin = min(p.y for p in pg.points)
        ymax = max(p.y for p in pg.points)
        return Rectangle(points=[
            Point(x=xmin, y=ymin), Point(x=xmin, y=ymax),
            Point(x=xmax, y=ymin), Point(x=xmax, y=ymax)
        ])

if __name__ == "__main__":
    LoggingTCPServer(SERVER_ADDR, Handler).serve_forever()
 
     

To test this server, we have a simple client which sends a series of messages to theserver and then reads back the responses, logging everything with our protlib.Loggerclass. Here’s our client.py script:

 
      import socket
from random import randrange

from common import *

def rand_point():
    return Point(x=randrange(100), y=randrange(100))

logger = Logger(also_print = True)
parser = Parser(logger)
sock = socket.create_connection(SERVER_ADDR)
f = sock.makefile("rwb", 0)

vec = Vector(p1=rand_point(), p2=rand_point())
logger.log_and_write(f, vec)
pt = parser.parse(f)
assert vec.p1.x < pt.x < vec.p2.x or vec.p1.x > pt.x > vec.p2.x
assert vec.p1.y < pt.y < vec.p2.y or vec.p1.y > pt.y > vec.p2.y

rect = Rectangle(points=[Point(x=1, y=1),
                         Point(x=1, y=5),
                         Point(x=5, y=1),
                         Point(x=5, y=5)])
logger.log_and_write(f, rect)
pt = parser.parse(f)
assert pt.x == pt.y == 1

points = [rand_point() for i in range(10)]
logger.log_and_write(f, PointGroup(count=10, points=points))
rect = parser.parse(f)
assert rect.code == Rectangle.code.always

sock.close()
 
     

Our server does all of our logging automatically, but we need to manually invoke thelogger on the client. The logs created and their format are explained below.

Logging

protlib uses the logging module toprovide 5 different logs, each with their own suffix: hex, raw, struct, error, and stack.By default, the prefix of these logs will be the name of the current script.A RotatingFileHandleris created for each of these logs if no handlers already exist when the logs are firstaccessed by protlib.

For example, if you’re running the script server.py then these will be the log names,log file names, logging levelused for the log messages, and type of messages written to each log:

log name	default filename	level	messages
server.hex	server.hex_log	`DEBUG`	nicely formatted hex dumps of the binary data sent and received
server.raw	server.raw_log	`INFO`	Python string literals of the binary data sent and received
server.struct	server.struct_log	`WARNING`	literal representations of each struct sent and received
server.error	server.error_log	`ERROR`	error messages
server.stack	server.stack_log	`CRITICAL`	stack traces of uncaught exceptions thrown by handler methods

Each log message generated by one of our protocol handlers contains a unique identifierwhich indicates the binary protocol message received. This makes it easy to match thelog messages in the different files to one another, since this unique message identifierwill be present in each of the 5 logs.

Log examples

Here’s a description of each log:

struct

This contains the literal representation of each request and response, for example:

 
         2010-03-15 18:54:07,664: (1268693647_0) received Vector(code=2, p1=Point(code=1, x=39.0, y=41.0), p2=Point(code=1, x=93.0, y=13.0))
2010-03-15 18:54:07,664: (1268693647_0) sending Point(code=1, x=66.0, y=27.0)

This is convenient because the structs are logged with the Python code which representsthem. Therefore we can paste them directly into a Python command prompt to inspect andplay around with them:

 
         >>> from common import *
>>> p = Point(code=1, x=66.0, y=27.0)
>>> p
Point(code=1, x=66.0, y=27.0)

raw

This contains the raw data in the form of a Python string of each request and response, for example:

 
         2010-03-15 18:54:07,664: (1268693647_0) sending b'\x00\x01B\x84\x00\x00A\xd8\x00\x00'
2010-03-15 18:54:07,667: (1268693647_1) received b'\x00\x04\x00\x01?\x80\x00\x00?\x80\x00\x00\x00\x01?\x80\x00\x00@\xa0\x00\x00\x00\x01@\xa0\x00\x00?\x80\x00\x00\x00\x01@\xa0\x00\x00@\xa0\x00\x00'

This is convenient because we can paste these strings into a Python command promptand play around with them. If they are valid then we can parse them into structs, andif they aren’t then we can examine exactly why; this log will always contain whatwe receive even in the case of unparsable binary data:

 
         >>> from common import *
>>> s = b'\x00\x01B\x84\x00\x00A\xd8\x00\x00'
>>> p = Point.parse(s)
>>> p
Point(code=1, x=66.0, y=27.0)
>>>
>>> s = b"bad"
>>> p = Point.parse(s)
>>> Point.parse(s)
Traceback (most recent call last):
  File "", line 1, in 
  File "protlib.py", line 230, in parse
    return cls.get_type(cached=True).parse(f)
  File "protlib.py", line 141, in parse
    raise CError("{0} requires {1} bytes and was only given {2} ({3!r})".format(self.subclass.__name__, self.sizeof, len(buf), buf))
protlib.CError: Point requires 10 bytes and was only given 3 ('bad')
>>>
>>> s = b"invalid but with enough data"
>>> p = Point.parse(s)
../../protlib.py:526: CWarning: Point.code should always be 1 but was given a value of 26990
  warn("{0}.{1} should always be {2!r} but was given a value of {3!r}".format(self.__class__.__name__, name, field.always, value), CWarning)
>>> p
Point(code=26990, x=1.1430328245747994e+33, y=1.1834294514326081e+22)
 
        

hex

This contains nicely-formatted tables of the binary data sent and received in hexadecimal notation. For example:

 
         2010-03-15 18:38:50,978: (1268692730_0) received
1  2  3  4  5  6  7
00 02 00 01 42 30 00 00
42 74 00 00 00 01 42 aa
00 00 42 18 00 00
 
        

error

This contains messages for common errors, such as when a message is too short, orwhen we have no handler to match a message we’ve received, etc. These messagescontain as much information as possible to help reconstruct the problem, whichusually includes the raw data involved (also present in the raw log).

stack

This contains stack traces from exceptions thrown in your handler methods.

Logger objects

Although logging is performed automatically when using SocketServer classes,you may find it useful to instantiate your own logger objects, then manually make useof the 5 logs listed above. Use this object to do that; note that this class uses butdoes not inherit from thelogging.Logger class.

class Logger ( [ prefix [, also_print=False ] ] )

A logging object which uses the 5 logs listed above.

Parameters:	prefix – Pass a string as this parameter to replace the default prefix (whichis the name of the script being executed). For example, if you passthe string `"foo"` as this parameter, then your logs will be named`foo.hex`, `foo.raw`, etc. also_print – whether to also print log messages to the screen

log_struct ( inst [, trans_type="received" ] )

Logs the repr of an instance of a CStruct subclass to the struct log.

Parameters:	inst – the instance of the struct to be logged trans_type – a prefix to the log message, generally this should be either`"sending"` or `"received"`

log_binary ( data [, trans_type="received" ] )

Logs the repr of the packed binary data to the raw log, then logs anicely formatted table of thje data to the hex log.

Parameters:	data – the packed binary data, such as what’s produced by calling`s.serialize()` on an instance of a `CStruct` subclass trans_type – a prefix to the log message, generally this should be either`"sending"` or `"received"`

log_error ( message, *args, **kwargs )

Logs the message to the error log. The message parameter should bea string, and the *args and **kwargs to this method are used as theparameters to str.format

log_stacktrace ( )

Logs the value of traceback.format_exc()to the stack log.

log_and_write ( f, data )

Logs a string or CStruct instance to the appropriate logs, then writes it to a file.

Parameters:	f – a file object to which data will be written message – a string or CStruct instance

Advanced logging

As mentioned above, protlib automatically sets up a RotatingFileHandler whenyou instantiate protlib.Logger on each of the 5 logs for which noother logging handlers are defined. Because protlib uses the logging modulefrom the standard library, you can use your own configuration, handlers, formatters,etc. This is demonstated by the following example, which is included as the fileexamples/custom_logging/testing.py, although you’ll need to replace the string"smtp.example.com" with a valid outgoing mail server for the code to run properly.

 
       import sys
import time
import logging
from logging.handlers import SMTPHandler, TimedRotatingFileHandler

from protlib import *

class Point(CStruct):
    code = CShort(always = 0x1234)
    x = CInt()
    y = CInt()

logging.basicConfig(level = logging.DEBUG)

trfh = TimedRotatingFileHandler("testing.rotating_log", "s", 1)
logging.getLogger("testing.hex").addHandler(trfh)

logger = Logger()
parser = Parser(logger)

smtp = SMTPHandler("smtp.example.com", "[email protected]", ["[email protected]"], "Stack Trace")
logging.getLogger("testing.stack").addHandler(smtp)

if __name__ == "__main__":
    with open("point.dat","w") as f:
        p1 = Point(x=5, y=6)
        logger.log_and_write(f, p1)
    
    time.sleep(2)
    
    with open("point.dat") as f:
        p2 = parser.parse(f)
    
    try:
        Point(x = "not an integer")
    except CError:
        logger.log_stacktrace()
 
      

Here’s an explanation of the customizations made to our logging:

The logging level is set to logging.DEBUG, which differs from the default value of logging.WARNING.
We use a TimedRotatingFileHandler for our hex log. Because we add this handler before instantiating protlib.Logger, this handler is used instead of the default RotatingFileHandler.
We use a SMTPHandler for our stack log. Because we add this handler after instantiating protlib.Logger, this is used in addition to the default RotatingFileHandler.

Protocol Handler Classes

As mentioned above, you should always have your protocol classes extend eitherthe TCPHandler or UDPHandler class, depending on what type of SocketServeryou’re using. Each of these classes inherits from ProtHandler, and you may usethese methods and fields to affect the behavior of your custom protocol handlers:

class ProtHandler

The user does not instantiate this class or any of its subclasses directly. Instead,you declare your own handler class which subclasses either TCPHandler orUDPHandler, which are themselves subclasses of ProtHandler. They also extendthe StreamRequestHandler and DatagramRequestHandler classesof the SocketServer module, respectively.

This class also inherits from the protlib.Logger class, so you can call the logfunctions listed above from your handler methods by simply calling self.log_stack(),self.log_error("Boo!"), etc.

STRUCT_MOD

By default, your handler will detect all messages present in the same modulewhere the handler class itself is defined. So you can either define your handlerin the same module where your structs are defined, or you can import thosestructs into the handler’s module. This is the recommended way to integrate yourhandlers with your struct definitions.

However, you may instead set the STRUCT_MOD field to the module where the structsare declared. (Technically this can be anything with __dict__ and__name__ fields.) You may also set this to a string which is the name ofthe module where they are declared. For example:

 
          import module_with_structs

class SomeHandler(TCPHandler):
    STRUCT_MOD = module_with_structs

    # handler methods would go here

class AnotherHandler(UDPHandler):
    STRUCT_MOD = "module_with_structs"

    # handler methods would go here

LOG_TO_SCREEN

This is False by default, but if set to True, every log message will beprinted to the screen in addition to being written to the appropriate log.

LOG_PREFIX

Changes the prefix of each log from the name of the current script to whatever is specified.For example, if you set the LOG_PREFIX to "foo", then your logs will befoo.hex, foo.raw, etc.

These attributes are best set where your custom handler class is defined, for example:

 
          class Handler(TCPHandler):
    LOG_TO_SCREEN = True
    LOG_PREFIX = "unified"

    # handler methods would go here

raw_data ( data )

This is the default handler for any message for which no struct has beendefined. By default this logs an error message and sends no reply. Overridethis if you wish to have your own handler for unclassified binary messages;the data parameter is a string containing the binary data of the message.

reply ( data )

Anything you return a handler method is sent back to the client, whether it’sa struct or just binary data in a string. However, sometimes you may need tosend multiple messages back to the client. You can manually concatenate thebinary data strings, or you can use the reply method, for example:

 
          class RepeatRequest(CStruct):
    code = CShort(always = 1)
    name = CString(length = 25)
    repititions = CInt()

class Handler(TCPHandler):
    def repeat_request(self, rr):
        for i in range(rr.repititions):
            self.reply(b"Hello " + sm.name + b"!\n")
 
         

class LoggingTCPServer ( addr, handler_class )

class LoggingUDPServer ( addr, handler_class )

These classes extend theTCPServerand theUDPServerclasses from the SocketServer module, respectively. There are only two differences betweenthese and their parent classes:

The allow_reuse_address field is set to True for these classes.
When your protocol handler is used with one of these classes, the logging level of the default RotatingFileHandler objects is set to INFO. When it’s used with other classes, it’s set to CRITICAL + 1. Note that this is the level of the handlers, which is independent of the level of the loggers themselves, as explained here.

So basically, using these classes simply provides sensible default settings for your logs and sockets.

class Parser ( [ logger [, module ] ] )

If you know what struct you want, then you can use the CStruct.parse classmethodto read an instance of that struct from a file, e.g. p = Point.parse(f). However,in some cases you want to read some data from a file or socket but aren’t sure whatmessage is coming across. This class’s parse method figures out which messageis being read and returns an instance of the correct struct.

Parameters:

Parameters:	module – This is exactly the same as the `ProtHandler.STRUCT_MOD` field;if present then it indicates which module contains the struct classesyou want to use. If omitted, then the module where this class isinstantiated is used. logger – The instance of the `Logger` class to use to perform logging. Ifomitted, the logging level of each default `RotatingFileHandler`will be `CRITICAL + 1`.

module – This is exactly the same as the ProtHandler.STRUCT_MOD field;if present then it indicates which module contains the struct classesyou want to use. If omitted, then the module where this class isinstantiated is used.
logger – The instance of the Logger class to use to perform logging. Ifomitted, the logging level of each default RotatingFileHandlerwill be CRITICAL + 1.

parse ( f )

This method accepts a string or file and returns an instance of the structit reads from that string/file. If the data it finds cannot be parsed intoa struct, then it just returns all of the data it is able to read. Thismay be an empty string if no data is available. Any data returned will bewritten to the appropriate logs.

None will be returned in the case of an incomplete message. In this casea message will be written to the error log.

Struct Inheritance

Many binary protocols have many message types, but every message has exactly the samefields, even if those fields have different constant values. It would be annoying if you hadto write a bunch of mostly-identical struct definitions, so protlib lets you subclassyour custom structs and only override the fields which are different in some way,such as having a default value in some subclasses but not others, etc.

Let’s walk through a simple example, which is available in the examples/struct_inheritancedirectory. First, we define our messages in common.py:

 
      from random import randrange
from datetime import datetime

from protlib import *

SERVER_ADDR = ("127.0.0.1", 5665)

class Message(CStruct):
    code      = CInt()
    timestamp = CString(length=20, default=lambda: datetime.now().strftime("%Y-%m-%d %H:%M:%S"))
    comment   = CString(length=100, default="")
    params    = CArray(20, CInt(default=0))

class ErrorMessage(Message): code = CInt(always = 0)
class CCRequest(Message):    code = CInt(always = 1)
class CCResponse(Message):   code = CInt(always = 2)
class ZipRequest(Message):   code = CInt(always = 3)
class ZipResponse(Message):  code = CInt(always = 4)
 
     

In this case we have a standard message format, and the only thing that varies isthe value of the code field, so we need only specify that field in our subclasses.If we needed to override other fields, we could do so in any order; the order offields would remain as however they were declared in the parent class.

Since these messages all have different constant values in their first field, we canwrite a normal handler class in our server.py:

 
      from common import *

def credit_card_lookup(ssn):
    if ssn != [0] * 9:
        return [randrange(10) for i in range(12)]

def zip_lookup(ssn):
    if ssn != [0] * 9:
        return [randrange(10) for i in range(5)]

class Handler(TCPHandler):
    LOG_TO_SCREEN = True
    
    def cc_request(self, ccr):
        """return the credit card number of the person with the given SSN"""
        ssn = ccr.params[:9]
        cc_num = credit_card_lookup(ssn)
        if cc_num:
            return CCResponse(params = cc_num)
        else:
            return ErrorMessage(params=ssn, comment="No matching SSN")
    
    def zip_request(self, zr):
        """return the zip code of the person with the given SSN"""
        ssn = zr.params[:9]
        zip_code = zip_lookup(ssn)
        if zip_code:
            return ZipResponse(params = zip_code)
        else:
            return ErrorMessage(params=ssn, comment="No matching SSN")

if __name__ == "__main__":
    LoggingTCPServer(SERVER_ADDR, Handler).serve_forever()
 
     

Since our handler can return different types of messages depending on whether our lookupwas successful, our client.py uses the Parser class to parse all incoming messages:

 
      import socket

from common import *

logger = Logger(also_print = True)
parser = Parser(logger)

def rand_ssn():
    return [randrange(10) for i in range(9)]

sock = socket.create_connection(SERVER_ADDR)
f = sock.makefile("rwb", 0)

logger.log_and_write(f, CCRequest(params=rand_ssn()))
ccresp = parser.parse(f)
assert ccresp.code == CCResponse.code.always

logger.log_and_write(f, ZipRequest(params=rand_ssn()))
zresp = parser.parse(f)
assert zresp.code == ZipResponse.code.always

logger.log_and_write(f, ZipRequest())
err = parser.parse(f)
assert err.code == ErrorMessage.code.always

sock.close()
 
     

Miscellaneous classes, methods, and constants

class CError

All exceptions raised by the protlib module will be instances of this class, which extends BaseException.

class CWarning

All warnings triggered by the protlib module will be instances of this class, which extends UserWarning.

underscorize ( name )

This is the function used to convert between camelCased andseparated_with_underscores names. Pass it a string and it returns anall-lower-case string with underscores inserted in the appropriate places. Younever have to call this method yourself, but you can use it as a test if you’reunsure of the correct handler method name for one of your CStruct class.If your struct names are already lower case then this function will just return theoriginal string, whether or not you are already using underscores. To makethings even clearer, here are some examples:

 
       SomeStruct    -> some_struct
SSNLookup     -> ssn_lookup
RS485Adaptor  -> rs485_adaptor
Rot13Encoded  -> rot13_encoded
RequestQ      -> request_q
John316       -> john316
rs485adaptor  -> rs485adaptor
rot13_encoded -> rot13_encoded
 
      

hexdump ( data )

Takes a string and returns a string containing a nicely formatted table of thehexadecimal values of the data in that string. For example:

 
       >>> from protlib import *
>>> print(hexdump(b"Hello World!"))
     0  1  2  3  4  5  6  7
  0  48 65 6c 6c 6f 20 57 6f
  8  72 6c 64 21
 
      

BYTE_ORDER

The first character of the format string passed tothe struct modulewhich determines the byte order used to parse and serialize our structs. By defaultthis is set to "!", which indicates network byte order. You may change it toany of the options available in the struct module.

AUTOSIZED

Special constant value which can be passed to the length attribute of aCString or CUnicode object to indicate that the string is null-terminatedand may have any length.

你可能感兴趣的:(python)

【Postgres_Python】使用python脚本将多个PG数据库合并为一个PG数据库萌小丹Fighting Postgres_Python 数据库
需要合并的多个PG数据库表个数和结构一致，这里提供一种思路，选择sql语句insert插入的方式进行，即将其他PG数据库的每个表内容插入到一个PG数据库中完成数据库合并示例代码说明：选择一个数据库导出表结构为.sql文件（可借助Navicat工具），在此基础上修改.sql内容加入insert语句和dblink语句，数据可能存在重复需要在每个insert插入语句后带上ONCONFLICTDONOTH
【ArcGIS遇上Python】Python使用栅格数据刘一哥GIS ArcGIS Python 栅格数据栅格描述
栅格数据是一个独特的空间数据类型。很多地理处理工具都是为了处理栅格数据而开发的。1.列出栅格数据ListRaster函数是以Python列表的形式返回工作控件中的栅格数据，该函数的语法格式是：ListRaster({wild_card},{raster_type})可选参数wild_card通过名称限制返回的结果，参数raster_type通过栅格数据的类型限制返回的结果。举例：列出某个工作空间中
python信号与槽（二） a_b_c_007
上一篇信号与槽的连接，与信号发射都是手动的，而且信号传递比较单一，这次我们弄一些自动的。fromPyQt5.QtCoreimport*classMultiSignal(QObject):##信号变量定义#无参信号signal1=pyqtSignal()#signal2=pyqtSignal(int)#signal3=pyqtSignal(int,str)#signal4=pyqtSignal(li
python 五文件操作读取大文件空灵宫（Ethereal Palace） python python java 前端
读取大文件时，为了避免占用过多内存，通常会采用分块读取的方式。以下是几种处理大文件的常见方法：1.使用迭代读取文件（逐行读取）使用for循环逐行读取文件，这种方法高效且占用内存小。#逐行读取文件withopen("large_file.txt",mode="r",encoding="utf-8")asfile:forlineinfile:#处理每一行数据print(line.strip())#去掉
Python3 OS模块中的文件/目录方法说明十四崔行舟 python python
一.简介前面文章简单学习了Python3中OS模块中的文件/目录的部分函数。本文继续来学习OS模块中文件、目录的操作方法：os.statvfs()方法，os.symlink()方法。二.Python3OS模块中的文件/目录方法1.os.statvfs()方法os.statvfs()方法用于返回包含文件描述符fd的文件的文件系统的信息。这些信息是关于文件系统统计信息的属性，比如块大小、总块数、可用块
pycharm提示无效SDK Alvin༒ pycharm ide python
问题：pycharm提示无效SDK解决：原因一、在公司使用的python版本是python10，在家使用的python版本是python11拉取代码，pip版本不一致，导致pycharm提示：无效SDK保留python版本的pip
python算法和数据结构刷题[5]：动态规划励志成为美貌才华为一体的女子数据结构与算法算法数据结构动态规划
动态规划（DynamicProgramming,DP）是一种算法思想，用于解决具有最优子结构的问题。它通过将大问题分解为小问题，并找到这些小问题的最优解，从而得到整个问题的最优解。动态规划与分治法相似，但区别在于动态规划的子问题通常不是相互独立的。动态规划的核心是解决重复子问题。例如，斐波那契数列问题，可以通过递归实现，但效率低下，因为会有重复计算。动态规划通过存储已解决的子问题的答案，避免重复计
python 求差分_用python实现简单的有限元方法（一） weixin_39622710 python 求差分
华中师范大学hahakity有限元算法（FiniteElementMethod，简称FEM）是一种非常流行的求解偏微分方程的数值算法。有限元被广泛应用于结构受力分析、复杂边界的麦克斯韦方程求解以及热传导等问题。这一节介绍有限元方法的基本原理，以及如何用Python从头实现一个有限元算法，数值求解麦克斯韦方程。学习内容筑基：加权残差法（WeightedResidualMethod）心法：有限元与有限
python gui编程for mac_Python GUI framework for Mac OS X weixin_39897687 python gui编程for mac
可以将文章内容翻译成中文,广告屏蔽插件可能会导致该功能失效(如失效，请关闭广告屏蔽插件后再试):问题:I'mtryingtofindagood"pythonGUIframework"forMacOSX,butIhaven'tfoundanythinggooduntilnow,onlywxWidgetswhichIdon'tlikeandit'salsounstable.Anysuggestions
python cv2 matchtemplate_机器学习进阶-图像金字塔与轮廓检测-模板匹配（单目标匹配和多目标匹配）1.cv2.matchTemplate(进行模板匹配) 2.cv2.minMa... weixin_39621044 python cv2 matchtemplate
1.cv2.matchTemplate(src,template,method)#用于进行模板匹配参数说明：src目标图像，template模板，method使用什么指标做模板的匹配度指标2.min_val,max_val,min_loc,max_loc=cv2.minMaxLoc(ret)#找出矩阵中最大值和最小值，即其对应的(x,y)的位置参数说明：min_val，max_val,min_lo
Python中的有限元方法：详细指南与代码实现，用于计算电磁学组建模电磁现象快撑死的鱼 python算法解析 python 开发语言
第一部分：简介与背景在现代工程和科学中，计算电磁学已经成为了一个不可或缺的工具。它为我们提供了一种方法，可以在计算机上模拟电磁现象，而不是在实验室中进行实验。有限元方法（FEM）是其中的一种流行的数值方法，它可以用于解决各种各样的工程问题，包括电磁学问题。有限元方法的基本思想是将一个连续的问题离散化，将其转化为在有限数量的点上求解的问题。这样，我们可以使用线性代数的技术来求解这些问题，从而得到近似
pycharm说的SDK是什么机械骷髅 pycharm ide python
2024.12.26遇到的问题已经解决方法pycharm所说的SDK是什么意思在PyCharm中，SDK代表“软件开发工具包”（SoftwareDevelopmentKit）。它是一个包含了开发特定类型应用程序所需的工具、库和文档的集合。在Python开发中，SDK通常指的是Python解释器及其相关的库和工具。SDK的作用是：Python解释器：SDK包含了Python解释器，它是执行Pytho
django多种查询筛选数据库方式 Sean_TS_Wang Django postgresql django
简介本文主要整理了Django多种针对postgresql数据库所支持的查询方式目录简介目录正文一、使用Python直接操作数二、使用Django执行数据库查询语句Django使用游标执行SQL查询语句Djangoraw执行SQL查询语句三、Django使用extra拆分SQL语句执行参数说明四、使用DjangoORM进行简单数据库查询五、使用双下划线查询六、关联表使用下划线查询外键关联查询多对多
有限元python NSidle python pygame 开发语言
importnumpyasnpimportcopyimportpygame,sysfrompygame.localsimport*classNode:def__init__(self):self.id=-1self.coordinate=[0,0]self.type=-1defcopy(self):returnselfclassRodElement:def__init__(self):self.i
Python-基于PyQt5,pdf2docx,pathlib的PDF转Word工具(专业版) 闪云-微星实用小程序 pdf word python pycharm 开发语言 pyqt
前言：日常生活中，我们常常会跟WPSOffice打交道。作表格，写报告，写PPT......可以说，我们的生活已经离不开WPSOffice了。与此同时，我们在这个过程中也会遇到各种各样的技术阻碍，例如部分软件的PDF转Word需要收取额外费用等。那么，可不可以自己开发一个小工具来实现PDF转Word这个功能呢?答案是肯定的，Python生来就是为应用层开发的。话不多说，我们直接开始今天的Pytho
Python-基于PyQt5,wordcloud,pillow,numpy,os,sys的智能词云生成器闪云-微星 WPS python pillow 开发语言 pycharm numpy 小程序 pyqt
前言：日常生活中，我们有时后就会遇见这样的情形：我们需要将给定的数据进行可视化处理，同时保证呈现比较良好的量化效果。这时候我们可能就会用到词云图。词云图（Wordcloud）又称文字云，是一种文本数据的图片视觉表达方式，一般是由词汇组成类似云的图形，用于展示大量文本数据。词云这个概念首先是由美国西北大学新闻学副教授、新媒体专业主任里奇·戈登提出的，通常用于描述网站上的关键字元数据（标签），或可视化
python-矩阵转置/将列表分割成块/和超过N的最短子数组闪云-微星 python 算法机器翻译
一：矩阵转置题目描述输入一个n行m列的矩阵A，输出它的转置AT。输入第一行包含两个整数n和m，表示矩阵A的行数和列数。1≤n≤100，1≤m≤100。接下来n行，每行m个整数，表示矩阵A的元素。相邻两个整数之间用单个空格隔开，每个元素均在1∼1000之间。输出m行，每行n个整数，为矩阵A的转置。相邻两个整数之间用单个空格隔开。样例输入133123456789样例输出1147258369来源/分类（
python算法和数据结构刷题[3]：哈希表、滑动窗口、双指针、回溯算法、贪心算法励志成为美貌才华为一体的女子数据结构与算法算法数据结构散列表
回溯算法「所有可能的结果」，而不是「结果的个数」，一般情况下，我们就知道需要暴力搜索所有的可行解了，可以用「回溯法」。回溯算法关键在于:不合适就退回上一步。在回溯算法中，递归用于深入到所有可能的分支，而迭代（通常在递归函数内部的循环中体现）用于探索当前层级的所有可能选项。组合问题39.组合总和-力扣（LeetCode）给你一个无重复元素的整数数组candidates和一个目标整数target，找出
〖Python WEB 自动化测试实战篇⑥〗- selenium元素定位之find-elements 哈哥撩编程 #④ -自动化测试实战篇 Python全栈白宝书 python python自动化测试实战 WEB自动化测试实战 selenium 元素定位
>【易编橙·终身成长社群，相遇已是上上签！】-点击跳转～<作者：哈哥撩编程（视频号、B站、抖音同名）图书作者：程序员职场效能宝典博客专家：全国博客之星第四名超级个体：COC上海社区主理人特约讲师：谷歌亚马逊分享嘉宾科技博主：极星会首批签约作者大家好,我是哈哥，一位35岁但是依然头发茂密的程序员老兵，目前在公司开启了养老模式。现在热衷于分享各种编程领域的软硬技能知识以及前沿技术，在过去的三
Python 网络爬虫实战：从基础到高级爬取技术一ge科研小菜鸡编程语言 Python python
个人主页：一ge科研小菜鸡-CSDN博客期待您的关注1.引言网络爬虫（WebScraping）是一种自动化技术，利用程序从网页中提取数据，广泛应用于数据采集、搜索引擎、市场分析、舆情监测等领域。本教程将涵盖requests、BeautifulSoup、Selenium、Scrapy等常用工具，并深入探讨反爬机制突破、动态加载页面、模拟登录、多线程/分布式爬取等高级技巧。2.爬虫基础：request
Python内存泄漏排查 SkylerHu Python python OOM 内存泄漏
Python内存泄漏排查1.排查工具1.1gc1.2tracemalloc1.3mem_top1.4guppy1.5objgraph1.6pympler1.7pyrasite2.案例分析3.参考记一次排查Python程序内存泄漏的问题。1.排查工具工具说明gcPython标准库内置模块tracemalloc推荐Python3.4以上此工具为标准库mem_top推荐是对gc的封装，能够排序输出最多的
关于排查python内存泄露的简单总结翔云123456 python python 内存泄露
这次的内存泄露问题是发生在多线程场景下的。各种工具都试过了，gc,objgraph,pdb,pympler等，仍然没有找到问题所在。pdb感觉用起来很方便，可以调试代码，对原来的代码无侵入性。排查问题的过程中，多线程场景下，相关的工具，显得无力的。使用objgraph时，代码执行很长时间后，show_growth()显示没有新创建的对象。这个可能是因为objgraph只针对当前线程的上下文。pym
Python如何查看内存泄漏 julielele python python 开发语言
在python中，当一个变量不被引用的时候就会触发垃圾回收机制从而被从内存中删除，但有时一个不注意可能就会出现内存泄漏问题。Python中可能的会出现内存泄露的情况(1)循环引用：当两个或多个对象相互引用，造成的循环引用进而导致内存泄露(2)大量创建对象：当程序中频繁创建大量的对象并没有及时销毁，也会导致内存泄露(3)全局变量：当全局变量被创建后一直存在，即使它们不再被使用，也会占用内存空间，可能
Python实现内存泄露排查的示例 Linux资源站 python 开发语言
导读一般在python代码块的调试过程中会使用memory-profiler、filprofiler、objgraph等三种方式进行辅助分析，今天这里主要介绍使用objgraph对象提供的函数接口来进行内存泄露的分析，感兴趣的可以了解一下一般情况下只有需要长期运行的项目才会去关注内存的增长情况，即使是很小部分的内存泄露经过长期的运行仍然会产生很大的隐患。python本身也是支持垃圾的自动回收的，但
Python内存泄漏排查技巧与编程代码幻想花园 python 开发语言编程
在Python编程中，内存泄漏是一个常见的问题。当我们创建对象或分配内存资源时，如果没有正确释放或销毁这些资源，就会导致内存泄漏。长时间运行的程序中的内存泄漏可能会导致内存消耗殆尽，最终导致程序崩溃。本文将介绍一些Python内存泄漏排查的小技巧，并提供相应的源代码示例。使用内存分析工具Python提供了一些内存分析工具，可以帮助我们检测和定位内存泄漏问题。其中一个常用的工具是objgraph库。
使用Python开发windows桌面程序 ww2890chen
使用Python开发windows桌面程序一、开发前期准备1.boa-constructor-0.6.1.bin.setup.exe#一个wxWidges的集成开发环境，简单如Delphi，可以直接拖拽控件，并且和其他集成环境不一样，#它不与集成开发环境的MainLoop冲突，用pythonwin,pyScripter都会冲突，典型报错就是运行第二次#程序的时候，直接导致集成开发环境的强制退出，因
python多进程和多线程晚风吹儿 Python python 开发语言 pycharm
前言进程是资源分配的最小单位，线程是CPU调度的最小单位进程：操作系统的每个一个程序都是一个进程线程：进程包括了线程，一个进程下可以有多个线程同时进行一、多进程代码如下（示例）：#-*-coding:utf-8-*-"""@Time：2022/5/2013:20@Author：盘盘@File：more_process.py@IDE：PyCharm"""fromrandomimportrandint
Python--多线程 weixin_34403693 python 运维
首先，说明一下多线程的应用场景：当python处理多个任务时，这些任务本质是异步的，需要有多个并发事务，各个事务的运行顺序可以是不确定的、随机的、不可预测的。计算密集型的任务可以顺序执行分隔成的多个子任务，也可以用多线程的方式处理。但I/O密集型的任务就不好以单线程方式处理了，如果不用多线程，只能用一个或多个计时器来处理实现。下面说一下进程与线程：进程（有时叫重量级进程），是程序的一次执行，正如我
python多线程怎么写日志_Python日志记录在多进程下的使用可以不是真名 python多线程怎么写日志
1、问题描述项目中，使用RotatingFileHandler根据日志文件大小来切分日志。设置文件的MaxBytes为1GB，backupCount大小为5。经查看，发现日志文件的大小均小于10MB，且每个回滚日志文件的写入时间也都比较接近。2、分析日志文件过小，猜测是代码有问题，或者是文件内容有丢失；日志写入时间接近猜测是同时写入的问题。经检查，代码没有问题，排除此原因。考虑当前使用gunico
python之多线程 sixkery python基础
注：本文是廖大的教程文章，本人也在学习，因为老是记不住，自己手打一边，代码也是亲自测试。廖大传送门多线程多个任务可以由多进程完成，也可以由一个进程内的多线程完成。一个线程由多个进程组成，一个进程至少有一个线程。由于线程是操作系统直接支持的单元，因此，高级语言都内置多线程的支持，python也不例外，并且，python的线程是真正的PosixThread,不是模拟出来的线程。python的标准库提供
html页面js获取参数值 0624chenhong html
1.js获取参数值js function GetQueryString(name) { var reg = new RegExp("(^|&)"+ name +"=([^&]*)(&|$)"); var r = windo
MongoDB 在多线程高并发下的问题 BigCat2013 mongodb DB 高并发重复数据
最近项目用到 MongoDB , 主要是一些读取数据及改状态位的操作. 因为是结合了最近流行的 Storm进行大数据的分析处理，并将分析结果插入Vertica数据库，所以在多线程高并发的情境下, 会发现 Vertica 数据库中有部分重复的数据. 这到底是什么原因导致的呢？笔者开始也是一筹莫展，重复去看 MongoDB 的 API , 终于有了新发现： com.mongodb.DB 这个类有
c++ 用类模版实现链表(c++语言程序设计第四版示例代码) CrazyMizzz 数据结构 C++
#include<iostream> #include<cassert> using namespace std; template<class T> class Node { private: Node<T> * next; public: T data;
最近情况麦田的设计者感慨考试生活
在五月黄梅天的岁月里，一年两次的软考又要开始了。到目前为止，我已经考了多达三次的软考，最后的结果就是通过了初级考试（程序员）。人啊，就是不满足，考了初级就希望考中级，于是，这学期我就报考了中级，明天就要考试。感觉机会不大，期待奇迹发生吧。这个学期忙于练车，写项目，反正最后是一团糟。后天还要考试科目二。这个星期真的是很艰难的一周，希望能快点度过。
linux系统中用pkill踢出在线登录用户被触发 linux
由于linux服务器允许多用户登录，公司很多人知道密码，工作造成一定的障碍所以需要有时踢出指定的用户 1/#who 查出当前有那些终端登录（用 w 命令更详细） # who root pts/0 2010-10-28 09:36 (192
仿QQ聊天第二版肆无忌惮_ qq
在第一版之上的改进内容: 第一版链接: http://479001499.iteye.com/admin/blogs/2100893 用map存起来号码对应的聊天窗口对象,解决私聊的时候所有消息发到一个窗口的问题. 增加ViewInfo类,这个是信息预览的窗口,如果是自己的信息,则可以进行编辑. 信息修改后上传至服务器再告诉所有用户,自己的窗口
java读取配置文件知了ing
1，java读取.properties配置文件 InputStream in; try { in = test.class.getClassLoader().getResourceAsStream("config/ipnetOracle.properties");//配置文件的路径 Properties p = new Properties()
__attribute__ 你知多少？矮蛋蛋 C++gcc
原文地址: http://www.cnblogs.com/astwish/p/3460618.html GNU C 的一大特色就是__attribute__ 机制。__attribute__ 可以设置函数属性（Function Attribute ）、变量属性（Variable Attribute ）和类型属性（Type Attribute ）。 __attribute__ 书写特征是：
jsoup使用笔记 alleni123 java 爬虫 JSoup
<dependency> <groupId>org.jsoup</groupId> <artifactId>jsoup</artifactId> <version>1.7.3</version> </dependency> 2014/08/28 今天遇到这种形式，
JAVA中的集合 Collectio 和Map的简单使用及方法百合不是茶 list map set
List ,set ,map的使用方法和区别 java容器类类库的用途是保存对象，并将其分为两个概念： Collection集合：一个独立的序列，这些序列都服从一条或多条规则;List必须按顺序保存元素，set不能重复元素；Queue按照排队规则来确定对象产生的顺序（通常与他们被插入的
杀LINUX的JOB进程 bijian1013 linux unix
今天发现数据库一个JOB一直在执行，都执行了好几个小时还在执行，所以想办法给删除掉系统环境： ORACLE 10G Linux操作系统操作步骤如下：第一步.查询出来那个job在运行，找个对应的SID字段 select * from dba_jobs_running--找到job对应的sid &n
Spring AOP详解 bijian1013 java spring AOP
最近项目中遇到了以下几点需求，仔细思考之后，觉得采用AOP来解决。一方面是为了以更加灵活的方式来解决问题，另一方面是借此机会深入学习Spring AOP相关的内容。例如，以下需求不用AOP肯定也能解决，至于是否牵强附会，仁者见仁智者见智。 1.对部分函数的调用进行日志记录，用于观察特定问题在运行过程中的函数调用
[Gson六]Gson类型适配器(TypeAdapter) bit1129 Adapter
TypeAdapter的使用动机 Gson在序列化和反序列化时，默认情况下，是按照POJO类的字段属性名和JSON串键进行一一映射匹配，然后把JSON串的键对应的值转换成POJO相同字段对应的值，反之亦然，在这个过程中有一个JSON串Key对应的Value和对象之间如何转换(序列化/反序列化)的问题。以Date为例，在序列化和反序列化时，Gson默认使用java.
【spark八十七】给定Driver Program，如何判断哪些代码在Driver运行，哪些代码在Worker上执行 bit1129 driver
Driver Program是用户编写的提交给Spark集群执行的application，它包含两部分作为驱动： Driver与Master、Worker协作完成application进程的启动、DAG划分、计算任务封装、计算任务分发到各个计算节点(Worker)、计算资源的分配等。计算逻辑本身，当计算任务在Worker执行时，执行计算逻辑完成application的计算任务
nginx 经验总结 ronin47 nginx 总结
　　　深感nginx的强大，只学了皮毛，把学下的记录。　　　获取Header 信息，一般是以$http_XX（ＸＸ是小写）获取body,通过接口，再展开，根据Ｋ取Ｖ　　　获取uri,以$arg_XX &n
轩辕互动-1.求三个整数中第二大的数2.整型数组的平衡点 bylijinnan 数组
import java.util.ArrayList; import java.util.Arrays; import java.util.List; public class ExoWeb { public static void main(String[] args) { ExoWeb ew=new ExoWeb(); System.out.pri
Netty源码学习-Java-NIO-Reactor bylijinnan java 多线程 netty
Netty里面采用了NIO-based Reactor Pattern 了解这个模式对学习Netty非常有帮助参考以下两篇文章： http://jeewanthad.blogspot.com/2013/02/reactor-pattern-explained-part-1.html http://gee.cs.oswego.edu/dl/cpjslides/nio.pdf
AOP通俗理解 cngolon spring AOP
1.我所知道的aop 初看aop,上来就是一大堆术语，而且还有个拉风的名字，面向切面编程，都说是OOP的一种有益补充等等。一下子让你不知所措，心想着：怪不得很多人都和我说aop多难多难。当我看进去以后，我才发现：它就是一些java基础上的朴实无华的应用，包括ioc，包括许许多多这样的名词，都是万变不离其宗而已。 2.为什么用aop&nb
cursor variable 实例 ctrain variable
create or replace procedure proc_test01 as type emp_row is record( empno emp.empno%type, ename emp.ename%type, job emp.job%type, mgr emp.mgr%type, hiberdate emp.hiredate%type, sal emp.sal%t
shell报bash: service: command not found解决方法 daizj linux shell service jps
今天在执行一个脚本时，本来是想在脚本中启动hdfs和hive等程序，可以在执行到service hive-server start等启动服务的命令时会报错，最终解决方法记录一下：脚本报错如下： ./olap_quick_intall.sh: line 57: service: command not found ./olap_quick_intall.sh: line 59
40个迹象表明你还是PHP菜鸟 dcj3sjt126com 设计模式 PHP 正则表达式 oop
你是PHP菜鸟，如果你：1. 不会利用如phpDoc 这样的工具来恰当地注释你的代码2. 对优秀的集成开发环境如Zend Studio 或Eclipse PDT 视而不见3. 从未用过任何形式的版本控制系统，如Subclipse4. 不采用某种编码与命名标准，以及通用约定，不能在项目开发周期里贯彻落实5. 不使用统一开发方式6. 不转换（或）也不验证某些输入或SQL查询串（译注：参考PHP相关函
Android逐帧动画的实现 dcj3sjt126com android
一、代码实现： private ImageView iv; private AnimationDrawable ad; @Override protected void onCreate(Bundle savedInstanceState) { super.onCreate(savedInstanceState); setContentView(R.layout
java远程调用linux的命令或者脚本 eksliang linux ganymed-ssh2
转载请出自出处： http://eksliang.iteye.com/blog/2105862 Java通过SSH2协议执行远程Shell脚本(ganymed-ssh2-build210.jar) 使用步骤如下： 1.导包官网下载: http://www.ganymed.ethz.ch/ssh2/ ma
adb端口被占用问题 gqdy365 adb
最近重新安装的电脑，配置了新环境，老是出现： adb server is out of date. killing... ADB server didn't ACK * failed to start daemon * 百度了一下，说是端口被占用，我开个eclipse，然后打开cmd，就提示这个，很烦人。一个比较彻底的解决办法就是修改
ASP.NET使用FileUpload上传文件 hvt .net C#hovertree asp.net webform
前台代码： <asp:FileUpload ID="fuKeleyi" runat="server" /> <asp:Button ID="BtnUp" runat="server" onclick="BtnUp_Click" Text="上传" />
代码之谜（四）- 浮点数（从惊讶到思考） justjavac 浮点数精度代码之谜 IEEE
在『代码之谜』系列的前几篇文章中，很多次出现了浮点数。浮点数在很多编程语言中被称为简单数据类型，其实，浮点数比起那些复杂数据类型（比如字符串）来说，一点都不简单。单单是说明 IEEE浮点数就可以写一本书了，我将用几篇博文来简单的说说我所理解的浮点数，算是抛砖引玉吧。一次面试记得多年前我招聘 Java 程序员时的一次关于浮点数、二分法、编码的面试，多年以后，他已经称为了一名很出色的
数据结构随记_1 lx.asymmetric 数据结构笔记
第一章 1.数据结构包括数据的逻辑结构、数据的物理/存储结构和数据的逻辑关系这三个方面的内容。 2.数据的存储结构可用四种基本的存储方法表示，它们分别是顺序存储、链式存储、索引存储和散列存储。 3.数据运算最常用的有五种，分别是查找/检索、排序、插入、删除、修改。 4.算法主要有以下五个特性：输入、输出、可行性、确定性和有穷性。 5.算法分析的
linux的会话和进程组网络接口 linux
会话：一个或多个进程组。起于用户登录，终止于用户退出。此期间所有进程都属于这个会话期。会话首进程：调用setsid创建会话的进程1.规定组长进程不能调用setsid，因为调用setsid后，调用进程会成为新的进程组的组长进程.如何保证？先调用fork，然后终止父进程，此时由于子进程的进程组ID为父进程的进程组ID，而子进程的ID是重新分配的，所以保证子进程不会是进程组长，从而子进程可以调用se
二维数组元素的连续求解 1140566087 二维数组 ACM
import java.util.HashMap; public class Title { public static void main(String[] args){ f(); } // 二位数组的应用 //12、二维数组中，哪一行或哪一列的连续存放的0的个数最多，是几个0。注意，是“连续”。 public static void f(){
也谈什么时候Java比C++快 windshome java C++
刚打开iteye就看到这个标题“Java什么时候比C++快”，觉得很好笑。你要比，就比同等水平的基础上的相比，笨蛋写得C代码和C++代码，去和高手写的Java代码比效率，有什么意义呢？我是写密码算法的，深刻知道算法C和C++实现和Java实现之间的效率差，甚至也比对过C代码和汇编代码的效率差，计算机是个死的东西，再怎么优化，Java也就是和C