Kang_TJU

python网络分析-network第一版的封装

本文主要是对我在写实验的时候所用到的networkx进行的一个初步的封装。其实不是很标准，现在再写第二版。先把之前的代码贴上来。主要参考的文档就是networkx的官方文档。
[networkx-reference]

我需要说明一点，下面的代码针对的是无向图。

代码

下面这一部分代码是对networkx的初步封装。

GraphOperation.py

#-*- coding:utf-8 -*-
import networkx as nx
import matplotlib.pyplot as plt
import traceback

'''
我对networkx 的封装
还是一个图操作-工具类
'''

class GraphOperation:

    #-----------------graph operation-----------------

    # construct a graph - undirected graph if default
    def __init__(self):

        self.graph = nx.Graph()

    def convert_to_directed_graph(self):
        self.graph = nx.DiGraph()

    def convert_to_multi_graph(self):
        self.graph = nx.MultiGraph()

    # only directed graph can do this operation
    def convert_to_undirected_graph(self):
        self.graph = nx.Graph()

    # clear the graph
    def clear_graph(self):
        try:
            self.graph.clear()
        except Exception, e:
            print traceback.print_exc()

    #------------------node operation----------------------------

    # add a node
    def add_node(self, node):
        try:
            self.graph.add_node(node)
        except Exception,e:
            print traceback.print_exc()

    # add a list of nodes
    def add_nodes_by_list(self, node_list):
        try:
            self.graph.add_nodes_from(node_list)
        except Exception,e:
            print traceback.print_exc()


    # remove a node
    def remove_node(self, node):
        try:
            self.graph.remove_node(node)
        except Exception,e:
            print traceback.print_exc()

    # remove a list of nodes
    def remove_nodes_by_list(self, node_list):
        try:
            self.graph.remove_nodes_from(node_list)
        except Exception,e:
            print traceback.print_exc()


    # get number of nodes
    def get_number_of_nodes(self):
        try:
            return self.graph.number_of_nodes()
        except Exception, e:
            print traceback.print_exc()

    # get nodes, return a list of nodes
    def get_nodes(self):
        try:
            return self.graph.nodes()
        except Exception, e:
            print traceback.print_exc()


    # get neighbors of v, return a list of nodes which is the neighbor of v
    def get_neighbors(self, v):
        try:
            return self.graph.neighbors(v)
        except Exception, e:
            print traceback.print_exc()

    #---------------edge operation------------------------------

    # add an edge
    def add_edge(self,u,v):
        try:
            self.graph.add_edge(u,v)
        except Exception,e:
            print traceback.print_exc()

    # add an edge by a tuple
    def add_edge_by_tuple(self,e):
        try:
            self.add_edge(*e) # unpack edge tuple
        except Exception,e:
            print traceback.print_exc()

    # add edges by list which is compromised of tuples, every tuple is an edge
    def add_edges_by_list(self, edge_list):
        try:
            self.graph.add_edges_from(edge_list)
        except Exception,e:
            print traceback.print_exc()


    # remove an edge
    def remove_edge(self,u ,v ):
        try:
            self.graph.remove_edge(u, v)
        except Exception,e:
            print traceback.print_exc()

    # remove an edge by tuple
    def remove_edge_by_tuple(self, e):
        try:
            self.remove_edge(*e)
        except Exception,e:
            print traceback.print_exc()

    # remove edges by list which is compromised of tuples
    def remove_edges_by_list(self, edge_list):
        try:
            self.remove_edges_from(edge_list)
        except Exception, e:
            print traceback.print_exc()


    # get number of edges
    def get_number_of_edges(self):
        try:
            return self.graph.number_of_edges()
        except Exception, e:
            print traceback.print_exc()

    # get edges, return a list of tuple which is a presentation of an edge
    def get_edges(self):
        try:
            return self.graph.edges()
        except Exception, e:
            print traceback.print_exc()


    # add weighted list by a list which is compromised of tuples
    def add_weighted_edge(self, weighted_edge_list):
        try:
            self.graph.add_weighted_edges_from(weighted_edge_list)
        except Exception, e:
            print traceback.print_exc()

    # get weighted edge
    def get_weighted_edge(self):
        try:
            return self.graph.edges(data='weight')
        except Exception, e:
            print traceback.print_exc()

    #---------------degree analysis-------------------------------------------------------------

    # get the degree of all nodes, return a dict.
    # directed graph work well, undirected graph does not test.
    def get_degree(self):
        try:
            return self.graph.degree()
        except Exception, e:
            print traceback.print_exc()

    # get the degree of a node, return an interger
    def get_degree_by_node(self, node_id):
        try:
            return self.graph.degree(node_id)
        except Exception, e:
            print traceback.print_exc()

    # get the degree of a node, but the degree is not viewed as sum of edges
    # instead the degree is viewed as sum of the weight of edges
    # eg: (1,2,0.5),(3,1,0.75) the degree based on weight of node 1 is 0.5+0.75 = 1.25(not 2)
    def get_degree_based_on_weight_by_node(self, node_id):
        try:
            return self.graph.degree(node_id, weight="weight")
        except Exception, e:
            print traceback.print_exc()

    # get sorted degrees, return a list. the item of a list is degree value of a node
    def get_sorted_degrees(self):
        try:
            return sorted(nx.degree(self.graph).values(), reverse=True)
        except Exception, e:
            print traceback.print_exc()



    # get the indegree of all nodes.
    def get_in_degree(self):
        try:
            return self.graph.in_degree()
        except Exception, e:
            print traceback.print_exc()

    # get the indegree of a node
    def get_in_degree_by_node(self, node_id):
        try:
            return self.graph.in_degree(node_id)
        except Exception, e:
            print traceback.print_exc()

    def get_in_degree_based_on_weight_by_node(self, node_id):
        try:
            return self.graph.in_degree(node_id, weight = "weight")
        except Exception, e:
            print traceback.print_exc()

    # get the outdegree of all nodes
    def get_out_degree(self):
        try:
            return self.graph.out_degree()
        except Exception, e:
            print traceback.print_exc()

    # get the outdegree of a node
    def get_out_degree_by_node(self, node_id):
        try:
            return self.graph.out_degree(node_id)
        except Exception, e:
            print traceback.print_exc()

    def get_out_degree_based_on_weight_by_node(self, node_id):
        try:
            return self.graph.out_degree(node_id, weight="weight")
        except Exception, e:
            print traceback.print_exc()



    # ----------component analysis-----------------

    # get connected components - return a list of set which is a component
    def get_connected_components(self):
        try:
            return nx.connected_components(self.graph)
        except Exception, e:
            print traceback.print_exc()

    # ----------drawing graph-----------------------
    def draw_graph(self,title):
        try:

            plt.title(title)
            nx.draw(self.graph)

            plt.show(title)
        except Exception, e:
            print traceback.print_exc()

    def draw_network(self):
        try:
            nx.draw_networkx(self.graph, nx.spring_layout)
            plt.show()
        except Exception,e:
            print traceback.print_exc()


    def draw_graph_random_layout(self):
        try:
            nx.draw_random(self.graph)
            plt.show()
        except Exception,e:
            print traceback.print_exc()


    def draw_graph_spring_layout(self):
        try:
            nx.draw_spring(self.graph)
            plt.show()
        except Exception,e:
            print traceback.print_exc()


    # ---------- Graph methods--------------------------

    # return a list of the frequency of each degree value
    # 这个函数我说明一下，之前的degree函数返回的是每个节点的度，但是度分布则是统计了度为某个值的个数。下面的函数
    # 很好的完成了这个任务，就是统计了度分布，当然最后一项是还有值的情形
    def get_degree_distribution(self):
        try:
            return nx.degree_histogram(self.graph)
        except Exception,e:
            print traceback.print_exc()

    def get_density(self):
        try:
            return nx.density(self.graph)
        except Exception,e:
            print traceback.print_exc()

    # get the transitivity - global clustering coefficient
    def get_transitivity(self):
        try:
            return nx.transitivity(self.graph)
        except Exception,e:
            print traceback.print_exc()

    def get_averate_clustering(self):
        try:
            return nx.average_clustering(self.graph)
        except Exception,e:
            print traceback.print_exc()

    def get_average_shortest_path_length(self):
        try:
            return nx.average_shortest_path_length(self.graph)
        except Exception,e:
            print traceback.print_exc()


    def write_to_pajek(self, pajek_net_path):
        try:
            nx.write_pajek(self.graph, pajek_net_path)
        except Exception,e:
            print traceback.print_exc()

    #--------------------------------------------------------
    #--------------centrality--------------------------------
    #--------------------------------------------------------

    # The degree centrality for a node v is the fraction of nodes it is connected to.
    def get_degree_centrality(self):
        try:
            return nx.degree_centrality(self.graph)
        except Exception,e:
            print traceback.print_exc()

    # Betweenness centrality of a node v is the sum of the fraction of all-pairs shortest paths that pass through v
    def get_betweenness_centrality(self):
        try:
            return nx.betweenness_centrality(self.graph)
        except Exception,e:
            print traceback.print_exc()

    # The load centrality of a node is the fraction of all shortest paths that pass through that node.
    def get_load_centrality(self):
        try:
            return nx.load_centrality(self.graph)
        except Exception,e:
            print traceback.print_exc()

    # Eigenvector centrality computes the centrality for a node based on the centrality of its neighbors.
    def get_eigenvector_centrality(self):
        try:
            return nx.eigenvector_centrality(self.graph)
        except Exception,e:
            print traceback.print_exc()

MyGraph.py

#-*- coding:utf-8 -*-
from GraphOperation import*

'''
基于我自己的工具类MyGraph
写一个图的操作类，实现图的各种操作
'''

class MyGraph:

    # 构造函数 - 主要是为了定义成员变量
    def __init__(self):
        self.my_graph = GraphOperation()
        self.map_name_to_number = dict()
        self.map_number_to_name = dict()
        self.output_path = ""

        self.clique_list = [] # for draw_community

        self.max_connected_component_subgraph = None

    # 构造图 - 初始化两个mapper,并且构造图
    def construct_graph(self, clique_list):
        try:

            # convert the name to number and store the relation in map_name_to_number
            number = 1
            new_clique_list = []

            for clique in clique_list:
                new_clique = []
                for u in clique:
                    if u in self.map_name_to_number:
                        new_clique.append(self.map_name_to_number[u])
                    else:
                        self.map_name_to_number[u] = number
                        number += 1
                        new_clique.append(self.map_name_to_number[u])
                new_clique_list.append(new_clique)

            # convert the number to name and store the relation in map_number_to_name
            self.map_number_to_name = dict()
            for name, number in self.map_name_to_number.items():
                self.map_number_to_name[number] = name

            self.clique_list = new_clique_list
            # construct graph based on the new_clique_list
            for clique in new_clique_list:
                # add all edges

                for u in clique:
                    # add a single node in case there exists node itself
                    self.my_graph.add_node(u)

                    for v in clique:
                        if (u == v):
                            continue
                        e = (u, v)
                        self.my_graph.add_edge_by_tuple(e)

            print "[INFO]: construct_graph is finished!"
        except Exception,e:
            print traceback.print_exc()

    # 加入一条边
    def add_edge(self, u, v):
        try:

            self.my_graph.add_edge(u, v)

        except Exception,e:
            print traceback.print_exc()

    # 获得所有边
    def get_all_edges(self):
        try:

            return self.my_graph.get_edges()

        except Exception,e:
            print traceback.print_exc()

    # 设置网络特征的输出路径
    def set_output_path(self, output_path):
        try:
            self.output_path = output_path
            print "[INFO]: set_output_path is finished!"
        except Exception,e:
            print traceback.print_exc()

    # 获得最大联通分量
    # 由于必须是在整个图生成之后，才能获得最大联通分量
    # 所以这个方法必须写在封装的第二层，第一层的类写的不够好。不能直接封装
    def set_max_connected_component_subgraph(self):
        try:
            self.max_connected_component_subgraph = max(nx.connected_component_subgraphs(self.my_graph.graph), key=len)
            print "[INFO]: set_max_connected_component_subgraph is finished!"
        except Exception,e:
            print traceback.print_exc()

    # 返回的是原生的nx.Graph()
    def get_max_connected_component_subgraph(self):
        try:
            return self.max_connected_component_subgraph
        except Exception,e:
            print traceback.print_exc()
    #-----------------------------------------------------------------------
    #-----------------------draw the network--------------------------------
    #-----------------------------------------------------------------------



    # 按照不同的社团进行绘图 - 不同社团具有不同的颜色
    # 逻辑是 不同的社团分别加入进去，然后配置颜色，绘图
    # 因为少了一层封装，所以掉用的时候只能按照最底层的凡是去调用，这样其实不好。
    # 为此，还增加了成员变量，保存clique_list
    def draw_community(self):
        try:
            # 初始信息
            #pos = nx.spring_layout(self.my_graph.graph)
            pos = nx.spring_layout(self.my_graph.graph)
            node_size_ = 100
            color_list = ["red", "yellow", "blue", "green", "pink", "orange", "purple"]
            #color_list = ["red", "yello", "blue", "green"]
            color_list_len = len(color_list)

            # add node and edges
            for i, node_list in enumerate(self.clique_list):
                edge_list = self.get_edges_for_community(node_list)

                # 以下两个函数参数太多，先暂时不直接封装
                #nx.draw_networkx_nodes(self.my_graph.graph, pos, node_list, node_size=node_size_, node_color=color_list[i%color_list_len])
                nx.draw_networkx_nodes(self.my_graph.graph, pos, node_list, node_size=node_size_, node_color=color_list[i], label="hello")
                nx.draw_networkx_edges(self.my_graph.graph, pos, edge_list)

            #title = "Collaboration Network"
            title = "people relation by train"
            plt.title(title)
            plt.show()

            print "[INFO]: draw_community is finished!"
        except Exception,e:
            print traceback.print_exc()

    def get_edges_for_community(self, node_list):
        try:
            edge_list = []
            for u in node_list:
                for v in node_list:
                    if u == v:
                        continue
                    else:
                        edge_list.append((u,v))
            return edge_list
        except Exception,e:
            print traceback.print_exc()

    # 基本画图
    def draw_graph(self,title):
        try:
            self.my_graph.draw_graph(title)
            print "[INFO]: draw_graph is finished!"
        except Exception,e:
            print traceback.print_exc()

    def draw_network(self):
        try:
            self.draw_network()
        except Exception,e:
            print traceback.print_exc()

    def draw_graph_random_layout(self):
        try:
            self.my_graph.draw_graph_random()
        except Exception,e:
            print traceback.print_exc()

    def draw_graph_spring_layout(self):
        try:
            self.my_graph.draw_graph_spring_layout()
            print "[INFO]: draw_graph is finished!"
        except Exception,e:
            print traceback.print_exc()

    #-----------------------------------------------------------------------
    #-----------------------network analysis--------------------------------
    #-----------------------------------------------------------------------


    # 计算节点数
    def cal_num_of_nodes(self):
        try:
            num_nodes = self.my_graph.get_number_of_nodes()
            file_path = self.output_path+"number_of_nodes.txt"

            outfile = open(file_path, "w")
            outfile.write(str(num_nodes) + '\n')
            outfile.close()
            print "[INFO]: cal_num_of_nodes is finished!"
        except Exception,e:
            print traceback.print_exc()

    # 计算边数
    def cal_num_of_edges(self):
        try:
            num_edges = self.my_graph.get_number_of_edges()
            file_path = self.output_path + "number_of_edges.txt"

            outfile = open(file_path, "w")
            outfile.write(str(num_edges) + '\n')
            outfile.close()
            print "[INFO]: cal_num_of_edges is finished!"
        except Exception, e:
            print traceback.print_exc()

    # 计算度分布
    def cal_degree_distribution(self):
        try:

            degree_distribution_list = self.my_graph.get_degree_distribution()
            file_path = self.output_path + "degree_distribution.txt"

            outfile = open(file_path, "w")
            for item in degree_distribution_list:
                line = str(item) + '\n'
                outfile.write(line)
            outfile.close()
            print "[INFO]: cal_degree_distribution is finished!"
        except Exception, e:
            print traceback.print_exc()

    # 计算网络密度
    def cal_density(self):
        try:
            density = self.my_graph.get_density()
            file_path = self.output_path + "graph_density.txt"

            outfile = open(file_path, "w")
            outfile.write(str(density) + '\n')
            outfile.close()
            print "[INFO]: cal_density is finished!"
        except Exception, e:
            print traceback.print_exc()

    # 计算聚集系数
    def cal_transitivity(self):
        try:
            transitivity = self.my_graph.get_transitivity()
            file_path = self.output_path + "transitivity.txt"

            outfile = open(file_path, "w")
            outfile.write(str(transitivity) + '\n')
            outfile.close()
            print "[INFO]: cal_transitivity is finished!"
        except Exception, e:
            print traceback.print_exc()

    def cal_average_clustering(self):
        try:
            average_clustering = self.my_graph.get_averate_clustering()
            file_path = self.output_path + "average_clustering.txt"

            outfile = open(file_path, "w")
            outfile.write(str(average_clustering) + '\n')
            outfile.close()
            print "[INFO]: cal_average_clustering is finished!"
        except Exception,e:
            print traceback.print_exc()

    # 计算平均距离
    def cal_average_shortest_path_length(self):
        try:
            aver_shortest_path = self.my_graph.get_average_shortest_path_length()
            file_path = self.output_path + "average_shortest_path_length.txt"

            outfile = open(file_path, "w")
            outfile.write(str(aver_shortest_path) + '\n')
            outfile.close()
            print "[INFO]: cal_average_shortest_path_length is finished!"
        except Exception, e:
            print traceback.print_exc()

    # 写入pajek格式文件
    def write_to_pajek_net(self):
        try:

            output_path = self.output_path + "graph_of_author_relation.net"

            # write to net file
            outfile = open(output_path, "w")

            nodes_num = self.my_graph.get_number_of_nodes()
            edges_num = self.my_graph.get_number_of_edges()
            first_line_of_node = "*Vertices " + str(nodes_num) + '\n'
            first_line_of_edge = "*Edges " + str(edges_num) + '\n'

            outfile.write(first_line_of_node)
            nodes_list = self.my_graph.get_nodes()
            for node in nodes_list:
                line = ""
                line += str(node) + ' ' + "\"" + str(self.map_number_name[node]) + "\"" + '\n'
                outfile.write(line)

            outfile.write(first_line_of_edge)
            edges_list = self.my_graph.get_edges()
            for edge in edges_list:
                line = ""
                line += str(edge[0]) + ' ' + str(edge[1]) + '\n'
                outfile.write(line)

            outfile.close()
            print "[INFO]: write_to_pajek_net is finished!"
        except Exception, e:
            print traceback.print_exc()

    def write_to_pajek_net1(self):
        try:
            pajek_net_path = self.output_path + "graph_of_author_relation.net"
            self.my_graph.write_to_pajek(pajek_net_path)

            print "[INFO]: write_to_pajek_net1 is finished!"
        except Exception, e:
            print traceback.print_exc()

    #--------------------------------------------------------
    #--------------centrality--------------------------------
    #--------------------------------------------------------
    def get_degree_centrality(self):
        try:
            return self.my_graph.get_degree_centrality()

            print "[INFO]: get_degree_centrality is finished!"
        except Exception,e:
            print traceback.print_exc()

    def get_betweenness_centrality(self):
        try:
            return self.my_graph.get_betweenness_centrality()

            print "[INFO]: get_betweenness_centrality is finished!"
        except Exception, e:
            print traceback.print_exc()

    def get_load_centrality(self):
        try:
            return self.my_graph.get_load_centrality()

            print "[INFO]: get_load_centrality is finished!"
        except Exception, e:
            print traceback.print_exc()

    def get_eigenvector_centrality(self):
        try:
            return self.my_graph.get_eigenvector_centrality()

            print "[INFO]: get_eigenvector_centrality is finished!"
        except Exception, e:
            print traceback.print_exc()

    # --------------------------------------------------------
    # --------------component--------------------------------
    # --------------------------------------------------------
    def draw_max_connected_component_subgraph(self):
        try:
            nx.draw_networkx(self.get_max_connected_component_subgraph(),with_labels = False)
            title = "Max connected subgraph of Collaboration Network"
            plt.title(title)
            plt.show()

            print "[INFO]: draw_max_connected_component_subgraph is finished!"
        except Exception, e:
            print traceback.print_exc()

    def get_average_shortest_path_length_in_max_connected_component_subgraph(self):
        try:

            res = nx.average_shortest_path_length(self.get_max_connected_component_subgraph())
            print "[INFO]: draw_max_connected_component_subgraph is finished!"
            return res
        except Exception, e:
            print traceback.print_exc()

    def cal_average_shortest_path_length_in_max_connected_component_subgraph(self):
        try:
            aver_shortest_path = self.get_average_shortest_path_length_in_max_connected_component_subgraph()
            file_path = self.output_path + "average_shortest_path_length_in_max_connected_subgraph.txt"

            outfile = open(file_path, "w")
            outfile.write(str(aver_shortest_path) + '\n')
            outfile.close()
            print "[INFO]: cal_average_shortest_path_length_in_max_connected_component_subgraph is finished!"
        except Exception, e:
            print traceback.print_exc()
#----------------------------------------------------------------------------

下面这一部分代码就不针对networkx了，主要是xml的封装类，以及测试部分的代码
- XmlParser

#-*- coding:utf-8
import xml.etree.ElementTree as et
import traceback

'''
基于XML的数据提取以及分析
其实我只可以负责数据提取
但是毕竟是同一个XML，所以把数据分析写进来我认为也是合理的

'''

class XmlParser:
    def __init__(self, xml_path, stop_words_path):
        self.stop_words_path = stop_words_path

        tree = et.parse(xml_path)
        self.root = tree.getroot()

    # 1-pubmed 获取文章作者
    def get_article_author(self):
        try:

            res_list = []
            for pubmed_article in self.root:
                try:
                    #print "---------------------------------------------------"
                    medline_citation = pubmed_article.findall("MedlineCitation")[0]
                    article = medline_citation.findall("Article")[0]
                    author_list = article.findall("AuthorList")[0]
                    author_list = author_list.findall("Author")

                    current_authour_list = []
                    for author in author_list:
                        try:
                            last_name = author.findall("LastName")[0]
                            initials = author.findall("Initials")[0]
                            name = str(last_name.text) + ' ' + str(initials.text)
                            current_authour_list.append(name)
                            #print name
                        except:
                            continue

                    res_list.append(current_authour_list)
                except:
                    continue
            return res_list
        except Exception, e:
            print traceback.print_exc()

    # 1-1 PMC 获取文章作者
    def get_article_author1(self):
        try:

            res_list = []
            for article in self.root:
                try:
                    author_list = []
                    #print pubmed_article
                    #print "---------------------------------------------------"
                    front = article.findall("front")[0]
                    article_meta = front.findall("article-meta")[0]
                    contrib_group = article_meta.findall("contrib-group")[0]

                    contrib_list = contrib_group.findall("contrib")

                    for contrib in contrib_list:
                        name = contrib.findall("name")[0]

                        surname = name.findall("surname")[0]
                        given_name = name.findall("given-names")[0]

                        final_name = ""
                        final_name += str(given_name.text) + " " + str(surname.text)

                        author_list.append(final_name)
                        #print final_name

                    res_list.append(author_list)

                except:
                    continue
            return res_list
        except Exception, e:
            print traceback.print_exc()


    # 2_获得文章标题
    def get_article_title(self, root):
        try:
            article_title_list = []
            for pubmed_article in root:
                try:
                    medline_citation = pubmed_article.findall("MedlineCitation")[0]
                    article = medline_citation.findall("Article")[0]
                    article_title = article.findall("ArticleTitle")[0]

                    article_title = str(article_title.text)
                    #print article_title
                    article_title_list.append(article_title)

                except:
                    continue
            return article_title_list
        except Exception,e:
            print traceback.print_exc()

    # 3_获取年份
    def get_article_year(self, root):
        try:
            article_year_list = []
            cnt = 0
            for pubmed_article in root:
                try:
                    medline_citation = pubmed_article.findall("MedlineCitation")[0]
                    article = medline_citation.findall("Article")[0]
                    article_journal = article.findall("Journal")[0]
                    article_journal_issue = article_journal.findall("JournalIssue")[0]
                    pub_date = article_journal_issue.findall("PubDate")[0]
                    year = pub_date.findall("Year")[0]

                    year = str(year.text)
                    article_year_list.append(year)

                except:
                    continue
            return article_year_list
        except Exception, e:
            print traceback.print_exc()

    # 4_获取出版社名称
    def get_article_journal_title(self, root):
        try:
            journal_title_list = []
            for pubmed_article in root:
                try:
                    medline_citation = pubmed_article.findall("MedlineCitation")[0]
                    article = medline_citation.findall("Article")[0]
                    article_journal = article.findall("Journal")[0]
                    article_journal_title = article_journal.findall("Title")[0]
                    journal_title = str(article_journal_title.text)

                    journal_title_list.append(journal_title)

                except:
                    continue
            return journal_title_list
        except Exception, e:
            print traceback.print_exc()

    # 5_pubmed获取文章摘要
    def get_article_abstract(self, root):
        try:
            article_abstract_list = []
            cnt = 0
            for pubmed_article in root:
                try:
                    medline_citation = pubmed_article.findall("MedlineCitation")[0]
                    article = medline_citation.findall("Article")[0]
                    article_abstract = article.findall("Abstract")[0]
                    article_abstract_text = article_abstract.findall("AbstractText")[0]

                    # 考虑有些文章不存在摘要的情形
                    if article_abstract_text is not None :
                        cnt += 1
                        abstract = str(article_abstract_text.text)
                        #print cnt, " ", abstract

                        article_abstract_list.append(abstract)

                except:
                    continue
            return article_abstract_list
        except Exception, e:
            print traceback.print_exc()

    # 5-1_pmc_获取文章作者
    def get_article_abstract1(self):
        try:

            res_list = []
            for article in self.root:
                try:
                    author_list = []
                    # print pubmed_article
                    # print "---------------------------------------------------"
                    front = article.findall("front")[0]
                    article_meta = front.findall("article-meta")[0]
                    abstract = article_meta.findall("abstract")[0]

                    abstract_p = abstract.findall("p")[0]
                    res_list.append(abstract_p.text)

                except:
                    continue
            return res_list
        except Exception, e:
            print traceback.print_exc()

    # 6_获取出版社名称 - （名字，位置）
    def get_article_journal_info(self, root):
        try:

            # journal_country_list = []
            # journal_name_list = []

            journal_info_list = []
            for pubmed_article in root:
                try:
                    medline_citation = pubmed_article.findall("MedlineCitation")[0]
                    journal_info = medline_citation.findall("MedlineJournalInfo")[0]

                    journal_country = str(journal_info.findall("Country")[0].text)
                    journal_name = str(journal_info.findall("MedlineTA")[0].text)

                    journal_info_list.append(journal_name + ',' + journal_country)

                except:
                    continue
            return journal_info_list

        except Exception, e:

            print traceback.print_exc()

#---------------------------------------------------------#
#                     计算统计特征                          -#
#----------------------------------------------------------#

    # 7_计算每年所发文章数
    def cal_num_of_article_in_each_year(self, write_path):

        try:
            year_list = self.get_article_year(self.root)

            counter = dict()

            #total = len(year_list)
            #print "TOTAL articles: ", total
            for y in year_list:
                if y in counter :
                    counter[y] += 1
                else:
                    counter[y] = 1

            pairs = list(counter.items())
            pairs.sort(reverse=True)


            outfile = open(write_path, "w")
            for pair in pairs:

                line = str(pair[0]) + "\t" + str(pair[1])
                outfile.write(line +'\n')

            outfile.close()

        except Exception, e:
            print traceback.print_exc()

    # 8_pubmed计算文章标题中词频
    def cal_word_occurence_in_article_title(self,output_path):
        try:
            article_list = self.get_article_title(self.root)

            stop_words_list = self.get_stop_words(self.stop_words_path)
            stop_words_list.append(' ')
            stop_words_list.append('')  # 这个要占很大的地方

            word_counter = dict()

            for article in article_list:

                try:
                    # 预处理
                    line = ""
                    for ch in article:
                        if ch.isalpha():
                            line += ch
                        else:
                            line += ' '

                    article = line
                    article = article.split(' ')

                    for word in article:
                        word = word.lower()
                        if word in stop_words_list:
                            continue

                        if word in word_counter:
                            word_counter[word] += 1
                        else:
                            word_counter[word] = 1

                except:
                    continue

            pairs = list(word_counter.items())
            items = [(count,word) for (word,count) in pairs]
            items.sort(reverse=True)

            write_path = output_path + "word_occurence_in_article_title.txt"
            outfile = open(write_path,"w")

            final_str = ""
            final_freq = ""
            cnt = 0

            for item in items:
                line =  str(item[1]) + "\t" + str(item[0])
                outfile.write(line +'\n')

                if cnt < 10:
                    if cnt == 0:
                        final_str = "'" + item[1] + "'" + final_str
                        final_freq = "'" + str(item[0]) + "'" + final_freq
                    else:
                        final_str = "'" + item[1] + "'" + ',' + final_str
                        final_freq = "'" + str(item[0]) + "'" + ',' + final_freq

                cnt += 1

            final_str = '[' + final_str + ']'
            final_freq = '[' + final_freq + ']'
            outfile.write(final_str + '\n')
            outfile.write(final_freq + '\n')

            outfile.close()

        except Exception, e:
            print traceback.print_exc()

    # 9_pubmed计算文章摘要中词频
    def cal_word_occurence_in_article_abstract(self, output_path):
        try:
            abstract_list = self.get_article_abstract(self.root)

            stop_words_list = self.get_stop_words(self.stop_words_path)
            stop_words_list.append(' ')
            stop_words_list.append('')  # 这个要占很大的地方

            word_counter = dict()

            for abstract in abstract_list:

                try:

                    # 预处理
                    line = ""
                    for ch in abstract:
                        if ch.isalpha():
                            line += ch
                        else:
                            line += ' '
                    abstract = line
                    abstract = abstract.split(' ')


                    for word in abstract:
                        word = word.lower()
                        if word in stop_words_list:
                            continue

                        if word in word_counter:
                            word_counter[word] += 1
                        else:
                            word_counter[word] = 1

                except:
                    continue

            pairs = list(word_counter.items())
            items = [(count, word) for (word, count) in pairs]
            items.sort(reverse=True)

            write_path = output_path + "word_occurence_in_article_abstract.txt"
            outfile = open(write_path, "w")

            final_str = ""
            final_freq = ""
            cnt = 0

            for item in items:
                line = str(item[1]) + "\t" + str(item[0])
                outfile.write(line + '\n')

                if cnt < 10:
                    if cnt == 0:
                        final_str = "'" + item[1] + "'" + final_str
                        final_freq = "'" + str(item[0]) + "'"+ final_freq
                    else:
                        final_str = "'"+item[1]+"'" + ',' + final_str
                        final_freq = "'" + str(item[0]) + "'" + ',' + final_freq

                cnt += 1

            final_str = '[' + final_str + ']'
            final_freq = '[' + final_freq + ']'
            outfile.write(final_str + '\n')
            outfile.write(final_freq + '\n')

            outfile.close()

        except Exception, e:
            print traceback.print_exc()

    # 9_1_pmc计算文章摘要中词频
    def cal_word_occurence_in_article_abstract1(self, write_path):
        try:
            abstract_list = self.get_article_abstract1()

            stop_words_list = self.get_stop_words(self.stop_words_path)
            stop_words_list.append(' ')
            stop_words_list.append('') # 这个要占很大的地方

            word_counter = dict()

            for abstract in abstract_list:

                try:
                    # 预处理
                    line = ""
                    for ch in abstract:
                        if ch.isalpha():
                            line += ch
                        else:
                            line += ' '
                    abstract = line
                    abstract = abstract.split(' ')

                    for word in abstract:
                        word = word.lower()
                        if word in stop_words_list:
                            continue


                        if word in word_counter:
                            word_counter[word] += 1
                        else:
                            word_counter[word] = 1
                except:
                    continue

            pairs = list(word_counter.items())
            items = [(count, word) for (word, count) in pairs]
            items.sort(reverse=True)

            #for item in items:
            #    print item[0], '\t', item[1]


            outfile = open(write_path, "w")
            for item in items:
                try:
                    line = ""
                    line = str(item[1]) + '\t' + str(item[0])
                    outfile.write(line+'\n')
                except Exception as ex:
                    print ex
            outfile.close()

        except Exception, e:
            print traceback.print_exc()

    # 10_计算期刊的名字以及其地理位置的出现次数
    def cal_journal_name_and_country_ouucrence(self, country_path, name_path):
        try:

            name_counter = dict()
            country_counter = dict()

            journal_info_list = self.get_article_journal_info(self.root)
            for item in journal_info_list:

                item = item.split(',')
                journal_name = item[0]
                journal_country = item[1]

                if journal_name in name_counter:
                    name_counter[journal_name] += 1
                else:
                    name_counter[journal_name] = 1

                if journal_country in country_counter:
                    country_counter[journal_country] += 1
                else:
                    country_counter[journal_country] = 1

            pairs = list(name_counter.items())
            reverse_pairs = [ (count,name) for (name,count) in pairs ]
            reverse_pairs.sort(reverse=True)

            outfile = open(name_path, "w")
            for item in reverse_pairs:

                name = str(item[1])
                count = str(item[0])

                line = ""
                line += name
                line += '\t'
                line += count

                outfile.write(line + '\n')

            outfile.close()

            pairs = list(country_counter.items())
            reverse_pairs = [(count, country) for (country, count) in pairs]
            reverse_pairs.sort(reverse=True)

            outfile = open(country_path, "w")
            for item in reverse_pairs:
                name = str(item[1])
                count = str(item[0])

                line = ""
                line += name
                line += '\t'
                line += count

                outfile.write(line + '\n')

            outfile.close()


        except Exception, e:
            print traceback.print_exc()

    # 11_计算发布量前10的论文，在不同区的数量
    def cal_num_in_diff_area(self, input_path, out_path):
        try:

            area_counter = {}

            cnt = 0
            infile = open(input_path, "r")
            for line in infile:
                cnt += 1
                if cnt == 1:
                    continue

                line = line.rstrip('\n').split(' ')

                num = int(line[1])
                area = line[3]

                if area in area_counter:
                    area_counter[area] += num
                else:
                    area_counter[area] = num
            infile.close()

            outfile = open(out_path, "w")
            for area in area_counter:
                line = ""
                line += str(area)
                line += " "
                line += str(area_counter[area])
                outfile.write(line + '\n')
            outfile.close()

        except Exception, e:
            print traceback.print_exc()

    # 12_计算影响因子
    def cal_aver_if_factor(self, input_path):
        try:

            cnt = 0
            infile = open(input_path, "r")

            total_num = 0
            total_factor = 0.0

            for line in infile:
                cnt += 1
                if cnt == 1:
                    continue

                line = line.rstrip('\n').split(' ')
                num = int(line[1])
                factor = float(line[2])

                total_num += num
                total_factor += factor * num


            infile.close()

            print total_factor / total_num

        except Exception, e:
            print traceback.print_exc()

    # 13_获取停用词
    def get_stop_words(self, stop_words_path):
        result_list = []

        infile = open(stop_words_path, "r")
        for line in infile:
            line = line.rstrip('\n')
            result_list.append(line)
        infile.close()

        return result_list

    # 14_测试函数
    def test(self):
        journal_info_list = self.get_article_journal_info(self.root)
        print len(journal_info_list)
        for aa in journal_info_list:
            print aa

main.py

#-*- coding:utf-8 -*-
from XmlParser import*
from MyGraph import*

STOP_WORDS_PATH = "../file/stop_words.txt"

XML_PATH1 = "../data/PUBMED/LANCET/2006/lancet_2006_1570.xml"
#XML_PATH2 = "../data/PUBMED/LANCET/2009/lancet_2009_1516.xml"
#OUTPUT_PATH1 = "../output/network_analysis/PUBMED/LANCET/2006/"
#OUTPUT_PATH2 = "../output/network_analysis/PUBMED/LANCET/2009/"
OUTPUT_PATH3 = "../output/src_output/edge.txt"

INPUT_PATH = "../data/src_input/citation.csv"
OUTPUT_PATH = "../output/src_output/"

# @xml_parser_obj:xml解析后的对象
# @OUTPUT_PATH:统计分析之后的输出路径
def statical_analysis( xml_parser_obj, OUTPUT_PATH ):
    try:
        xml_parser_obj.cal_word_occurence_in_article_abstract(OUTPUT_PATH)
        xml_parser_obj.cal_word_occurence_in_article_title(OUTPUT_PATH)

        print "[INFO]: statical_analysis is finished!"
    except Exception,e:
        print traceback.print_exc()

# @xml_parser_obj:xml解析后的对象
# @OUTPUT_PATH: 网络静态分析之后的输出路径
def author_collaboration_network_analysis( xml_parser_obj, OUTPUT_PATH ):
    try:

        # get the author clique list
        author_clique_list = xml_parser_obj.get_article_author()

        # construct the graph based on the author clique list
        graph = MyGraph()
        graph.construct_graph(author_clique_list)
        graph.set_output_path(OUTPUT_PATH)


        # calculate the statistics
        graph.cal_num_of_nodes()
        graph.cal_num_of_edges()

        graph.cal_degree_distribution()
        graph.cal_density()

        # the colloboration network is usually not connected
        #graph.cal_average_shortest_path_length()
        graph.cal_average_clustering()

        graph.write_to_pajek_net1()

        # 这个函数并不是真的画社团 只是把不同clique画出来而已 画的是整个的图
        graph.draw_community()

        graph.set_max_connected_component_subgraph()
        graph.draw_max_connected_component_subgraph()
        graph.cal_average_shortest_path_length_in_max_connected_component_subgraph()

        #graph.draw_graph()
        #graph.draw_graph_spring_layout()
        #graph.draw_graph_random()


        print "[INFO]: author_collaboration_network_analysis is finished!"
    except Exception,e:
        print traceback.print_exc()

def author_collaboration_network_analysis1( xml_parser_obj1, xml_parser_obj2, OUTPUT_PATH ):
    try:

        # get the author clique list
        author_clique_list = xml_parser_obj1.get_article_author()
        author_clique_list.extend(xml_parser_obj2.get_article_author())

        # construct the graph based on the author clique list
        graph = MyGraph()
        graph.construct_graph(author_clique_list)
        graph.set_output_path(OUTPUT_PATH)

        # calculate the statistics
        graph.cal_num_of_nodes()
        graph.cal_num_of_edges()

        graph.cal_degree_distribution()
        graph.cal_density()

        graph.cal_average_shortest_path_length()
        graph.cal_average_clustering()

        graph.write_to_pajek_net1()

        graph.draw_community()
        #graph.draw_graph()
        #graph.draw_graph_spring_layout()
        #graph.draw_graph_random()

        print "[INFO]: author_collaboration_network_analysis is finished!"
    except Exception,e:
        print traceback.print_exc()

def test_for_srx():
    try:

        graph = MyGraph()
        graph.set_output_path(OUTPUT_PATH)

        for line in file(INPUT_PATH, "r"):
            u = line.split(',')[0]
            v = line.split(',')[1]

            graph.add_edge(u, v)

        print "[INFO]: graph is finished!"


        graph.cal_average_clustering()
        graph.cal_average_shortest_path_length_in_max_connected_component_subgraph()
        graph.cal_degree_distribution()
        graph.cal_density()
        graph.cal_transitivity()


    except Exception,e:
        print traceback.print_exc()

def test_for_jcx():
    try:
        graph = MyGraph()
        graph.set_output_path(OUTPUT_PATH)
        cnt = 0
        for line in file(INPUT_PATH,"r"):
            u =line.split()[0]
            v =line.split()[1]

            graph.add_edge(u,v)
            cnt += 1

            if(cnt == 10000):
                break;
        print "[INFO]: graph is finished!"

        '''
        graph.cal_average_clustering()
        graph.cal_average_shortest_path_length_in_max_connected_component_subgraph()
        graph.cal_degree_distribution()
        graph.cal_density()
        graph.cal_transitivity()
        '''

        title = "Social Network - Live Journal"
        graph.draw_graph(title)

    except Exception,e:
        print traceback.print_exc()

def main():
    try:

        print "[INFO]: Programme is running......"

        # parse the xml and get the result
        #a_obj1 = XmlParser(XML_PATH1, STOP_WORDS_PATH)
        #a_obj2 = XmlParser(XML_PATH2, STOP_WORDS_PATH)

        #statical_analysis(a_obj1, OUTPUT_PATH1)
        #statical_analysis(a_obj2, OUTPUT_PATH2)

        #author_collaboration_network_analysis(a_obj1, OUTPUT_PATH1)


        test_for_srx()

        print "[INFO]: Programme terminated successfully!"

    except Exception, e:
        print traceback.print_exc()


main()

你可能感兴趣的:(python学习)

关于使用python进行处理雷达数据笔记六毛驴 python 数据分析
好久不见，甚是想念本人深知这段时间鸽了一篇博（上一篇博），后续会补上的，今天想写一下关于使用python进行TI雷达接收回波数据处理的一些常见问题和解决方法。这也是前几天领导给我布置的任务，所以我将这段时间自己遇到的并且已经解决的问题进行了简单的汇总，也会推荐几本这几天阅读了python书籍。python书籍推荐：python学习手册MarkLutz著（对应python版本3.X，2.X都可）Py
python技巧之下划线老虎也淘气 Python编程掌握指南 python django 开发语言
‍♂️个人主页@老虎也淘气个人主页✍作者简介：Python学习者希望大家多多支持我们一起进步！如果文章对你有帮助的话，欢迎评论点赞收藏加关注python技巧之下划线1、python的moudles文件中__all__作用2、__slots__用于限定类属性，如：3、下面的小技巧可以获取私有变量：4、下划线种类单个下划线（_）单下划线前缀的名称（例如_shahriar）双下划线前缀的名称（例如__s
【Python学习笔记】一些关于多线程，xls文件读取，PyQt5，PyInstaller打包等问题的解决方案记录百里香酚兰 Python自学笔记 python 学习笔记 pyinstaller xls文件 PyQt5 多线程
背景：最近利用休息时间写了个小型exe程序，主要涉及的技术点有：多线程，读取xls文件，基于PyQt5的简单GUI页面，利用PyInstaller打包成exe。虽然有ChatGPT等协助，但难免还是在开发过程中遇到了一些疑难问题，所以开个记录贴刊登解决方式。问题&解决方式：1.PyQt+PyInstaller：tqdm报错AttributeError:‘NoneType‘objecthasnoat
Python学习日记-第二十九天-tcp（客户端）差点长成吴彦祖 python pandas tcp/ip 网络
系列文章目录tcp介绍tcp特点tcp客户端一、tcp介绍Tcp协议，传输控制协议是一种面向连接的、可靠的、基于字节流的传输层通信协议，由IETF的RFC793定义TCP通信需要经过创建连接、传输数据、终止连接三个步骤TCP通信模型中，在通信开始之前，一定要先建立相关的链接，才能发送数据，类似于生活中的“打电话”（注：之前学习的udp，在通信前，不需要建立相关的链接，只需要发送数据即可，类似于“写
Python学习第十九天 Leo来编程 Python学习学习 python
Django-分页后端分页Django提供了Paginator类来实现后端分页。Paginator类可以将一个查询集（QuerySet）分成多个页面，每个页面包含指定数量的对象。fromdjango.shortcutsimportrender,redirect,get_object_or_404from.modelsimportUserfrom.formsimportUserFormfromdja
漫画算法python篇pdf_用Python抓取漫画并制作mobi格式电子书 jian bao 漫画算法python篇pdf
想看某一部漫画，但是用手机看感觉屏幕太小，用电脑看吧有太不方面。正好有一部Kindle，决定写一个爬虫把漫画爬取下来，然后制作成mobi格式的电子书放到kindle里面看。本人对于Python学习创建了一个小小的学习圈子，为各位提供了一个平台，大家一起来讨论学习Python。欢迎各位到来Python学习群：943752371一起讨论视频分享学习。Python是未来的发展方向，正在挑战我们的分析能力
批量安装 Python 库的脚本：提高python学习效率的第一步（附源码） TAGRENLA Interesting python project python 学习开发语言
批量安装Python库批量安装Python库的脚本：提高数据分析效率的一步（附源码）批量安装脚本前提条件使用pip：Python包管理工具批量安装脚本查看当前python解释器中安装的所有的库批量安装Python库的脚本：提高数据分析效率的一步（附源码）在现代数据分析领域，Python已成为一个不可或缺的工具。为了进行数据处理、分析、可视化和建模等任务，Python社区涌现出了众多强大的库和工具。
Python学习-----项目设计1.0（设计思维和ATM环境搭建） Fitz& Python学习学习 python
目录前言：项目开发流程MVC设计模式什么是MVC设计模式？ATM项目要求ATM项目的环境搭建前言：我个人学习Python大概也有一个月了，在这一个月中我发布了许多关于Python的文章，建立了一个Python学习起步的专栏（https://blog.csdn.net/m0_73633088/category_12186491.html），在这里我非常感谢各位的一路陪伴，你们的支持是我创作的不竭动力
Python学习日志3-复合类型可惜还不下雨学习
python支持多种复合类型，可以将不同的值组合在一起一、列表列表（list）是用方括号标注、逗号隔开的一组值，可以包含不同类型的元素（但最好不要这么做），列表有以下特点：列表内的顺序有先后顺序列表的值可变1.创建列表列表有两种创建方式，一是直接用方括号把表达式括起来，而是用构造函数list()表达式list1=[]#创建了一个空列表list2=["a","b","c"]#创建了一个字符串列表li
Python学习第十四天 Leo来编程 Python学习 python 学习开发语言
pip命令pip是Python的包管理工具，用于安装和管理Python第三方库安装安装pip指令（主要是为了更换pip的国内源），在C:\Users下建立pip文件夹，在pip文件夹里建立pip.ini(C盘不让建立可以桌面建立拖进去)文件内容如下：[global]index-url=https://pypi.tuna.tsinghua.edu.cn/simple常用命令操作类型命令格式描述安装包
Python学习第十五天 Leo来编程 Python学习 python 学习
Django概念Django最初被设计用于具有快速开发需求的新闻类站点，目的是要实现简单快捷的网站开发。以下内容简要介绍了如何使用Django实现一个数据库驱动的网络应用。（Django是一个开放源代码的第三方模块Web应用框架，并且是一个功能全，重量的框架。Flask框架是一个轻量级功能少，从github上搜索pythonweb项目基本都出来的是django和flask项目）学习文档可以使用：官
python爬虫遇到IP被封的情况，怎么办？(2) 2301_82242251 程序员 python 爬虫开发语言
代理的设置：①urllib的代理设置fromurllib.errorimportURLErrorfromurllib.requestimportProxyHandler,build_opener‘’’更多Python学习资料以及源码教程资料，可以在群1136201545免费获取‘’’proxy=‘127.0.0.1:8888’#需要认证的代理#proxy=‘username:password@12
Python学习笔记 Helloooooworldddddd python
eclipse中配置PyDev：Help-->InstallNewSoftware-->Add-->起名如：PyDev，网址：http://www.pydev.org/updates-->选择PyDev-->一路Next安装完之后创建新项目时，如果没有PyDev选项，则是安装的版本跟eclipse、jdk不匹配，需要卸载重新安装。卸载：Help-->AboutEclipse-->Installat
入坑 Python 全能实战小白训练营，470 集干货 12.9G 大揭秘！七七知享 Python python 开发语言 pandas numpy matplotlib java php
家人们，我最近挖到了一个Python学习的宝藏——Python全能实战小白训练营。整整470集，内容超丰富，资源包有12.9G，完全就是为咱们这些想系统学习Python的小白量身定制的。接下来就给大家好好唠唠。随着课程深入，会涉及到Python的各种高级特性，比如面向对象编程、模块与包的使用。在讲面向对象编程时，老师通过打造一个小型游戏角色系统，把类、对象、继承、多态这些抽象概念诠释得生动形象，让
Python学习第十一天 Leo来编程 Python学习 python
疑惑：有很多人不知道是不是也分不清什么是单核？什么是多核？什么是时间片？进程？线程？那么在讲进程和线程前我先举个例子更好理解这些概念。单核例子：比如你是一个厨师（计算机）在一个厨房（CPU）里需要同时做3个菜（进程）、每个菜需要准备不同的调料以及协作（线程），那么这个厨师需要不断地切换时间（时间片）来达到同时在一个时间将三个菜做完。多核的话其实对应的例子就是多个厨师，这样的例子太多了因为万物皆对象
python学习第三天 Leo来编程 Python学习 python 开发语言
条件判断条件判断使用if、elif和else关键字。它们用于根据条件执行不同的代码块。#条件判断age=18ifage0:#也可以写if(s>0)但是没必要因为python给个提示建议去掉保证代码的按照缩进来进行更加规范print("这个数字是大于0的数字!")#这行代码属于if语句的代码块elifs==0:print("这个数字是等于0的数字!")#这行代码属于elif语句的代码块else:pr
Python学习指南：系统化路径 + 避坑建议程之编 Python全栈通关秘籍青少年编程 python 开发语言人工智能机器学习
新手小白学习编程就像搭积木——需要从基础开始，逐步构建知识体系。以下是为你量身定制的Python学习路径，帮你告别杂乱，高效入门！一、学习前的关键认知明确目标：想用Python做什么？数据分析（如Excel自动化、可视化）Web开发（如搭建网站）人工智能（如机器学习）自动化办公（如处理文件、邮件）目标不同，后续学习侧重点不同（但基础通用）。避免误区：❌只看教程不写代码✅边学边动手，哪怕抄代码也要运
第五周作业——第十章动手试一试 hongsqi
10-1Python学习笔记学习笔记：在文本编辑器中新建一个文件，写几句话来总结一下你至此学到的Python知识，其中每一行都以“InPythonyoucan”打头。将这个文件命名为learning_python.txt，并将其存储到为完成本章练习而编写的程序所在的目录中。编写一个程序，它读取这个文件，并将你所写的内容打印三次：第一次打印时读取整个文件；第二次打印时遍历文件对象；第三次打印时将各行
Python学习总结 serve the people 巨人的肩膀 python 开发语言
第一个python程序print("HelloWorld")#缩进一般4个空格键或者1个tab键，但是所有代码块语句必须是相同的缩进，这个必须严格执行，不同的缩进会导致程序不能运行，不能混用空格和tabifTrue:print("True")else:print("False")python注释符单行注释（行注释）#print("HelloWorld")多行注释（块注释）'''print("Hel
python学习，Windows图标一键替换工具开发详解木木黄木木 python 学习 windows
Windows图标一键替换工具开发详解项目概述本项目是一个基于Python开发的Windows图标一键替换工具，提供了简单易用的图形界面，让用户能够轻松地替换Windows系统中的回收站图标、快捷方式图标以及应用程序图标。功能特点支持三种图标替换模式：回收站图标替换桌面快捷方式图标替换系统应用程序图标替换图标预览功能：实时预览选择的图标支持缩放预览支持多种图片格式（ICO、PNG、JPEG等）便捷
2024年Python最新Pytorch--3，面试高分实战 m0_60666452 程序员 python 学习面试
（1）Python所有方向的学习路线（新版）这是我花了几天的时间去把Python所有方向的技术点做的整理，形成各个领域的知识点汇总，它的用处就在于，你可以按照上面的知识点去找对应的学习资源，保证自己学得较为全面。最近我才对这些路线做了一下新的更新，知识体系更全面了。（2）Python学习视频包含了Python入门、爬虫、数据分析和web开发的学习视频，总共100多个，虽然没有那么全面，但是对于入门
Python学习之-分支语句-基础训练 YMLT花岗岩 educoder Python实践（代码篇）学习 python educoder
第1关：计算并输出圆的面积和周长。任务描述从键盘输入圆的半径，如果半径大于等于0，则计算并输出圆的面积和周长。相关知识判断半径是不是小于零测试说明平台会对你编写的代码进行测试：示例代码：#单分支#coding=utf-8#********Begin**********r=float(input())s=3.1415*pow(r,2)c=2*3.1415*rprint("圆的面积为：%.2f"%s)
数据挖掘实战-基于Catboost算法的艾滋病数据可视化与建模分析艾派森数据挖掘实战合集 python 人工智能数据挖掘信息可视化数据分析
‍♂️个人主页：@艾派森的个人主页✍作者简介：Python学习者希望大家多多支持，我们一起进步！如果文章对你有帮助的话，欢迎评论点赞收藏加关注+目录1.项目背景2.数据集介绍
Python就业薪资怎么样？前景如何？田野猫咪 Python 计算机 python 人工智能数据挖掘
Python是一种全栈的开发语言，你如果能学好Python，前端，后端，测试，大数据分析，爬虫等这些工作你都能胜任。那么Python现在在国内的就业薪资高吗？Python就业薪资怎么样？前景如何？对于这些问题，下面小编整理相关内容为大家详情解析，一起来了解吧~如果你也对Python感兴趣，想通过学习Python转行、做副业或者提升工作效率，我也为大家整理了一份【最新全套Python学习资料】一定对
Python学习第七天 Leo来编程 Python学习学习
模块模块是一个包含Python代码的文件，通常以.py为扩展名。模块中内容有函数、类、变量/常量、测试代码。模块的作用：划分代码结构、提高代码的复用率。命名规范使用小写字母：模块名应全部使用小写字母。避免使用大写字母，因为不同操作系统对文件名的大小写敏感度不同。使用下划线分隔单词：如果模块名由多个单词组成，使用下划线_分隔单词。这种风格称为file_util避免使用关键字和内置模块名不要使用Pyt
Python学习第九天 Leo来编程 Python学习学习
序列化和反序列概念在Python中，序列化是将对象转换为可存储或传输的格式（如字节流或字符串），而反序列化则是将序列化后的数据重新转换为对象（官网序列化）。序列化：就是将不能存储的对象转为可存储的对象（封存pickling）。发序列化：序列化的对象返回成原来的对象（解封unpickling）。方式序列化和反序列化有下面五种方式pickle模块官网概念：pickle模块实现了对一个Python对象结
pywin32，一个超强的 Python 库！ Sitin涛哥 Python python 开发语言
更多Python学习内容：ipengtao.com大家好，今天为大家分享一个超强的Python库-pywin32。Github地址：https://github.com/mhammond/pywin32在Python的世界里，有许多优秀的第三方库可以帮助开发者更轻松地处理各种任务。其中，pywin32库是一个特别引人注目的工具，它提供了对WindowsAPI的完整访问，使得开发者能够利用Pytho
Python 学习与开发：高效编程技巧与实用案例壹屋安源知识分享 python 学习开发语言
Python学习与开发：高效编程技巧与实用案例Python是现代编程语言中最受欢迎的一种，它以简洁、易读的语法和强大的功能广泛应用于数据分析、人工智能、Web开发等多个领域。无论你是Python新手还是有经验的开发者，掌握一些高效编程技巧和实用案例，能让你的Python开发之旅更加顺畅。1.高效的函数式编程使用列表推导式列表推导式是Python中非常常用的功能，它不仅可以让代码更加简洁，还能提高执
跟我一起学Python数据处理（113/127）：丰富学习资源与命令行技巧 lilye66 python 学习开发语言 django
跟我一起学Python数据处理（113/127）：丰富学习资源与命令行技巧嗨，大家好！我一直觉得学习是个不断探索和成长的过程，在Python数据处理的学习之路上，我收获了很多宝贵的知识和经验。真心希望能和大家一起分享这些，咱们携手共同进步，所以才有了这篇文章。上一篇文章里，我们了解了Python和其他编程语言的对比，今天咱们接着深入，看看还有哪些超棒的Python学习资源，顺便学习一下命令行的实用
php程序员如何3天完成python学习大0马浓 php python 学习
作为PHP程序员，你已具备编程思维和逻辑能力，3天内掌握Python基础语法和核心特性是完全可行的。关键在于利用已有编程经验进行知识迁移，同时聚焦Python独有的特性。以下是‌高强度学习路径‌（每日6-8小时）：‌Day1：基础语法迁移（6小时）‌‌目标：掌握与PHP相似的基础语法，突破关键差异点‌‌变量与数据类型（1小时）‌动态类型：Python无需声明类型（age=25vsPHP的$age=
集合框架天子之骄 java 数据结构集合框架
集合框架集合框架可以理解为一个容器，该容器主要指映射(map)、集合(set)、数组(array)和列表(list)等抽象数据结构。从本质上来说，Java集合框架的主要组成是用来操作对象的接口。不同接口描述不同的数据类型。简单介绍： Collection接口是最基本的接口，它定义了List和Set，List又定义了LinkLi
Table Driven（表驱动）方法实例 bijian1013 java enum Table Driven 表驱动
实例一： /** * 驾驶人年龄段 * 保险行业，会对驾驶人的年龄做年龄段的区分判断 * 驾驶人年龄段：01-[18,25);02-[25,30);03-[30-35);04-[35,40);05-[40,45);06-[45,50);07-[50-55);08-[55,+∞) */ public class AgePeriodTest { //if...el
Jquery 总结 cuishikuan java jquery Ajax Web jquery方法
1.$.trim方法用于移除字符串头部和尾部多余的空格。如：$.trim(' Hello ') // Hello2.$.contains方法返回一个布尔值，表示某个DOM元素（第二个参数）是否为另一个DOM元素（第一个参数）的下级元素。如：$.contains(document.documentElement, document.body); 3.$
面向对象概念的提出麦田的设计者 java 面向对象面向过程
面向对象中，一切都是由对象展开的，组织代码，封装数据。在台湾面向对象被翻译为了面向物件编程，这充分说明了，这种编程强调实体。下面就结合编程语言的发展史，聊一聊面向过程和面向对象。 c语言由贝尔实
linux网口绑定被触发 linux
刚在一台IBM Xserver服务器上装了RedHat Linux Enterprise AS 4，为了提高网络的可靠性配置双网卡绑定。一、环境描述我的RedHat Linux Enterprise AS 4安装双口的Intel千兆网卡，通过ifconfig -a命令看到eth0和eth1两张网卡。二、双网卡绑定步骤： 2.1 修改/etc/sysconfig/network
XML基础语法肆无忌惮_ xml
一、什么是XML？ XML全称是Extensible Markup Language，可扩展标记语言。很类似HTML。XML的目的是传输数据而非显示数据。XML的标签没有被预定义，你需要自行定义标签。XML被设计为具有自我描述性。是W3C的推荐标准。二、为什么学习XML？用来解决程序间数据传输的格式问题做配置文件充当小型数据库三、XML与HTM
为网页添加自己喜欢的字体知了ing 字体秒表 css
@font-face { font-family: miaobiao;//定义字体名字 font-style: normal; font-weight: 400; src: url('font/DS-DIGI-e.eot');//字体文件 } 使用： <label style="font-size:18px;font-famil
redis范围查询应用-查找IP所在城市矮蛋蛋 redis
原文地址： http://www.tuicool.com/articles/BrURbqV 需求根据IP找到对应的城市原来的解决方案 oracle表（ip_country）：查询IP对应的城市： 1.把a.b.c.d这样格式的IP转为一个数字，例如为把210.21.224.34转为3524648994 2. select city from ip_
输入两个整数，计算百分比 alleni123 java
public static String getPercent(int x, int total){ double result=(x*1.0)/(total*1.0); System.out.println(result); DecimalFormat df1=new DecimalFormat("0.0000%");
百合——————>怎么学习计算机语言百合不是茶 java 移动开发
对于一个从没有接触过计算机语言的人来说，一上来就学面向对象，就算是心里上面接受的了，灵魂我觉得也应该是跟不上的，学不好是很正常的现象，计算机语言老师讲的再多，你在课堂上面跟着老师听的再多，我觉得你应该还是学不会的，最主要的原因是你根本没有想过该怎么来学习计算机编程语言，记得大一的时候金山网络公司在湖大招聘我们学校一个才来大学几天的被金山网络录取，一个刚到大学的就能够去和
linux下tomcat开机自启动 bijian1013 tomcat
方法一：修改Tomcat/bin/startup.sh 为: export JAVA_HOME=/home/java1.6.0_27 export CLASSPATH=$CLASSPATH:$JAVA_HOME/lib/tools.jar:$JAVA_HOME/lib/dt.jar:. export PATH=$JAVA_HOME/bin:$PATH export CATALINA_H
spring aop实例 bijian1013 java spring AOP
1.AdviceMethods.java package com.bijian.study.spring.aop.schema; public class AdviceMethods { public void preGreeting() { System.out.println("--how are you!--"); } } 2.beans.x
[Gson八]GsonBuilder序列化和反序列化选项enableComplexMapKeySerialization bit1129 serialization
enableComplexMapKeySerialization配置项的含义 Gson在序列化Map时，默认情况下，是调用Key的toString方法得到它的JSON字符串的Key，对于简单类型和字符串类型，这没有问题，但是对于复杂数据对象，如果对象没有覆写toString方法，那么默认的toString方法将得到这个对象的Hash地址。 GsonBuilder用于
【Spark九十一】Spark Streaming整合Kafka一些值得关注的问题 bit1129 Stream
包括Spark Streaming在内的实时计算数据可靠性指的是三种级别： 1. At most once，数据最多只能接受一次，有可能接收不到 2. At least once, 数据至少接受一次，有可能重复接收 3. Exactly once 数据保证被处理并且只被处理一次，具体的多读几遍http://spark.apache.org/docs/lates
shell脚本批量检测端口是否被占用脚本 ronin47
#!/bin/bash cat ports |while read line do#nc -z -w 10 $line nc -z -w 2 $line 58422>/dev/null2>&1if[ $?-eq 0]then echo $line:ok else echo $line:fail fi done 这里的ports 既可以是文件
java-2.设计包含min函数的栈 bylijinnan java
具体思路参见：http://zhedahht.blog.163.com/blog/static/25411174200712895228171/ import java.util.ArrayList; import java.util.List; public class MinStack { //maybe we can use origin array rathe
Netty源码学习-ChannelHandler bylijinnan java netty
一般来说，“有状态”的ChannelHandler不应该是“共享”的，“无状态”的ChannelHandler则可“共享” 例如ObjectEncoder是“共享”的, 但 ObjectDecoder 不是因为每一次调用decode方法时，可能数据未接收完全（incomplete），它与上一次decode时接收到的数据“累计”起来才有可能是完整的数据，是“有状态”的 p
java生成随机数 cngolon java
方法一： /** * 生成随机数 * @author [email protected] * @return */ public synchronized static String getChargeSequenceNum(String pre){ StringBuffer sequenceNum = new StringBuffer(); Date dateTime = new D
POI读写海量数据 ctrain 海量数据
import java.io.FileOutputStream; import java.io.OutputStream; import org.apache.poi.xssf.streaming.SXSSFRow; import org.apache.poi.xssf.streaming.SXSSFSheet; import org.apache.poi.xssf.streaming
mysql 日期格式化date_format详细使用 daizj mysql date_format 日期格式转换日期格式化
日期转换函数的详细使用说明 DATE_FORMAT(date,format) Formats the date value according to the format string. The following specifiers may be used in the format string. The&n
一个程序员分享8年的开发经验 dcj3sjt126com 程序员
在中国有很多人都认为IT行为是吃青春饭的，如果过了30岁就很难有机会再发展下去!其实现实并不是这样子的，在下从事.NET及JAVA方面的开发的也有8年的时间了，在这里在下想凭借自己的亲身经历，与大家一起探讨一下。明确入行的目的很多人干IT这一行都冲着“收入高”这一点的，因为只要学会一点HTML, DIV+CSS，要做一个页面开发人员并不是一件难事，而且做一个页面开发人员更容
android欢迎界面淡入淡出效果 dcj3sjt126com android
很多Android应用一开始都会有一个欢迎界面，淡入淡出效果也是用得非常多的，下面来实现一下。主要代码如下： package com.myaibang.activity; import android.app.Activity;import android.content.Intent;import android.os.Bundle;import android.os.CountDown
linux 复习笔记之常见压缩命令 eksliang tar解压 linux系统常见压缩命令 linux压缩命令 tar压缩
转载请出自出处:http://eksliang.iteye.com/blog/2109693 linux中常见压缩文件的拓展名 *.gz gzip程序压缩的文件 *.bz2 bzip程序压缩的文件 *.tar tar程序打包的数据，没有经过压缩 *.tar.gz tar程序打包后，并经过gzip程序压缩 *.tar.bz2 tar程序打包后，并经过bzip程序压缩 *.zi
Android 应用程序发送shell命令 gqdy365 android
项目中需要直接在APP中通过发送shell指令来控制lcd灯，其实按理说应该是方案公司在调好lcd灯驱动之后直接通过service送接口上来给APP，APP调用就可以控制了，这是正规流程，但我们项目的方案商用的mtk方案，方案公司又没人会改，只调好了驱动，让应用程序自己实现灯的控制，这不蛋疼嘛！！！！发就发吧！一、关于shell指令：我们知道，shell指令是Linux里面带的
java 无损读取文本文件 hw1287789687 读取文件无损读取读取文本文件 charset
java 如何无损读取文本文件呢？以下是有损的 @Deprecated public static String getFullContent(File file, String charset) { BufferedReader reader = null; if (!file.exists()) { System.out.println("getFull
Firebase 相关文章索引 justjavac firebase
Awesome Firebase 最近谷歌收购Firebase的新闻又将Firebase拉入了人们的视野，于是我做了这个 github 项目。 Firebase 是一个数据同步的云服务，不同于 Dropbox 的「文件」，Firebase 同步的是「数据」，服务对象是网站开发者，帮助他们开发具有「实时」（Real-Time）特性的应用。开发者只需引用一个 API 库文件就可以使用标准 RE
C++学习重点 lx.asymmetric C++笔记
1.c++面向对象的三个特性：封装性，继承性以及多态性。 2.标识符的命名规则：由字母和下划线开头，同时由字母、数字或下划线组成；不能与系统关键字重名。 3.c++语言常量包括整型常量、浮点型常量、布尔常量、字符型常量和字符串性常量。 4.运算符按其功能开以分为六类：算术运算符、位运算符、关系运算符、逻辑运算符、赋值运算符和条件运算符。 &n
java bean和xml相互转换 q821424508 java bean xml xml和bean转换 java bean和xml转换
这几天在做微信公众号做的过程中想找个java bean转xml的工具，找了几个用着不知道是配置不好还是怎么回事，都会有一些问题，然后脑子一热谢了一个javabean和xml的转换的工具里，自己用着还行，虽然有一些约束吧，还是贴出来记录一下顺便你提一下下，这个转换工具支持属性为集合、数组和非基本属性的对象。 packag
C 语言初级位运算 1140566087 位运算 c
第十章位运算 1、位运算对象只能是整形或字符型数据，在VC6.0中int型数据占4个字节 2、位运算符：运算符作用 ~ 按位求反 << 左移 >> 右移 & 按位与 ^ 按位异或 | 按位或他们的优先级从高到低； 3、位运算符的运算功能： a、按位取反： ~01001101 = 101
14点睛Spring4.1-脚本编程 wiselyman spring4
14.1 Scripting脚本编程脚本语言和java这类静态的语言的主要区别是:脚本语言无需编译,源码直接可运行; 如果我们经常需要修改的某些代码,每一次我们至少要进行编译,打包,重新部署的操作,步骤相当麻烦; 如果我们的应用不允许重启,这在现实的情况中也是很常见的; 在spring中使用脚本编程给上述的应用场景提供了解决方案,即动态加载bean; spring支持脚本