根据数据配置信息运用集成函数处理数据

集成函数处理蛋白质特征： sample_msa，make_masked_msa，nearest_neighbor_clusters，summarize_clusters，crop_extra_msa，delete_extra_msa，make_msa_feat，select_feat， random_crop_to_size，make_fixed_size，crop_templates

import copy
import tensorflow.compat.v1 as tf
import pickle
import numpy as np
import ml_collections

NUM_RES = 'num residues placeholder'
NUM_MSA_SEQ = 'msa placeholder'
NUM_EXTRA_SEQ = 'extra msa placeholder'
NUM_TEMPLATES = 'num templates placeholder'

CONFIG = ml_collections.ConfigDict({
    'data': {
        'common': {
            'masked_msa': {
                'profile_prob': 0.1,
                'same_prob': 0.1,
                'uniform_prob': 0.1
            },
            'max_extra_msa': 1024,
            'msa_cluster_features': True,
            'num_recycle': 3,
            'reduce_msa_clusters_by_max_templates': False,
            'resample_msa_in_recycling': True,
            'template_features': [
                'template_all_atom_positions', 'template_sum_probs',
                'template_aatype', 'template_all_atom_masks',
                'template_domain_names'
            ],
            'unsupervised_features': [
                'aatype', 'residue_index', 'sequence', 'msa', 'domain_name',
                'num_alignments', 'seq_length', 'between_segment_residues',
                'deletion_matrix'
            ],
            'use_templates': False,
        },
        'eval': {
            'feat': {
                'aatype': [NUM_RES],
                'all_atom_mask': [NUM_RES, None],
                'all_atom_positions': [NUM_RES, None, None],
                'alt_chi_angles': [NUM_RES, None],
                'atom14_alt_gt_exists': [NUM_RES, None],
                'atom14_alt_gt_positions': [NUM_RES, None, None],
                'atom14_atom_exists': [NUM_RES, None],
                'atom14_atom_is_ambiguous': [NUM_RES, None],
                'atom14_gt_exists': [NUM_RES, None],
                'atom14_gt_positions': [NUM_RES, None, None],
                'atom37_atom_exists': [NUM_RES, None],
                'backbone_affine_mask': [NUM_RES],
                'backbone_affine_tensor': [NUM_RES, None],
                'bert_mask': [NUM_MSA_SEQ, NUM_RES],
                'chi_angles': [NUM_RES, None],
                'chi_mask': [NUM_RES, None],
                'extra_deletion_value': [NUM_EXTRA_SEQ, NUM_RES],
                'extra_has_deletion': [NUM_EXTRA_SEQ, NUM_RES],
                'extra_msa': [NUM_EXTRA_SEQ, NUM_RES],
                'extra_msa_mask': [NUM_EXTRA_SEQ, NUM_RES],
                'extra_msa_row_mask': [NUM_EXTRA_SEQ],
                'is_distillation': [],
                'msa_feat': [NUM_MSA_SEQ, NUM_RES, None],
                'msa_mask': [NUM_MSA_SEQ, NUM_RES],
                'msa_row_mask': [NUM_MSA_SEQ],
                'pseudo_beta': [NUM_RES, None],
                'pseudo_beta_mask': [NUM_RES],
                'random_crop_to_size_seed': [None],
                'residue_index': [NUM_RES],
                'residx_atom14_to_atom37': [NUM_RES, None],
                'residx_atom37_to_atom14': [NUM_RES, None],
                'resolution': [],
                'rigidgroups_alt_gt_frames': [NUM_RES, None, None],
                'rigidgroups_group_exists': [NUM_RES, None],
                'rigidgroups_group_is_ambiguous': [NUM_RES, None],
                'rigidgroups_gt_exists': [NUM_RES, None],
                'rigidgroups_gt_frames': [NUM_RES, None, None],
                'seq_length': [],
                'seq_mask': [NUM_RES],
                'target_feat': [NUM_RES, None],
                'template_aatype': [NUM_TEMPLATES, NUM_RES],
                'template_all_atom_masks': [NUM_TEMPLATES, NUM_RES, None],
                'template_all_atom_positions': [
                    NUM_TEMPLATES, NUM_RES, None, None],
                'template_backbone_affine_mask': [NUM_TEMPLATES, NUM_RES],
                'template_backbone_affine_tensor': [
                    NUM_TEMPLATES, NUM_RES, None],
                'template_mask': [NUM_TEMPLATES],
                'template_pseudo_beta': [NUM_TEMPLATES, NUM_RES, None],
                'template_pseudo_beta_mask': [NUM_TEMPLATES, NUM_RES],
                'template_sum_probs': [NUM_TEMPLATES, None],
                'true_msa': [NUM_MSA_SEQ, NUM_RES]
            },
            'fixed_size': True,
            'subsample_templates': True,  # We want top templates.
            'masked_msa_replace_fraction': 0.15,
            'max_msa_clusters': 512,
            'max_templates': 4,
            'num_ensemble': 1,
            'crop_size': 100,
        },
    },
    'model': {
        'embeddings_and_evoformer': {
            'evoformer_num_block': 48,
            'evoformer': {
                'msa_row_attention_with_pair_bias': {
                    'dropout_rate': 0.15,
                    'gating': True,
                    'num_head': 8,
                    'orientation': 'per_row',
                    'shared_dropout': True
                },
                'msa_column_attention': {
                    'dropout_rate': 0.0,
                    'gating': True,
                    'num_head': 8,
                    'orientation': 'per_column',
                    'shared_dropout': True
                },
                'msa_transition': {
                    'dropout_rate': 0.0,
                    'num_intermediate_factor': 4,
                    'orientation': 'per_row',
                    'shared_dropout': True
                },
                'outer_product_mean': {
                    'first': False,
                    'chunk_size': 128,
                    'dropout_rate': 0.0,
                    'num_outer_channel': 32,
                    'orientation': 'per_row',
                    'shared_dropout': True
                },
                'triangle_attention_starting_node': {
                    'dropout_rate': 0.25,
                    'gating': True,
                    'num_head': 4,
                    'orientation': 'per_row',
                    'shared_dropout': True
                },
                'triangle_attention_ending_node': {
                    'dropout_rate': 0.25,
                    'gating': True,
                    'num_head': 4,
                    'orientation': 'per_column',
                    'shared_dropout': True
                },
                'triangle_multiplication_outgoing': {
                    'dropout_rate': 0.25,
                    'equation': 'ikc,jkc->ijc',
                    'num_intermediate_channel': 128,
                    'orientation': 'per_row',
                    'shared_dropout': True,
                    'fuse_projection_weights': False,
                },
                'triangle_multiplication_incoming': {
                    'dropout_rate': 0.25,
                    'equation': 'kjc,kic->ijc',
                    'num_intermediate_channel': 128,
                    'orientation': 'per_row',
                    'shared_dropout': True,
                    'fuse_projection_weights': False,
                },
                'pair_transition': {
                    'dropout_rate': 0.0,
                    'num_intermediate_factor': 4,
                    'orientation': 'per_row',
                    'shared_dropout': True
                }
            },
            'extra_msa_channel': 64,
            'extra_msa_stack_num_block': 4,
            'max_relative_feature': 32,
            'msa_channel': 256,
            'pair_channel': 128,
            'prev_pos': {
                'min_bin': 3.25,
                'max_bin': 20.75,
                'num_bins': 15
            },
            'recycle_features': True,
            'recycle_pos': True,
            'seq_channel': 384,
            'template': {
                'attention': {
                    'gating': False,
                    'key_dim': 64,
                    'num_head': 4,
                    'value_dim': 64
                },
                'dgram_features': {
                    'min_bin': 3.25,
                    'max_bin': 50.75,
                    'num_bins': 39
                },
                'embed_torsion_angles': False,
                'enabled': False,
                'template_pair_stack': {
                    'num_block': 2,
                    'triangle_attention_starting_node': {
                        'dropout_rate': 0.25,
                        'gating': True,
                        'key_dim': 64,
                        'num_head': 4,
                        'orientation': 'per_row',
                        'shared_dropout': True,
                        'value_dim': 64
                    },
                    'triangle_attention_ending_node': {
                        'dropout_rate': 0.25,
                        'gating': True,
                        'key_dim': 64,
                        'num_head': 4,
                        'orientation': 'per_column',
                        'shared_dropout': True,
                        'value_dim': 64
                    },
                    'triangle_multiplication_outgoing': {
                        'dropout_rate': 0.25,
                        'equation': 'ikc,jkc->ijc',
                        'num_intermediate_channel': 64,
                        'orientation': 'per_row',
                        'shared_dropout': True,
                        'fuse_projection_weights': False,
                    },
                    'triangle_multiplication_incoming': {
                        'dropout_rate': 0.25,
                        'equation': 'kjc,kic->ijc',
                        'num_intermediate_channel': 64,
                        'orientation': 'per_row',
                        'shared_dropout': True,
                        'fuse_projection_weights': False,
                    },
                    'pair_transition': {
                        'dropout_rate': 0.0,
                        'num_intermediate_factor': 2,
                        'orientation': 'per_row',
                        'shared_dropout': True
                    }
                },
                'max_templates': 4,
                'subbatch_size': 128,
                'use_template_unit_vector': False,
            }
        },
        'global_config': {
            'deterministic': False,
            'multimer_mode': False,
            'subbatch_size': 4,
            'use_remat': False,
            'zero_init': True,
            'eval_dropout': False,
        },
        'heads': {
            'distogram': {
                'first_break': 2.3125,
                'last_break': 21.6875,
                'num_bins': 64,
                'weight': 0.3
            },
            'predicted_aligned_error': {
                # `num_bins - 1` bins uniformly space the
                # [0, max_error_bin A] range.
                # The final bin covers [max_error_bin A, +infty]
                # 31A gives bins with 0.5A width.
                'max_error_bin': 31.,
                'num_bins': 64,
                'num_channels': 128,
                'filter_by_resolution': True,
                'min_resolution': 0.1,
                'max_resolution': 3.0,
                'weight': 0.0,
            },
            'experimentally_resolved': {
                'filter_by_resolution': True,
                'max_resolution': 3.0,
                'min_resolution': 0.1,
                'weight': 0.01
            },
            'structure_module': {
                'num_layer': 8,
                'fape': {
                    'clamp_distance': 10.0,
                    'clamp_type': 'relu',
                    'loss_unit_distance': 10.0
                },
                'angle_norm_weight': 0.01,
                'chi_weight': 0.5,
                'clash_overlap_tolerance': 1.5,
                'compute_in_graph_metrics': True,
                'dropout': 0.1,
                'num_channel': 384,
                'num_head': 12,
                'num_layer_in_transition': 3,
                'num_point_qk': 4,
                'num_point_v': 8,
                'num_scalar_qk': 16,
                'num_scalar_v': 16,
                'position_scale': 10.0,
                'sidechain': {
                    'atom_clamp_distance': 10.0,
                    'num_channel': 128,
                    'num_residual_block': 2,
                    'weight_frac': 0.5,
                    'length_scale': 10.,
                },
                'structural_violation_loss_weight': 1.0,
                'violation_tolerance_factor': 12.0,
                'weight': 1.0
            },
            'predicted_lddt': {
                'filter_by_resolution': True,
                'max_resolution': 3.0,
                'min_resolution': 0.1,
                'num_bins': 50,
                'num_channels': 128,
                'weight': 0.01
            },
            'masked_msa': {
                'num_output': 23,
                'weight': 2.0
            },
        },
        'num_recycle': 3,
        'resample_msa_in_recycling': True
    },
})


_MSA_FEATURE_NAMES = [
    'msa', 'deletion_matrix', 'msa_mask', 'msa_row_mask', 'bert_mask',
    'true_msa'
]


class SeedMaker(object):
  """Return unique seeds."""
 
  def __init__(self, initial_seed=0):
    self.next_seed = initial_seed
 
  def __call__(self):
    i = self.next_seed
    self.next_seed += 1
    return i


def shape_list(x):
  """Return list of dimensions of a tensor, statically where possible.

  Like `x.shape.as_list()` but with tensors instead of `None`s.

  Args:
    x: A tensor.
  Returns:
    A list with length equal to the rank of the tensor. The n-th element of the
    list is an integer when that dimension is statically known otherwise it is
    the n-th element of `tf.shape(x)`.
  """
  x = tf.convert_to_tensor(x)

  # If unknown rank, return dynamic shape
  if x.get_shape().dims is None:
    return tf.shape(x)

  static = x.get_shape().as_list()
  shape = tf.shape(x)

  ret = []
  for i in range(len(static)):
    dim = static[i]
    if dim is None:
      dim = shape[i]
    ret.append(dim)
  return ret


def shaped_categorical(probs, epsilon=1e-10):
  ds = shape_list(probs)
  num_classes = ds[-1]
  counts = tf.random.categorical(
      tf.reshape(tf.log(probs + epsilon), [-1, num_classes]),
      1,
      dtype=tf.int32)
  return tf.reshape(counts, ds[:-1])


def data_transforms_curry1(f):
  """Supply all arguments but the first."""

  def fc(*args, **kwargs):
    return lambda x: f(x, *args, **kwargs)

  return fc



@data_transforms_curry1
def sample_msa(protein, max_seq, keep_extra):
  """Sample MSA randomly, remaining sequences are stored as `extra_*`.
  Args:
    protein: batch to sample msa from.
    max_seq: number of sequences to sample.
    keep_extra: When True sequences not sampled are put into fields starting
      with 'extra_*'.
  Returns:
    Protein with sampled msa.
  """
  num_seq = tf.shape(protein['msa'])[0]
  # 索引0的序列为查询序列
  shuffled = tf.random_shuffle(tf.range(1, num_seq))
  index_order = tf.concat([[0], shuffled], axis=0)
  num_sel = tf.minimum(max_seq, num_seq)
  # tf.split函数将张量沿指定轴进行切分，
  # 第一张量大小为num_sel，第二张量大小为num_seq - num_sel
  sel_seq, not_sel_seq = tf.split(index_order, [num_sel, num_seq - num_sel])
 
  for k in _MSA_FEATURE_NAMES:
    if k in protein:
      if keep_extra:
        # tf.gather 按索引从输入张量中收集元素的函数
          protein['extra_' + k] = tf.gather(protein[k], not_sel_seq)
      protein[k] = tf.gather(protein[k], sel_seq)
 
  return protein


@data_transforms_curry1
def make_masked_msa(protein, config, replace_fraction):
  """Create data for BERT on raw MSA."""
  # Add a random amino acid uniformly
  random_aa = tf.constant([0.05] * 20 + [0., 0.], dtype=tf.float32)
  # 构建随机随机出现某一氨基酸的概率，和MSA中氨基酸的保守性有关
  categorical_probs = (
      config.uniform_prob * random_aa +
      config.profile_prob * protein['hhblits_profile'] +
      config.same_prob * tf.one_hot(protein['msa'], 22))
 
  #print(tf.reduce_sum(categorical_probs, axis=-1))  # 都为0.3
 
  # Put all remaining probability on [MASK] which is a new column
 
  pad_shapes = [[0, 0] for _ in range(len(categorical_probs.shape))]
  pad_shapes[-1][1] = 1
  # mask_prob ： 0.7， 其他prob加在一起0.3
  mask_prob = 1. - config.profile_prob - config.same_prob - config.uniform_prob
  assert mask_prob >= 0.
  # categorical_probs张量后填充mask_prob值，代表MSA每一个位置的概率（20种氨基酸+gap+X+mask）
  categorical_probs = tf.pad(
      categorical_probs, pad_shapes, constant_values=mask_prob)
 
  #print(tf.reduce_sum(categorical_probs, axis=-1))  # 都为0.3
 
  sh = shape_list(protein['msa'])
  # 0-1均匀分布中随机抽样，形状为sh，通过和replace_fraction（0.15）比较,产生随机mask位置
  mask_position = tf.random.uniform(sh) < replace_fraction
  
  ##抽样，注意随机性产生的方式，抽到mask概率最大，而抽到其他氨基酸概率的大小和其在MSA中的保守性有关
  bert_msa = shaped_categorical(categorical_probs)
  ## 大概0.15的概率用随机氨基酸代替，随机氨基酸中有0.7的概率是mask，还有0.3的概率抽到其他氨基酸，
  ## 氨基酸在此位置越保守，抽到的可能性越大
  ## bert_msa中大概有0.7*0.15的mask，还有混杂着错误和正确的氨基酸
  bert_msa = tf.where(mask_position, bert_msa, protein['msa'])
 
  # Mix real and masked MSA
  protein['bert_mask'] = tf.cast(mask_position, tf.float32)
  protein['true_msa'] = protein['msa']
  protein['msa'] = bert_msa
 
  return protein


@data_transforms_curry1
def nearest_neighbor_clusters(protein, gap_agreement_weight=0.):
  """Assign each extra MSA sequence to its nearest neighbor in sampled MSA."""
 
  # Determine how much weight we assign to each agreement.  In theory, we could
  # use a full blosum matrix here, but right now let's just down-weight gap
  # agreement because it could be spurious.
  # Never put weight on agreeing on BERT mask
  # 除了gap权重为0，其他（restype+X+mask）权重为1
  weights = tf.concat([
      tf.ones(21),
      gap_agreement_weight * tf.ones(1),
      np.zeros(1)], 0)
 
  # Make agreement score as weighted Hamming distance
  # 增加一个维度
  sample_one_hot = (protein['msa_mask'][:, :, None] *
                    tf.one_hot(protein['msa'], 23))
  extra_one_hot = (protein['extra_msa_mask'][:, :, None] *
                   tf.one_hot(protein['extra_msa'], 23))
 
  num_seq, num_res, _ = shape_list(sample_one_hot)
  extra_num_seq, _, _ = shape_list(extra_one_hot)
 
  # Compute tf.einsum('mrc,nrc,c->mn', sample_one_hot, extra_one_hot, weights)
  # in an optimized fashion to avoid possible memory or computation blowup.
  # 判断extra msa序列与MSA sample序列的相似度,相同的氨基酸越多，越相似。
  # 没有考虑氨基酸的性质，可以改进
  # 注意氨基酸的权重（weights）
  agreement = tf.matmul(
      tf.reshape(extra_one_hot, [extra_num_seq, num_res * 23]),
      tf.reshape(sample_one_hot * weights, [num_seq, num_res * 23]),
      transpose_b=True)
 
  # Assign each sequence in the extra sequences to the closest MSA sample
  # 对extra msa中每一条序列，取相似度最高的MSA sample序列
  protein['extra_cluster_assignment'] = tf.argmax(
      agreement, axis=1, output_type=tf.int32)
 
  return protein


@data_transforms_curry1
def summarize_clusters(protein):
  """Produce profile and deletion_matrix_mean within each cluster."""
  num_seq = shape_list(protein['msa'])[0]
  def csum(x):
    return tf.math.unsorted_segment_sum(
        x, protein['extra_cluster_assignment'], num_seq)
 
  mask = protein['extra_msa_mask']
  mask_counts = 1e-6 + protein['msa_mask'] + csum(mask)  # Include center
  
  # 结果张量[num_seq, num_resi]，第一行表示和msa中的0号序列是最近邻序列的extr_msa之和，以此类推
  msa_sum = csum(mask[:, :, None] * tf.one_hot(protein['extra_msa'], 23))
  msa_sum += tf.one_hot(protein['msa'], 23)  # Original sequence
  protein['cluster_profile'] = msa_sum / mask_counts[:, :, None]
 
  del msa_sum
 
  # 每条msa序列的最近邻序列的extr_msa，在不同位置deletion数统计
  # del_sum [num_seq, num_resi]，第一行表示和msa中的0号序列是最近邻序列的extr_msa,不同位置deletion数，以此类推
  del_sum = csum(mask * protein['extra_deletion_matrix'])
  del_sum += protein['deletion_matrix']  # Original sequence
  protein['cluster_deletion_mean'] = del_sum / mask_counts
  del del_sum
 
  return protein


@data_transforms_curry1
def crop_extra_msa(protein, max_extra_msa):
  """MSA features are cropped so only `max_extra_msa` sequences are kept."""
  num_seq = tf.shape(protein['extra_msa'])[0]
  num_sel = tf.minimum(max_extra_msa, num_seq)
  select_indices = tf.random_shuffle(tf.range(0, num_seq))[:num_sel]
  for k in _MSA_FEATURE_NAMES:
    if 'extra_' + k in protein:
      protein['extra_' + k] = tf.gather(protein['extra_' + k], select_indices)

  return protein


@data_transforms_curry1
def make_msa_feat(protein):
  """Create and concatenate MSA features."""
  # Whether there is a domain break. Always zero for chains, but keeping
  # for compatibility with domain datasets.
  has_break = tf.clip_by_value(
      tf.cast(protein['between_segment_residues'], tf.float32),
      0, 1)
  aatype_1hot = tf.one_hot(protein['aatype'], 21, axis=-1)

  target_feat = [
      tf.expand_dims(has_break, axis=-1),
      aatype_1hot,  # Everyone gets the original sequence.
  ]

  msa_1hot = tf.one_hot(protein['msa'], 23, axis=-1)
  has_deletion = tf.clip_by_value(protein['deletion_matrix'], 0., 1.)
  deletion_value = tf.atan(protein['deletion_matrix'] / 3.) * (2. / np.pi)

  msa_feat = [
      msa_1hot,
      tf.expand_dims(has_deletion, axis=-1),
      tf.expand_dims(deletion_value, axis=-1),
  ]

  if 'cluster_profile' in protein:
    deletion_mean_value = (
        tf.atan(protein['cluster_deletion_mean'] / 3.) * (2. / np.pi))
    msa_feat.extend([
        protein['cluster_profile'],
        tf.expand_dims(deletion_mean_value, axis=-1),
    ])

  if 'extra_deletion_matrix' in protein:
    protein['extra_has_deletion'] = tf.clip_by_value(
        protein['extra_deletion_matrix'], 0., 1.)
    protein['extra_deletion_value'] = tf.atan(
        protein['extra_deletion_matrix'] / 3.) * (2. / np.pi)

  protein['msa_feat'] = tf.concat(msa_feat, axis=-1)
  protein['target_feat'] = tf.concat(target_feat, axis=-1)
  return protein


@data_transforms_curry1
def select_feat(protein, feature_list):
  return {k: v for k, v in protein.items() if k in feature_list}


@data_transforms_curry1
def random_crop_to_size(protein, crop_size, max_templates, shape_schema,
                        subsample_templates=False):
  """Crop randomly to `crop_size`, or keep as is if shorter than that."""
  seq_length = protein['seq_length']
  if 'template_mask' in protein:
    num_templates = tf.cast(
        shape_list(protein['template_mask'])[0], tf.int32)
  else:
    num_templates = tf.constant(0, dtype=tf.int32)
  num_res_crop_size = tf.math.minimum(seq_length, crop_size)
 
  # Ensures that the cropping of residues and templates happens in the same way
  # across ensembling iterations.
  # Do not use for randomness that should vary in ensembling.
  seed_maker = SeedMaker(initial_seed=protein['random_crop_to_size_seed'])
 
  if subsample_templates:
    templates_crop_start = tf.random.stateless_uniform(
        shape=(), minval=0, maxval=num_templates + 1, dtype=tf.int32,
        seed=seed_maker())
  else:
    templates_crop_start = 0
 
  num_templates_crop_size = tf.math.minimum(
      num_templates - templates_crop_start, max_templates)
 
  num_res_crop_start = tf.random.stateless_uniform(
      shape=(), minval=0, maxval=seq_length - num_res_crop_size + 1,
      dtype=tf.int32, seed=seed_maker())
 
  ## 产生随机打乱的索引，用于所有需要裁剪的模版特征
 
  # tf.argsort 函数用于返回张量中元素的排序索引
  # tf.random.stateless_uniform：生成指定形状的服从均匀分布的随机张量
  # 生成num_templates个指定形状的服从均匀分布的随机张量，形状为shape=(num_templates,)。
  # 注：num_templates为标量，作为shape时，变成list[num_templates]
  templates_select_indices = tf.argsort(tf.random.stateless_uniform(
      [num_templates], seed=seed_maker()))
 
  for k, v in protein.items():
    if k not in shape_schema or (
        'template' not in k and NUM_RES not in shape_schema[k]):
      continue
 
    # randomly permute the templates before cropping them.
    if k.startswith('template') and subsample_templates:
      v = tf.gather(v, templates_select_indices)
 
    crop_sizes = []
    crop_starts = []
    
    # zip函数把维度说明和维度值绑定
    # shape_schema[k]维度说明（placeholder）列表 ，shape_list(v)维度值
    for i, (dim_size, dim) in enumerate(zip(shape_schema[k],shape_list(v))):
      is_num_res = (dim_size == NUM_RES)
      if i == 0 and k.startswith('template'):
        crop_size = num_templates_crop_size
        crop_start = templates_crop_start
      else:
        crop_start = num_res_crop_start if is_num_res else 0
        crop_size = (num_res_crop_size if is_num_res else
                     (-1 if dim is None else dim))
      crop_sizes.append(crop_size)
      crop_starts.append(crop_start)
    protein[k] = tf.slice(v, crop_starts, crop_sizes)
 
  protein['seq_length'] = num_res_crop_size
  return protein


@data_transforms_curry1
def make_fixed_size(protein, shape_schema, msa_cluster_size, extra_msa_size,
                    num_res, num_templates=0):
  """Guess at the MSA and sequence dimensions to make fixed size."""
 
  pad_size_map = {
      NUM_RES: num_res,
      NUM_MSA_SEQ: msa_cluster_size,
      NUM_EXTRA_SEQ: extra_msa_size,
      NUM_TEMPLATES: num_templates,
  }
 
  for k, v in protein.items():
    # Don't transfer this to the accelerator.
    if k == 'extra_cluster_assignment':
      continue
    shape = v.shape.as_list()
    # 特征维度placeholder
    schema = shape_schema[k]
    assert len(shape) == len(schema), (
        f'Rank mismatch between shape and shape schema for {k}: '
        f'{shape} vs {schema}')
    
    # 特征张量不同维度的填充尺寸（pad_size）。需要填充的维度尺寸由pad_size_map决定。
    # 字典get方法，键不存在时返回的None,这时列表取 s1 for (s1, s2) in zip(shape, schema)
    pad_size = [
        pad_size_map.get(s2, None) or s1 for (s1, s2) in zip(shape, schema)
    ]
    # 在张量的后面填充，需要填充0的数目为填充尺寸减去现有的尺寸（p - tf.shape(v)[i]）
    padding = [(0, p - tf.shape(v)[i]) for i, p in enumerate(pad_size)]
    if padding:
      protein[k] = tf.pad(
          v, padding, name=f'pad_to_fixed_{k}')
      protein[k].set_shape(pad_size)
  return protein


def ensembled_map_fns(data_config):
  """Input pipeline functions that can be ensembled and averaged."""
  common_cfg = data_config.common
  eval_cfg = data_config.eval

  map_fns = []

  if common_cfg.reduce_msa_clusters_by_max_templates:
    pad_msa_clusters = eval_cfg.max_msa_clusters - eval_cfg.max_templates
  else:
    pad_msa_clusters = eval_cfg.max_msa_clusters

  max_msa_clusters = pad_msa_clusters
  max_extra_msa = common_cfg.max_extra_msa

  map_fns.append(sample_msa(max_msa_clusters,keep_extra=True))

  if 'masked_msa' in common_cfg:
    # Masked MSA should come *before* MSA clustering so that
    # the clustering and full MSA profile do not leak information about
    # the masked locations and secret corrupted locations.
    map_fns.append(make_masked_msa(common_cfg.masked_msa,
                                   eval_cfg.masked_msa_replace_fraction))

  if common_cfg.msa_cluster_features:
    map_fns.append(nearest_neighbor_clusters())
    
    map_fns.append(summarize_clusters())
    
  # Crop after creating the cluster profiles.
  if max_extra_msa:
    map_fns.append(crop_extra_msa(max_extra_msa))
  else:
    map_fns.append(delete_extra_msa)

  map_fns.append(make_msa_feat())

  crop_feats = dict(eval_cfg.feat)

  if eval_cfg.fixed_size:
    map_fns.append(select_feat(list(crop_feats)))
    map_fns.append(random_crop_to_size(
        eval_cfg.crop_size,
        eval_cfg.max_templates,
        crop_feats,
        eval_cfg.subsample_templates))
    map_fns.append(make_fixed_size(
        crop_feats,
        pad_msa_clusters,
        common_cfg.max_extra_msa,
        eval_cfg.crop_size,
        eval_cfg.max_templates))
  else:
    map_fns.append(crop_templates(eval_cfg.max_templates))

  return map_fns


@data_transforms_curry1
def compose(x, fs):
  for f in fs:
    x = f(x)
  return x


with open("Human_HBB_tensor_dict_nonensembled.pkl",'rb') as f:
   Human_HBB_tensor = pickle.load(f)

protein = copy.deepcopy(Human_HBB_tensor)

#加上protein['deletion_matrix']特征，不然会报错
protein['deletion_matrix'] = tf.cast(protein['deletion_matrix_int'], dtype=tf.float32) 

data_config = CONFIG.data

eval_cfg = data_config.eval
common_cfg = data_config.common

crop_feats = dict(eval_cfg.feat)
#pad_msa_clusters = eval_cfg.max_msa_clusters

shape_schema = crop_feats

protein = compose(ensembled_map_fns(data_config))(protein)

with open("Human_HBB_tensor_dict_ensembled.pkl",'wb') as f:
    pickle.dump(protein, f)

print(f"ensembled函数处理前:")
print(f"特征数:{len(Human_HBB_tensor)}")
print(f"特征:{Human_HBB_tensor.keys()}")
print(Human_HBB_tensor['aatype'].shape)
#print(Human_HBB_tensor['aatype'])
      
print(f"ensembled函数处理后:")
print(f"特征数:{len(protein)}")
print(f"特征:{protein.keys()}")
print(protein['extra_msa'].shape)
print(protein['aatype'].shape)
print(protein['msa_feat'].shape)

【Python】一文详细介绍 py格式文件高斯小哥 Python基础【高质量合集】python 新手入门学习
【Python】一文详细介绍py格式文件个人主页：高斯小哥高质量专栏：Matplotlib之旅：零基础精通数据可视化、Python基础【高质量合集】、PyTorch零基础入门教程希望得到您的订阅和支持~创作高质量博文(平均质量分92+)，分享更多关于深度学习、PyTorch、Python领域的优质内容！（希望得到您的关注~）文章目录一、py格式文件简介二、如何创建和编辑py格式文件三、如何运行py
python抓包与解包_Python—网络抓包与解包（pcap、dpkt） weixin_39691055 python抓包与解包
pcap安装[root@localhost~]#pipinstallpypcap抓包与解包#-*-coding:utf-8-*-importpcap,dpktimportre,threading,requests__black_ip=['103.224.249.123','203.66.1.212']#抓包：param1eth_name网卡名，如：eth0,eth3。param2p_type日志捕
华为OD机试 - 单向链表中间节点（Java & JS & Python & C & C++）华为OD题库华为od 链表 java
须知哈喽，本题库完全免费，收费是为了防止被爬，大家订阅专栏后可以私信联系退款。感谢支持文章目录须知题目描述输出描述解析代码题目描述给定一个单链表L，请编写程序输出L中间结点保存的数据。如果有两个中间结点，则输出第二个中间结点保存的数据。例如：给定L为1→7→5，则输出应该为7；给定L为1→2→3→4，则输出应该为3；输入描述每个输入包含1个测试用例。每个测试用例：第一行给出链表首结点的地址、结点总
python 推导式(派生、衍生) sanduo112 人工智能 python windows 开发语言
python推导式一、推导式(派生、衍生)1.Python推导式是一种独特的数据处理方式，可以从一个数据序列构建另一个新的数据序列的结构体。2.列表(list)推导式3.字典(dict)推导式4.集合(set)推导式5.元组(tuple)推导式二、代码概述一、推导式(派生、衍生)1.Python推导式是一种独特的数据处理方式，可以从一个数据序列构建另一个新的数据序列的结构体。Python支持各种数
数据挖掘|数据预处理|基于Python的数据标准化方法皖山文武数据挖掘数据建模与分析 python 数据挖掘开发语言
基于Python的数据标准化方法1.z-score方法2.极差标准化方法3.最大绝对值标准化方法在数据分析之前，通常需要先将数据标准化（Standardization），利用标准化后的数据进行数据分析，以避免属性之间不同度量和取值范围差异造成数据对分析结果的影响。1.z-score方法Z-score方法是基于原始数据的均值和标准差来进行数据标准化的，处理后的数据均值为0，方差为1，符合标准正态分布
CSV指南：Python程序获取大型CSV文件行数孤独打铁匠Julian 笔记经验分享 python
本指南提供了几种使用Python来获取大型CSV文件行数的方法，并解释了每种方法的适用场景。方法1:使用csv.reader处理复杂CSV文件当你的CSV文件中包含多行字段（即某些字段的值中包含换行符）时，使用csv.reader是一个可靠的选择，因为它能够正确处理这些复杂情况。这个方法适用于大多数大小的CSV文件，但是对于非常大的文件，读取整个文件可能会占用较多的时间和内存。对于极大的文件，考虑
谷歌浏览器驱动Chromedriver（114-120版本）文件以及驱动下载教程 pigerr杨 Python python chrome drivers
ChromeDriver官方网站GitHub||GoogleChromeLabs/chrome-for-testingChromeDriver113-125_JSONChromeforTestingavailability123-125zip白月黑羽Python基础|进阶|Qt图形界面|Django|自动化测试|性能测试|JS语言|JS前端|原理与安装
大创项目推荐深度学习 opencv python 公式识别(图像识别机器视觉) laafeer python
文章目录0前言1课题说明2效果展示3具体实现4关键代码实现5算法综合效果6最后0前言优质竞赛项目系列，今天要分享的是基于深度学习的数学公式识别算法实现该项目较为新颖，适合作为竞赛课题方向，学长非常推荐！学长这里给一个题目综合评分(每项满分5分)难度系数：3分工作量：4分创新点：4分更多资料,项目分享：https://gitee.com/dancheng-senior/postgraduate1课题
python转码 Desamond python 开发语言
转码在许多场景中都有应用，以下是一些常见的场景：网页开发：当用户在网页上输入文本时，可能需要将特殊字符（如空格、引号、特殊符号等）进行转码，以防止这些字符对URL或HTML代码产生干扰。文件名处理：在处理文件名时，可能需要将特殊字符进行转码，以避免文件名被错误地解析或显示。数据传输：在数据传输过程中，为了确保数据的完整性和正确性，可能需要将数据中的特殊字符进行转码。数据存储：在数据库或数据存储中，
排序算法太多？常用排序都在这了，一篇文章总结和实现所有面试会考的排序算法（基于Python实现）宇宙之一粟不归路之Python #IT面试题收集与总结数据结构与算法算法数据结构排序算法 python java
文章目录排序算法1.常见的排序算法1.1选择排序1.1.1思想1.1.2实现**1.1.3选择排序分析**1.2冒泡排序**1.2.1思想****1.2.2实现****1.2.3冒泡排序分析**1.3插入排序**1.3.1思想****1.3.2实现****1.3.3插入排序分析**1.4归并排序☆☆★**1.4.1思想****1.4.2实现****1.4.3归并排序分析**1.5快速排序☆★★**
27.Python从入门到精通—Python异常处理抛出异常用户自定义异常定义清理行为预定义的清理行为以山河作礼。 #Python基础入门—详解版 python java 服务器
27.从入门到精通：Python异常处理抛出异常用户自定义异常定义清理行为预定义的清理行为异常处理抛出异常用户自定义异常定义清理行为预定义的清理行为异常处理在Python中，异常处理是一种处理程序在执行期间可能遇到的错误的方法。当Python解释器遇到错误时，它会引发异常。异常是一种Python对象，它包含有关错误的信息，例如错误类型和错误位置。为了处理异常，您可以使用try-except语句。在
python清华大学出版社答案_Python机器学习及实践 weixin_39805119 python清华大学出版社答案
第1章机器学习的基础知识1.1何谓机器学习1.1.1传感器和海量数据1.1.2机器学习的重要性1.1.3机器学习的表现1.1.4机器学习的主要任务1.1.5选择合适的算法1.1.6机器学习程序的步骤1.2综合分类1.3推荐系统和深度学习1.3.1推荐系统1.3.2深度学习1.4何为Python1.4.1使用Python软件的由来1.4.2为什么使用Python1.4.3Python设计定位1.4.
VGG16滤镜可视化和类激活图 LIjin_1006 人工智能神经网络深度学习 cnn
这个用keras2.2.4+tensorflow1.15.0importkeraskeras.__version__fromkeras.applicationsimportVGG16fromkerasimportbackendasKimportnumpyasnpfromkerasimportmodelsimportmatplotlib.pyplotaspltimporttensorflowastf
Python | Redis工具类 -拟墨画扇- Python redis 数据库缓存 python
一、需求自动连接Redis数据库，通过连接池处理数据对输出结果进行Log打印并保存到文件二、代码Utils.redisUtils.py#!/usr/bin/envpython#-*-coding:utf-8-*-importredisfromUtils.loggerimportlog"""Redis数据格式(1)字符串|存储形式:key-value:str-存储二进制数据:可以存储任意类型的数据，
Python dict字符串转json对象，小数精度丢失问题朝如青丝暮成雪 json python
一前言JSON(JavaScriptObjectNotation)是一种轻量级的数据交换格式，dict是Python的一种数据格式。本篇介绍一个float数据转换时精度丢失的案例。二问题描述importjsontest_str1='{"π":3.1415926535897932384626433832795028841971}'test_str2='{"value":10.00000}'print
Python+Requests模拟发送GET请求爱学习的执念自动化测试软件测试技术分享 python 开发语言
模拟发送GET请求前置条件：导入requests库一、发送不带参数的get请求代码如下：以百度首页为例importrequests#发送get请求response=requests.get(url="http://www.baidu.com")print(response.content.decode("utf-8"))#以utf-8的编码输出内容二、发送带参数的get请求发送带参数的get请求有
Python极速入门：五分钟开启实战之旅！知白守黑V Python 编程语言系统运维 python 编程语言 python开发 python学习 python入门 python数据分析
1.Python基础语法和结构：了解Python的基本语法，包括变量、数据类型、运算符、注释等。控制流：掌握条件语句（if-elif-else）、循环（for和while）及其控制（break和continue）。函数：学习如何定义和使用函数，包括参数传递、返回值、作用域和闭包。模块和包：理解如何导入和使用模块，以及如何创建和使用自己的包。2.数据处理列表、元组和集合：学习这些序列类型的操作和方法
Python Flask 使用数据库安果移不动 python flask 开发语言
pipinstallflask_sqlalchemy官方文档：Flask-SQLAlchemy—Flask-SQLAlchemyDocumentation(3.1.x)为了不报错也需要导入另外两个库#pipinstallflask_sqlalchemy#pipinstallmysqlclient完整代码importosfromflaskimportFlaskfromflask_sqlalchemy
PaperWeekly sapienst Papers PaperwithCode General ML
1.Python软件包解决DL在未见过的数据分布下性能差的问题：（1）神经网络和损失分离的模块化设计（2）强大便捷的基准测试能力（3）易于使用但难以修改（4）github:https://github.com/marrlab/domainlabTrainer和Models之间是什么关系Trainer和Models是DomainLab中的两个核心概念。Trainer是一个用于指导数据流向模型并计算S
使用Python读取Excel文件并计算平均分嘻嘻爱编码 Python从入门到放弃 python excel 开发语言
在这篇博客中，我们将探讨如何使用Python的pandas库来读取Excel文件，并计算其中数据的平均分。pandas是一个强大的数据分析工具，它允许我们以简单直观的方式处理表格数据。安装必要的库在开始之前，确保你的环境中安装了pandas和openpyxl库。可以使用以下命令进行安装：pipinstallpandasopenpyxl读取Excel文件首先，我们需要读取Excel文件。假设我们有一
python项目练习——7.网站访问日志分析器 F—— python项目练习 python 信息可视化数据分析数据挖掘开发语言学习
项目功能分析：这个项目可以读取网站的访问日志文件，统计访问量、独立访客数、访问来源等信息，并以图表或表格的形式展示出来。这个项目涉及到文件操作、数据处理、数据可视化等方面的技术。示例代码：importrefromcollectionsimportCounterimportmatplotlib.pyplotaspltdefparse_log_file(log_file):#读取日志文件内容witho
python的while双重循环九九乘法表 Jinm_R python 开发语言
a=1whilea<=9:b=1#乘数每次需要从1开始whileb<=a:print(f"{a}*{b}={a*b}\t",end='')#\t为制表符使乘法表整齐end=''代表用空格代替换行b+=1a+=1print()#乘数每加一换行
DCGAN中的生成器和识别器代码详解 YYLin-AI DCGAN 深度学习 celeba tensorflow
#DCGAN中的生成器我自己写的有一个封装好的用于生成器和识别器的卷积操作但是在这个代码中我没有使用我自己的代码#原因想绍一下tensorflow自带的函数所以找了一个以前在书上的代码申明一下这个不是原创但是原来代码中有几处不符合DCGAN的要求所以就做了一些修改转载链接没有就直接写成原创建议看代码之前先看看DCGAN的特点，然后再看代码中如何实这些特点的这样会更有帮助DCGAN（深度卷积的对抗生
【Python】成功解决ModuleNotFoundError: No module named ‘torchinfo‘ 高斯小哥 BUG解决方案合集 python pytorch 新手入门学习 debug
【Python】成功解决ModuleNotFoundError:Nomodulenamed‘torchinfo’个人主页：高斯小哥高质量专栏：Matplotlib之旅：零基础精通数据可视化、Python基础【高质量合集】、PyTorch零基础入门教程希望得到您的订阅和支持~创作高质量博文(平均质量分92+)，分享更多关于深度学习、PyTorch、Python领域的优质内容！（希望得到您的关注~）文
Python自动化测试web常见框架汇总自动化测试薰儿软件测试技术分享 python 前端开发语言
1、前言目前，有非常多的Python框架，用来帮助你更轻松的创建web应用。这些框架把相应的模块组织起来，使得构建应用的时候可以更快捷，也不用去关注一些细节（例如socket和协议），所以需要的都在框架里了。接下来我们会介绍不同的选项。经过初期的不起眼，Python已经成为互联网最流行的服务端编程语言之一。根据W3Techs的统计，它被用于很多的大流量的站点很多的大流量的站点很多的大流量的站点，超
python安装jupter在线ide 晚风拂柳颜生活小经验 python3 ide jupter
我在虚拟3.6.8的环境里面安装的，具体用了以下命令；pipinstallipython-ihttps://mirrors.aliyun.com/pypi/simple/pipinstalljupyter-ihttps://mirrors.aliyun.com/pypi/simple/jupyternotebook当然，jupter可以直接通过python环境里script目录下的jupyter-
opencv 十八 python下实现0缓存掉线重连的rtsp直播流播放器摸鱼的机器猫 opencv实战 opencv python 缓存
使用opencv打开rtsp视频流时，会因为网络问题导致VideoCapture掉线；也会因为图像的后处理阶段耗时过长导致opencv缓冲区数据堆积，从而使程序无法及时处理最新的数据。为此对cv2.VideoCapture进行封装，实现0缓存掉线重连的rtsp直播流播放器，让程序能一直处理最新的数据。代码实现fromcollectionsimportdequeimportthreadingimpo
Windows如何安装poppler库，python的PDF转PPTX项目跨不过 pdf
资源库在这里下载https://github.com/oschwartz10612/poppler-windows/releases/tag/v21.03.0其他的参考这篇博客，里面提到的资源链接失效了https://blog.csdn.net/wy01415/article/details/110257130
用Python批量更改图片大小马达马达达 AI python
#提取目录下所有图片,更改尺寸后保存到另一目录fromPILimportImageimportos.pathimportglobdefconvertjpg(jpgfile,outdir,width=128,height=128):img=Image.open(jpgfile)try:new_img=img.resize((width,height),Image.BILINEAR)new_img.s
3.Python数据分析—数据分析入门知识图谱&索引(知识体系中篇) 以山河作礼。 Python数据分析项目数据分析知识图谱数据挖掘 python 开发语言
3.Python数据分析—数据分析入门知识图谱&索引-知识体系中篇一·个人简介二·数据获取和处理2.1数据来源：2.2数据清洗：2.2.1缺失值处理：2.2.2异常值处理：2.3数据转换：2.3.1数据类型转换：2.3.2数据编码：2.4数据合并与重塑：2.4.1数据合并：2.4.2数据拼接：2.4.3数据重塑：三·数据探索与分析3.1描述性统计分析3.2数据可视化原则和技巧3.3探索性数据分析（
PHP如何实现二维数组排序？ IT独行者二维数组 PHP 排序　
二维数组在PHP开发中经常遇到，但是他的排序就不如一维数组那样用内置函数来的方便了，（一维数组排序可以参考本站另一篇文章【PHP中数组排序函数详解汇总】）。二维数组的排序需要我们自己写函数处理了，这里UncleToo给大家分享一个PHP二维数组排序的函数：代码： functionarray_sort($arr,$keys,$type='asc'){ $keysvalue= $new_arr
【Hadoop十七】HDFS HA配置 bit1129 hadoop
基于Zookeeper的HDFS HA配置主要涉及两个文件,core-site和hdfs-site.xml。测试环境有三台 hadoop.master hadoop.slave1 hadoop.slave2 hadoop.master包含的组件NameNode, JournalNode, Zookeeper，DFSZKFailoverController
由wsdl生成的java vo类不适合做普通java vo darrenzhu VO wsdl webservice rpc
开发java webservice项目时，如果我们通过SOAP协议来输入输出，我们会利用工具从wsdl文件生成webservice的client端类，但是这里面生成的java data model类却不适合做为项目中的普通java vo类来使用，当然有一中情况例外，如果这个自动生成的类里面的properties都是基本数据类型，就没问题，但是如果有集合类，就不行。原因如下： 1)使用了集合如Li
JAVA海量数据处理之二（BitMap）周凡杨 java 算法 bitmap bitset 数据
路漫漫其修远兮，吾将上下而求索。想要更快，就要深入挖掘 JAVA 基础的数据结构，从来分析出所编写的 JAVA 代码为什么把内存耗尽，思考有什么办法可以节省内存呢？啊哈！算法。这里采用了 BitMap 思想。首先来看一个实验：指定 VM 参数大小： -Xms256m -Xmx540m
java类型与数据库类型 g21121 java
很多时候我们用hibernate的时候往往并不是十分关心数据库类型和java类型的对应关心，因为大多数hbm文件是自动生成的，但有些时候诸如：数据库设计、没有生成工具、使用原始JDBC、使用mybatis(ibatIS)等等情况，就会手动的去对应数据库与java的数据类型关心，当然比较简单的数据类型即使配置错了也会很快发现问题，但有些数据类型却并不是十分常见，这就给程序员带来了很多麻烦。 &nb
Linux命令 510888780 linux命令
系统信息 arch 显示机器的处理器架构(1) uname -m 显示机器的处理器架构(2) uname -r 显示正在使用的内核版本 dmidecode -q 显示硬件系统部件 - (SMBIOS / DMI) hdparm -i /dev/hda 罗列一个磁盘的架构特性 hdparm -tT /dev/sda 在磁盘上执行测试性读取操作 cat /proc/cpuinfo 显示C
java常用JVM参数墙头上一根草 java jvm参数
-Xms：初始堆大小，默认为物理内存的1/64(<1GB)；默认(MinHeapFreeRatio参数可以调整)空余堆内存小于40%时，JVM就会增大堆直到-Xmx的最大限制 -Xmx：最大堆大小，默认(MaxHeapFreeRatio参数可以调整)空余堆内存大于70%时，JVM会减少堆直到 -Xms的最小限制 -Xmn：新生代的内存空间大小，注意：此处的大小是（eden+ 2
我的spring学习笔记9-Spring使用工厂方法实例化Bean的注意点 aijuans Spring 3
方法一： <bean id="musicBox" class="onlyfun.caterpillar.factory.MusicBoxFactory" factory-method="createMusicBoxStatic"></bean> 方法二：
mysql查询性能优化之二 annan211 UNION mysql 查询优化索引优化
1 union的限制有时mysql无法将限制条件从外层下推到内层，这使得原本能够限制部分返回结果的条件无法应用到内层查询的优化上。如果希望union的各个子句能够根据limit只取部分结果集，或者希望能够先排好序在合并结果集的话，就需要在union的各个子句中分别使用这些子句。例如想将两个子查询结果联合起来，然后再取前20条记录，那么mys
数据的备份与恢复百合不是茶 oracle sql 数据恢复数据备份
数据的备份与恢复的方式有: 表,方案 ,数据库; 数据的备份: 导出到的常见命令; 参数说明 USERID 确定执行导出实用程序的用户名和口令 BUFFER 确定导出数据时所使用的缓冲区大小，其大小用字节表示 FILE 指定导出的二进制文
线程组 bijian1013 java 多线程 thread java多线程线程组
有些程序包含了相当数量的线程。这时，如果按照线程的功能将他们分成不同的类别将很有用。线程组可以用来同时对一组线程进行操作。创建线程组：ThreadGroup g = new ThreadGroup(groupName); &nbs
top命令找到占用CPU最高的java线程 bijian1013 java linux top
上次分析系统中占用CPU高的问题，得到一些使用Java自身调试工具的经验，与大家分享。 (1)使用top命令找出占用cpu最高的JAVA进程PID:28174 (2)如下命令找出占用cpu最高的线程 top -Hp 28174 -d 1 -n 1 32694 root 20 0 3249m 2.0g 11m S 2 6.4 3:31.12 java
【持久化框架MyBatis3四】MyBatis3一对一关联查询 bit1129 Mybatis3
当两个实体具有1对1的对应关系时，可以使用One-To-One的进行映射关联查询 One-To-One示例数据以学生表Student和地址信息表为例，每个学生都有都有1个唯一的地址(现实中，这种对应关系是不合适的，因为人和地址是多对一的关系)，这里只是演示目的学生表 CREATE TABLE STUDENTS (
C/C++图片或文件的读写 bitcarter 写图片
先看代码： /*strTmpResult是文件或图片字符串 * filePath文件需要写入的地址或路径 */ int writeFile(std::string &strTmpResult,std::string &filePath) { int i,len = strTmpResult.length(); unsigned cha
nginx自定义指定加载配置 ronin47
进入 /usr/local/nginx/conf/include 目录，创建 nginx.node.conf 文件，在里面输入如下代码： upstream nodejs { server 127.0.0.1:3000; #server 127.0.0.1:3001; keepalive 64; } server { liste
java-71-数值的整数次方.实现函数double Power(double base, int exponent)，求base的exponent次方 bylijinnan double
public class Power { /** *Q71-数值的整数次方 *实现函数double Power(double base, int exponent)，求base的exponent次方。不需要考虑溢出。 */ private static boolean InvalidInput=false; public static void main(
Android四大组件的理解 Cb123456 android 四大组件的理解
分享一下，今天在Android开发文档-开发者指南中看到的: App components are the essential building blocks of an Android
[宇宙与计算]涡旋场计算与拓扑分析 comsci 计算
怎么阐述我这个理论呢？。。。。。。。。。首先：宇宙是一个非线性的拓扑结构与涡旋轨道时空的统一体。。。。我们要在宇宙中寻找到一个适合人类居住的行星，时间非常重要，早一个刻度和晚一个刻度，这颗行星的
同一个Tomcat不同Web应用之间共享会话Session cwqcwqmax9 session
实现两个WEB之间通过session 共享数据查看tomcat 关于 HTTP Connector 中有个emptySessionPath 其解释如下： If set to true, all paths for session cookies will be set to /. This can be useful for portlet specification impleme
springmvc Spring3 MVC，ajax，乱码 dashuaifu spring jquery mvc Ajax
springmvc Spring3 MVC @ResponseBody返回，jquery ajax调用中文乱码问题解决 Spring3.0 MVC @ResponseBody 的作用是把返回值直接写到HTTP response body里。具体实现AnnotationMethodHandlerAdapter类handleResponseBody方法，具体实
搭建WAMP环境 dcj3sjt126com wamp
这里先解释一下WAMP是什么意思。W:windows，A：Apache，M：MYSQL，P：PHP。也就是说本文说明的是在windows系统下搭建以apache做服务器、MYSQL为数据库的PHP开发环境。工欲善其事，必须先利其器。因为笔者的系统是WinXP，所以下文指的系统均为此系统。笔者所使用的Apache版本为apache_2.2.11-
yii2 使用raw http request dcj3sjt126com http
Parses a raw HTTP request using yii\helpers\Json::decode() To enable parsing for JSON requests you can configure yii\web\Request::$parsers using this class: 'request' =&g
Quartz-1.8.6 理论部分 eksliang quartz
转载请出自出处：http://eksliang.iteye.com/blog/2207691 一.概述基于Quartz-1.8.6进行学习，因为Quartz2.0以后的API发生的非常大的变化，统一采用了build模式进行构建；什么是quartz? 答：简单的说他是一个开源的java作业调度框架，为在 Java 应用程序中进行作业调度提供了简单却强大的机制。并且还能和Sp
什么是POJO？ gupeng_ie java POJO 框架 Hibernate
POJO--Plain Old Java Objects(简单的java对象) POJO是一个简单的、正规Java对象，它不包含业务逻辑处理或持久化逻辑等，也不是JavaBean、EntityBean等，不具有任何特殊角色和不继承或不实现任何其它Java框架的类或接口。 POJO对象有时也被称为Data对象，大量应用于表现现实中的对象。如果项目中使用了Hiber
jQuery网站顶部定时折叠广告 ini JavaScript html jquery Web css
效果体验：http://hovertree.com/texiao/jquery/4.htmHTML文件代码： <!DOCTYPE html> <html xmlns="http://www.w3.org/1999/xhtml"> <head> <title>网页顶部定时收起广告jQuery特效 - HoverTree<
Spring boot内嵌的tomcat启动失败 kane_xie spring boot
根据这篇guide创建了一个简单的spring boot应用，能运行且成功的访问。但移植到现有项目（基于hbase）中的时候，却报出以下错误： SEVERE: A child container failed during start java.util.concurrent.ExecutionException: org.apache.catalina.Lif
leetcode: sort list michelle_0916 Algorithm linked list sort
Sort a linked list in O(n log n) time using constant space complexity. ====analysis======= mergeSort for singly-linked list ====code======= /** * Definition for sin
nginx的安装与配置,中途遇到问题的解决 qifeifei nginx
我使用的是ubuntu13.04系统，在安装nginx的时候遇到如下几个问题，然后找思路解决的，nginx 的下载与安装 wget http://nginx.org/download/nginx-1.0.11.tar.gz tar zxvf nginx-1.0.11.tar.gz ./configure make make install 安装的时候出现
用枚举来处理java自定义异常 tcrct java enum exception
在系统开发过程中，总少不免要自己处理一些异常信息，然后将异常信息变成友好的提示返回到客户端的这样一个过程，之前都是new一个自定义的异常，当然这个所谓的自定义异常也是继承RuntimeException的，但这样往往会造成异常信息说明不一致的情况，所以就想到了用枚举来解决的办法。 1，先创建一个接口，里面有两个方法，一个是getCode, 一个是getMessage public
erlang supervisor分析 wudixiaotie erlang
当我们给supervisor指定需要创建的子进程的时候，会指定M,F,A,如果是simple_one_for_one的策略的话，启动子进程的方式是supervisor:start_child(SupName, OtherArgs),这种方式可以根据调用者的需求传不同的参数给需要启动的子进程的方法。和最初的参数合并成一个数组，A ++ OtherArgs。那么这个时候就有个问题了，既然参数不一致，那

根据数据配置信息运用集成函数处理数据

你可能感兴趣的:(生物信息学,tensorflow,python)