ciky奇

【何之源-21个项目玩转深度学习】——Chapter3-3.2 数据准备-将图像数据转为tfrecord形式

在训练自己的模型前，需要准备数据集，tfrecord作为tensorflow较为流行的数据处理格式，我们需要根据已有的图像样本来制作tfrecord格式的数据源。读者完全可按照下面文件的存放路径，调用以下两个.py文件制作自己的tfrecord文件；

何大神提供的数据源结构如下：

data_prepare/
pic/
train/
wood/
water/
rock/
wetland/
glacier/
urban/
validation/
wood/
water/
rock/
wetland/
glacier/
urban/
src/
tfrecord.py
data_convert.py

在data_prepare文件夹下有个pic的文件夹，该文件夹中又包含train文件夹和validation文件夹；在train文件夹中又包含wood,water,rock,wetland，glacier,urban文件夹，这6个文件夹中分别包含各自类型图像800张，尺寸大致为256x256；

同样在validation中也包含那6个文件夹，各目录下存放了200张图像；

运行data_prepare/ 目录下的data_convert.py程序，运行指令是：

python data_convert.py -t pic/ \
--train-shards 2 \
--validation-shards 2 \
--num-threads 2 \
--dataset-name satellite

指令解释如下：

-t pic/ 是指要转换格式的图像文件存放在pic文件夹下；

--train-shards 2 是指将训练图像生成的tfrecord文件分成2份（考虑数据存储的方便，具体分成几份才合理请百度吧，默认是2份）

--validation-shards 2 是指将验证图像生成的tfrecord文件分成2份（默认2）

--num-threads 2 线程数（默认2，注意线程数必须要能整除 train-shards 和 validation-shards，来保证每个线程处理的数据块数是相同的）

--dataset-name satellite 数据集名，默认为satellite（根据读者自己的数据集更改，何大神用的是卫星航拍图，给生成的数据集起一个名字。这里将数据集起名叫“satellite＇’，最后生成文件的开头就是 satellite_train 和 satellite_validation）

data_convert.py的代码如下：

# coding:utf-8
from __future__ import absolute_import
import argparse
import os
import logging
from src.tfrecord import main

def parse_args():
    parser = argparse.ArgumentParser()
    parser.add_argument('-t', '--tensorflow-data-dir', default='pic/')
    parser.add_argument('--train-shards', default=2, type=int)
    parser.add_argument('--validation-shards', default=2, type=int)
    parser.add_argument('--num-threads', default=2, type=int)
    parser.add_argument('--dataset-name', default='satellite', type=str)
    return parser.parse_args()

if __name__ == '__main__':
    logging.basicConfig(level=logging.INFO)
    args = parse_args()
    args.tensorflow_dir = args.tensorflow_data_dir
    args.train_directory = os.path.join(args.tensorflow_dir, 'train')
    args.validation_directory = os.path.join(args.tensorflow_dir, 'validation')
    args.output_directory = args.tensorflow_dir
    args.labels_file = os.path.join(args.tensorflow_dir, 'label.txt')
    if os.path.exists(args.labels_file) is False:
        logging.warning('Can\'t find label.txt. Now create it.')
        all_entries = os.listdir(args.train_directory)
        dirnames = []
        for entry in all_entries:
            if os.path.isdir(os.path.join(args.train_directory, entry)):
                dirnames.append(entry)
        with open(args.labels_file, 'w') as f:
            for dirname in dirnames:
                f.write(dirname + '\n')
    main(args)

读者可根据作者的数据存放目录结构存放数据，然后根据自己的数据集更改名字；其中上面这个.py文件调用了src文件夹中的tfrecord.py文件（其源码如下）；

# coding:utf-8
# Copyright 2016 Google Inc. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
#     http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# ==============================================================================
"""Converts image data to TFRecords file format with Example protos.
The image data set is expected to reside in JPEG files located in the
following directory structure.
  data_dir/label_0/image0.jpeg
  data_dir/label_0/image1.jpg
  ...
  data_dir/label_1/weird-image.jpeg
  data_dir/label_1/my-image.jpeg
  ...
where the sub-directory is the unique label associated with these images.
This TensorFlow script converts the training and evaluation data into
a sharded data set consisting of TFRecord files
  train_directory/train-00000-of-01024
  train_directory/train-00001-of-01024
  ...
  train_directory/train-00127-of-01024
and
  validation_directory/validation-00000-of-00128
  validation_directory/validation-00001-of-00128
  ...
  validation_directory/validation-00127-of-00128
where we have selected 1024 and 128 shards for each data set. Each record
within the TFRecord file is a serialized Example proto. The Example proto
contains the following fields:
  image/encoded: string containing JPEG encoded image in RGB colorspace
  image/height: integer, image height in pixels
  image/width: integer, image width in pixels
  image/colorspace: string, specifying the colorspace, always 'RGB'
  image/channels: integer, specifying the number of channels, always 3
  image/format: string, specifying the format, always'JPEG'
  image/filename: string containing the basename of the image file
            e.g. 'n01440764_10026.JPEG' or 'ILSVRC2012_val_00000293.JPEG'
  image/class/label: integer specifying the index in a classification layer. start from "class_label_base"
  image/class/text: string specifying the human-readable version of the label
    e.g. 'dog'
If you data set involves bounding boxes, please look at build_imagenet_data.py.
"""
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function

from datetime import datetime
import os
import random
import sys
import threading

import numpy as np
import tensorflow as tf
import logging


def _int64_feature(value):
    """Wrapper for inserting int64 features into Example proto."""
    if not isinstance(value, list):
        value = [value]
    return tf.train.Feature(int64_list=tf.train.Int64List(value=value))


def _bytes_feature(value):
	value=tf.compat.as_bytes(value)
	return tf.train.Feature(bytes_list=tf.train.BytesList(value=[value]))


def _convert_to_example(filename, image_buffer, label, text, height, width):
    """Build an Example proto for an example.
    Args:
      filename: string, path to an image file, e.g., '/path/to/example.JPG'
      image_buffer: string, JPEG encoding of RGB image
      label: integer, identifier for the ground truth for the network
      text: string, unique human-readable, e.g. 'dog'
      height: integer, image height in pixels
      width: integer, image width in pixels
    Returns:
      Example proto
    """

    colorspace = 'RGB'
    channels = 3
    image_format = 'JPEG'

    example = tf.train.Example(features=tf.train.Features(feature={
        'image/height': _int64_feature(height),
        'image/width': _int64_feature(width),
        'image/colorspace': _bytes_feature(colorspace),
        'image/channels': _int64_feature(channels),
        'image/class/label': _int64_feature(label),
        'image/class/text': _bytes_feature(text),
        'image/format': _bytes_feature(image_format),
        'image/filename': _bytes_feature(os.path.basename(filename)),
        'image/encoded': _bytes_feature(image_buffer)}))
    return example


class ImageCoder(object):
    """Helper class that provides TensorFlow image coding utilities."""

    def __init__(self):
        # Create a single Session to run all image coding calls.
        self._sess = tf.Session()

        # Initializes function that converts PNG to JPEG data.
        self._png_data = tf.placeholder(dtype=tf.string)
        image = tf.image.decode_png(self._png_data, channels=3)
        self._png_to_jpeg = tf.image.encode_jpeg(image, format='rgb', quality=100)

        # Initializes function that decodes RGB JPEG data.
        self._decode_jpeg_data = tf.placeholder(dtype=tf.string)
        self._decode_jpeg = tf.image.decode_jpeg(self._decode_jpeg_data, channels=3)

    def png_to_jpeg(self, image_data):
        return self._sess.run(self._png_to_jpeg,
                              feed_dict={self._png_data: image_data})

    def decode_jpeg(self, image_data):
        image = self._sess.run(self._decode_jpeg,
                               feed_dict={self._decode_jpeg_data: image_data})
        assert len(image.shape) == 3
        assert image.shape[2] == 3
        return image


def _is_png(filename):
    """Determine if a file contains a PNG format image.
    Args:
      filename: string, path of the image file.
    Returns:
      boolean indicating if the image is a PNG.
    """
    return '.png' in filename


def _process_image(filename, coder):
    """Process a single image file.
    Args:
      filename: string, path to an image file e.g., '/path/to/example.JPG'.
      coder: instance of ImageCoder to provide TensorFlow image coding utils.
    Returns:
      image_buffer: string, JPEG encoding of RGB image.
      height: integer, image height in pixels.
      width: integer, image width in pixels.
    """
    # Read the image file.
    with open(filename, 'rb') as f:    # need change r  to  rb
        image_data = f.read()

    # Convert any PNG to JPEG's for consistency.
    if _is_png(filename):
        logging.info('Converting PNG to JPEG for %s' % filename)
        image_data = coder.png_to_jpeg(image_data)

    # Decode the RGB JPEG.
    image = coder.decode_jpeg(image_data)

    # Check that image converted to RGB
    assert len(image.shape) == 3
    height = image.shape[0]
    width = image.shape[1]
    assert image.shape[2] == 3

    return image_data, height, width


def _process_image_files_batch(coder, thread_index, ranges, name, filenames,
                               texts, labels, num_shards, command_args):
    """Processes and saves list of images as TFRecord in 1 thread.
    Args:
      coder: instance of ImageCoder to provide TensorFlow image coding utils.
      thread_index: integer, unique batch to run index is within [0, len(ranges)).
      ranges: list of pairs of integers specifying ranges of each batches to
        analyze in parallel.
      name: string, unique identifier specifying the data set
      filenames: list of strings; each string is a path to an image file
      texts: list of strings; each string is human readable, e.g. 'dog'
      labels: list of integer; each integer identifies the ground truth
      num_shards: integer number of shards for this data set.
    """
    # Each thread produces N shards where N = int(num_shards / num_threads).
    # For instance, if num_shards = 128, and the num_threads = 2, then the first
    # thread would produce shards [0, 64).
    num_threads = len(ranges)
    assert not num_shards % num_threads
    num_shards_per_batch = int(num_shards / num_threads)

    shard_ranges = np.linspace(ranges[thread_index][0],
                               ranges[thread_index][1],
                               num_shards_per_batch + 1).astype(int)
    num_files_in_thread = ranges[thread_index][1] - ranges[thread_index][0]

    counter = 0
    for s in range(num_shards_per_batch):  #xrange used only in python 2.X ；so use range instend  by csq
        # Generate a sharded version of the file name, e.g. 'train-00002-of-00010'
        shard = thread_index * num_shards_per_batch + s
        output_filename = '%s_%s_%.5d-of-%.5d.tfrecord' % (command_args.dataset_name, name, shard, num_shards)
        output_file = os.path.join(command_args.output_directory, output_filename)
        writer = tf.python_io.TFRecordWriter(output_file)

        shard_counter = 0
        files_in_shard = np.arange(shard_ranges[s], shard_ranges[s + 1], dtype=int)
        for i in files_in_shard:
            filename = filenames[i]
            label = labels[i]
            text = texts[i]

            image_buffer, height, width = _process_image(filename, coder)

            example = _convert_to_example(filename, image_buffer, label,
                                          text, height, width)
            writer.write(example.SerializeToString())
            shard_counter += 1
            counter += 1

            if not counter % 1000:
                logging.info('%s [thread %d]: Processed %d of %d images in thread batch.' %
                             (datetime.now(), thread_index, counter, num_files_in_thread))
                sys.stdout.flush()

        writer.close()
        logging.info('%s [thread %d]: Wrote %d images to %s' %
                     (datetime.now(), thread_index, shard_counter, output_file))
        sys.stdout.flush()
        shard_counter = 0
    logging.info('%s [thread %d]: Wrote %d images to %d shards.' %
                 (datetime.now(), thread_index, counter, num_files_in_thread))
    sys.stdout.flush()


def _process_image_files(name, filenames, texts, labels, num_shards, command_args):
    """Process and save list of images as TFRecord of Example protos.
    Args:
      name: string, unique identifier specifying the data set
      filenames: list of strings; each string is a path to an image file
      texts: list of strings; each string is human readable, e.g. 'dog'
      labels: list of integer; each integer identifies the ground truth
      num_shards: integer number of shards for this data set.
    """
    assert len(filenames) == len(texts)
    assert len(filenames) == len(labels)

    # Break all images into batches with a [ranges[i][0], ranges[i][1]].
    spacing = np.linspace(0, len(filenames), command_args.num_threads + 1).astype(np.int)
    ranges = []
    for i in range(len(spacing) - 1):   #xrange used only in python 2.X ；so use range instend  by csq
        ranges.append([spacing[i], spacing[i + 1]])

    # Launch a thread for each batch.
    logging.info('Launching %d threads for spacings: %s' % (command_args.num_threads, ranges))
    sys.stdout.flush()

    # Create a mechanism for monitoring when all threads are finished.
    coord = tf.train.Coordinator()

    # Create a generic TensorFlow-based utility for converting all image codings.
    coder = ImageCoder()

    threads = []
    for thread_index in range(len(ranges)):  #xrange used only in python 2.X ；so use range instend  by csq
        args = (coder, thread_index, ranges, name, filenames,
                texts, labels, num_shards, command_args)
        t = threading.Thread(target=_process_image_files_batch, args=args)
        t.start()
        threads.append(t)

    # Wait for all the threads to terminate.
    coord.join(threads)
    logging.info('%s: Finished writing all %d images in data set.' %
                 (datetime.now(), len(filenames)))
    sys.stdout.flush()


def _find_image_files(data_dir, labels_file, command_args):
    """Build a list of all images files and labels in the data set.
    Args:
      data_dir: string, path to the root directory of images.
        Assumes that the image data set resides in JPEG files located in
        the following directory structure.
          data_dir/dog/another-image.JPEG
          data_dir/dog/my-image.jpg
        where 'dog' is the label associated with these images.
      labels_file: string, path to the labels file.
        The list of valid labels are held in this file. Assumes that the file
        contains entries as such:
          dog
          cat
          flower
        where each line corresponds to a label. We map each label contained in
        the file to an integer starting with the integer 0 corresponding to the
        label contained in the first line.
    Returns:
      filenames: list of strings; each string is a path to an image file.
      texts: list of strings; each string is the class, e.g. 'dog'
      labels: list of integer; each integer identifies the ground truth.
    """
    logging.info('Determining list of input files and labels from %s.' % data_dir)
    unique_labels = [l.strip() for l in tf.gfile.FastGFile(
        labels_file, 'r').readlines()]

    labels = []
    filenames = []
    texts = []

    # Leave label index 0 empty as a background class.
    """非常重要，这里我们调整label从0开始以符合定义"""
    label_index = command_args.class_label_base

    # Construct the list of JPEG files and labels.
    for text in unique_labels:
        jpeg_file_path = '%s/%s/*' % (data_dir, text)
        matching_files = tf.gfile.Glob(jpeg_file_path)

        labels.extend([label_index] * len(matching_files))
        texts.extend([text] * len(matching_files))
        filenames.extend(matching_files)

        if not label_index % 100:
            logging.info('Finished finding files in %d of %d classes.' % (
                label_index, len(labels)))
        label_index += 1

    # Shuffle the ordering of all image files in order to guarantee
    # random ordering of the images with respect to label in the
    # saved TFRecord files. Make the randomization repeatable.
    shuffled_index = list(range(len(filenames)))    #add  list() by ciky
    random.seed(12345)
    random.shuffle(shuffled_index)

    filenames = [filenames[i] for i in shuffled_index]
    texts = [texts[i] for i in shuffled_index]
    labels = [labels[i] for i in shuffled_index]

    logging.info('Found %d JPEG files across %d labels inside %s.' %
                 (len(filenames), len(unique_labels), data_dir))
    # print(labels)
    return filenames, texts, labels


def _process_dataset(name, directory, num_shards, labels_file, command_args):
    """Process a complete data set and save it as a TFRecord.
    Args:
      name: string, unique identifier specifying the data set.
      directory: string, root path to the data set.
      num_shards: integer number of shards for this data set.
      labels_file: string, path to the labels file.
    """
    filenames, texts, labels = _find_image_files(directory, labels_file, command_args)
    _process_image_files(name, filenames, texts, labels, num_shards, command_args)


def check_and_set_default_args(command_args):
    if not(hasattr(command_args, 'train_shards')) or command_args.train_shards is None:
        command_args.train_shards = 5
    if not(hasattr(command_args, 'validation_shards')) or command_args.validation_shards is None:
        command_args.validation_shards = 5
    if not(hasattr(command_args, 'num_threads')) or command_args.num_threads is None:
        command_args.num_threads = 5
    if not(hasattr(command_args, 'class_label_base')) or command_args.class_label_base is None:
        command_args.class_label_base = 0
    if not(hasattr(command_args, 'dataset_name')) or command_args.dataset_name is None:
        command_args.dataset_name = ''
    assert not command_args.train_shards % command_args.num_threads, (
        'Please make the command_args.num_threads commensurate with command_args.train_shards')
    assert not command_args.validation_shards % command_args.num_threads, (
        'Please make the command_args.num_threads commensurate with '
        'command_args.validation_shards')
    assert command_args.train_directory is not None
    assert command_args.validation_directory is not None
    assert command_args.labels_file is not None
    assert command_args.output_directory is not None


def main(command_args):
    """
    command_args:需要有以下属性：
    command_args.train_directory  训练集所在的文件夹。这个文件夹下面，每个文件夹的名字代表label名称，再下面就是图片。
    command_args.validation_directory 验证集所在的文件夹。这个文件夹下面，每个文件夹的名字代表label名称，再下面就是图片。
    command_args.labels_file 一个文件。每一行代表一个label名称。
    command_args.output_directory 一个文件夹，表示最后输出的位置。

    command_args.train_shards 将训练集分成多少份。
    command_args.validation_shards 将验证集分成多少份。
    command_args.num_threads 线程数。必须是上面两个参数的约数。

    command_args.class_label_base 很重要！真正的tfrecord中，每个class的label号从多少开始，默认为0（在models/slim中就是从0开始的）
    command_args.dataset_name 字符串，输出的时候的前缀。

    图片不可以有损坏。否则会导致线程提前退出。
    """
    check_and_set_default_args(command_args)
    logging.info('Saving results to %s' % command_args.output_directory)

    # Run it!
    _process_dataset('validation', command_args.validation_directory,
                     command_args.validation_shards, command_args.labels_file, command_args)
    _process_dataset('train', command_args.train_directory,
                     command_args.train_shards, command_args.labels_file, command_args)

这个源码与何大神提供有差异，考虑本人用的是python3，（何大神用的应该是python2）,所以如不做更改会报一些错误。

直接运行

python data_convert.py -t pic/ \
--train-shards 2 \
--validation-shards 2 \
--num-threads 2 \
--dataset-name satellite

可能会报如下错误：

\data_prepare\src\tfrecord.py", line 341, in _find_image_files
random.shuffle(shuffled_index)
File "F:\Python36\lib\random.py", line 275, in shuffle
x[i], x[j] = x[j], x[i]
TypeError: 'range' object does not support item assignment

UnicodeDecodeError: 'gbk' codec can't decode byte 0xff in position 0: illegal multibyte sequence

解决方法是做如下几处做更改（我上面给的tfrecord.py代码是做了更改后的）：

//第一
def _bytes_feature(value):
"""Wrapper for inserting bytes features into Example proto."""
value=tf.compat.as_bytes(value)//这行需要添加（作者给的代码这行没有）
return tf.train.Feature(bytes_list=tf.train.BytesList(value=[value]))

//第二
def _process_image(filename, coder):
with open(filename, 'rb') as f://这里需要加个b（作者给的源码是‘r’）
image_data = f.read()

//第三
xrange需要都改为range

//第四
_find_image_files:
shuffled_index = list(range(len(filenames)))//这里加上了list （百度了下说python3中range不返回数组对象，而是返回range对象）

//第五
你的项目路径最好不要有中文，嗯中文路径很多问题的你懂的，拼音也比中文好。

至此，运行指令后会在data_prepare/pic/目录下生成下图5个文件；

其中label,txt中内容是

glacier
rock
urban
water
wetland
wood
这6类标签名；

而.tfrecord文件中存放的数据是包含图像数据和标签统一存储的二进制文件

tfrecord格式文件使用可参考：https://blog.csdn.net/c20081052/article/details/81315774）

参考：

https://blog.csdn.net/u010412719/article/details/47088095

https://blog.csdn.net/shijing_0214/article/details/51971734

https://blog.csdn.net/dillon2015/article/details/52987792

https://github.com/hzy46/Deep-Learning-21-Examples/issues/28

PHP安全编程实践系列（三）：安全会话管理与防护策略软考和人工智能学堂 php #php程序设计经验 php 安全开发语言
前言会话管理是Web应用安全的核心环节，不安全的会话实现可能导致用户账户被劫持、敏感数据泄露等严重后果。本文将深入探讨PHP中的会话安全机制，分析常见会话攻击手段，并提供全面的防护策略和实践方案。一、会话安全基础1.1PHP会话机制工作原理理论：PHP会话是通过会话ID（SessionID）在服务器和客户端之间维持状态的一种机制。关键流程包括：会话初始化：session_start()调用会话ID
一文搞懂 Cursor 内部工作原理~ zz_jesse
介绍了Cursor，一个结合了AI技术的代码编辑器，它通过深度学习和语义索引的方式，提升了开发者的工作效率。Cursor通过与VSCode相似的界面和功能，以及自己的AI特性，实现了代码的智能化编辑和错误检查。译文从这开始～～你可能已经看到新闻：OpenAI正以高达30亿美元的价格收购Windsurf！与此同时，Cursor的母公司Anysphere也正在以90亿美元估值融资9亿美元！这对于代码生
MiniMind：3小时训练26MB微型语言模型，开源项目助力AI初学者快速入门 nine是个工程师关注人工智能语言模型开源
开发｜界面｜引擎｜交付｜副驾——重写全栈法则：AI原生的倍速造应用流来自全栈程序员nine的探索与实践，持续迭代中。欢迎关注评论私信交流~在大型语言模型(LLaMA、GPT等)日益流行的今天，一个名为MiniMind的开源项目正在AI学习圈内引起广泛关注。这个项目让初学者能够在3小时内从零开始训练出一个仅26.88MB大小的微型语言模型，体积仅为GPT-3的七千分之一，却完整覆盖了从数据处理到模型
如何让AI真正理解你的意图（自适应Prompt实战指南） nine是个工程师大语言模型人工智能 prompt
目前的LLM模型，在理解用户意图方面，正在使用自适应Prompt技术，来提升模型的理解能力。目前使用deepseek推理模型能明显看到自适应的一个过程。前言：为什么你的AI总是"答非所问"？相信很多人都遇到过这样的情况：你问：“帮我写一个Python爬虫”AI答：给你一堆理论知识和完整教程（你只想要简单代码）你问：“推荐一部电影”AI答：推荐了《教父》（你想看轻松喜剧）你问：“解释一下机器学习”A
Cursor这类编程Agent软件的模型架构与工作流程 nine是个工程师谈谈架构 Agent 架构
开发｜界面｜引擎｜交付｜副驾——重写全栈法则：AI原生的倍速造应用流来自全栈程序员nine的探索与实践，持续迭代中。欢迎评论私信交流。最近在关注和输出一系列AIGC架构。模型架构与工作流程大语言模型（LLM）核心编程Agent的核心是一个强大的大语言模型，负责理解用户意图并生成相应的代码和解决方案。Cursor这类编程Agent通常基于GPT-4或Claude等先进大语言模型构建。这些模型通过海量
前端开发实践：疑难问题与解决方案总结沈大大520 实际开发所遇见的问题 vue.js 前端
本文将分享前端开发实践：疑难问题与解决方案总结，希望对大家在面试过程中有一定的帮助！作者：沈大大更新时间：2025-03-13前言在前端开发过程中，我们经常会遇到各种各样的技术难题。本文将分享在实际开发中遇到的一些典型问题及其解决方案，希望能给其他开发者一些参考和启发。性能优化类问题1.首屏加载过慢问题描述页面首次加载时间超过3秒用户等待时间过长白屏时间明显问题分析打包体积过大第三方库引入过多未进
ollama v0.9.6版本发布详解：修复启动屏幕样式及新增工具名称参数支持福大大架构师每日一题文心一言vschatgpt ollama
作为近年来备受瞩目的开源对话式人工智能框架之一，ollama持续更新优化其产品，致力于为开发者带来更稳定、高效的使用体验。2025年7月8日，ollama发布了v0.9.6版本，这一版本在用户界面和API的可用性方面做出了重要改进，进一步增强了开发和集成的便捷性。本文将对ollamav0.9.6版本的更新内容进行全面解析，详细介绍新特性、修复的具体问题、应用示例及最佳实践，帮助开发者快速掌握和应用
AI人工智能与机器学习的大数据融合应用 AI智能探索者人工智能机器学习大数据 ai
AI人工智能与机器学习的大数据融合应用关键词：AI人工智能、机器学习、大数据、融合应用、数据挖掘摘要：本文深入探讨了AI人工智能与机器学习在大数据融合应用方面的相关内容。首先介绍了研究的背景、目的、预期读者和文档结构，对核心术语进行了清晰定义。接着阐述了AI、机器学习和大数据的核心概念及相互联系，给出了形象的文本示意图和Mermaid流程图。详细讲解了核心算法原理，并通过Python源代码进行说明
深入解读 Qwen3 技术报告（一）：引言小爷毛毛（卓寿杰）大模型AIGC 深度学习基础/原理人工智能自然语言处理 python 语言模型深度学习
重磅推荐专栏：《大模型AIGC》《课程大纲》《知识星球》本专栏致力于探索和讨论当今最前沿的技术趋势和应用领域，包括但不限于ChatGPT和StableDiffusion等。我们将深入研究大型模型的开发和应用，以及与之相关的人工智能生成内容（AIGC）技术。通过深入的技术解析和实践经验分享，旨在帮助读者更好地理解和应用这些领域的最新进展1.引言：迎接大型语言模型的新纪元我们正处在一个由人工智能（AI
目标检测YOLO实战应用案例100讲-基于深度学习的自动驾驶目标检测算法研究（续）林聪木目标检测 YOLO 深度学习
目录基于双蓝图卷积的轻量化自动驾驶目标检测算法5.1引言5.2DarkNet53网络冗余性分析5.3双蓝图卷积网络5.4实验结果及分析基于深度学习的自动驾驶目标检测算法研究与应用传统的目标检测算法目标检测基线算法性能对比与选择相关理论和算法基础2.1引言2.2人工神经网络2.3FCOS目标检测算法2.4复杂交通场景下的目标检测难点与FCOS改进方案基于FCOS的目标检测算法改进3.1引言3.2Re
AI人工智能遇上TensorFlow：技术融合新趋势 AI大模型应用之禅人工智能 tensorflow python ai
AI人工智能遇上TensorFlow：技术融合新趋势关键词：人工智能、TensorFlow、深度学习、神经网络、机器学习、技术融合、AI开发摘要：本文深入探讨了人工智能技术与TensorFlow框架的融合发展趋势。我们将从基础概念出发，详细分析TensorFlow在AI领域的核心优势，包括其架构设计、算法实现和实际应用。文章包含丰富的技术细节，如神经网络原理、TensorFlow核心算法实现、数学
SpringBoot-19-企业云端开发实践之web开发晋级皮皮冰燃 SpringBoot spring boot 前端后端
文章目录1静态资源访问1.1static静态资源目录1.2application.properties(过滤规则)2文件上传2.1文件上传原理2.2SprintBoot文件上传功能2.3FileUploadController.java2.4配置访问上传的文件3拦截器3.1interceptor/LoginInterceptor3.2config/WebConfig4RESTful服务和Swagg
深入企业内部的MCP知识（二）：FastMCP客户端三大核心能力深度解析：资源、工具与提示的全场景实践炼丹上岸大模型 #MCP microsoft 人工智能 python 交互 mcp
引言：MCP协议交互的“三驾马车”在ModelContextProtocol（MCP）的技术生态中，资源（Resources）、工具（Tools）与提示（Prompts）构成了客户端与服务器交互的核心支柱。FastMCP通过统一的API设计，将这三者转化为可直接调用的编程接口，既隐藏了底层协议的复杂性，又保留了高度的灵活性。本文将从技术原理、实战案例到性能优化，系统拆解这三大能力的使用方法与协同逻
HR面-面试题及套路总结
HR面试题：1.你为什么不考研?我认为学历只是表现能力的一种形式，因为我觉得我这个专业，不考研的话也能做的很好，早点进入社会，在社会中通过实践来磨炼自己，一样可以使自己成为，为公司创造价值的人。2.你如何看待加班?如果工作需要，我可以加班。因为我刚毕业，时间和精力也比较充裕，可以全身心的投入工作。同时，我也会提高工作效率，尽可能减少不必要的加班。3.为什么选择北京?北京是一个快节奏的城市，在北京能
vue-element-plus-admin：一套基于vue3、element-plus、ts、vite的后台集成方案
vue-element-plus-admin：一套基于vue3、element-plus、ts、vite的后台集成方案，中后台前端解决方案的探索与实践。框架示例图：在线预览：https://element-plus-admin.cn摘要：本文主要介绍了vue-element-plus-admin，一个基于element-plus的免费开源中后台前端模版。文章首先介绍了该模版的开发背景和技术栈，然后
网络安全之如何设置云服务器禁止 ping？两种设置方法教你搞定云服务器无法ping通、ping不通云主机、Linux禁止ping、ICMP屏蔽、网络安全最佳实践 sysctl.conf配置代码简单说运维宝典限时特惠服务器 web安全 linux 服务器禁止ping 云服务器禁止ping 服务器禁止ping的方法
云主机如何设置云服务器禁止ping？两种设置方法教你搞定标签：云服务器无法ping通、ping不通云主机、Linux禁止ping、ICMP屏蔽、网络安全最佳实践、sysctl.conf配置前几天上线了一个测试服务，总有安全团队扫端口，还时不时用ping探测存活，我开始思考：云服务器到底要不要禁ping？一、禁ping的好处和坏处作为一名前端转全栈开发的程序员，我越来越觉得网络安全不能忽视。“pin
MySQL 统计信息详解：从原理到实践我科绝伦（Huanhuan Zhou） mysql mysql android 数据库
MySQL统计信息是数据库优化器生成查询执行计划的关键依据，记录了表和索引的基本特性，辅助优化器估算查询成本、选择最优执行路径。一、统计信息主要内容分为表级、索引级和列级三类。1.1表级统计信息描述表基本属性，如行数（TABLE_ROWS）、平均行长度（AVG_ROW_LENGTH）、数据大小（DATA_LENGTH）、索引大小（INDEX_LENGTH）、空闲空间（DATA_FREE）。获取方式
让 Python 代码飙升330倍：从入门到精通的四种性能优化实践 python
花下猫语：性能优化是每个程序员的必修课，但你是否想过，除了更换算法，还有哪些“大招”？这篇文章堪称典范，它将一个普通的函数，通过四套组合拳，硬生生把性能提升了330倍！作者不仅展示了“术”，更传授了“道”。让我们一起跟随作者的思路，体验一次酣畅淋漓的优化之旅。PS.本文选自最新一期Python潮流周刊，如果你对优质文章感兴趣，诚心推荐你订阅我们的专栏。作者：ItamarTurner-Traurin
DDD实践精髓：战略与战术 Java廖志伟 Java场景面试宝典 DDD Software Architecture Business Logic
我是廖志伟，一名Java开发工程师、《Java项目实战——深入理解大型互联网企业通用技术》（基础篇）、（进阶篇）、（架构篇）清华大学出版社签约作家、Java领域优质创作者、CSDN博客专家、阿里云专家博主、51CTO专家博主、产品软文专业写手、技术文章评审老师、技术类问卷调查设计师、幕后大佬社区创始人、开源项目贡献者。拥有多年一线研发和团队管理经验，研究过主流框架的底层源码(Spring、Spri
深度学习核心知识简介和模型调参研术工坊深度学习知识和技巧深度学习人工智能 python
深度学习模型调优就像调制一道复杂的菜肴，需要掌握多种"调料"的用法。本文将为您详解这些关键"调料"，帮助您烹饪出高性能的模型。###核心参数及其影响####1️⃣Loss（损失函数）**基本介绍**：衡量模型预测与真实值差距的指标，是模型优化的指南针。**生活类比**：想象你在教小孩认识动物：-**完美情况**：小孩看到猫说"猫"，看到狗说"狗"→Loss=0-**有错误**：小孩看到猫说"狗"→
大唐杯省赛考纲总结（10%） LUO-CHEn 大唐杯第十届 5G 信息与通信
系列文章目录本届大唐杯考察范围20%通信基础知识70%5G内容10%商业流程文章目录系列文章目录前言一、通信基础知识的考察（20%）二、5G内容5G无线技术（20%）：5G网络技术（10%）：5G协议与信令（15%）:5G工程实践（15%）：5G+垂直行业应用（10%）:三、商业流程（10%）:总结前言大唐杯以推广信息通信领域前沿技术、协同高校学科建设、推动行业创新发展为目的，激发高校学生参赛热情
kafka如何让消息均匀的写入到每个partition 野老杂谈全网最全IT公司面试宝典 kafka 分布式
在Kafka中，要实现消息均匀写入每个partition，核心是通过合理的分区分配策略让消息在partition间均衡分布。具体机制和实践方式如下：一、Kafka默认的分区分配逻辑（核心机制）Kafka生产者发送消息时，通过Partitioner接口（默认实现为DefaultPartitioner）决定消息写入哪个partition，核心逻辑如下：指定partition时若发送消息时显式指定了pa
视频讲解：ARIMA-LSTM注意力融合模型跨行业股价预测应用
全文链接：https://tecdat.cn/?p=42866原文出处：拓端数据部落公众号分析师：ChengchengLi在协助券商构建股价预测系统时，团队曾面临高频波动市场的建模困境。传统ARIMA模型对极端行情响应迟滞，单一LSTM模型则存在长期依赖难题。基于该项目实践，我们提出ARIMA-LSTM注意力融合框架，通过双轨协同机制实现预测精度突破。视频讲解：ARIMA-LSTM注意力融合模型跨
OpenTelemetry 实践指南：历史、架构与基本概念 m0_74823595 面试学习路线阿里巴巴架构
背景之前陆续写过一些和OpenTelemetry相关的文章：实战：如何优雅的从Skywalking切换到OpenTelemetry实战：如何编写一个OpenTelemetryExtensions从一个JDK21+OpenTelemetry不兼容的问题讲起这些内容的前提是最好有一些OpenTelemetry的背景知识，看起来就不会那么枯燥，为此这篇文章就来做一个入门科普，方便一些对OpenTelem
Java 集合框架：ArrayList 深度剖析与进阶实践 2501_92631758 java 开发语言
一、ArrayList底层实现的演进与源码解析（JDK8-JDK21）（一）跨版本实现差异对比JDK版本初始化机制扩容策略性能优化点JDK8延迟初始化空数组，首次add扩容至10oldCapacity+(oldCapacity>>1)引入CopyOnWriteArrayListJDK11优化ensureCapacityInternal逻辑相同增强序列化性能JDK17新增数组copyOfRange优
C++ 从入门到精通课程大纲超级码里奥2024 C++从入门到精通课程 c++开发语言
C++从入门到精通课程大纲设计理念：采用“基础→核心→高级→实战”四阶段螺旋式教学，结合理论讲解、代码演示、项目实践（70%实操占比），培养工程级开发能力。目录结构1.第一阶段：C++编程基础2.第二阶段：C++核心编程3.第三阶段：C++高级编程4.第四阶段：实战项目开发附录：学习资源与工具链详细大纲一、第一阶段：C++编程基础目标：掌握语法基础与结构化编程能力环境与基础语法编译器配置（GCC/
深入解读MCP：构建低延迟、高吞吐量通信中间件 LCG元 MCP 中间件
目录MCP核心架构设计MCP中间件架构图协议设计与消息格式MCP协议头结构消息体编码示例核心模块实现1.高性能网络层（基于Netty）2.零拷贝内存队列3.高效路由引擎4.消息持久化模块性能优化技巧1.批量合并写操作2.CPU缓存行优化3.内存池技术可靠性保障机制消息处理流程图实现代码：消息重试机制性能基准测试压测环境配置性能测试结果生产部署方案集群拓扑图部署脚本示例总结与最佳实践性能优化矩阵部署
C++11 forward_list 从基础到精通：原理、实践与性能优化码事漫谈 c++11 c++list 性能优化
文章目录一、为什么需要forward_list？二、基础篇：forward_list的核心特性与接口2.1数据结构与迭代器2.2常用接口速览2.3基础操作示例：从初始化到遍历2.3.1初始化与遍历2.3.2插入与删除：before_begin的关键作用三、进阶篇：深入理解forward_list的特殊操作3.1emplace_aftervsinsert_after：效率差异的本质3.2迭代器失效：
SpringBoot设计基石：约定优于配置与模块化架构
一、约定优于配置（CoC）的设计哲学1.背景“当你新建一个Spring项目时，是否曾纠结于这些选择：该用Tomcat还是Jetty？数据源配置HikariCP还是Druid？事务管理器要声明哪些Bean？这些决策消耗的开发者的精力，本应属于业务创新。”设计者的初心思考：“能否将行业数年积累的最佳实践，沉淀为开箱即用的默认值？”就像智能手机默认设置字体大小——多数人直接使用，少数人按需调整。这便是约
【小白入门必看】一文读懂深度学习计算机视觉技术及学习路线
一、什么是计算机视觉？计算机视觉，其实就是教机器怎么像我们人一样，用摄像头看看周围的世界，然后理解它。比如说，它能认出这是个苹果，或者那边有辆车。除此之外，还能把拍到的照片或者视频转换成有用的信息，帮我们做决定。整个过程就是为了让机器能看懂图像，然后根据这些图像来做出聪明的选择。二、计算机视觉实现起来难吗？人类依赖视觉，找辆汽车轻而易举，毕竟汽车那么大，一眼就能看出来，所以常误以为计算机视觉简单，
apache 安装linux windows 墙头上一根草 apache inux windows
linux安装Apache 有两种方式一种是手动安装通过二进制的文件进行安装，另外一种就是通过yum 安装，此中安装方式，需要物理机联网。以下分别介绍两种的安装方式通过二进制文件安装Apache需要的软件有apr,apr-util,pcre 1，安装 apr 下载地址：htt
fill_parent、wrap_content和match_parent的区别 Cb123456 match_parent fill_parent
fill_parent、wrap_content和match_parent的区别: 1）fill_parent 设置一个构件的布局为fill_parent将强制性地使构件扩展，以填充布局单元内尽可能多的空间。这跟Windows控件的dockstyle属性大体一致。设置一个顶部布局或控件为fill_parent将强制性让它布满整个屏幕。 2） wrap_conte
网页自适应设计天子之骄 html css 响应式设计页面自适应
网页自适应设计网页对浏览器窗口的自适应支持变得越来越重要了。自适应响应设计更是异常火爆。再加上移动端的崛起，更是如日中天。以前为了适应不同屏幕分布率和浏览器窗口的扩大和缩小，需要设计几套css样式，用js脚本判断窗口大小，选择加载。结构臃肿，加载负担较大。现笔者经过一定时间的学习，有所心得，故分享于此，加强交流，共同进步。同时希望对大家有所
[sql server] 分组取最大最小常用sql 一炮送你回车库 SQL Server
--分组取最大最小常用sql--测试环境if OBJECT_ID('tb') is not null drop table tb;gocreate table tb( col1 int, col2 int, Fcount int)insert into tbselect 11,20,1 union allselect 11,22,1 union allselect 1
ImageIO写图片输出到硬盘 3213213333332132 java image
package awt; import java.awt.Color; import java.awt.Font; import java.awt.Graphics; import java.awt.image.BufferedImage; import java.io.File; import java.io.IOException; import javax.imagei
自己的String动态数组宝剑锋梅花香 java 动态数组数组
数组还是好说，学过一两门编程语言的就知道，需要注意的是数组声明时需要把大小给它定下来，比如声明一个字符串类型的数组：String str[]=new String[10]; 但是问题就来了，每次都是大小确定的数组，我需要数组大小不固定随时变化怎么办呢？动态数组就这样应运而生，龙哥给我们讲的是自己用代码写动态数组，并非用的ArrayList 看看字符
pinyin4j工具类 darkranger .net
pinyin4j工具类Java工具类 2010-04-24 00:47:00 阅读69 评论0 字号：大中小引入pinyin4j-2.5.0.jar包: pinyin4j是一个功能强悍的汉语拼音工具包，主要是从汉语获取各种格式和需求的拼音，功能强悍，下面看看如何使用pinyin4j。本人以前用AscII编码提取工具，效果不理想，现在用pinyin4j简单实现了一个。功能还不是很完美，
StarUML学习笔记----基本概念 aijuans UML建模
介绍StarUML的基本概念，这些都是有效运用StarUML?所需要的。包括对模型、视图、图、项目、单元、方法、框架、模型块及其差异以及UML轮廓。模型、视与图（Model, View and Diagram） &
Activiti最终总结 avords Activiti id 工作流
1、流程定义ID：ProcessDefinitionId，当定义一个流程就会产生。 2、流程实例ID：ProcessInstanceId，当开始一个具体的流程时就会产生，也就是不同的流程实例ID可能有相同的流程定义ID。 3、TaskId，每一个userTask都会有一个Id这个是存在于流程实例上的。 4、TaskDefinitionKey和（ActivityImpl activityId
从省市区多重级联想到的，react和jquery的差别 bee1314 jquery UI react
在我们的前端项目里经常会用到级联的select，比如省市区这样。通常这种级联大多是动态的。比如先加载了省，点击省加载市，点击市加载区。然后数据通常ajax返回。如果没有数据则说明到了叶子节点。针对这种场景，如果我们使用jquery来实现，要考虑很多的问题，数据部分，以及大量的dom操作。比如这个页面上显示了某个区，这时候我切换省，要把市重新初始化数据，然后区域的部分要从页面
Eclipse快捷键大全 bijian1013 java eclipse 快捷键
Ctrl+1 快速修复(最经典的快捷键,就不用多说了)Ctrl+D: 删除当前行 Ctrl+Alt+↓ 复制当前行到下一行(复制增加)Ctrl+Alt+↑ 复制当前行到上一行(复制增加)Alt+↓ 当前行和下面一行交互位置(特别实用,可以省去先剪切,再粘贴了)Alt+↑ 当前行和上面一行交互位置(同上)Alt+← 前一个编辑的页面Alt+→ 下一个编辑的页面(当然是针对上面那条来说了)Alt+En
js 笔记函数征客丶 JavaScript
一、函数的使用 1.1、定义函数变量 var vName = funcation(params){ } 1.2、函数的调用函数变量的调用： vName(params); 函数定义时自发调用：(function(params){})(params); 1.3、函数中变量赋值 var a = 'a'; var ff
【Scala四】分析Spark源代码总结的Scala语法二 bit1129 scala
1. Some操作在下面的代码中，使用了Some操作：if (self.partitioner == Some(partitioner))，那么Some(partitioner)表示什么含义？首先partitioner是方法combineByKey传入的变量， Some的文档说明： /** Class `Some[A]` represents existin
java 匿名内部类 BlueSkator java匿名内部类
组合优先于继承 Java的匿名类，就是提供了一个快捷方便的手段，令继承关系可以方便地变成组合关系继承只有一个时候才能用，当你要求子类的实例可以替代父类实例的位置时才可以用继承。在Java中内部类主要分为成员内部类、局部内部类、匿名内部类、静态内部类。内部类不是很好理解，但说白了其实也就是一个类中还包含着另外一个类如同一个人是由大脑、肢体、器官等身体结果组成，而内部类相
盗版win装在MAC有害发热，苹果的东西不值得买，win应该不用 ljy325 游戏 apple windows XP OS
Mac mini 型号: MC270CH-A RMB:5,688 Apple 对windows的产品支持不好,有以下问题: 1.装完了xp,发现机身很热虽然没有运行任何程序！貌似显卡跑游戏发热一样，按照那样的发热量,那部机子损耗很大,使用寿命受到严重的影响! 2.反观安装了Mac os的展示机，发热量很小，运行了1天温度也没有那么高 &nbs
读《研磨设计模式》-代码笔记-生成器模式-Builder bylijinnan java 设计模式
声明：本文只为方便我个人查阅和理解，详细的分析以及源代码请移步原作者的博客http://chjavach.iteye.com/ /** * 生成器模式的意图在于将一个复杂的构建与其表示相分离，使得同样的构建过程可以创建不同的表示（GoF） * 个人理解： * 构建一个复杂的对象，对于创建者（Builder）来说，一是要有数据来源(rawData)，二是要返回构
JIRA与SVN插件安装 chenyu19891124 SVN jira
JIRA安装好后提交代码并要显示在JIRA上，这得需要用SVN的插件才能看见开发人员提交的代码。 1.下载svn与jira插件安装包，解压后在安装包(atlassian-jira-subversion-plugin-0.10.1) 2.解压出来的包里下的lib文件夹下的jar拷贝到(C:\Program Files\Atlassian\JIRA 4.3.4\atlassian-jira\WEB
常用数学思想方法 comsci 工作
对于搞工程和技术的朋友来讲，在工作中常常遇到一些实际问题，而采用常规的思维方式无法很好的解决这些问题，那么这个时候我们就需要用数学语言和数学工具，而使用数学工具的前提却是用数学思想的方法来描述问题。。下面转帖几种常用的数学思想方法，仅供学习和参考函数思想　　把某一数学问题用函数表示出来，并且利用函数探究这个问题的一般规律。这是最基本、最常用的数学方法
pl/sql集合类型 daizj oracle 集合 type pl/sql
--集合类型 /* 单行单列的数据，使用标量变量单行多列数据，使用记录单列多行数据，使用集合（。。。） *集合：类似于数组也就是。pl/sql集合类型包括索引表（pl/sql table）、嵌套表（Nested Table）、变长数组（VARRAY）等 */ /* --集合方法 &n
[Ofbiz]ofbiz初用 dinguangx 电商 ofbiz
从github下载最新的ofbiz（截止2015-7-13），从源码进行ofbiz的试用 1. 加载测试库 ofbiz内置derby，通过下面的命令初始化测试库 ./ant load-demo (与load-seed有一些区别) 2. 启动内置tomcat ./ant start 或 ./startofbiz.sh 或 java -jar ofbiz.jar &
结构体中最后一个元素是长度为0的数组 dcj3sjt126com c gcc
在Linux源代码中，有很多的结构体最后都定义了一个元素个数为0个的数组，如/usr/include/linux/if_pppox.h中有这样一个结构体： struct pppoe_tag { __u16 tag_type; __u16 tag_len; &n
Linux cp 实现强行覆盖 dcj3sjt126com linux
发现在Fedora 10 /ubutun 里面用cp -fr src dest，即使加了-f也是不能强行覆盖的，这时怎么回事的呢？一两个文件还好说，就输几个yes吧，但是要是n多文件怎么办，那还不输死人呢？下面提供三种解决办法。方法一我们输入alias命令，看看系统给cp起了一个什么别名。 [root@localhost ~]# aliasalias cp=’cp -i’a
Memcached(一)、HelloWorld frank1234 memcached
一、简介高性能的架构离不开缓存，分布式缓存中的佼佼者当属memcached，它通过客户端将不同的key hash到不同的memcached服务器中，而获取的时候也到相同的服务器中获取，由于不需要做集群同步，也就省去了集群间同步的开销和延迟，所以它相对于ehcache等缓存来说能更好的支持分布式应用，具有更强的横向伸缩能力。二、客户端选择一个memcached客户端，我这里用的是memc
Search in Rotated Sorted Array II hcx2013 search
Follow up for "Search in Rotated Sorted Array":What if duplicates are allowed? Would this affect the run-time complexity? How and why? Write a function to determine if a given ta
Spring4新特性——更好的Java泛型操作API jinnianshilongnian spring4 generic type
Spring4新特性——泛型限定式依赖注入 Spring4新特性——核心容器的其他改进 Spring4新特性——Web开发的增强 Spring4新特性——集成Bean Validation 1.1(JSR-349)到SpringMVC Spring4新特性——Groovy Bean定义DSL Spring4新特性——更好的Java泛型操作API Spring4新
CentOS安装JDK liuxingguome centos
1、行卸载原来的： [root@localhost opt]# rpm -qa | grep java tzdata-java-2014g-1.el6.noarch java-1.7.0-openjdk-1.7.0.65-2.5.1.2.el6_5.x86_64 java-1.6.0-openjdk-1.6.0.0-11.1.13.4.el6.x86_64 [root@localhost
二分搜索专题2-在有序二维数组中搜索一个元素 OpenMind 二维数组算法二分搜索
1,设二维数组p的每行每列都按照下标递增的顺序递增。用数学语言描述如下：p满足 (1),对任意的x1，x2，y，如果x1<x2,则p(x1,y)<p(x2,y); (2),对任意的x，y1,y2, 如果y1<y2,则p(x,y1)<p(x,y2); 2,问题：给定满足1的数组p和一个整数k，求是否存在x0,y0使得p(x0,y0)=k? 3,算法分析： (
java 随机数 Math与Random SaraWon java Math Random
今天需要在程序中产生随机数，知道有两种方法可以使用，但是使用Math和Random的区别还不是特别清楚，看到一篇文章是关于的，觉得写的还挺不错的，原文地址是 http://www.oschina.net/question/157182_45274?sort=default&p=1#answers 产生1到10之间的随机数的两种实现方式： //Math Math.roun
oracle创建表空间 tugn oracle
create temporary tablespace TXSJ_TEMP tempfile 'E:\Oracle\oradata\TXSJ_TEMP.dbf' size 32m autoextend on next 32m maxsize 2048m extent m
使用Java8实现自己的个性化搜索引擎 yangshangchuan java superword 搜索引擎 java8 全文检索
需要对249本软件著作实现句子级别全文检索，这些著作均为PDF文件，不使用现有的框架如lucene，自己实现的方法如下： 1、从PDF文件中提取文本，这里的重点是如何最大可能地还原文本。提取之后的文本，一个句子一行保存为文本文件。 2、将所有文本文件合并为一个单一的文本文件，这样，每一个句子就有一个唯一行号。 3、对每一行文本进行分词，建立倒排表，倒排表的格式为：词=包含该词的总行数N=行号

【何之源-21个项目玩转深度学习】——Chapter3-3.2 数据准备-将图像数据转为tfrecord形式

你可能感兴趣的:(TensorFlow,深度学习/机器学习,深度学习【理论+实践】)