sxj731533730

52、训练paddleSeg模型，部署自己的模型到OAK相机上

基本思想：简单记录一下训练过程，数据集在coco基础上进行，进行筛选出杯子的数据集，然后进行训练，比较简单，从coco数据集中筛选出杯子的数据集，然后在labelme数据集的基础上，转成paddleseg数据集，然后训练即可，生成的标签在代码中添加相应的数据格式，贴到txt即可

实验模型链接: https://pan.baidu.com/s/1w50vkX1kLfEhj2labK1xuQ?pwd=79qk 提取码: 79qk

一、数据集准备45、实例分割的labelme数据集转coco数据集以及coco数据集转labelme数据集、转paddleSeg数据集_sxj731533730的博客-CSDN博客_实例分割labelme

得到杯子数据集：链接: https://pan.baidu.com/s/1DWf7d1xWAscAKmIvNYJ9Rw?pwd=n2vs 提取码: n2vs

二、配置文件使用修改pp_liteseg_stdc1_camvid_960x720_10k.yml

ubuntu@ubuntu:~/PaddleSeg/configs/pp_liteseg$ cp pp_liteseg_stdc1_camvid_960x720_10k.yml pp_liteseg_stdc1_camvid_300x300_10k.yml

文件内容

batch_size: 6  # total: 4*6
iters: 100000
 
train_dataset:
  type: Dataset
  dataset_root: /home/ubuntu/PaddleSeg/paddleSegCup/datasets/train
  num_classes: 2 #backgroud+cup
  mode: train
  train_path: /home/ubuntu/PaddleSeg/paddleSegCup/datasets/train/train.txt
  transforms:
    - type: ResizeStepScaling
      min_scale_factor: 0.5
      max_scale_factor: 2.5
      scale_step_size: 0.25
    - type: RandomPaddingCrop
      crop_size: [300, 300]
    - type: RandomHorizontalFlip
    - type: RandomDistort
      brightness_range: 0.5
      contrast_range: 0.5
      saturation_range: 0.5
    - type: Normalize
 
val_dataset:
  type: Dataset
  dataset_root: /home/ubuntu/PaddleSeg/paddleSegCup/datasets/val
  num_classes: 2
  mode: val
  val_path: /home/ubuntu/PaddleSeg/paddleSegCup/datasets/val/val.txt
  transforms:
    - type: Normalize
 
optimizer:
  type: sgd
  momentum: 0.9
  weight_decay: 5.0e-4
 
lr_scheduler:
  type: PolynomialDecay
  learning_rate: 0.01
  end_lr: 0
  power: 0.9
  warmup_iters: 200
  warmup_start_lr: 1.0e-5
 
loss:
  types:
    - type: OhemCrossEntropyLoss
      min_kept: 250000   # batch_size * 300 * 300 // 16
    - type: OhemCrossEntropyLoss
      min_kept: 250000
    - type: OhemCrossEntropyLoss
      min_kept: 250000
  coef: [1, 1, 1]
 
model:
  type: PPLiteSeg
  backbone:
    type: STDC1
    pretrained: https://bj.bcebos.com/paddleseg/dygraph/PP_STDCNet1.tar.gz
  arm_out_chs: [32, 64, 128]
  seg_head_inter_chs: [32, 64, 64]

训练起来了

ubuntu@ubuntu:~/PaddleSeg$ ubuntu@ubuntu:~/PaddleSeg$ python3 train.py --config configs/pp_liteseg/pp_liteseg_stdc1_camvid_300x300_10k.yml --do_eval
2022-11-25 16:46:23 [INFO]	
------------Environment Information-------------
platform: Linux-5.15.0-52-generic-x86_64-with-glibc2.29
Python: 3.8.10 (default, Jun 22 2022, 20:18:18) [GCC 9.4.0]
Paddle compiled with cuda: True
NVCC: Build cuda_11.1.TC455_06.29069683_0
cudnn: 8.2
GPUs used: 1
CUDA_VISIBLE_DEVICES: None
GPU: ['GPU 0: NVIDIA GeForce']
GCC: gcc (Ubuntu 9.4.0-1ubuntu1~20.04.1) 9.4.0
PaddleSeg: 2.6.0
PaddlePaddle: 2.3.2
OpenCV: 4.6.0
------------------------------------------------
2022-11-25 16:46:23 [INFO]	
---------------Config Information---------------
batch_size: 6
iters: 10000
loss:
  coef:
  - 1
  - 1
  - 1
  types:
2022-11-25 16:54:19 [INFO]	[TRAIN] epoch: 1, iter: 10/10000, loss: 2.7239, lr: 0.000460, batch_cost: 0.2893, reader_cost: 0.01094, ips: 20.7363 samples/sec | ETA 00:48:10
2022-11-25 16:54:19 [INFO]	[TRAIN] epoch: 1, iter: 20/10000, loss: 2.3742, lr: 0.000959, batch_cost: 0.0511, reader_cost: 0.00009, ips: 117.4557 samples/sec | ETA 00:08:29
2022-11-25 16:54:20 [INFO]	[TRAIN] epoch: 1, iter: 30/10000, loss: 1.9726, lr: 0.001459, batch_cost: 0.0536, reader_cost: 0.00026, ips: 111.8903 samples/sec | ETA 00:08:54
2022-11-25 16:54:20 [INFO]	[TRAIN] epoch: 2, iter: 40/10000, loss: 1.7898, lr: 0.001958, batch_cost: 0.0576, reader_cost: 0.00709, ips: 104.1587 samples/sec | ETA 00:09:33
2022-11-25 16:54:21 [INFO]	[TRAIN] epoch: 2, iter: 50/10000, loss: 2.6318, lr: 0.002458, batch_cost: 0.0550, reader_cost: 0.00426, ips: 109.1434 samples/sec | ETA 00:09:06
2022-11-25 16:54:21 [INFO]	[TRAIN] epoch: 2, iter: 60/10000, loss: 2.1906, lr: 0.002957, batch_cost: 0.0566, reader_cost: 0.00435, ips: 106.0024 samples/sec | ETA 00:09:22
2022-11-25 16:54:22 [INFO]	[TRAIN] epoch: 2, iter: 70/10000, loss: 1.9887, lr: 0.003457, batch_cost: 0.0567, reader_cost: 0.00542, ips: 105.8548 samples/sec | ETA 00:09:22
2022-11-25 16:54:23 [INFO]	[TRAIN] epoch: 3, iter: 80/10000, loss: 2.3479, lr: 0.003956, batch_cost: 0.0611, reader_cost: 0.01129, ips: 98.2484 samples/sec | ETA 00:10:05
2022-11-25 16:54:23 [INFO]	[TRAIN] epoch: 3, iter: 90/10000, loss: 2.0537, lr: 0.004456, batch_cost: 0.0551, reader_cost: 0.00373, ips: 108.8724 samples/sec | ETA 00:09:06
2022-11-25 16:54:24 [INFO]	[TRAIN] epoch: 3, iter: 100/10000, loss: 2.0187, lr: 0.004955, batch_cost: 0.0539, reader_cost: 0.00411, ips: 111.2684 samples/sec | ETA 00:08:53
2022-11-25 16:54:24 [INFO]	[TRAIN] epoch: 3, iter: 110/10000, loss: 2.1657, lr: 0.005455, batch_cost: 0.0508, reader_cost: 0.00069, ips: 118.2217 samples/sec | ETA 00:08:21

训练完成和测试

ubuntu@ubuntu:~/PaddleSeg/output$ ls
iter_10000  iter_6000  iter_7000  iter_8000  iter_9000

三、测试

ubuntu@ubuntu:~/PaddleSeg$ python3 predict.py --config /home/ubuntu/PaddleSeg/configs/pp_liteseg/pp_liteseg_stdc1_camvid_300x300_10k.yml --model_path /home/ubuntu/PaddleSeg/output/best_model/model.pdparams --image_path /home/ubuntu/PaddleSeg/paddleSegCup/datasets/val/JPEGImages/000000002157.jpg

测试结果

三、转模型，从modelparam到onnx，然后到openvino,最后到blob

1)onnx转换，

model = SavedSegmentationNet(model)  # add argmax to the last layer

后续错误不用在意,这里测试以427 640 图片为例子，还是建议统一图片尺寸在训练，因为voc数据集大小不统一，所以，我只生成了一个427 640 的数据集

ubuntu@ubuntu:~/PaddleSeg$ python3 deploy/python/infer_onnx_trt.py --config /home/ubuntu/PaddleSeg/configs/pp_liteseg/pp_liteseg_stdc1_camvid_300x300_10k.yml --model_path /home/ubuntu/PaddleSeg/output/best_model/model.pdparams --save_dir ./saved --width 640 --height 427
W1126 10:34:49.439234 19118 gpu_resources.cc:61] Please NOTE: device: 0, GPU Compute Capability: 8.6, Driver API Version: 11.7, Runtime API Version: 11.1
W1126 10:34:49.441439 19118 gpu_resources.cc:91] device: 0, cuDNN Version: 8.2.
2022-11-26 10:34:50 [INFO]	Loading pretrained model from https://bj.bcebos.com/paddleseg/dygraph/PP_STDCNet1.tar.gz
2022-11-26 10:34:50 [INFO]	There are 145/145 variables loaded into STDCNet.
2022-11-26 10:34:50 [INFO]	Loading pretrained model from /home/ubuntu/PaddleSeg/output/best_model/model.pdparams
2022-11-26 10:34:50 [INFO]	There are 247/247 variables loaded into PPLiteSeg.
2022-11-26 10:34:50 [INFO]	Loaded trained params of model successfully
input shape: [1, 3, 427, 640]
out shape: (1, 1, 427, 640)


2022-11-26 09:15:33 [INFO]	Static PaddlePaddle model saved in ./saved/paddle_model_static_onnx_temp_dir.
[Paddle2ONNX] Start to parse PaddlePaddle model...
[Paddle2ONNX] Model file path: ./saved/paddle_model_static_onnx_temp_dir/model.pdmodel
[Paddle2ONNX] Paramters file path: ./saved/paddle_model_static_onnx_temp_dir/model.pdiparams
[Paddle2ONNX] Start to parsing Paddle model...
[Paddle2ONNX] Use opset_version = 11 for ONNX export.
[Paddle2ONNX] PaddlePaddle model is exported as ONNX format now.
2022-11-26 09:15:33 [INFO]	ONNX model saved in ./saved/pp_liteseg_stdc1_camvid_300x300_10k_model.onnx.
Completed export onnx model.

2）转openvino

ubuntu@ubuntu:~/PaddleSeg$ python3 /opt/intel/openvino_2021/deployment_tools/model_optimizer/mo.py --input_model /home/ubuntu/PaddleSeg/saved/pp_liteseg_stdc1_camvid_300x300_10k_model.onnx --output_dir /home/ubuntu/PaddleSeg/saved/FP16 --input_shape [1,3,427,640] --data_type FP16 --scale_values [127.5,127.5,127.5] --mean_values [127.5,127.5,127.5

cmakelist.txt

cmake_minimum_required(VERSION 3.4.1)
set(CMAKE_CXX_STANDARD 14)


project(nanodet_demo)

find_package(OpenCV REQUIRED)
find_package(ngraph REQUIRED)
find_package(InferenceEngine REQUIRED)

include_directories(
        ${OpenCV_INCLUDE_DIRS}
        ${CMAKE_CURRENT_SOURCE_DIR}
        ${CMAKE_CURRENT_BINARY_DIR}
)

add_executable(nanodet_demo main.cpp )

target_link_libraries(
        nanodet_demo
        ${InferenceEngine_LIBRARIES}
        ${NGRAPH_LIBRARIES}
        ${OpenCV_LIBS}
)

main.cpp

#include 
#include 
#include 
#include 
#include 
#include 
#include 
#include  // Header file needed to use setprecision

using namespace std;
using namespace cv;

void preprocess(cv::Mat image, InferenceEngine::Blob::Ptr &blob) {
    int img_w = image.cols;
    int img_h = image.rows;
    int channels = 3;

    InferenceEngine::MemoryBlob::Ptr mblob = InferenceEngine::as(blob);
    if (!mblob) {
        THROW_IE_EXCEPTION << "We expect blob to be inherited from MemoryBlob in matU8ToBlob, "
                           << "but by fact we were not able to cast inputBlob to MemoryBlob";
    }
    // locked memory holder should be alive all time while access to its buffer happens
    auto mblobHolder = mblob->wmap();

    float *blob_data = mblobHolder.as();


    for (size_t c = 0; c < channels; c++) {
        for (size_t h = 0; h < img_h; h++) {
            for (size_t w = 0; w < img_w; w++) {
                blob_data[c * img_w * img_h + h * img_w + w] =
                        (float) image.at(h, w)[c];
            }
        }
    }
}

int main(int argc, char **argv) {
    cv::Mat bgr = cv::imread("/home/ubuntu/PaddleSeg/paddleSegCup/datasets/val/JPEGImages/000000002157.jpg");
    int orignal_width = bgr.cols;
    int orignal_height = bgr.rows;
    int target_width = 640;
    int target_height = 427;
    cv::Mat resize_img;
    cv::resize(bgr, resize_img, cv::Size(target_width, target_height));

    cv::Mat rgb;
    cv::cvtColor(resize_img, rgb, cv::COLOR_BGR2RGB);


    // resize_img.convertTo(resize_img, CV_32FC1, 1.0 / 255, 0);
    //resize_img = (resize_img - 0.5) / 0.5;
    auto start = chrono::high_resolution_clock::now();    //开始时间

    std::string input_name_ = "x";
    std::string output_name_ = "argmax_0.tmp_0";
    std::string model_path = "/home/ubuntu/PaddleSeg/saved/FP16/pp_liteseg_stdc1_camvid_300x300_10k_model.xml";
    InferenceEngine::Core ie;
    InferenceEngine::CNNNetwork model = ie.ReadNetwork(model_path);
    // prepare input settings
    InferenceEngine::InputsDataMap inputs_map(model.getInputsInfo());
    input_name_ = inputs_map.begin()->first;
    InferenceEngine::InputInfo::Ptr input_info = inputs_map.begin()->second;
    //input_info->setPrecision(InferenceEngine::Precision::FP32);
    //input_info->setLayout(InferenceEngine::Layout::NCHW);



    //prepare output settings
    InferenceEngine::OutputsDataMap outputs_map(model.getOutputsInfo());
    for (auto &output_info : outputs_map) {
        std::cout << "Output:" << output_info.first << std::endl;
        output_info.second->setPrecision(InferenceEngine::Precision::FP32);
    }

    //get network
    InferenceEngine::ExecutableNetwork network_ = ie.LoadNetwork(model, "CPU");
    InferenceEngine::InferRequest infer_request_ = network_.CreateInferRequest();
    InferenceEngine::Blob::Ptr input_blob = infer_request_.GetBlob(input_name_);
    preprocess(rgb, input_blob);

    // do inference
    infer_request_.Infer();
    const InferenceEngine::Blob::Ptr pred_blob = infer_request_.GetBlob(output_name_);

    auto m_pred = InferenceEngine::as(pred_blob);
    auto m_pred_holder = m_pred->rmap();
    const float *pred = m_pred_holder.as();
    auto end = chrono::high_resolution_clock::now();    //结束时间
    auto duration = (end - start).count();
    cout << "程序运行时间：" << std::setprecision(10) << duration / 1000000000.0 << "s"
         << "；  " << duration / 1000000.0 << "ms"
         << "；  " << duration / 1000.0 << "us"
         << endl;

    int w = target_height;
    int h = target_width;
    std::vector vec_host_scores;
    for (int i = 0; i < w * h; i++) {
        vec_host_scores.emplace_back(pred[i]);
    }

    int num_class = 1;
    vector color_map(num_class * 3);
    for (int i = 0; i < num_class; i++) {
        int j = 0;
        int lab = i;
        while (lab) {
            color_map[i * 3] |= ((lab >> 0 & 1) << (7 - j));
            color_map[i * 3 + 1] |= (((lab >> 1) & 1) << (7 - j));
            color_map[i * 3 + 2] |= (((lab >> 2) & 1) << (7 - j));
            j += 1;
            lab >>= 3;
        }
    }

    cv::Mat pseudo_img(w, h, CV_8UC3, cv::Scalar(0, 0, 0));
    for (int r = 0; r < w; r++) {
        for (int c = 0; c < h; c++) {
            int idx = vec_host_scores[r * h + c];
            pseudo_img.at(r, c)[0] = color_map[idx * 3];
            pseudo_img.at(r, c)[1] = color_map[idx * 3 + 1];
            pseudo_img.at(r, c)[2] = color_map[idx * 3 + 2];
        }
    }
    cv::Mat result;
    cv::addWeighted(resize_img, 0.4, pseudo_img, 0.6, 0, result, 0);

    cv::imshow("pseudo_img", pseudo_img);
    cv::imwrite("pseudo_img.jpg", pseudo_img);
    cv::imshow("bgr", bgr);
    cv::imwrite("resize_img.jpg", resize_img);
    cv::imshow("result", result);
    cv::imwrite("result.jpg", result);
    cv::waitKey(0);
    return 0;

}

测试结果

3)转OAK模型

ubuntu@ubuntu:/opt/intel/openvino_2021/deployment_tools/tools$ sudo chmod 777 compile_tool/
ubuntu@ubuntu:/opt/intel/openvino_2021/deployment_tools/tools$ cd compile_tool/
ubuntu@ubuntu:/opt/intel/openvino_2021/deployment_tools/tools/compile_tool$ ./compile_tool -m /home/ubuntu/PaddleSeg/saved/FP16/pp_liteseg_stdc1_camvid_300x300_10k_model.xml -ip U8 -d MYRIAD -VPU_NUMBER_OF_SHAVES 4 -VPU_NUMBER_OF_CMX_SLICES 4
Inference Engine: 
	IE version ......... 2021.4.1
	Build ........... 2021.4.1-3926-14e67d86634-releases/2021/4

Network inputs:
    x : U8 / NCHW
Network outputs:
    bilinear_interp_v2_13.tmp_0 : FP16 / NCHW
[Warning][VPU][Config] Deprecated option was used : VPU_MYRIAD_PLATFORM
Done. LoadNetwork time elapsed: 5132 ms
ubuntu@ubuntu:/opt/intel/openvino_2021/deployment_tools/tools/compile_tool$ cp pp_liteseg_stdc1_camvid_300x300_10k_model.blob /home/ubuntu/PaddleSeg/saved/FP16

cmakelist.txt 测试图片的

cmake_minimum_required(VERSION 3.16)
project(untitled15)
set(CMAKE_CXX_STANDARD 11)
find_package(OpenCV REQUIRED)
#message(STATUS ${OpenCV_INCLUDE_DIRS})
#添加头文件
include_directories(${OpenCV_INCLUDE_DIRS})
include_directories(${CMAKE_SOURCE_DIR}/include)
include_directories(${CMAKE_SOURCE_DIR}/include/utility)
#链接Opencv库
find_package(depthai CONFIG REQUIRED)
add_executable(untitled15 main.cpp include/utility/utility.cpp)
target_link_libraries(untitled15 ${OpenCV_LIBS}  depthai::opencv )

main.cpp


#include 
#include 
#include 
#include 
#include 
#include 

#include "utility.hpp"
#include 
#include "depthai/depthai.hpp"
using namespace std;
using namespace std::chrono;
using namespace cv;
int post_process(std::vector vec_host_scores,cv::Mat resize_img,cv::Mat &result, vector color_map,int w,int h){



    cv::Mat pseudo_img(w, h, CV_8UC3, cv::Scalar(0, 0, 0));
    for (int r = 0; r < w; r++) {
        for (int c = 0; c < h; c++) {
            int idx = vec_host_scores[r*h  + c];
            pseudo_img.at(r, c)[0] = color_map[idx * 3];
            pseudo_img.at(r, c)[1] = color_map[idx * 3 + 1];
            pseudo_img.at(r, c)[2] = color_map[idx * 3 + 2];
        }
    }

    cv::addWeighted(resize_img, 0.4, pseudo_img, 0.6, 0, result, 0);
    //cv::imshow("pseudo_img", pseudo_img);
    cv::imwrite(".pseudo_img.jpg", pseudo_img);
    // cv::imshow("bgr", resize_img);
    cv::imwrite("resize_img.jpg", resize_img);
    //cv::imshow("result", result);
    cv::imwrite("result.jpg", result);
    //cv::waitKey(0);
    return 0;
}


int main(int argc, char **argv) {
    int num_class = 256;
    vector color_map(num_class * 3);
    for (int i = 0; i < num_class; i++) {
        int j = 0;
        int lab = i;
        while (lab) {
            color_map[i * 3] |= ((lab >> 0 & 1) << (7 - j));
            color_map[i * 3 + 1] |= (((lab >> 1) & 1) << (7 - j));
            color_map[i * 3 + 2] |= (((lab >> 2) & 1) << (7 - j));
            j += 1;
            lab >>= 3;
        }
    }
    int target_width=427;
    int target_height=640;
    dai::Pipeline pipeline;
    //定义
    auto cam = pipeline.create();
    cam->setStreamName("inFrame");
    auto net = pipeline.create();
    dai::OpenVINO::Blob blob("/opt/intel/openvino_2021.4.689/deployment_tools/tools/compile_tool/pp_liteseg_stdc1_camvid_300x300_10k_model.blob");
    net->setBlob(blob);
    net->input.setBlocking(false);

    //基本熟练明白oak的函数使用了　
    cam->out.link(net->input);



    //定义输出
    auto xlinkParserOut = pipeline.create();
    xlinkParserOut->setStreamName("parseOut");
    auto xlinkoutOut = pipeline.create();
    xlinkoutOut->setStreamName("out");

    auto xlinkoutpassthroughOut = pipeline.create();

    xlinkoutpassthroughOut->setStreamName("passthrough");


    net->out.link(xlinkParserOut->input);

    net->passthrough.link(xlinkoutpassthroughOut->input);

    //结构推送相机
    dai::Device device(pipeline);
    //取帧显示
    auto inqueue = device.getInputQueue("inFrame");//maxsize 代表缓冲数据
    auto detqueue = device.getOutputQueue("parseOut", 8, false);//maxsize 代表缓冲数据

    bool printOutputLayersOnce=true;

    cv::Mat frame=cv::imread("/home/ubuntu/PaddleSeg/paddleSegCup/datasets/val/JPEGImages/000000002157.jpg");
    while(true) {

        if(frame.empty()) break;

        auto img = std::make_shared();
        frame = resizeKeepAspectRatio(frame, cv::Size(target_height, target_width), cv::Scalar(0));
        toPlanar(frame, img->getData());
        img->setTimestamp(steady_clock::now());
        img->setWidth(target_height);
        img->setHeight(target_width);
        inqueue->send(img);

        auto inNN = detqueue->get();
        if( printOutputLayersOnce&&inNN) {
            std::cout << "Output layer names: ";
            for(const auto& ten : inNN->getAllLayerNames()) {
                std::cout << ten << ", ";
            }
            std::cout << std::endl;
            printOutputLayersOnce = false;
        }
        cv::Mat result;
        auto pred=inNN->getLayerInt32(inNN->getAllLayerNames()[0]);

        post_process(pred,frame,result,color_map,target_width,target_height);
        cv::imshow("demo", frame);
        cv::imshow("result", result);
        cv::imwrite("result.jpg",result);
        int key = cv::waitKey(1);
        if(key == 'q' || key == 'Q') return 0;
    }

//    while (true) {
//
//
//        auto ImgFrame = outqueue->get();
//        auto frame = ImgFrame->getCvFrame();
//
//        auto inNN = detqueue->get();
//        if( printOutputLayersOnce&&inNN) {
//            std::cout << "Output layer names: ";
//            for(const auto& ten : inNN->getAllLayerNames()) {
//                std::cout << ten << ", ";
//            }
//            std::cout << std::endl;
//            printOutputLayersOnce = false;
//        }
//        cv::Mat result;
//        auto pred=inNN->getLayerInt32(inNN->getAllLayerNames()[0]);
//
//        post_process(pred,frame,result,color_map,target_width,target_height);
//        cv::imshow("demo", frame);
//        cv::imshow("result", result);
//        cv::imwrite("result.jpg",result);
//        cv::waitKey(1);
//
//
//    }


    return 0;
}

测试结果

实际测试视频472 640 的帧率在16fps左右


#include 
#include 
#include 
#include 
#include 
#include 

#include "utility.hpp"
#include 
#include "depthai/depthai.hpp"

using namespace std;
using namespace std::chrono;
using namespace cv;

int post_process(std::vector vec_host_scores, cv::Mat resize_img, cv::Mat &result, vector color_map, int w,
                 int h) {


    cv::Mat pseudo_img(w, h, CV_8UC3, cv::Scalar(0, 0, 0));
    for (int r = 0; r < w; r++) {
        for (int c = 0; c < h; c++) {
            int idx = vec_host_scores[r * h + c];
            pseudo_img.at(r, c)[0] = color_map[idx * 3];
            pseudo_img.at(r, c)[1] = color_map[idx * 3 + 1];
            pseudo_img.at(r, c)[2] = color_map[idx * 3 + 2];
        }
    }

    cv::addWeighted(resize_img, 0.4, pseudo_img, 0.6, 0, result, 0);
    //cv::imshow("pseudo_img", pseudo_img);
    cv::imwrite(".pseudo_img.jpg", pseudo_img);
    // cv::imshow("bgr", resize_img);
    cv::imwrite("resize_img.jpg", resize_img);
    //cv::imshow("result", result);
    cv::imwrite("result.jpg", result);
    //cv::waitKey(0);
    return 0;
}


int main(int argc, char **argv) {
    int num_class = 256;
    vector color_map(num_class * 3);
    for (int i = 0; i < num_class; i++) {
        int j = 0;
        int lab = i;
        while (lab) {
            color_map[i * 3] |= ((lab >> 0 & 1) << (7 - j));
            color_map[i * 3 + 1] |= (((lab >> 1) & 1) << (7 - j));
            color_map[i * 3 + 2] |= (((lab >> 2) & 1) << (7 - j));
            j += 1;
            lab >>= 3;
        }
    }
    int target_width = 427;
    int target_height = 640;
    dai::Pipeline pipeline;
    //定义
    auto cam = pipeline.create();
    cam->setBoardSocket(dai::CameraBoardSocket::RGB);
    cam->setResolution(dai::ColorCameraProperties::SensorResolution::THE_1080_P);
    cam->setPreviewSize(target_height, target_width);  // NN input
    cam->setInterleaved(false);

    auto net = pipeline.create();
    dai::OpenVINO::Blob blob("/home/ubuntu/PaddleSeg/saved/FP16/pp_liteseg_stdc1_camvid_300x300_10k_model.blob");
    net->setBlob(blob);
    net->input.setBlocking(false);

    //基本熟练明白oak的函数使用了　
    cam->preview.link(net->input);



    //定义输出
    auto xlinkParserOut = pipeline.create();
    xlinkParserOut->setStreamName("parseOut");
    auto xlinkoutOut = pipeline.create();
    xlinkoutOut->setStreamName("out");

    auto xlinkoutpassthroughOut = pipeline.create();

    xlinkoutpassthroughOut->setStreamName("passthrough");


    net->out.link(xlinkParserOut->input);

    net->passthrough.link(xlinkoutpassthroughOut->input);

    //结构推送相机
    dai::Device device(pipeline);
    //取帧显示
    auto outqueue = device.getOutputQueue("passthrough", 8, false);//maxsize 代表缓冲数据
    auto detqueue = device.getOutputQueue("parseOut", 8, false);//maxsize 代表缓冲数据

    bool printOutputLayersOnce = true;
    auto startTime = steady_clock::now();
    int counter = 0;
    float fps = 0;
    while (true) {




    auto ImgFrame = outqueue->get();
    auto frame = ImgFrame->getCvFrame();

    auto inNN = detqueue->get();
    if (printOutputLayersOnce && inNN) {
        std::cout << "Output layer names: ";
        for (const auto &ten : inNN->getAllLayerNames()) {
            std::cout << ten << ", ";
        }
        std::cout << std::endl;
        printOutputLayersOnce = false;
    }
    cv::Mat result;
    auto pred = inNN->getLayerInt32(inNN->getAllLayerNames()[0]);

    post_process(pred, frame, result, color_map, target_width, target_height);

    counter++;
    auto currentTime = steady_clock::now();
    auto elapsed = duration_cast>(currentTime - startTime);
    if (elapsed > seconds(1)) {
        fps = counter / elapsed.count();
        counter = 0;
        startTime = currentTime;
    }
        std::stringstream fpsStr;
        fpsStr << "NN fps: " << std::fixed << std::setprecision(2) << fps;
        cv::putText(result, fpsStr.str(), cv::Point(2, result.rows - 4), cv::FONT_HERSHEY_TRIPLEX, 0.4, cv::Scalar(0,255,0));
        //cv::imshow("demo", frame);
        cv::imshow("result", result);
        //cv::imwrite("result.jpg", result);
        cv::waitKey(1);


    }


    return 0;
}

测试数据

/home/ubuntu/CLionProjects/untitled5/cmake-build-debug/untitled15
[19443010C130FF1200] [1.5] [1.155] [NeuralNetwork(1)] [warning] Network compiled for 4 shaves, maximum available 13, compiling for 6 shaves likely will yield in better performance
Output layer names: argmax_0.tmp_0, 
NN fps: 0.00
NN fps: 0.00
NN fps: 0.00
NN fps: 0.00
NN fps: 0.00
NN fps: 0.00
NN fps: 0.00
NN fps: 0.00
NN fps: 0.00
NN fps: 0.00
NN fps: 10.58
NN fps: 10.58
NN fps: 10.58
NN fps: 10.58
NN fps: 10.58
NN fps: 10.58
NN fps: 10.58
NN fps: 10.58
NN fps: 10.58
NN fps: 10.58
NN fps: 10.58
NN fps: 10.58
NN fps: 10.58
NN fps: 10.58
NN fps: 10.58
NN fps: 10.58
NN fps: 15.26
NN fps: 15.26
NN fps: 15.26
NN fps: 15.26
NN fps: 15.26
NN fps: 15.26
NN fps: 15.26
NN fps: 15.26
NN fps: 15.26
NN fps: 15.26
NN fps: 15.26
NN fps: 15.26
NN fps: 15.26
NN fps: 15.26
NN fps: 15.26
NN fps: 15.26
NN fps: 15.65
NN fps: 15.65
NN fps: 15.65
NN fps: 15.65
NN fps: 15.65
NN fps: 15.65
NN fps: 15.65
NN fps: 15.65
NN fps: 15.65
NN fps: 15.65
NN fps: 15.65
NN fps: 15.65
NN fps: 15.65
NN fps: 15.65
NN fps: 15.65
NN fps: 15.65
NN fps: 14.99
NN fps: 14.99
NN fps: 14.99
NN fps: 14.99
NN fps: 14.99
NN fps: 14.99
NN fps: 14.99

测试图片

有时候识别效果还是差，数据集可能太少了。。毕竟才150张。。。

补充一个代码含有测据　链接: https://pan.baidu.com/s/1Top8jspCXskIyfi-9QLvcg?pwd=x4fj 提取码: x4fj


#include 
#include 
#include 
#include 
#include 
#include 

#include "utility.hpp"
#include 
#include "depthai/depthai.hpp"

using namespace std;
using namespace std::chrono;
using namespace cv;
static std::atomic newConfig{false};

int find_bound(cv::Mat gray_img, cv::Mat resize_img, vector &ploy_rects_) {
    cvtColor(gray_img, gray_img, cv::COLOR_BGR2GRAY);
    std::vector> contours;
    findContours(gray_img, contours, cv::RETR_TREE, cv::CHAIN_APPROX_SIMPLE);

    vector> contours_ploy(contours.size()); // 逼近多边形点
    vector ploy_rects(contours.size());             // 多边形框


    for (size_t i = 0; i < contours.size(); i++) {
        approxPolyDP(Mat(contours[i]), contours_ploy[i], 3, true);
        ploy_rects[i] = boundingRect(contours_ploy[i]);


    }

    RNG rng(1234);
    Point2f pts[4];
    for (size_t t = 0; t < contours.size(); t++) {
        Scalar color = Scalar(rng.uniform(0, 255), rng.uniform(0, 255), rng.uniform(0, 255));
        rectangle(resize_img, ploy_rects[t], color, 2, 8);
        if (contours_ploy[t].size() > 5) {

            for (int r = 0; r < 4; r++) {
                line(resize_img, pts[r], pts[(r + 1) % 4], color, 2, 8);
            }
        }
    }
    cv::drawContours(resize_img, contours, -1, (255, 0, 0), 2);

    ploy_rects_ = ploy_rects;
    imshow("drawImg", resize_img);
    cv::imwrite("dram.jpg",resize_img);

}

int post_process(std::vector vec_host_scores, cv::Mat resize_img, cv::Mat &result, vector color_map, int w,
                 int h, std::vector &ploy_rects_) {


    cv::Mat pseudo_img(w, h, CV_8UC3, cv::Scalar(0, 0, 0));
    for (int r = 0; r < w; r++) {
        for (int c = 0; c < h; c++) {
            int idx = vec_host_scores[r * h + c];
            pseudo_img.at(r, c)[0] = color_map[idx * 3];
            pseudo_img.at(r, c)[1] = color_map[idx * 3 + 1];
            pseudo_img.at(r, c)[2] = color_map[idx * 3 + 2];
        }
    }

    cv::addWeighted(resize_img, 0.4, pseudo_img, 0.6, 0, result, 0);
    find_bound(pseudo_img, resize_img, ploy_rects_);
    //cv::imshow("pseudo_img", pseudo_img);
    //cv::imwrite(".pseudo_img.jpg", pseudo_img);
    // cv::imshow("bgr", resize_img);
    //cv::imwrite("resize_img.jpg", resize_img);
    // cv::imshow("result", result);
    // cv::imwrite("result.jpg", result);
    //cv::waitKey(0);
    return 0;
}


int main(int argc, char **argv) {
    int num_class = 1;
    vector color_map(num_class * 3);
    for (int i = 0; i < num_class; i++) {
        int j = 0;
        int lab = i;
        while (lab) {
            color_map[i * 3] |= ((lab >> 0 & 1) << (7 - j));
            color_map[i * 3 + 1] |= (((lab >> 1) & 1) << (7 - j));
            color_map[i * 3 + 2] |= (((lab >> 2) & 1) << (7 - j));
            j += 1;
            lab >>= 3;
        }
    }
    float target_width = 300;
    float target_height = 300;
    dai::Pipeline pipeline;
    dai::Device device;

    //定义
    auto cam = pipeline.create();
    cam->setBoardSocket(dai::CameraBoardSocket::RGB);
    cam->setResolution(dai::ColorCameraProperties::SensorResolution::THE_1080_P);
    cam->setPreviewSize((int) target_height, (int) target_width);  // NN input
    cam->setInterleaved(false);
    cam->setPreviewKeepAspectRatio(false);

    try {
        auto calibData = device.readCalibration2();
        auto lensPosition = calibData.getLensPosition(dai::CameraBoardSocket::RGB);
        if (lensPosition) {
            cam->initialControl.setManualFocus(lensPosition);
        }
    } catch (const std::exception &ex) {
        std::cout << ex.what() << std::endl;
        return 1;
    }
    auto net = pipeline.create();
    dai::OpenVINO::Blob blob(
            "../model_300_300/pp_liteseg_stdc1_camvid_300x300_10k_model.blob");
    net->setBlob(blob);
    net->input.setBlocking(false);

    //基本熟练明白oak的函数使用了　
    cam->preview.link(net->input);

    auto monoLeft = pipeline.create();
    auto monoRight = pipeline.create();

    auto stereo = pipeline.create();
    auto spatialDataCalculator = pipeline.create();
    auto xoutDepth = pipeline.create();
    auto xoutSpatialData = pipeline.create();
    auto xinSpatialCalcConfig = pipeline.create();
    xoutDepth->setStreamName("depth");
    xoutSpatialData->setStreamName("spatialData");
    xinSpatialCalcConfig->setStreamName("spatialCalcConfig");

    monoLeft->setResolution(dai::MonoCameraProperties::SensorResolution::THE_720_P);
    monoLeft->setBoardSocket(dai::CameraBoardSocket::LEFT);
    monoRight->setResolution(dai::MonoCameraProperties::SensorResolution::THE_720_P);
    monoRight->setBoardSocket(dai::CameraBoardSocket::RIGHT);
    try {
        auto calibData = device.readCalibration2();
        auto lensPosition = calibData.getLensPosition(dai::CameraBoardSocket::RGB);
        if (lensPosition) {
            cam->initialControl.setManualFocus(lensPosition);
        }
    } catch (const std::exception &ex) {
        std::cout << ex.what() << std::endl;
        return 1;
    }
   // stereo->setDefaultProfilePreset(dai::node::StereoDepth::PresetMode::HIGH_DENSITY);
    //stereo->setSubpixel(subpixel);
    stereo->setLeftRightCheck(true);
    stereo->setExtendedDisparity(true);
    stereo->setDepthAlign(dai::CameraBoardSocket::RGB);
   stereo->setDefaultProfilePreset(dai::node::StereoDepth::PresetMode::HIGH_ACCURACY);

    dai::Point2f topLeft(0.4f, 0.4f);
    dai::Point2f bottomRight(0.6f, 0.6f);

    dai::SpatialLocationCalculatorConfigData config;
    config.depthThresholds.lowerThreshold = 100;
    config.depthThresholds.upperThreshold = 10000;
    config.roi = dai::Rect(topLeft, bottomRight);

    spatialDataCalculator->inputConfig.setWaitForMessage(false);
    spatialDataCalculator->initialConfig.addROI(config);

    // Linking
    monoLeft->out.link(stereo->left);
    monoRight->out.link(stereo->right);

    //定义输出
    auto xlinkParserOut = pipeline.create();
    xlinkParserOut->setStreamName("parseOut");
    auto xlinkoutOut = pipeline.create();
    xlinkoutOut->setStreamName("out");

    auto xlinkoutpassthroughOut = pipeline.create();

    xlinkoutpassthroughOut->setStreamName("passthrough");

    spatialDataCalculator->passthroughDepth.link(xoutDepth->input);
    stereo->depth.link(spatialDataCalculator->inputDepth);

    spatialDataCalculator->out.link(xoutSpatialData->input);
    xinSpatialCalcConfig->out.link(spatialDataCalculator->inputConfig);

    net->out.link(xlinkParserOut->input);

    net->passthrough.link(xlinkoutpassthroughOut->input);

    device.startPipeline(pipeline);
    device.setIrLaserDotProjectorBrightness(1000);
    //结构推送相机
    //取帧显示
    auto outqueue = device.getOutputQueue("passthrough", 4, false);//maxsize 代表缓冲数据
    auto detqueue = device.getOutputQueue("parseOut", 4, false);//maxsize 代表缓冲数据
    auto depthQueue = device.getOutputQueue("depth", 4, false);
    auto spatialCalcQueue = device.getOutputQueue("spatialData", 4, false);
    auto spatialCalcConfigInQueue = device.getInputQueue("spatialCalcConfig");

    bool printOutputLayersOnce = true;
    auto startTime = steady_clock::now();
    int counter = 0;
    float fps = 0;
    auto color = cv::Scalar(255, 255, 255);

    while (true) {


        auto inDepth = depthQueue->get();
        auto ImgFrame = outqueue->get();

        auto frame = ImgFrame->getCvFrame();
        target_width=frame.cols*1.0;
        target_height=frame.rows*1.0;

        auto inNN = detqueue->get();
        if (printOutputLayersOnce && inNN) {
            std::cout << "Output layer names: ";
            for (const auto &ten : inNN->getAllLayerNames()) {
                std::cout << ten << ", ";
            }
            std::cout << std::endl;
            printOutputLayersOnce = false;
        }
        cv::Mat result;
        auto pred = inNN->getLayerInt32(inNN->getAllLayerNames()[0]);
        std::vector ploy_rects_;
        post_process(pred, frame, result, color_map, target_width, target_height, ploy_rects_);

        for (auto &item:ploy_rects_) {

            newConfig = true;
            cv::Mat depthFrame = inDepth->getFrame();  // depthFrame values are in millimeters
            std::cout << depthFrame.rows << " " << depthFrame.cols << " " << std::endl;
            cv::Mat depthFrameColor;

            cv::normalize(depthFrame, depthFrameColor, 255, 0, cv::NORM_INF, CV_8UC1);
            cv::equalizeHist(depthFrameColor, depthFrameColor);
            cv::applyColorMap(depthFrameColor, depthFrameColor, cv::COLORMAP_HOT);

            topLeft.x = item.x * depthFrame.cols / target_width / depthFrame.cols;
            topLeft.y = item.y * depthFrame.rows / target_height / depthFrame.rows;
            bottomRight.x = (item.x * depthFrame.cols / target_width + item.width * depthFrame.cols / target_width) /
                            depthFrame.cols;
            bottomRight.y = (item.y * depthFrame.rows / target_height + item.height * depthFrame.rows / target_height) /
                            depthFrame.rows;

            auto spatialData = spatialCalcQueue->get()->getSpatialLocations();
            for (auto depthData : spatialData) {
                auto roi = depthData.config.roi;
                roi = roi.denormalize(depthFrameColor.cols, depthFrameColor.rows);
                auto xmin = (int) roi.topLeft().x;
                auto ymin = (int) roi.topLeft().y;
                auto xmax = (int) roi.bottomRight().x;
                auto ymax = (int) roi.bottomRight().y;

                auto depthMin = depthData.depthMin;
                auto depthMax = depthData.depthMax;

                cv::rectangle(result, cv::Rect(cv::Point((int) item.x, (int) item.y),
                                               cv::Point((int) item.x + (int) item.width,
                                                         (int) item.y + (int) item.height)), color,
                              cv::FONT_HERSHEY_SIMPLEX);
                std::stringstream depthX;
                depthX << "X: " << (int) depthData.spatialCoordinates.x << " mm";
                cv::putText(result, depthX.str(), cv::Point((int) item.x + 10, (int) item.y + 20),
                            cv::FONT_HERSHEY_TRIPLEX, 0.5, color);
                std::stringstream depthY;
                depthY << "Y: " << (int) depthData.spatialCoordinates.y << " mm";
                cv::putText(result, depthY.str(), cv::Point((int) item.x + 10, (int) item.y + 35),
                            cv::FONT_HERSHEY_TRIPLEX, 0.5, color);
                std::stringstream depthZ;
                depthZ << "Z: " << (int) depthData.spatialCoordinates.z << " mm";
                cv::putText(result, depthZ.str(), cv::Point((int) item.x + 10, (int) item.y + 50),
                            cv::FONT_HERSHEY_TRIPLEX, 0.5, color);

                cv::rectangle(result, cv::Rect(cv::Point((int) item.x, (int) item.y),
                                               cv::Point((int) item.x + (int) item.width,
                                                         (int) item.y + (int) item.height)), color,
                              cv::FONT_HERSHEY_SIMPLEX);
                auto coords = depthData.spatialCoordinates;
                auto distance = std::sqrt(coords.x * coords.x + coords.y * coords.y + coords.z * coords.z);
                std::stringstream depthDistance;
                depthDistance.precision(2);
                depthDistance << fixed << static_cast(distance / 1000.0f) << "m";
                auto fontType = cv::FONT_HERSHEY_TRIPLEX;
                cv::putText(result, depthDistance.str(), cv::Point(xmin + 10, ymin + 70), fontType, 0.5, color);



                cv::rectangle(depthFrameColor, cv::Rect(cv::Point(xmin, ymin), cv::Point(xmax, ymax)), color,
                              cv::FONT_HERSHEY_SIMPLEX);
                depthX << "X: " << (int) depthData.spatialCoordinates.x << " mm";
                cv::putText(depthFrameColor, depthX.str(), cv::Point(xmin + 10, ymin + 20), cv::FONT_HERSHEY_TRIPLEX,
                            0.5, color);

                depthY << "Y: " << (int) depthData.spatialCoordinates.y << " mm";
                cv::putText(depthFrameColor, depthY.str(), cv::Point(xmin + 10, ymin + 35), cv::FONT_HERSHEY_TRIPLEX,
                            0.5, color);

                depthZ << "Z: " << (int) depthData.spatialCoordinates.z << " mm";
                cv::putText(depthFrameColor, depthZ.str(), cv::Point(xmin + 10, ymin + 50), cv::FONT_HERSHEY_TRIPLEX,
                            0.5, color);
                cv::imshow("depthFrameColor", depthFrameColor);

            }
            if (newConfig) {
                config.roi = dai::Rect(topLeft, bottomRight);
                dai::SpatialLocationCalculatorConfig cfg;
                cfg.addROI(config);
                spatialCalcConfigInQueue->send(cfg);
                newConfig = false;
            }
        }
        counter++;
        auto currentTime = steady_clock::now();
        auto elapsed = duration_cast>(currentTime - startTime);
        if (elapsed > seconds(1)) {
            fps = counter / elapsed.count();
            counter = 0;
            startTime = currentTime;
        }
        std::stringstream fpsStr;
        fpsStr << "NN fps: " << std::fixed << std::setprecision(2) << fps;
        cv::putText(result, fpsStr.str(), cv::Point(2, result.rows - 4), cv::FONT_HERSHEY_TRIPLEX, 0.4,
                    cv::Scalar(0, 255, 0));


        cv::imshow("result", result);
        cv::waitKey(1);


    }


    return 0;
}

测试结果

补充一个python版本的代码

#!/usr/bin/env python3

import cv2
import depthai as dai
import numpy as np
import math
import time
import os


def get_color_map_list(num_classes, custom_color=None):
    """
    Returns the color map for visualizing the segmentation mask,
    which can support arbitrary number of classes.
    Args:
        num_classes (int): Number of classes.
        custom_color (list, optional): Save images with a custom color map. Default: None, use paddleseg's default color map.
    Returns:
        (list). The color map.
    """

    num_classes += 1
    color_map = num_classes * [0, 0, 0]
    for i in range(0, num_classes):
        j = 0
        lab = i
        while lab:
            color_map[i * 3] |= (((lab >> 0) & 1) << (7 - j))
            color_map[i * 3 + 1] |= (((lab >> 1) & 1) << (7 - j))
            color_map[i * 3 + 2] |= (((lab >> 2) & 1) << (7 - j))
            j += 1
            lab >>= 3
    color_map = color_map[3:]

    if custom_color:
        color_map[:len(custom_color)] = custom_color
    return color_map

def find_bound(gray_img,resize_img):
    cv2.imshow("gray_img", gray_img)
    cv2.imwrite("gray_img.jpg",gray_img)
    ret, gray_img = cv2.threshold(
        cv2.cvtColor(gray_img, cv2.COLOR_BGR2GRAY),  # 转换为灰度图像,
        60, 205,  # 大于130的改为255  否则改为0
        cv2.THRESH_BINARY)  # 黑白二值化
    cv2.imshow("gray_img2", gray_img)
    cv2.imwrite("gray_img2.jpg", gray_img)
    contours, hierarchy  = cv2.findContours(gray_img, cv2.RETR_TREE,cv2.CHAIN_APPROX_SIMPLE)
    rect_list=[]
    for c in contours:
        x, y, w, h = cv2.boundingRect(c)
        rect_list.append([x, y, w, h])

        """
        传入一个轮廓图像，返回 x y 是左上角的点， w和h是矩形边框的宽度和高度
        """
        cv2.rectangle(resize_img, (x, y), (x + w, y + h), (0, 255, 0), 2)
        cv2.imwrite("resize_img.jpg", resize_img)
        """
        画出矩形
            img 是要画出轮廓的原图
            (x, y) 是左上角点的坐标
            (x+w, y+h) 是右下角的坐标
            0,255,0）是画线对应的rgb颜色
            2 是画出线的宽度
        """

        # 获得最小的矩形轮廓 可能带旋转角度
        rect = cv2.minAreaRect(c)
        # 计算最小区域的坐标
        box = cv2.boxPoints(rect)
        # 坐标规范化为整数
        box = np.int0(box)
        # 画出轮廓
        cv2.drawContours(resize_img, [box], 0, (0, 0, 255), 3)
        cv2.imwrite("drawContours.jpg", resize_img)
        # 计算最小封闭圆形的中心和半径
        (x, y), radius = cv2.minEnclosingCircle(c)
        # 转换成整数
        center = (int(x), int(y))
        radius = int(radius)
        # 画出圆形
        resize_img = cv2.circle(resize_img, center, radius, (0, 255, 0), 2)
        cv2.imwrite("circle.jpg", resize_img)
    # 画出轮廓
    cv2.drawContours(resize_img, contours, -1, (255, 0, 0), 1)
    cv2.imshow("contours", resize_img)
    cv2.imwrite("contours.jpg",resize_img)
    return rect_list






def visualize(image, result, color_map, save_dir=None, weight=0.6):
    """
    Convert predict result to color image, and save added image.
    Args:
        image (str): The path of origin image.
        result (np.ndarray): The predict result of image.
        color_map (list): The color used to save the prediction results.
        save_dir (str): The directory for saving visual image. Default: None.
        weight (float): The image weight of visual image, and the result weight is (1 - weight). Default: 0.6
    Returns:
        vis_result (np.ndarray): If `save_dir` is None, return the visualized result.
    """

    color_map = [color_map[i:i + 3] for i in range(0, len(color_map), 3)]
    color_map = np.array(color_map).astype("uint8")
    # Use OpenCV LUT for color mapping
    c1 = cv2.LUT(result, color_map[:, 0])
    c2 = cv2.LUT(result, color_map[:, 1])
    c3 = cv2.LUT(result, color_map[:, 2])
    pseudo_img = np.dstack((c3, c2, c1))

    im = image
    vis_result = cv2.addWeighted(im, weight, pseudo_img, 1 - weight, 0)
    rect_list=find_bound(pseudo_img, image)

    if save_dir is not None:
        if not os.path.exists(save_dir):
            os.makedirs(save_dir)
        image_name = os.path.split(image)[-1]
        out_path = os.path.join(save_dir, image_name)
        cv2.imwrite(out_path, vis_result)
    else:
        return vis_result,rect_list


nn_shape = [300, 300]  # width height
target_width=nn_shape[0]*1.0
target_height=nn_shape[1]*1.0
class_num = 256
color_map = get_color_map_list(class_num)

# Start defining a pipeline
pipeline = dai.Pipeline()

pipeline.setOpenVINOVersion(version=dai.OpenVINO.VERSION_2021_4)

# Define a neural network that will make predictions based on the source frames
detection_nn = pipeline.create(dai.node.NeuralNetwork)
detection_nn.setBlobPath("/home/ubuntu/nanodet/oak_detect_head/model_300_300/pp_liteseg_stdc1_camvid_300x300_10k_model.blob")

detection_nn.setNumPoolFrames(4)
detection_nn.input.setBlocking(False)
detection_nn.setNumInferenceThreads(2)

# Define a source - color camera

cam = pipeline.create(dai.node.ColorCamera)
cam.setPreviewSize(nn_shape[1], nn_shape[0])
cam.setInterleaved(False)
cam.setPreviewKeepAspectRatio(False)

cam.preview.link(detection_nn.input)

cam.setFps(60)

monoLeft=pipeline.create(dai.node.MonoCamera)
monoRight=pipeline.create(dai.node.MonoCamera)
stereo=pipeline.create(dai.node.StereoDepth)
spatialLocationCalculator=pipeline.create(dai.node.SpatialLocationCalculator)

xoutDepth = pipeline.create (dai.node.XLinkOut)
xoutSpatialData = pipeline.create (dai.node.XLinkOut)
xinSpatialCalcConfig = pipeline.create (dai.node.XLinkIn)
xoutDepth.setStreamName("depth")
xoutSpatialData.setStreamName("spatialData")
xinSpatialCalcConfig.setStreamName("spatialCalcConfig")

monoLeft.setResolution(dai.MonoCameraProperties.SensorResolution.THE_720_P)
monoLeft.setBoardSocket(dai.CameraBoardSocket.LEFT)
monoRight.setResolution(dai.MonoCameraProperties.SensorResolution.THE_720_P)
monoRight.setBoardSocket(dai.CameraBoardSocket.RIGHT)
stereo.setDefaultProfilePreset(dai.node.StereoDepth.PresetMode.HIGH_DENSITY)
stereo.setLeftRightCheck(True)
stereo.setExtendedDisparity(True)
stereo.setDepthAlign(dai.CameraBoardSocket.RGB)


topLeft = dai.Point2f(0.4, 0.4)
bottomRight = dai.Point2f(0.6, 0.6)

spatialLocationCalculator.setWaitForConfigInput(False)
config = dai.SpatialLocationCalculatorConfigData()
config.depthThresholds.lowerThreshold = 100
config.depthThresholds.upperThreshold = 10000
config.roi = dai.Rect(topLeft, bottomRight)
spatialLocationCalculator.initialConfig.addROI(config)




monoLeft.out.link(stereo.left)
monoRight.out.link(stereo.right)

xlinkParserOut = pipeline.create ( dai.node.XLinkOut )
xlinkParserOut.setStreamName("parseOut")

xlinkoutOut = pipeline.create ( dai.node.XLinkOut )
xlinkoutOut.setStreamName("out")

xlinkoutpassthroughOut = pipeline.create (dai.node.XLinkOut )

xlinkoutpassthroughOut.setStreamName("passthrough")

spatialLocationCalculator.passthroughDepth.link(xoutDepth.input)
stereo.depth.link(spatialLocationCalculator.inputDepth)

spatialLocationCalculator.out.link(xoutSpatialData.input)
xinSpatialCalcConfig.out.link(spatialLocationCalculator.inputConfig)

detection_nn.out.link(xlinkParserOut.input)

detection_nn.passthrough.link(xlinkoutpassthroughOut.input)



# # Create outputs
# xout_rgb = pipeline.create(dai.node.XLinkOut)
# xout_rgb.setStreamName("nn_input")
# xout_rgb.input.setBlocking(False)
#
# detection_nn.passthrough.link(xout_rgb.input)
#
# xout_nn = pipeline.create(dai.node.XLinkOut)
# xout_nn.setStreamName("nn")
# xout_nn.input.setBlocking(False)
#
# detection_nn.out.link(xout_nn.input)

# Pipeline defined, now the device is assigned and pipeline is started
with dai.Device() as device:
    cams = device.getConnectedCameras()
    device.startPipeline(pipeline);
    device.setIrLaserDotProjectorBrightness(1000);
    outqueue = device.getOutputQueue("passthrough", 4, False)

    detqueue = device.getOutputQueue("parseOut", 4, False)
    depthQueue = device.getOutputQueue("depth", 4, False);
    spatialCalcQueue = device.getOutputQueue("spatialData", 4, False);
    spatialCalcConfigInQueue = device.getInputQueue("spatialCalcConfig");

    start_time = time.time()
    counter = 0
    fps = 0
    layer_info_printed = True
    while True:
        # instead of get (blocking) used tryGet (nonblocking) which will return the available data or None otherwise
        inDepth=depthQueue.get()
        in_nn_input = outqueue.get()
        in_nn = detqueue.get()
        frame = in_nn_input.getCvFrame()
        layers = in_nn.getAllLayers()
        if layer_info_printed:
            for item in layers:
                print(item.name)
            layer_info_printed = False
        # get layer1 data
        pred = in_nn.getFirstLayerInt32()
        pred = np.array(pred).astype('uint8').reshape(nn_shape[0], nn_shape[1])
        frame_,rect_list = visualize(frame, pred, color_map, None, weight=0.6)
        for item in rect_list:
            x,y,w,h=int(item[0]),int(item[1]),int(item[2]),int(item[3])
            depthFrame=inDepth.getFrame()

            depthFrameColor = cv2.normalize(depthFrame, None, 255, 0, cv2.NORM_INF, cv2.CV_8UC1)
            depthFrameColor = cv2.equalizeHist(depthFrameColor)
            depthFrameColor = cv2.applyColorMap(depthFrameColor, cv2.COLORMAP_HOT)

            topLeft.x = x * depthFrameColor.shape[1]/ target_width / depthFrameColor.shape[1]
            topLeft.y =y * depthFrameColor.shape[0] / target_height / depthFrameColor.shape[0]
            bottomRight.x = (x * depthFrameColor.shape[1] / target_width + w * depthFrameColor.shape[1] / target_width) /depthFrameColor.shape[1]
            bottomRight.y = (y * depthFrameColor.shape[0] / target_height + h * depthFrameColor.shape[0] / target_height) /depthFrameColor.shape[0]

            config.roi = dai.Rect(topLeft, bottomRight)
            cfg = dai.SpatialLocationCalculatorConfig()
            cfg.addROI(config)
            spatialCalcConfigInQueue.send(cfg)

            spatialData = spatialCalcQueue.get().getSpatialLocations()
            for depthData in spatialData:
                roi = depthData.config.roi
                roi = roi.denormalize(depthFrameColor.shape[1], depthFrameColor.shape[0])
                xmin = int(roi.topLeft().x)
                ymin =  int(roi.topLeft().y)
                xmax =  int(roi.bottomRight().x)
                ymax =  int(roi.bottomRight().y)

                coords = depthData.spatialCoordinates;
                fontType = cv2.FONT_HERSHEY_TRIPLEX

                distance = math.sqrt(coords.x * coords.x + coords.y * coords.y + coords.z * coords.z)/1000.0
                cv2.putText(depthFrameColor, f"d: { round(distance,2)} mm", (xmin + 10, ymin + 70), fontType, 0.5,(255, 0, 0))

                cv2.rectangle(depthFrameColor, (xmin,ymin),(xmax,ymax), (255, 0, 0),cv2.FONT_HERSHEY_SIMPLEX)
                cv2.putText(depthFrameColor, f"X: {int(depthData.spatialCoordinates.x)} mm", (xmin + 10, ymin + 20),
                            fontType, 0.5, (255, 0, 0))
                cv2.putText(depthFrameColor, f"Y: {int(depthData.spatialCoordinates.y)} mm", (xmin + 10, ymin + 35),
                            fontType, 0.5, (255, 0, 0))
                cv2.putText(depthFrameColor, f"Z: {int(depthData.spatialCoordinates.z)} mm", (xmin + 10, ymin + 50),
                            fontType, 0.5, (255, 0, 0))
                cv2.rectangle(frame, (x, y), (x + w, y + h), (255, 0, 0), cv2.FONT_HERSHEY_SIMPLEX)
                cv2.putText(frame, f"X: {int(depthData.spatialCoordinates.x)} mm", (x + 10, y + 20),
                            fontType, 0.5, (255, 0, 0))
                cv2.putText(frame, f"Y: {int(depthData.spatialCoordinates.y)} mm", (x + 10, y + 35),
                            fontType, 0.5, (255, 0, 0))
                cv2.putText(frame, f"Z: {int(depthData.spatialCoordinates.z)} mm", (x + 10, y + 50),
                            fontType, 0.5, (255, 0, 0))
                cv2.putText(frame, f"d: {round(distance,2)} mm", (x + 10, y + 70), fontType, 0.5,  (255, 0, 0))

                cv2.imshow("depthFrameColor",depthFrameColor)

        cv2.putText(frame, "NN fps: {:.2f}".format(fps), (2, frame.shape[0] - 4), cv2.FONT_HERSHEY_TRIPLEX, 0.4,
                    (255, 0, 0))
        cv2.imshow("nn_input", frame)
        cv2.imwrite("nn.jpg",frame)

        counter += 1
        if (time.time() - start_time) > 1:
            fps = counter / (time.time() - start_time)

            counter = 0
            start_time = time.time()

        if cv2.waitKey(1) == ord('q'):
            break

测距离

你可能感兴趣的:(人工智能,深度学习)

21-梯度累积原理与实现机器人图像处理深度学习算法与模型人工智能深度学习 YOLO
一、基本概念在深度学习训练的时候，数据的batchsize大小受到GPU内存限制，batchsize大小会影响模型最终的准确性和训练过程的性能。在GPU内存不变的情况下，模型越来越大，那么这就意味着数据的batchsize智能缩小，这个时候，梯度累积（GradientAccumulation）可以作为一种简单的解决方案来解决这个问题。二、Batchsize的作用训练数据的Batchsize大小对训
【人工智能基础2】Tramsformer架构、自然语言处理基础、计算机视觉总结 roman_日积跬步-终至千里人工智能习题人工智能自然语言处理计算机视觉
文章目录七、Transformer架构1.替代LSTM的原因2.Transformer架构：编码器-解码器架构3.Transformer架构原理八、自然语言处理基础1.语言模型基本概念2.向量语义3.预训练语言模型的基本原理与方法4.DeepSeek基本原理九、计算机视觉七、Transformer架构1.替代LSTM的原因处理极长序列时，效率下降：虽然LSTM设计的初衷是解决长期依赖问题，即让模型
怎么做一个AI产品经理？ AI筑梦师 AI产品经理人工智能产品经理
AI产品经理全面进化：在人工智能迅猛发展的时代，产品经理的角色正经历前所未有的转型。从传统的需求捕捉者到技术与商业紧密结合的创新推动者，AI产品经理肩负着将前沿AI技术转化为解决用户痛点的产品的重要任务。随着大数据、云计算和大模型技术的不断成熟，产品经理不仅需要具备敏锐的市场洞察，还必须深刻理解AI技术本质，跨界整合技术、数据与业务优势，从而推动产品的持续创新与落地。本文将全面解析AI产品经理的角
MV-EB435i立体相机对垃圾分类开哥kg pytorch 深度学习卷积神经网络分类人工智能
最近在v社区发了一篇文章，懒得转移过来了链接：V社区-机器视觉技术交流社区-MV-EB435i立体相机对垃圾分类我觉得这篇文章对于刚入门深度学习想看点项目学点代码的刚刚好，因为我也是新手，嘻嘻嘻！希望这篇文章对大家有所帮助，如有错误请大家指正。
LORA 微调大模型：从入门到入土大模型. 人工智能开发语言 gpt agi 架构大模型
在当今人工智能领域，预训练的大模型已经成为推动技术发展的核心力量。然而，在实际项目中，我们往往会发现这些预训练模型虽然强大，但直接就去应用于一些特定的任务时，往往无法完全满足需求。这时，微调就成为了必不可少的一步。而在众多微调方法中，LORA全名(Low-RankAdaptation)以高效性和实用性，逐渐成为了许多开发者训练模型的首选项。作为一名小有经验的咸鱼开发者，我深知在实际项目中高效的进行
AI人工智能中的概率论与统计学原理与Python实战：Python实现概率模型 AI天才研究院 AI实战 AI大模型企业级应用开发实战大数据人工智能语言模型 AI LLM Java Python 架构设计 Agent RPA
1.背景介绍随着人工智能技术的不断发展，概率论与统计学在人工智能领域的应用越来越广泛。概率论与统计学是人工智能中的基础知识之一，它们在机器学习、深度学习、自然语言处理等领域都有着重要的作用。本文将介绍概率论与统计学的核心概念、算法原理、具体操作步骤以及Python实现方法，并通过具体代码实例进行详细解释。2.核心概念与联系2.1概率论与统计学的区别概率论是一门数学学科，它研究随机事件发生的可能性。
如何使用 Python 实现生成对抗网络 NoABug python 生成对抗网络 tensorflow
如何使用Python实现生成对抗网络生成对抗网络（GenerativeAdversarialNetwork，GAN）是一种能够生成高质量、逼真图像的深度学习模型。GAN模型由两个神经网络组成：一个生成器和一个判别器。生成器的任务是以噪声为输入，生成看似真实的图像；而判别器则需要根据输入的图像，判断该图像是真实的还是由生成器生成的。下面我们将通过Python代码来实现一个简单的GAN模型。首先，我们
GAN模型的Python应用——生成对抗网络代码编织匠人 python 生成对抗网络开发语言
GAN模型的Python应用——生成对抗网络生成对抗网络（GenerativeAdversarialNetwork，GAN）是深度学习中的一种重要模型，已经被广泛应用于图像、文本生成等领域。GAN模型由两个神经网络组成：生成器（Generator）和判别器（Discriminator）。生成器用于生成假样本，判别器用于评估真实性。两个神经网络相互博弈，通过一次次迭代训练，最终生成器可以生成足以骗过
二值逻辑、三值逻辑到多值逻辑的变迁（含示例）搏博人工智能原理算法人工智能机器学习线性代数图像处理数据分析
二值逻辑、三值逻辑到多值逻辑的变迁是一个逻辑体系不断拓展和深化的过程，反映了人们对复杂现象和不确定性问题认识的逐步深入。前文，我们已经探讨过命题逻辑与谓词逻辑，了解了如何用符号语言从浅入深地刻画现实世界。具体可以看我的CSDN文章：人工智能的数学基础之命题逻辑与谓词逻辑（含示例）-CSDN博客人工智能中用到的逻辑可概括地划分为两大类。第一类是经典命题逻辑和一阶谓词逻辑，第二类是泛指除经典逻辑之外的
ollama下载的DeepSeek的模型(Model)文件在哪里？(C盘下) 神秘泣男子常见AI大模型部署与应用 Ollama部署LLM 人工智能 ollama llama 自然语言处理机器学习
目录一、下载大模型（DeepSeek）2.安装Ollama3.检查安装是否成功二、拉取大模型（DeepSeek）1.打开命令行2.下载模型3.测试下载4.等待下载完成三.模型存放路径这个位置！！在人工智能快速发展的今天，大语言模型已经成为许多人探索和使用的热门技术。而Ollama作为一款轻量级的本地大模型运行工具，让我们能够在个人电脑上体验各种强大的AI模型，如DeepSeek系列。不少用户在安装
如何使用Python实现生成对抗网络（GAN）「已注销」互联网前沿技术韩进的创作空间全栈开发知识库 python 生成对抗网络 tensorflow 深度学习数据分析
生成对抗网络（GAN）是一种深度学习模型，由两个部分组成：生成器和判别器。生成器负责生成与训练数据相似的新数据，而判别器负责判断输入数据是真实的还是由生成器生成的。这两个部分不断相互博弈，直到生成器能够生成非常逼真的数据，使判别器难以区分生成数据和真实数据。下面是一个简单的Python实现，使用TensorFlow和Keras库。在开始之前，请确保已经安装了TensorFlow和Keras。imp
【数学基础】线性代数#1向量和矩阵初步 -一杯为品- 数学线性代数矩阵
本系列内容介绍：主要参考资料：《深度学习》[美]伊恩·古德菲洛等著《机器人数学基础》吴福朝张铃著文章为自学笔记，仅供参考。目录标量、向量、矩阵和张量矩阵运算单位矩阵和逆矩阵线性相关和生成子空间范数特殊类型的矩阵和向量特征分解奇异值分解Moore-Penrose伪逆迹运算行列式标量、向量、矩阵和张量标量标量是一个单独的数。向量向量是一列有序排列的数：x=[x1x2⋮xn]\boldsymbolx=\
AIGC从入门到实战：可能消失的职业和新出现的机会 AGI大模型与大数据研究院 DeepSeek R1 &大数据AI人工智能计算科学神经计算深度学习神经网络大数据人工智能大型语言模型 AI AGI LLM Java Python 架构设计 Agent RPA
AIGC从入门到实战：可能消失的职业和新出现的机会作者：禅与计算机程序设计艺术1.背景介绍人工智能生成内容（AIGC）正在迅速改变我们的世界。从文本、代码到图像和音乐，AIGC正在各个领域展示其强大的能力，并开始挑战传统的创意产业。本篇文章将深入探讨AIGC的概念、技术原理、应用场景以及其对未来职业的影响，并为读者提供入门AIGC的实用指南。1.1AIGC的兴起AIGC的兴起得益于近年来人工智能技
内容创作者必备！Deepseek赋能，让创作更高效小焱创作 AI改变未来人工智能人工智能写作 ai写作深度学习神经网络 ai chatgpt
内容创作者必备！DeepSeek赋能，让创作更高效在当今信息爆炸的时代，内容创作已成为自媒体博主们展现才华、吸引粉丝的重要途径。然而，面对日益增长的竞争压力和不断变化的用户需求，如何高效、高质量地产出内容成为了摆在我们面前的一大挑战。幸运的是，随着人工智能技术的飞速发展，一款名为DeepSeek的智能工具应运而生，为内容创作者提供了强大的赋能。本文将深入探讨DeepSeek的基本概念、深层次解读、
开源模型应用落地-Qwen2-VL-7B-Instruct-vLLM-OpenAI API Client调用开源技术探险家开源大语言模型-新手试炼深度学习 AI编程 AIGC
一、前言学习Qwen2-VL，为我们打开了一扇通往先进人工智能技术的大门。让我们能够深入了解当今最前沿的视觉语言模型的工作原理和强大能力。这不仅拓宽了我们的知识视野，更让我们站在科技发展的潮头，紧跟时代的步伐。Qwen2-VL具有卓越的图像和视频理解能力，以及多语言支持等特性。学习它可以提升我们处理复杂视觉信息的能力，无论是在学术研究中分析图像数据、解读视频内容，还是在实际工作中进行文档处理、解决
ChatGPT、DeepSeek、Grok 三者对比：AI 语言模型的博弈与未来一ge科研小菜菜人工智能人工智能
个人主页：一ge科研小菜鸡-CSDN博客期待您的关注1.引言随着人工智能技术的飞速发展，AI语言模型已经成为人机交互、内容创作、代码生成、智能问答等领域的重要工具。其中，ChatGPT（OpenAI）、DeepSeek（中国团队研发）和Grok（xAI，ElonMusk旗下公司）是当前三大具有代表性的AI语言模型。它们在技术架构、应用场景、用户体验、生态开放性等多个维度各具特色，并针对不同的用户需
【go从入门到精通】探秘struct结构体转json为什么需要首字母大写？前网易架构师-高司机 golang从入门到精通 golang json go 结构体首字母大写 golang从入门到精通 go从入门到精通
目录作者简介：问题抛出分析结论作者简介：高科，先后在IBMPlatformComputing从事网格计算，淘米网，网易从事游戏服务器开发，拥有丰富的C++，go等语言开发经验，mysql，mongo，redis等数据库，设计模式和网络库开发经验，对战棋类，回合制，moba类页游，手游有丰富的架构设计和开发经验。并且深耕深度学习和数据集训练，提供商业化的视觉人工智能检测和预警系统（煤矿，工厂，制造业
【动手学深度学习】#1PyTorch基础操作 -一杯为品- 机器学习深度学习人工智能
主要参考学习资料：《动手学深度学习》阿斯顿·张等著【动手学深度学习PyTorch版】哔哩哔哩@跟李牧学AI目录1.1数据操作1.1.1入门1.1.2运算符1.1.3广播机制1.1.4索引和切片1.1.5节省内存1.1.6转换为其他Python对象1.2数据预处理1.2.1读取数据集1.2.2处理缺失值1.2.3转换为张量格式1.3线性代数1.3.1标量1.3.2向量1.3.3矩阵1.3.4张量1.
为什么转行大模型行业？深度解析职业变革与技术红利大模型入门教程大模型学习语言模型人工智能 AI 大模型程序员大模型入门
引言2023年ChatGPT的爆发式发展，标志着AI大模型技术正式进入大众视野。这一技术不仅重塑了人工智能的边界，更催生了全新的职业赛道。从传统算法工程师到互联网从业者，越来越多的人开始将目光投向大模型领域。本文将深入探讨这一现象背后的核心动因，并结合行业现状、技术趋势与职业发展路径，为从业者提供系统性分析。一、行业变革：传统岗位萎缩与大模型崛起传统技术岗位的困境以推荐算法为例，随着移动互联网流量
DeepSeek 与云原生后端：AI 赋能现代应用架构一ge科研小菜菜后端人工智能后端
个人主页：一ge科研小菜鸡-CSDN博客期待您的关注1.引言在当今快速发展的互联网时代，云原生（CloudNative）架构已成为后端开发的主流趋势。云原生后端的核心目标是利用云计算的弹性、可扩展性和高可用性，为现代应用提供稳定可靠的后端支持。而人工智能（AI）技术的发展，使得智能化成为云原生后端的新趋势。DeepSeek作为新一代AI技术，在云原生后端的自动化运维、智能资源调度、安全增强和高效数
AI 大模型应用数据中心建设：高性能计算与存储架构 AI智能涌现深度研究 AI大模型应用入门实战与进阶 java python javascript kotlin golang 架构人工智能
AI大模型、数据中心、高性能计算、存储架构、分布式训练、GPU加速、数据管理1.背景介绍近年来，人工智能（AI）技术取得了飞速发展，特别是深度学习模型的突破性进展，催生了一系列基于大规模数据训练的强大AI模型，例如GPT-3、BERT、DALL-E等。这些AI大模型在自然语言处理、计算机视觉、语音识别等领域展现出强大的应用潜力，但也对计算资源和数据存储提出了极高的要求。传统的计算架构难以满足AI大
人工智能直通车系列24【机器学习基础】（机器学习模型评估指标（回归））浪九天人工智能直通车开发语言 python 机器学习深度学习神经网络人工智能
目录机器学习模型评估指标（回归）1.均方误差（MeanSquaredError,MSE）2.均方根误差（RootMeanSquaredError,RMSE）3.平均绝对误差（MeanAbsoluteError,MAE）4.决定系数（CoefficientofDetermination,R2）机器学习模型评估指标（回归）1.均方误差（MeanSquaredError,MSE）详细解释均方误差是回归问
国央企AI落地：以智能客服系统为突破口的详细实施方案探讨数商云网络 B2B系统数字化电商平台人工智能大数据架构 java 微服务 spring
一、引言随着人工智能（AI）技术的飞速发展和广泛应用，国央企作为国民经济的重要支柱，正积极探索AI技术在企业管理、业务运营等方面的应用。智能客服系统作为AI技术的一个重要应用领域，具有提升服务效率、增强用户体验、降低运营成本等显著优势，成为国央企AI落地的重要突破口。本文将详细探讨国央企如何以智能客服系统为突破口，实施AI技术的落地应用，并结合数商云在智能客服系统领域的实践经验，为国央企提供一套切
CSDN社区，到底该不该用DeepSeek AI生成文章？ Small踢倒coffee_氕氘氚 python 经验分享
##引言在当今数字化时代，人工智能（AI）技术正以惊人的速度发展，逐渐渗透到各个行业和领域。作为AI技术的一个重要分支，自然语言处理（NLP）在内容创作、文本生成等方面展现出了巨大的潜力。DeepSeekAI作为一款先进的AI写作工具，能够自动生成高质量的文章，极大地提高了内容创作的效率。然而，随着AI生成内容的普及，CSDN社区中的开发者、技术爱好者和内容创作者们开始思考一个问题：我们到底该不该
AI大模型从入门到精通，2025终极指南！好卷啊，又不能躺平，只能悄悄卷你们了！大模型教程人工智能大模型训练 LLM 知识库大模型大模型入门大模型学习
什么是AI大模型？AI大模型是指使用大规模数据和强大的计算能力训练出来的人工智能模型。这些模型通常具有高度的准确性和泛化能力，可以应用于各种领域，如自然语言处理、图像识别、语音识别等。为什么要学AI大模型？2024人工智能大模型的技术岗位与能力培养随着人工智能技术的迅速发展和应用，大模型作为其中的重要组成部分，正逐渐成为推动人工智能发展的重要引擎。大模型以其强大的数据处理和模式识别能力，广泛应用于
AI大模型学习路线：从入门到精通的完整指南【2025最新】 AI大模型-大飞人工智能学习大模型 LLM AI 程序员大模型学习
引言近年来，以GPT、BERT、LLaMA等为代表的AI大模型彻底改变了人工智能领域的技术格局。它们不仅在自然语言处理（NLP）任务中表现卓越，还在计算机视觉、多模态交互等领域展现出巨大潜力。本文旨在为开发者、研究者和技术爱好者提供一条清晰的学习路径，帮助读者逐步掌握大模型的核心技术并实现实际应用。一、基础阶段：构建知识体系数学与理论基础线性代数：矩阵运算、特征值与奇异值分解是大模型参数优化的基础
Python与Web 3.0：重新定义数字身份验证的未来 Echo_Wish Python！实战！python 前端开发语言
Python与Web3.0：重新定义数字身份验证的未来随着Web3.0的迅猛发展，传统的身份验证方式正面临越来越大的挑战。从依赖中心化服务器存储用户数据，到如今去中心化、用户掌控数据的新时代，身份验证系统经历了前所未有的变革。而作为一个人工智能、区块链和Python技术的深度爱好者，我认为Python将成为构建Web3.0身份验证系统的重要工具。今天，我们就来聊聊如何结合Python与Web3.0
大模型和数据要素赋能实体零售行业数字化转型建设和实施方案优享智库大模型数据要素数据治理数据仓库主数据零售
大模型和数据要素赋能实体零售行业数字化转型建设和实施方案更多参考公众号：优享智库引言项目背景与意义数字化转型目标与期望实施方案概述零售行业现状及挑战实体零售行业现状数字化转型面临的挑战市场需求与趋势分析大模型与数据要素赋能策略大模型技术及应用场景数据要素采集、整合与治理赋能策略制定与实施路径数字化转型关键技术与解决方案人工智能技术及应用大数据分析与挖掘技术云计算、物联网等技术支持定制化解决方案设计
从LLM出发：由浅入深探索AI开发的全流程与简单实践（全文3w字）码事漫谈 AI 人工智能
文章目录第一部分：AI开发的背景与历史1.1人工智能的起源与发展1.2神经网络与深度学习的崛起1.3Transformer架构与LLM的兴起1.4当前AI开发的现状与趋势第二部分：AI开发的核心技术2.1机器学习：AI的基础2.1.1机器学习的类型2.1.2机器学习的流程2.2深度学习：机器学习的进阶2.2.1神经网络基础2.2.2深度学习的关键架构2.3Transformer架构：现代LLM的核
我们的AI人工智能，自动发布了一篇假新闻…… 数据断案数据人的故事人工智能数据库 sql oracle 数据分析
今天这个故事，还得从一个事故开始说起。前些日子，我们被XX公司投诉，说我们的资讯发布了关于他们公司授信额度的不实报道：告诉我们这篇资讯与他们公司最新公开披露的数据不一致，相关内容并不属实，可能对广大网友们造成严重误导，并对他们公司造成了严重负面影响……balabala一堆指责，并要求我们3小时内删除全部相关信息。然后，他们丢了2篇公告附件过来。我们对照着仔细一看，还真是我们搞错了：由于数据错误，“
Java序列化进阶篇 g21121 java序列化
1.transient 类一旦实现了Serializable 接口即被声明为可序列化，然而某些情况下并不是所有的属性都需要序列化，想要人为的去阻止这些属性被序列化，就需要用到transient 关键字。
escape()、encodeURI()、encodeURIComponent()区别详解 aigo JavaScript Web
原文：http://blog.sina.com.cn/s/blog_4586764e0101khi0.html JavaScript中有三个可以对字符串编码的函数，分别是： escape,encodeURI,encodeURIComponent，相应3个解码函数：,decodeURI,decodeURIComponent 。下面简单介绍一下它们的区别 1 escape()函
ArcgisEngine实现对地图的放大、缩小和平移 Cb123456 添加矢量数据对地图的放大、缩小和平移 Engine
ArcgisEngine实现对地图的放大、缩小和平移: 个人觉得是平移，不过网上的都是漫游，通俗的说就是把一个地图对象从一边拉到另一边而已。就看人说话吧. 具体实现: 一、引入命名空间 using ESRI.ArcGIS.Geometry; using ESRI.ArcGIS.Controls; 二、代码实现.
Java集合框架概述天子之骄 Java集合框架概述
集合框架集合框架可以理解为一个容器，该容器主要指映射(map)、集合(set)、数组(array)和列表(list)等抽象数据结构。从本质上来说，Java集合框架的主要组成是用来操作对象的接口。不同接口描述不同的数据类型。简单介绍： Collection接口是最基本的接口，它定义了List和Set，List又定义了LinkLi
旗正4.0页面跳转传值问题何必如此 java jsp
跳转和成功提示 a) 成功字段非空forward 成功字段非空forward，不会弹出成功字段，为jsp转发，页面能超链接传值,传输变量时需要拼接。接拼接方式list.jsp?test="+strweightUnit+"或list.jsp?test="+weightUnit+&qu
全网唯一:移动互联网服务器端开发课程 cocos2d-x小菜 web开发移动开发移动端开发移动互联程序员
移动互联网时代来了！ App市场爆发式增长为Web开发程序员带来新一轮机遇，近两年新增创业者，几乎全部选择了移动互联网项目！传统互联网企业中超过98%的门户网站已经或者正在从单一的网站入口转向PC、手机、Pad、智能电视等多端全平台兼容体系。据统计，AppStore中超过85%的App项目都选择了PHP作为后端程
Log4J通用配置|注意问题笔记 7454103 DAO apache tomcat log4j Web
关于日志的等级那些去百度就知道了！这几天要搭个新框架配置了日志记下来！做个备忘！ #这里定义能显示到的最低级别,若定义到INFO级别,则看不到DEBUG级别的信息了~! log4j.rootLogger=INFO,allLog # DAO层 log记录到dao.log 控制台和总日志文件 log4j.logger.DAO=INFO,dao,C
SQLServer TCP/IP 连接失败问题 ---SQL Server Configuration Manager darkranger sql c windows SQL Server XP
当你安装完之后,连接数据库的时候可能会发现你的TCP/IP 没有启动.. 发现需要启动客户端协议 : TCP/IP 需要打开 SQL Server Configuration Manager... 却发现无法打开 SQL Server Configuration Manager..?? 解决方法: C:\WINDOWS\system32目录搜索framedyn.
[置顶] 做有中国特色的程序员 aijuans 程序员
从出版业说起网络作品排到靠前的，都不会太难看，一般人不爱看某部作品也是因为不喜欢这个类型，而此人也不会全不喜欢这些网络作品。究其原因，是因为网络作品都是让人先白看的，看的好了才出了头。而纸质作品就不一定了，排行榜靠前的，有好作品，也有垃圾。许多大牛都是写了博客，后来出了书。这些书也都不次，可能有人让为不好，是因为技术书不像小说，小说在读故事，技术书是在学知识或温习知识，有些技术书读得可
document.domain 跨域问题 avords document
document.domain用来得到当前网页的域名。比如在地址栏里输入：javascript:alert(document.domain); //www.315ta.com我们也可以给document.domain属性赋值，不过是有限制的，你只能赋成当前的域名或者基础域名。比如：javascript:alert(document.domain = "315ta.com");
关于管理软件的一些思考 houxinyou 管理
工作好多看年了,一直在做管理软件,不知道是我最开始做的时候产生了一些惯性的思维,还是现在接触的管理软件水平有所下降.换过好多年公司,越来越感觉现在的管理软件做的越来越乱. 在我看来,管理软件不论是以前的结构化编程,还是现在的面向对象编程,不管是CS模式,还是BS模式.模块的划分是很重要的.当然,模块的划分有很多种方式.我只是以我自己的划分方式来说一下. 做为管理软件,就像现在讲究MVC这
NoSQL数据库之Redis数据库管理(String类型和hash类型) bijian1013 redis 数据库 NoSQL
一.Redis的数据类型 1.String类型及操作 String是最简单的类型，一个key对应一个value，string类型是二进制安全的。Redis的string可以包含任何数据，比如jpg图片或者序列化的对象。 Set方法：设置key对应的值为string类型的value
Tomcat 一些技巧征客丶 java tomcat dos
以下操作都是在windows 环境下一、Tomcat 启动时配置 JAVA_HOME 在 tomcat 安装目录，bin 文件夹下的 catalina.bat 或 setclasspath.bat 中添加 set JAVA_HOME=JAVA 安装目录 set JRE_HOME=JAVA 安装目录/jre 即可；二、查看Tomcat 版本在 tomcat 安装目
【Spark七十二】Spark的日志配置 bit1129 spark
在测试Spark Streaming时，大量的日志显示到控制台，影响了Spark Streaming程序代码的输出结果的查看(代码中通过println将输出打印到控制台上)，可以通过修改Spark的日志配置的方式，不让Spark Streaming把它的日志显示在console 在Spark的conf目录下，把log4j.properties.template修改为log4j.p
Haskell版冒泡排序 bookjovi 冒泡排序 haskell
面试的时候问的比较多的算法题要么是binary search，要么是冒泡排序，真的不想用写C写冒泡排序了，贴上个Haskell版的，思维简单，代码简单，下次谁要是再要我用C写冒泡排序，直接上个haskell版的，让他自己去理解吧。 sort [] = [] sort [x] = [x] sort (x:x1:xs) | x>x1 = x1:so
java 路径配置文件读取 bro_feng java
这几天做一个项目，关于路径做如下笔记，有需要供参考。取工程内的文件，一般都要用相对路径，这个自然不用多说。在src统计目录建配置文件目录res,在res中放入配置文件。读取文件使用方式： 1. MyTest.class.getResourceAsStream("/res/xx.properties") 2. properties.load(MyTest.
读《研磨设计模式》-代码笔记-简单工厂模式 bylijinnan java 设计模式
声明：本文只为方便我个人查阅和理解，详细的分析以及源代码请移步原作者的博客http://chjavach.iteye.com/ package design.pattern; /* * 个人理解：简单工厂模式就是IOC; * 客户端要用到某一对象，本来是由客户创建的，现在改成由工厂创建，客户直接取就好了 */ interface IProduct {
SVN与JIRA的关联 chenyu19891124 SVN
SVN与JIRA的关联一直都没能装成功，今天凝聚心思花了一天时间整合好了。下面是自己整理的步骤：一、搭建好SVN环境，尤其是要把SVN的服务注册成系统服务二、装好JIRA，自己用是jira-4.3.4破解版三、下载SVN与JIRA的插件并解压，然后拷贝插件包下lib包里的三个jar，放到Atlassian\JIRA 4.3.4\atlassian-jira\WEB-INF\lib下，再
JWFDv0.96 最新设计思路 comsci 数据结构算法工作企业应用公告
随着工作流技术的发展，工作流产品的应用范围也不断的在扩展，开始进入了像金融行业(我已经看到国有四大商业银行的工作流产品招标公告了)，实时生产控制和其它比较重要的工程领域，而
vi 保存复制内容格式粘贴 daizj vi 粘贴复制保存原格式不变形
vi是linux中非常好用的文本编辑工具，功能强大无比，但对于复制带有缩进格式的内容时，粘贴的时候内容错位很严重，不会按照复制时的格式排版，vi能不能在粘贴时，按复制进的格式进行粘贴呢？答案是肯定的，vi有一个很强大的命令可以实现此功能。在命令模式输入:set paste，则进入paste模式，这样再进行粘贴时
shell脚本运行时报错误：/bin/bash^M: bad interpreter 的解决办法 dongwei_6688 shell脚本
出现原因：windows上写的脚本，直接拷贝到linux系统上运行由于格式不兼容导致解决办法： 1. 比如文件名为myshell.sh，vim myshell.sh 2. 执行vim中的命令 : set ff?查看文件格式，如果显示fileformat=dos，证明文件格式有问题 3. 执行vim中的命令 :set fileformat=unix 将文件格式改过来就可以了，然后:w
高一上学期难记忆单词 dcj3sjt126com word english
honest 诚实的；正直的 argue 争论 classical 古典的 hammer 锤子 share 分享；共有 sorrow 悲哀；悲痛 adventure 冒险 error 错误；差错 closet 壁橱；储藏室 pronounce 发音；宣告 repeat 重做；重复 majority 大多数；大半 native 本国的，本地的，本国
hibernate查询返回DTO对象，DTO封装了多个pojo对象的属性 frankco POJO hibernate查询 DTO
DTO-数据传输对象；pojo-最纯粹的java对象与数据库中的表一一对应。简单讲：DTO起到业务数据的传递作用，pojo则与持久层数据库打交道。有时候我们需要查询返回DTO对象，因为DTO
Partition List hcx2013 partition
Given a linked list and a value x, partition it such that all nodes less than x come before nodes greater than or equal to x. You should preserve the original relative order of th
Spring MVC测试框架详解——客户端测试 jinnianshilongnian
上一篇《Spring MVC测试框架详解——服务端测试》已经介绍了服务端测试，接下来再看看如果测试Rest客户端，对于客户端测试以前经常使用的方法是启动一个内嵌的jetty/tomcat容器，然后发送真实的请求到相应的控制器；这种方式的缺点就是速度慢；自Spring 3.2开始提供了对RestTemplate的模拟服务器测试方式，也就是说使用RestTemplate测试时无须启动服务器，而是模拟一
关于推荐个人观点 liyonghui160com 推荐系统关于推荐个人观点
回想起来，我也做推荐了3年多了，最近公司做了调整招聘了很多算法工程师，以为需要多么高大上的算法才能搭建起来的，从实践中走过来，我只想说【不是这样的】第一次接触推荐系统是在四年前入职的时候，那时候，机器学习和大数据都是没有的概念，什么大数据处理开源软件根本不存在，我们用多台计算机web程序记录用户行为，用.net的w
不间断旋转的动画 pangyulei 动画
CABasicAnimation* rotationAnimation; rotationAnimation = [CABasicAnimation animationWithKeyPath:@"transform.rotation.z"]; rotationAnimation.toValue = [NSNumber numberWithFloat: M
自定义annotation sha1064616837 java enum annotation reflect
对象有的属性在页面上可编辑，有的属性在页面只可读，以前都是我们在页面上写死的，时间一久有时候会混乱，此处通过自定义annotation在类属性中定义。越来越发现Java的Annotation真心很强大，可以帮我们省去很多代码，让代码看上去简洁。下面这个例子主要用到了 1.自定义annotation：@interface，以及几个配合着自定义注解使用的几个注解 2.简单的反射 3.枚举
Spring 源码 up2pu spring
1.Spring源代码 https://github.com/SpringSource/spring-framework/branches/3.2.x 注：兼容svn检出 2.运行脚本 import-into-eclipse.bat 注：需要设置JAVA_HOME为jdk 1.7 build.gradle compileJava { sourceCompatibilit
利用word分词来计算文本相似度 yangshangchuan word word分词文本相似度余弦相似度简单共有词
word分词提供了多种文本相似度计算方式：方式一：余弦相似度，通过计算两个向量的夹角余弦值来评估他们的相似度实现类：org.apdplat.word.analysis.CosineTextSimilarity 用法如下： String text1 = "我爱购物"; String text2 = "我爱读书"; String text3 =