Filbert的榛子

【机器学习-吴恩达】Week3 分类问题——逻辑回归&正则化

文章目录

Terminology
Logistic Regression
- Classification and Representation
- - Classification
  - - linear regression to a classification problem
    - Logistic Regression - a classification algorithm
  - Hypothesis Representation
  - - Logistic Regression Model
    - Sigmoid(Logistic) function -- g(z)
    - lnterpretation of Hypothesis Output
  - Decision Boundary
  - - Non-linear decision boundaries
- Logistic Regression Model
- - Cost Function
  - Simplified Cost Function and Gradient Descent
  - - Compact logistic regression cost function
    - Gradient Descent
  - Advanced Optimization
  - - Optimization algorithm
    - Example
    - Code
- Multiclass Classification
- - Multiclass Classification: One-vs-all(rest)
  - - One-vs-all(rest)
Regularization
- Solving the Problem of Overfitting
- - The Problem of Overfitting
  - - Underfitting - High Bias
    - Overfitting - High Variance
    - Address overfitting
  - Cost Function
  - - Regularization
  - Regularized Linear Regression
  - - Regularized linear regression
    - Gradiennt descent
    - Normal equation
    - Non-invertibility
  - Regularized Logistic Regression
  - - Regularized logistic regression
    - Gradient descent
    - Advanced optimization
References
- Test

Terminology

fraudulent: 欺诈的

arbitrary: 随意,独断

asymptote: 渐近线

defer: 延缓,迁延;服从;顺延

convex: 中凸的;鼓

square roots of numbers: 数字的平方根

inverses of matrices: 矩阵的逆

opaque: 不透明

regularization: 正则化

overfitting: 过度拟合

underfitting: 欠拟合

generalize: 泛化

plateau:

n. an area of relatively level high ground.
n. a state of little or no change following a period of activity or progress.

“the peace process had reached a plateau”
v. reach a state of little or no change after a time of activity or progress.

“the industry’s problems have plateaued out”

wiggly: 蠕动

preconception: n.先入之见;成见;预想;事先形成的观念;

high order polynomial: 高阶多项式

contort: 扭曲

magnitude: n.size

penalize: 惩罚,处罚

by convention: 按照惯例

summation: 总和

akin: adj. 类似的

Logistic Regression

Now we are switching from regression problems to classification problems. Don’t be confused by the name “Logistic Regression”; it is named that way for historical reasons and is actually an approach to classification problems, not regression problems.

Classification and Representation

Classification

Instead of our output vector $y$ being a continuous range of values, it will only be 0 or 1.

$y\in{0,1}$

Where 0 is usually taken as the “negative class” and 1 as the “positive class”, but you are free to assign any representation to it.

We’re only doing two classes for now, called a “Binary Classification Problem.”

linear regression to a classification problem

applying linear regression to a classification problem: not good

Threshold classifier output $h_\theta(x)$ at 0.5:

if $h_\theta(x)\geq0.5$ , predict “y=1”

if $h_\theta(x)\lt0.5$ , predict “y=1”

Logistic Regression - a classification algorithm

the output, the predictions of logistic regression are always between zero and one, and doesn’t become bigger than one or become less than zero.

$0\leq h_\theta(x)\leq1$

Given $x^{(i)}$ , the corresponding $y^{(i)}$ is also called the label for the training example.

Hypothesis Representation

Logistic Regression Model

Want $0\leq h_\theta(x)\leq1$
$h_\theta(x)=g(θ^Tx) \\ z=θ^Tx \\ g(z)=\frac{1}{1+e^{−z}}$
also as:
$h_\theta(x)=\frac{1}{1+e^{−\theta^Tx}}$

Sigmoid(Logistic) function – g(z)

鉴于该假设表示，我们需要做的就是，将参数 $\theta$ 拟合到数据中。因此，给定一个训练集，我们需要为参数 $\theta$ 确定一个值，然后这个假设将使得我们做出预测。

lnterpretation of Hypothesis Output

$h_\theta(x)$ = estimated probability that y=1 on input x

$h_θ(x)=P(y=1|x;θ)=1−P(y=0|x;θ)\\ P(y=0|x;θ)+P(y=1|x;θ)=1$
$h_θ(x)$ 用概率公式P表示的含义，即为在给定x的条件下，y=1 的概率。假设患者具有特征 x，也就是假设患者具有特定肿瘤大小（肿瘤大小由特征x表示），并且这个概率是由 $\theta$ 参数化的，那么我们会依靠该假设来估计 y=1 的概率。

Decision Boundary

In order to get our discrete 0 or 1 classification, we can translate the output of the hypothesis function as follows:
$\\ hθ(x)<0.5→y=0$
The way our logistic function g behaves is that when its input is greater than or equal to zero, its output is greater than or equal to 0.5:
$g(z)≥0.5\\ when\ z≥0$
So if our input to g is $\theta^TX$ , then that means:
$h_θ(x)=g(θ^Tx)≥0.5\\ when\ θ^Tx≥0$
From these statements we can now say:
$θ^Tx≥0⇒y=1\\ θ^Tx<0⇒y=0$

Decision boundary and the region where we predict y =1 versus y = 0, that’s a property of the hypothesis and of the parameters of the hypothesis and not a property of the data set.

Non-linear decision boundaries

The training set may be used to fit the parameters θ.

Once you have the parameters θ，that is what defines the decisions boundary.

Again, the input to the sigmoid function g(z) (e.g. $\theta^T X$ ) doesn’t need to be linear, and could be a function that describes a circle (e.g. $\theta_0 + \theta_1 x_1^2 +\theta_2 x_2^2$ ) or any shape to fit our data.

Logistic Regression Model

Cost Function

the supervised learning problem of fitting logistic regression model:

We cannot use the same cost function that we use for linear regression because the Logistic Function will cause the output to be wavy, causing many local optima. In other words, it will not be a convex function.

Instead, our cost function for logistic regression looks like:
$J(θ)=\frac{1}{m}\sum_{i=1}^{m}Cost(h_θ(x^{(i)}),y^{(i)})\\ Cost(h_θ(x),y)=\begin{cases}−log(h_θ(x)),&\text{if y=1}\\ −log(1−h_θ(x)),&\text{if y=0} \end{cases}$
if y=1:

if y=0:

The more our hypothesis is off from y, the larger the cost function output. If our hypothesis is equal to y, then our cost is 0:
$Cost(h_θ(x),y)=0 \ \ if \ h_θ(x)=y \\ Cost(h_θ(x),y)→∞ \ \ if \ y=0 \ and \ h_θ(x)→1 \\ Cost(h_θ(x),y)→∞ \ \ if \ y=1 \ and \ h_θ(x)→0$
上述公式中第一条可以分开解释为：

如果 $h_\theta(x)=y=1$ ,（读图1）相应的 $Cost(h_θ(x),y)=0$

如果 $h_\theta(x)=y=0$ ,（读图2）相应的 $Cost(h_θ(x),y)=0$

Note that writing the cost function in this way guarantees overall cost function $J(\theta)$ will be convex for logistic regression and local optima free.

总成本函数J(θ)将是凸的且无局部最优。

Simplified Cost Function and Gradient Descent

Compact logistic regression cost function

$J(θ)=\frac{1}{m}\sum_{i=1}^{m}Cost(h_θ(x^{(i)}),y^{(i)})\\ Cost(h_θ(x),y)=\begin{cases}−log(h_θ(x)),&\text{if y=1}\\ −log(1−h_θ(x)),&\text{if y=0} \end{cases}$

Note: y=0 or 1 always

We can compress our cost function’s two conditional cases into one case:
$\textcolor{#FF0000}{Cost(h_θ(x),y)=−y\,\log(h_θ(x))−(1−y)\log(1−h_θ(x))}$
We can fully write out our entire cost function as follows:
$J(θ)=\frac{1}{m}\sum_{i=1}^{m}Cost(h_θ(x^{(i)}),y^{(i)})\\ =-\frac{1}{m}\sum_{i=1}^{m}[y^{(i)}\log{(h_θ(x^{(i)}))}+(1−y^{(i)})\log{(1−h_θ(x^{(i)}))}]$
This cost function can be derived from statistics using the principle of maximum likelihood estimation, which is an idea in statistics for how to efficiently find parameters’ data for different models.

该代价函数可以利用最大似然估计原理从统计学中推导出来，这是统计学中关于如何有效地找到不同模型的参数数据的一种思想。

A vectorized implementation is:
$h=g(X\theta)\\ J(\theta)=\frac{1}{m}\cdot[-y^T\log(h)-(1−y^T)\log(1−h)]$
To fit parameters θ:
$\min_{\theta}J(\theta)$
To make a prediction given new x:

Output
$h_\theta(x)=\frac{1}{1+e^{−\theta^Tx}}$

Gradient Descent

Want $\min_{\theta}J(\theta)$ :
$Repeat\{\\ \ θ_j:=θ_j−\frac{\alpha}{m}\sum_{i=1}^{m}(h_θ(x^{(i)})−y^{(i)})x^{(i)}_j\\ \}$
Notice that this algorithm is identical to the one we used in linear regression. We still have to simultaneously update all values in theta.

逻辑回归的更新算法看似来和线性回归的更新算法形式相同（求导出的形式相同），但实际上 $h_\theta(x)$ 的定义式不同。

A vectorized implementation is:
$θ:=θ−\frac{\alpha}{m}X^T(g(Xθ)−\vec{y})$

The vectorized version:
$\nabla J(\theta)=\frac{1}{m}X^T(g(X\cdotθ)−\vec{y})$
$\nabla$ 表示梯度算符

向量化实现：

求导过程：

Advanced Optimization

Optimization algorithm

Cost function $J(\theta)$ . Want $\min_{\theta}J(\theta)$

Given $\theta$ , we have code that can compute:

$J(\theta)$
$\frac{\partial}{\partial\theta_j}J(\theta)$ (for j=0,1,…,n)

If we only provide them a way to compute these two things, then these are different approaches to optimize the cost function for us:

Other Optimiztion algorithms

Gradient Descent：
$Repeat\{\\ \ θ_j:=θ_j−\alpha\frac{\partial}{\partial\theta_j}J(\theta)\\ \}$
Conjugate gradient

BFGS

L-BFGS

Example

勘误：The notation for specifying MaxIter is incorrect. The value provided should be an integer, not a character string. So (…‘MaxIter’, ‘100’) is incorrect. It should be (…‘MaxIter’, 100). This error only exists in the video - the exercise script files are correct.

Code

We can write a single function that returns both of these:

function [jVal, gradient] = costFunction(theta)
  jVal = [...code to compute J(theta)...];
  gradient = [...code to compute derivative of J(theta)...];
end

Then we can use octave’s “fminunc()” optimization algorithm along with the “optimset()” function that creates an object containing the options we want to send to “fminunc()”. (Note: the value for MaxIter should be an integer, not a character string)

options = optimset('GradObj', 'on', 'MaxIter', 100);
initialTheta = zeros(2,1);
   [optTheta, functionVal, exitFlag] = fminunc(@costFunction, initialTheta, options);

We give to the function “fminunc()” our cost function, our initial vector of theta values, and the “options” object that we created beforehand.

Multiclass Classification

Multiclass Classification: One-vs-all(rest)

One-vs-all(rest)

Train a logistic regression classifier $h^{(i)}_\theta(x)$ for each class $i$ to predict the probability that $y = i$ .

To make a prediction, on a new input $x$ , pick the class $i$ that maximizes $\max_{i}h^{(i)}_\theta (x)$

Regularization

Solving the Problem of Overfitting

The Problem of Overfitting

Underfitting - High Bias

It’s just not even fitting the training data very well.

The data clearly shows structure not captured by the model.

Underfitting, or high bias, is when the form of our hypothesis function $h$ maps poorly to the trend of the data. It is usually caused by a function that is too simple or uses too few features.

Overfitting - High Variance

If we’re fitting such a high order polynomial, then, the hypothesis can fit almost any function and this face of possible hypothesis is just too large, it’s too variable. And we don’t have enough data to constrain it to give us a good hypothesis.

Overfitting: If we have too many features, the learned hypothesis may fit the training set very well ( $J(\theta)\approx0$ ), but fail to generalize to new examples (predict prices on new examples).

The term generalized refers to how well a hypothesis applies even to new examples. That is to data to houses that it has not seen in the training set.

Address overfitting

problem: lot of features & very little training data

Options:

Reduce the number of features:

Manually select which features to keep.
Use a model selection algorithm (studied later in the course).

Regularization:

Keep all the features, but reduce the magnitude of parameters $\theta_j$ .
Regularization works well when we have a lot of slightly useful features.

Cost Function

Regularization

Small values for parameters $\theta_0,\theta_1,...,\theta_n$

Correspond to “Simpler” hypothesis
Less prone to overfitting

Housing example:

Features: $x_1,x_2,...,x_100$
Parameters: $\theta_0,\theta_1,...,\theta_{100}$

pick which paramerter to shrink?

By convention, we regularize only $\theta_1$ through $\theta_{100}$ , and not actually penalize $\theta_{0}$ being large.

Regularized optimization objective:
$J(θ)=\frac{1}{2m}[\sum_{i=1}^{m}(h_\theta(x^{(i)})-y^{(i)})^2+\lambda\sum_{j=1}^{n}\theta_j^2]\\ \min_{\theta}J(\theta)$
$\lambda\sum_{j=1}^{n}\theta_j^2$ : regularization term

$\lambda$ : regularization parameter. It controls a trade off between two different goals. It determines how much the costs of our theta parameters are inflated.

The first goal, capture it by the first goal objective, is that we would like to fit the training data well.

The second goal is, we want to keep the parameters small, and that’s captured by the second term, by the regularization term.

What if $\lambda$ is set to an extremely large value (perhaps for too large for our problem, say $\lambda=10^{10}$ ?)

If lambda is chosen to be too large, it may smooth out the function too much and cause underfitting.

过度平滑函数并导致欠拟合

What will happen is we will end up penalizing the parameters $\theta_1, \theta_2,\theta_3,\theta_4$ very highly.

And if we end up penalizing $\theta_1, \theta_2,\theta_3,\theta_4$ very heavily, then we end up with all of these parameters close to zero.

It doesn’t go anywhere near most of the training examples. And another way of saying this is that this hypothesis has too strong a preconception or too high bias that housing prices are just equal to $\theta_0$ , and despite the clear data to the contrary, chooses to fit a sort of a flat horizontal line to the data.

what would happen if $\lambda=0$ or is too small ?

Adding regularization may cause your classifier to incorrectly classify some training examples (which it had correctly classified when not using regularization, i.e. when $\lambda = 0$ ).

Regularized Linear Regression

Regularized linear regression

$J(θ)=\frac{1}{2m}[\sum_{i=1}^{m}(h_\theta(x^{(i)})-y^{(i)})^2+\lambda\sum_{j=1}^{n}\theta_j^2]\\ \min_{\theta}J(\theta)$

We will modify our gradient descent function to separate out $\theta_0$ from the rest of the parameters because we do not want to penalize $\theta_0$ .

Gradiennt descent

Compute:
$Repeat\{\\ \ θ_0:=θ_0−\frac{\alpha}{m}\sum_{i=1}^{m}(h_θ(x^{(i)})−y^{(i)})x^{(i)}_0\\ \ θ_j:=θ_j−\alpha[(\frac{1}{m}\sum_{i=1}^{m}(h_θ(x^{(i)})−y^{(i)})x^{(i)}_j)+\frac{\lambda}{m}\theta_j] \ j\in\{1,2,...,n\}\\ \}$

The term $\frac{\lambda}{m}\theta_j$ performs our regularization. With some manipulation our update rule can also be represented as:
$\ θ_j:=θ_j(1-\alpha\frac{\lambda}{m})−\alpha\frac{1}{m}\sum_{i=1}^{m}(h_θ(x^{(i)})−y^{(i)})x^{(i)}_j) \ \ j\in\{1,2,...,n\}$
The first term in the above equation, $-\alpha\frac{\lambda}{m}$ will always be less than 1.

if your learning rate is small and if m is large, this is usually pretty small.

Intuitively you can see it as reducing the value of $\theta_j$ by some amount on every update. Notice that the second term is now exactly the same as it was before.

Normal equation

$\left[ \begin{matrix} (x^{(1)})^T \\ \vdots \\ (x^{(m)})^T \end{matrix} \right]$

$\left[ \begin{matrix} y^{(1)} \\ \vdots \\ y^{(m)} \end{matrix} \right] \in\mathbb{R}^{m}$

$\min_{\theta}J(\theta)$

To add in regularization, the equation is the same as our original, except that we add another term inside the parentheses:
${\theta=(X^TX+\lambda\cdot L)^{-1}X^Ty}\\ where\ L=\left[ \begin{matrix} 0 & & & \\ & 1 & & \\ & & \ddots & \\ & & & 1 \\ \end{matrix}\right]$
$L$ is a matrix with 0 at the top left and 1’s down the diagonal, with 0’s everywhere else. It should have dimension (n+1)×(n+1).

Intuitively, this is the identity matrix (though we are not including $x_0$ ), multiplied with a single real number λ.

Non-invertibility

The correct statement should be that X is non-invertible if m < n, and may be non-invertible if m = n.

Recall that if m < n, then $X^TX$ is non-invertible. However, when we add the term $λ \cdot L$ , then $X^TX+λ⋅L$ becomes invertible.

Regularized Logistic Regression

Regularized logistic regression

We can regularize cost function for logistic regression by adding a term to the end:
$J(θ)=-\frac{1}{m}\sum_{i=1}^{m}[y^{(i)}\log{(h_θ(x^{(i)}))}+(1−y^{(i)})\log{(1−h_θ(x^{(i)}))}]+\frac{\lambda}{2m}\sum_{j=1}^{n}θ_j^2$
explicitly exclude the bias term $\theta_0$

Gradient descent

$Repeat\{\\ \ θ_0:=θ_0−\frac{\alpha}{m}\sum_{i=1}^{m}(h_θ(x^{(i)})−y^{(i)})x^{(i)}_0\\ \ θ_j:=θ_j−\alpha[(\frac{1}{m}\sum_{i=1}^{m}(h_θ(x^{(i)})−y^{(i)})x^{(i)}_j)+\frac{\lambda}{m}\theta_j] \ \ j\in\{1,2,...,n\}\\ \}$

And, once again, this cosmetically looks identical what we had for linear regression.

But of course is not the same algorithm as we had, because now the hypothesis is defined using this:
$h_\theta(x)=\frac{1}{1+e^{−\theta^Tx}}$

Advanced optimization

References

Test

机器学习测试Week3_1Logistic Regression

机器学习测试Week3_2Regularization

P.S. CSDN什么破烂Markdown啊

LangChain中的向量数据库接口－Weaviate 洪城叮当 langchain 数据库经验分享笔记交互人工智能知识图谱
文章目录前言一、原型定义二、代码解析1、add_texts方法1.1、应用样例2、from_texts方法2.1、应用样例3、similarity_search方法3.1、应用样例三、项目应用1、安装依赖2、引入依赖3、创建对象4、添加数据5、查询数据总结前言 Weaviate是一个开源的向量数据库，支持存储来自各类机器学习模型的数据对象和向量嵌入，并能无缝扩展至数十亿数据对象。它提供存储文档嵌
Python的科学计算库NumPy（一） linlin_1998 python numpy 开发语言
NumPy(NumericalPython)是Python中最基础、最重要的科学计算库之一，提供了高性能的多维数组（ndarray）对象和大量数学函数，是许多数据科学、机器学习库（如Pandas、SciPy、TensorFlow等）的基础依赖。1.创建一个numpy里面的一维数组importnumpyasnp###通过array方法创建一个ndarrayarray1=np.array([1,2,3
微算法科技的前沿探索：量子机器学习算法在视觉任务中的革新应用 MicroTech2025 量子计算算法
在信息技术飞速发展的今天，计算机视觉作为人工智能领域的重要分支，正逐步渗透到我们生活的方方面面。从自动驾驶到人脸识别，从医疗影像分析到安防监控，计算机视觉技术展现了巨大的应用潜力。然而，随着视觉任务复杂度的不断提升，传统机器学习算法在处理大规模、高维度数据时遇到了计算瓶颈。在此背景下，量子计算作为一种颠覆性的计算模式，以其独特的并行处理能力和指数级增长的计算空间，为解决这一难题提供了新的思路。微算
Cool Pi CM5-LAPTOP Linux Quick Start Guide george-coolpi linux 运维服务器开源 arm开发 AI编程
MachineIntroductionCOOLPICM5open-sourcenotebookisaproductthatcombineshighperformance,portability,andopen-sourcespirit.Itnotonlymeetsthebasiccomputingneedsofusers,butalsoprovidesanidealplatformforthose
在mac m1基于llama.cpp运行deepseek
lama.cpp是一个高效的机器学习推理库，目标是在各种硬件上实现LLM推断，保持最小设置和最先进性能。llama.cpp支持1.5位、2位、3位、4位、5位、6位和8位整数量化，通过ARMNEON、Accelerate和Metal支持Apple芯片，使得在MACM1处理器上运行Deepseek大模型成为可能。1下载llama.cppgitclonehttps://github.com/ggerg
【机器学习笔记Ⅰ】9 特征缩放巴伦是只猫机器学习机器学习笔记人工智能
特征缩放（FeatureScaling）详解特征缩放是机器学习数据预处理的关键步骤，旨在将不同特征的数值范围统一到相近的尺度，从而加速模型训练、提升性能并避免某些特征主导模型。1.为什么需要特征缩放？(1)问题背景量纲不一致：例如：特征1：年龄（范围0-100）特征2：收入（范围0-1,000,000）梯度下降的困境：量纲大的特征（如收入）会导致梯度更新方向偏离最优路径，收敛缓慢。量纲小的特征（如
JVM初学者指南：Java虚拟机基础知识笔记 lenyan~ 笔记技术 JVM jvm java 笔记
JVM初学者指南：Java虚拟机基础知识全解析摘要：本文记录了Java虚拟机(JVM)的基本概念、架构、内存模型及工作原理的相关笔记-lenyan。一、JVM简介1.1什么是JVM？JVM(JavaVirtualMachine，Java虚拟机)是运行Java字节码的虚拟机。JVM是Java"一次编写，到处运行"这一特性的关键所在。无论什么平台，只要安装了对应的JVM，就能运行Java程序。JVM有
深度学习实战-使用TensorFlow与Keras构建智能模型程序员Gloria Python超入门 TensorFlow python
深度学习实战-使用TensorFlow与Keras构建智能模型深度学习已经成为现代人工智能的重要组成部分，而Python则是实现深度学习的主要编程语言之一。本文将探讨如何使用TensorFlow和Keras构建深度学习模型，包括必要的代码实例和详细的解析。1.深度学习简介深度学习是机器学习的一个分支，使用多层神经网络来学习和表示数据中的复杂模式。其广泛应用于图像识别、自然语言处理、推荐系统等领域。
【大模型与机器学习解惑】什么是A/B测试，为何进行A/B测试？
以下内容将围绕机器学习中的A/B测试展开，从概念与背景到实施细节、示例代码、优化思路和未来建议，并在最后给出一个整体的“输出目录”供参考。目录什么是机器学习的A/B测试为何要进行A/B测试A/B测试的实施流程示例代码与详细解释优化方向与未来建议结语1.什么是机器学习的A/B测试A/B测试（也常被称作对照试验、SplitTest）最早多用于互联网产品的功能或界面迭代中，指的是将用户或样本随机分为两组
强化学习之 DQN、Double DQN、PPO JNU freshman 强化学习强化学习
文章目录通俗理解DQNDoubleDQNPPO结合公式理解通俗理解DQN一个简单的比喻和分步解释来理解DQN（DeepQ-Network，深度Q网络），就像教小朋友学打游戏一样：先理解基础概念：Q学习（Q-Learning）想象你在教一只小狗玩电子游戏（比如打砖块）。小狗每做一个动作（比如“向左移动”或“发射球”），游戏会给出一个奖励（比如得分增加）或惩罚（比如球掉了）。小狗的目标是通过不断尝试，
详解LLMOps，将DevOps用于大语言模型开发
大家好，在机器学习领域，随着技术的不断发展，将大型语言模型（LLMs）集成到商业产品中已成为一种趋势，同时也带来了许多挑战。为了有效应对这些挑战，数据科学家们转向了一种新型的DevOps实践LLM-OPS，专为大型语言模型的开发和维护而设计。本文将介绍LLM-OPS的核心思想，并分析这一策略如何帮助数据科学家更高效地运用DevOps的优秀实践，从而在语言模型的开发和部署过程中，提升工作效率和成果的
搜广推校招面经九十一
美团机器学习/数据挖掘算法工程师_二面一、介绍一下ESMM模型，是否有进行过函数推导传统的转化率建模方式：只用发生点击（click=1）的样本来训练CVR模型。CVR定义如下：CVR=P(y=1∣x,z=1)CVR=P(y=1|x,z=1)CVR=P(y=1∣x,z=1)y=1表示用户发生了转化（如购买）z=1表示用户点击了广告这样做的问题：样本选择偏差（SampleSelectionBias,S
python 计算生态概览的概述
文章目录前言python计算生态库的介绍1.网络爬虫2.数据分析3.文本处理4.数据可视化5.机器学习6.图形用户界面7.游戏开发8.网络应用开发前言python计算生态概览的解释Python计算生态概览是对Python作为一门强大而广泛使用的编程语言所拥有的庞大软件集合的整体描述和概述。这个生态体系不仅包含了Python的标准库（stdlib），即随Python解释器安装的基本模块，还涵盖了极其
来聊聊一个轻量级的有限状态机Cola-StateMachine shark-chili Java核心技术精讲 java
文章目录写在文章开头状态机基本概念扫盲基于Cola-StateMachine落地下单业务业务流程说明状态机落地最终效果演示小结参考写在文章开头简单研究了一下研究了一下市面上的几个状态机框架，包括但不限制于SpringStatemachine以及Cola-StateMachine，考虑到前者上下文会记录当前状态机的相关属性(当前状态信息、上一次状态)，对此我们就必须要通过工厂模式等方式规避这些问题，
Google机器学习实践指南(模型预测偏差) AI_Auto 人工智能机器学习人工智能
Google机器学习（31）-模型预测偏差预测偏差：模型为何总是"猜不准"的真相揭秘你的模型预测准确率高达95%，却总是与实际情况差那么一点点？这可能是预测偏差在作祟！本文将带你深入探索这个被忽视的模型"隐形杀手"。一、什么是预测偏差？一个生活化案例想象一下，你网购了一个智能体重秤，连续一周称重显示都是60kg。但你去健身房用专业设备测量，实际是62kg。这种系统性的测量偏差，就是预测偏差在现实中
什么是ARM架构和Cortex内核？ cykaw2590 单片机MCU arm开发架构
ARM（AdvancedRISCMachine）架构是一种基于精简指令集（RISC，ReducedInstructionSetComputing）的计算机处理器架构，广泛应用于移动设备、嵌入式系统、物联网设备等领域。ARM架构的处理器以其高效的功耗和较低的发热量著称，是目前移动设备中最主流的处理器架构之一。ARM架构的特点高效的功耗：ARM架构设计旨在减少功耗，这对于需要长时间续航的设备非常重要，
【机器学习|学习笔记】用 Python 结合 graphviz 生成 ID3、C4.5、CART 三种决策树的结构示意图。
【机器学习|学习笔记】用Python结合graphviz生成ID3、C4.5、CART三种决策树的结构示意图【机器学习|学习笔记】用Python结合graphviz生成ID3、C4.5、CART三种决策树的结构示意图文章目录【机器学习|学习笔记】用Python结合graphviz生成ID3、C4.5、CART三种决策树的结构示意图用Python结合graphviz生成ID3、C4.5、CART三种
智能产品经理的核心能力 AI天才研究院 Agentic AI 实战 AI人工智能与大数据 AI大模型企业级应用开发实战计算科学神经计算深度学习神经网络大数据人工智能大型语言模型 AI AGI LLM Java Python 架构设计 Agent RPA
智能产品经理的核心能力1.背景介绍在当今快节奏的数字时代,产品经理扮演着至关重要的角色,他们负责确保产品满足用户需求,实现商业目标,并保持竞争优势。随着人工智能(AI)和机器学习(ML)技术的不断发展,智能产品经理的概念应运而生。智能产品经理需要将传统的产品管理技能与新兴技术相结合,以创建具有创新性和智能化的产品体验。智能产品不仅需要满足功能需求,还需要提供个性化、智能化和无缝的用户体验。这对产品
使用Python进行机器学习入门指南软考和人工智能学堂 Python开发经验 python 机器学习开发语言
使用Python进行机器学习入门指南机器学习（MachineLearning）是人工智能（ArtificialIntelligence,AI）的一个重要分支，旨在通过算法和统计模型，使计算机系统能够自动从数据中学习和改进。Python作为机器学习领域的主流编程语言，提供了丰富的库和工具来实现各种机器学习任务。本文将介绍如何使用Python进行机器学习，包括基本概念、常用库以及一个实战项目示例。目录
少样本图学习（few-shot learning on graph）知识背景 so.far_away 网络空间安全学习机器学习人工智能
Few-ShotLearningonGraph少样本学习简介少样本图学习简介1.SupportSet和QuerySet（针对单个任务）（1）SupportSet（支持集）（2）QuerySet（查询集）2.BaseData和NovelData（针对整个数据集）（1）BaseData/Classes（基类数据）（2）NovelData/Classes（新类数据）少样本学习简介少样本学习（FSL）旨在
【亲测免费】 CatBoost 教程项目使用指南
CatBoost教程项目使用指南tutorials项目地址:https://gitcode.com/gh_mirrors/tutorials1/tutorials1.项目介绍CatBoost是一个高效、灵活且易于使用的梯度提升库，特别适用于处理分类特征。它由Yandex开发，广泛应用于机器学习和数据科学领域。CatBoost提供了丰富的功能，包括自动处理分类特征、支持GPU训练、内置的交叉验证和模
Python自动化机器学习平台库之mindsdb使用详解
概要MindsDB是一个开源的自动化机器学习平台，它通过SQL接口简化了机器学习模型的创建、训练和预测过程。该库的核心理念是将机器学习功能直接集成到数据库中，让开发者无需深入了解复杂的机器学习算法，就能够快速构建和部署预测模型。MindsDB支持多种数据源连接，包括MySQL、PostgreSQL、MongoDB等主流数据库，同时提供了丰富的PythonAPI接口，使得数据科学家和开发者能够在熟悉
堡垒机操作行为异常检测的机器学习算法应用
一、传统检测模式的困境与机器学习的破局价值在数字化转型浪潮中，堡垒机作为运维安全的核心防线，面临着操作行为复杂度激增与检测能力滞后的双重挑战。传统检测手段主要依赖静态规则库与统计模型，存在三大致命缺陷：规则固化与误报泛滥：某金融机构曾因规则库未及时更新，导致运维人员正常批量操作被误判为“暴力破解”，单日误报量超2000次，消耗安全团队60%的精力。动态行为适应性弱：微服务架构下，运维人员访问路径呈
最全自动驾驶数据集（11/4号已更新）数据猎手小k 自动驾驶人工智能机器学习
自动驾驶是一个快速发展的行业，它融合了人工智能、机器学习、传感器技术、高精度地图和先进的计算平台等多种技术。技术方面，自动驾驶汽车依赖于先进的传感器、如激光雷达、摄像头、毫米波雷达等，以及强大的计算平台来处理大量数据，自动驾驶数据集是训练和验证自动驾驶系统的关键资源，它提供了丰富的场景和条件，使算法能够学习和适应复杂的真实世界驾驶环境。一、研究背景自动驾驶技术的发展需要大量的数据来训练和优化算法，
机器学习深度学习驱动在光子学设计中的应用与未来【专题培训会议邀您共探科技前沿】软研科技信息与通信信号处理量子计算人工智能
一、背景介绍在智能科技飞速发展的今天，光子学设计与智能算法的结合正成为科研创新的热点。深度学习、机器学习等算法在光子器件的逆向设计、超构表面材料设计、光学神经网络构建等方面展现出巨大潜力。二、会议亮点由北京软研国际信息技术研究院主办的“智能算法驱动的光子学设计与应用”专题培训会议，将深入探讨以下核心内容：光子器件的逆向设计：利用深度学习优化多参数光子器件设计。超构表面与超材料设计：智能算法在新型光
机器学习与光子学的融合正重塑光学器件设计范式 m0_75133639 光电智能电视二维材料电子半导体人工智能顶刊 nature
Nature/Science最新研究表明，该交叉领域聚焦六大前沿方向：光子器件逆向设计、超构材料智能优化、光子神经网络加速器、非线性光学芯片开发、多任务协同优化及光谱智能预测。系统掌握该领域需构建四维知识体系：1、基础融合——从空间/集成光学系统切入，解析机器学习赋能光学的理论必然性，涵盖光学神经网络构建原理2、逆向设计革命——通过AnsysOptics实战，掌握FDTD算法与粒子群/拓扑优化技术
Building Apps with AI Tools: ChatGPT, Semantic Kernel, and Langchain 项目推荐滕娴殉
BuildingAppswithAITools:ChatGPT,SemanticKernel,andLangchain项目推荐building-apps-with-ai-tools-chatgpt-semantic-kernel-langchain-4469616ThisisacoderepositoryfortheLinkedInLearningcourseBuildingAppswithAIT
AI模型训练新范式：基于同态加密的隐私保护方案 AIGC应用创新大全人工智能同态加密区块链 ai
AI模型训练新范式：基于同态加密的隐私保护方案技术解析关键词同态加密（HomomorphicEncryption）、隐私保护机器学习（PPML）、全同态加密（FHE）、安全多方计算（MPC）、加密数据训练摘要本报告系统解析基于同态加密的AI模型训练新范式，覆盖从理论基础到工程实践的全生命周期。首先通过第一性原理推导同态加密的数学本质，对比传统隐私保护技术的局限性；其次构建“加密-训练-解密”全流程
量子机器学习入门：从理论到实践
量子机器学习入门：从理论基石到实践路径元数据框架标题量子机器学习入门：从理论基石到实践路径——连接量子计算与人工智能的未来桥梁关键词量子计算；机器学习；量子算法；量子神经网络；Qiskit；PennyLane；量子变分算法摘要量子机器学习（QuantumMachineLearning,QML）是量子计算与机器学习的交叉领域，通过量子计算的叠加态、纠缠和并行性解决传统机器学习的计算瓶颈（如高维数据处
量子计算突破：8比特扩散模型实现指数级加速晨曦543210 人工智能
目录一、量子扩散模型（QuantumDiffusion）二、DNA存储生成（Biological-GAN）三、光子计算加速四、神经形态生成五、引力场渲染六、分子级生成七、星际生成网络八、元生成系统极限挑战方向一、量子扩散模型（QuantumDiffusion）量子线路模拟经典扩散过程fromqiskitimportQuantumCircuitfromqiskit_machine_learning.
html页面js获取参数值 0624chenhong html
1.js获取参数值js function GetQueryString(name) { var reg = new RegExp("(^|&)"+ name +"=([^&]*)(&|$)"); var r = windo
MongoDB 在多线程高并发下的问题 BigCat2013 mongodb DB 高并发重复数据
最近项目用到 MongoDB , 主要是一些读取数据及改状态位的操作. 因为是结合了最近流行的 Storm进行大数据的分析处理，并将分析结果插入Vertica数据库，所以在多线程高并发的情境下, 会发现 Vertica 数据库中有部分重复的数据. 这到底是什么原因导致的呢？笔者开始也是一筹莫展，重复去看 MongoDB 的 API , 终于有了新发现： com.mongodb.DB 这个类有
c++ 用类模版实现链表(c++语言程序设计第四版示例代码) CrazyMizzz 数据结构 C++
#include<iostream> #include<cassert> using namespace std; template<class T> class Node { private: Node<T> * next; public: T data;
最近情况麦田的设计者感慨考试生活
在五月黄梅天的岁月里，一年两次的软考又要开始了。到目前为止，我已经考了多达三次的软考，最后的结果就是通过了初级考试（程序员）。人啊，就是不满足，考了初级就希望考中级，于是，这学期我就报考了中级，明天就要考试。感觉机会不大，期待奇迹发生吧。这个学期忙于练车，写项目，反正最后是一团糟。后天还要考试科目二。这个星期真的是很艰难的一周，希望能快点度过。
linux系统中用pkill踢出在线登录用户被触发 linux
由于linux服务器允许多用户登录，公司很多人知道密码，工作造成一定的障碍所以需要有时踢出指定的用户 1/#who 查出当前有那些终端登录（用 w 命令更详细） # who root pts/0 2010-10-28 09:36 (192
仿QQ聊天第二版肆无忌惮_ qq
在第一版之上的改进内容: 第一版链接: http://479001499.iteye.com/admin/blogs/2100893 用map存起来号码对应的聊天窗口对象,解决私聊的时候所有消息发到一个窗口的问题. 增加ViewInfo类,这个是信息预览的窗口,如果是自己的信息,则可以进行编辑. 信息修改后上传至服务器再告诉所有用户,自己的窗口
java读取配置文件知了ing
1，java读取.properties配置文件 InputStream in; try { in = test.class.getClassLoader().getResourceAsStream("config/ipnetOracle.properties");//配置文件的路径 Properties p = new Properties()
__attribute__ 你知多少？矮蛋蛋 C++gcc
原文地址: http://www.cnblogs.com/astwish/p/3460618.html GNU C 的一大特色就是__attribute__ 机制。__attribute__ 可以设置函数属性（Function Attribute ）、变量属性（Variable Attribute ）和类型属性（Type Attribute ）。 __attribute__ 书写特征是：
jsoup使用笔记 alleni123 java 爬虫 JSoup
<dependency> <groupId>org.jsoup</groupId> <artifactId>jsoup</artifactId> <version>1.7.3</version> </dependency> 2014/08/28 今天遇到这种形式，
JAVA中的集合 Collectio 和Map的简单使用及方法百合不是茶 list map set
List ,set ,map的使用方法和区别 java容器类类库的用途是保存对象，并将其分为两个概念： Collection集合：一个独立的序列，这些序列都服从一条或多条规则;List必须按顺序保存元素，set不能重复元素；Queue按照排队规则来确定对象产生的顺序（通常与他们被插入的
杀LINUX的JOB进程 bijian1013 linux unix
今天发现数据库一个JOB一直在执行，都执行了好几个小时还在执行，所以想办法给删除掉系统环境： ORACLE 10G Linux操作系统操作步骤如下：第一步.查询出来那个job在运行，找个对应的SID字段 select * from dba_jobs_running--找到job对应的sid &n
Spring AOP详解 bijian1013 java spring AOP
最近项目中遇到了以下几点需求，仔细思考之后，觉得采用AOP来解决。一方面是为了以更加灵活的方式来解决问题，另一方面是借此机会深入学习Spring AOP相关的内容。例如，以下需求不用AOP肯定也能解决，至于是否牵强附会，仁者见仁智者见智。 1.对部分函数的调用进行日志记录，用于观察特定问题在运行过程中的函数调用
[Gson六]Gson类型适配器(TypeAdapter) bit1129 Adapter
TypeAdapter的使用动机 Gson在序列化和反序列化时，默认情况下，是按照POJO类的字段属性名和JSON串键进行一一映射匹配，然后把JSON串的键对应的值转换成POJO相同字段对应的值，反之亦然，在这个过程中有一个JSON串Key对应的Value和对象之间如何转换(序列化/反序列化)的问题。以Date为例，在序列化和反序列化时，Gson默认使用java.
【spark八十七】给定Driver Program，如何判断哪些代码在Driver运行，哪些代码在Worker上执行 bit1129 driver
Driver Program是用户编写的提交给Spark集群执行的application，它包含两部分作为驱动： Driver与Master、Worker协作完成application进程的启动、DAG划分、计算任务封装、计算任务分发到各个计算节点(Worker)、计算资源的分配等。计算逻辑本身，当计算任务在Worker执行时，执行计算逻辑完成application的计算任务
nginx 经验总结 ronin47 nginx 总结
　　　深感nginx的强大，只学了皮毛，把学下的记录。　　　获取Header 信息，一般是以$http_XX（ＸＸ是小写）获取body,通过接口，再展开，根据Ｋ取Ｖ　　　获取uri,以$arg_XX &n
轩辕互动-1.求三个整数中第二大的数2.整型数组的平衡点 bylijinnan 数组
import java.util.ArrayList; import java.util.Arrays; import java.util.List; public class ExoWeb { public static void main(String[] args) { ExoWeb ew=new ExoWeb(); System.out.pri
Netty源码学习-Java-NIO-Reactor bylijinnan java 多线程 netty
Netty里面采用了NIO-based Reactor Pattern 了解这个模式对学习Netty非常有帮助参考以下两篇文章： http://jeewanthad.blogspot.com/2013/02/reactor-pattern-explained-part-1.html http://gee.cs.oswego.edu/dl/cpjslides/nio.pdf
AOP通俗理解 cngolon spring AOP
1.我所知道的aop 初看aop,上来就是一大堆术语，而且还有个拉风的名字，面向切面编程，都说是OOP的一种有益补充等等。一下子让你不知所措，心想着：怪不得很多人都和我说aop多难多难。当我看进去以后，我才发现：它就是一些java基础上的朴实无华的应用，包括ioc，包括许许多多这样的名词，都是万变不离其宗而已。 2.为什么用aop&nb
cursor variable 实例 ctrain variable
create or replace procedure proc_test01 as type emp_row is record( empno emp.empno%type, ename emp.ename%type, job emp.job%type, mgr emp.mgr%type, hiberdate emp.hiredate%type, sal emp.sal%t
shell报bash: service: command not found解决方法 daizj linux shell service jps
今天在执行一个脚本时，本来是想在脚本中启动hdfs和hive等程序，可以在执行到service hive-server start等启动服务的命令时会报错，最终解决方法记录一下：脚本报错如下： ./olap_quick_intall.sh: line 57: service: command not found ./olap_quick_intall.sh: line 59
40个迹象表明你还是PHP菜鸟 dcj3sjt126com 设计模式 PHP 正则表达式 oop
你是PHP菜鸟，如果你：1. 不会利用如phpDoc 这样的工具来恰当地注释你的代码2. 对优秀的集成开发环境如Zend Studio 或Eclipse PDT 视而不见3. 从未用过任何形式的版本控制系统，如Subclipse4. 不采用某种编码与命名标准，以及通用约定，不能在项目开发周期里贯彻落实5. 不使用统一开发方式6. 不转换（或）也不验证某些输入或SQL查询串（译注：参考PHP相关函
Android逐帧动画的实现 dcj3sjt126com android
一、代码实现： private ImageView iv; private AnimationDrawable ad; @Override protected void onCreate(Bundle savedInstanceState) { super.onCreate(savedInstanceState); setContentView(R.layout
java远程调用linux的命令或者脚本 eksliang linux ganymed-ssh2
转载请出自出处： http://eksliang.iteye.com/blog/2105862 Java通过SSH2协议执行远程Shell脚本(ganymed-ssh2-build210.jar) 使用步骤如下： 1.导包官网下载: http://www.ganymed.ethz.ch/ssh2/ ma
adb端口被占用问题 gqdy365 adb
最近重新安装的电脑，配置了新环境，老是出现： adb server is out of date. killing... ADB server didn't ACK * failed to start daemon * 百度了一下，说是端口被占用，我开个eclipse，然后打开cmd，就提示这个，很烦人。一个比较彻底的解决办法就是修改
ASP.NET使用FileUpload上传文件 hvt .net C#hovertree asp.net webform
前台代码： <asp:FileUpload ID="fuKeleyi" runat="server" /> <asp:Button ID="BtnUp" runat="server" onclick="BtnUp_Click" Text="上传" />
代码之谜（四）- 浮点数（从惊讶到思考） justjavac 浮点数精度代码之谜 IEEE
在『代码之谜』系列的前几篇文章中，很多次出现了浮点数。浮点数在很多编程语言中被称为简单数据类型，其实，浮点数比起那些复杂数据类型（比如字符串）来说，一点都不简单。单单是说明 IEEE浮点数就可以写一本书了，我将用几篇博文来简单的说说我所理解的浮点数，算是抛砖引玉吧。一次面试记得多年前我招聘 Java 程序员时的一次关于浮点数、二分法、编码的面试，多年以后，他已经称为了一名很出色的
数据结构随记_1 lx.asymmetric 数据结构笔记
第一章 1.数据结构包括数据的逻辑结构、数据的物理/存储结构和数据的逻辑关系这三个方面的内容。 2.数据的存储结构可用四种基本的存储方法表示，它们分别是顺序存储、链式存储、索引存储和散列存储。 3.数据运算最常用的有五种，分别是查找/检索、排序、插入、删除、修改。 4.算法主要有以下五个特性：输入、输出、可行性、确定性和有穷性。 5.算法分析的
linux的会话和进程组网络接口 linux
会话：一个或多个进程组。起于用户登录，终止于用户退出。此期间所有进程都属于这个会话期。会话首进程：调用setsid创建会话的进程1.规定组长进程不能调用setsid，因为调用setsid后，调用进程会成为新的进程组的组长进程.如何保证？先调用fork，然后终止父进程，此时由于子进程的进程组ID为父进程的进程组ID，而子进程的ID是重新分配的，所以保证子进程不会是进程组长，从而子进程可以调用se
二维数组元素的连续求解 1140566087 二维数组 ACM
import java.util.HashMap; public class Title { public static void main(String[] args){ f(); } // 二位数组的应用 //12、二维数组中，哪一行或哪一列的连续存放的0的个数最多，是几个0。注意，是“连续”。 public static void f(){
也谈什么时候Java比C++快 windshome java C++
刚打开iteye就看到这个标题“Java什么时候比C++快”，觉得很好笑。你要比，就比同等水平的基础上的相比，笨蛋写得C代码和C++代码，去和高手写的Java代码比效率，有什么意义呢？我是写密码算法的，深刻知道算法C和C++实现和Java实现之间的效率差，甚至也比对过C代码和汇编代码的效率差，计算机是个死的东西，再怎么优化，Java也就是和C

【机器学习-吴恩达】Week3 分类问题——逻辑回归&正则化

文章目录

Terminology

Logistic Regression

Classification and Representation

Classification

linear regression to a classification problem

Logistic Regression - a classification algorithm

Hypothesis Representation

Logistic Regression Model

Sigmoid(Logistic) function – g(z)

lnterpretation of Hypothesis Output

Decision Boundary

Non-linear decision boundaries

Logistic Regression Model

Cost Function

Simplified Cost Function and Gradient Descent

Compact logistic regression cost function

Gradient Descent

Advanced Optimization

Optimization algorithm

Example

Code

Multiclass Classification

Multiclass Classification: One-vs-all(rest)

One-vs-all(rest)

Regularization

Solving the Problem of Overfitting

The Problem of Overfitting

Underfitting - High Bias

Overfitting - High Variance

Address overfitting

Cost Function

Regularization

Regularized Linear Regression

Regularized linear regression

Gradiennt descent

Normal equation

Non-invertibility

Regularized Logistic Regression

Regularized logistic regression

Gradient descent

Advanced optimization

References

Test

你可能感兴趣的:(Machine,Learning,机器学习)