The Model Complexity Myth

(or, Yes You Can Fit Models With More Parameters Than Data Points)

An oft-repeated rule of thumb in any sort of statistical model fitting is "you can't fit a model with more parameters than data points". This idea appears to be as wide-spread as it is incorrect. On the contrary, if you construct your models carefully, you can fit models with more parameters than datapoints, and this is much more than mere trivia with which you can impress the nerdiest of your friends: as I will show here, this fact can prove to be very useful in real-world scientific applications.

A model with more parameters than datapoints is known as an under-determined system, and it's a common misperception that such a model cannot be solved in any circumstance. In this post I will consider this misconception, which I call the model complexity myth. I'll start by showing where this model complexity myth holds true, first from from an intuitive point of view, and then from a more mathematically-heavy point of view. I'll build from this mathematical treatment and discuss how underdetermined models may be addressed from a frequentist standpoint, and then from a Bayesian standpoint. (If you're unclear about the general differences between frequentist and Bayesian approaches, I might suggest reading my posts on the subject). Finally, I'll discuss some practical examples of where such an underdetermined model can be useful, and demonstrate one of these examples: quantitatively accounting for measurement biases in scientific data.

The Root of the Model Complexity Myth

While the model complexity myth is not true in general, it is true in the specific case of simple linear models, which perhaps explains why the myth is so pervasive. In this section I first want to motivate the reason for the underdetermination issue in simple linear models, first from an intuitive view, and then from a more mathematical view.

I'll start by defining some functions to create plots for the examples below; you can skip reading this code block for now:

In [1]:

# Code to create figures

%matplotlib inline import matplotlib.pyplot as plt import numpy as np plt.style.use('ggplot') def plot_simple_line(): rng = np.random.RandomState(42) x = 10 * rng.rand(20) y = 2 * x + 5 + rng.randn(20) p = np.polyfit(x, y, 1) xfit = np.linspace(0, 10) yfit = np.polyval(p, xfit) plt.plot(x, y, 'ok') plt.plot(xfit, yfit, color='gray') plt.text(9.8, 1, "y = {0:.2f}x + {1:.2f}".format(*p), ha='right', size=14); def plot_underdetermined_fits(p, brange=(-0.5, 1.5), xlim=(-3, 3), plot_conditioned=False): rng = np.random.RandomState(42) x, y = rng.rand(2, p).round(2) xfit = np.linspace(xlim[0], xlim[1]) for r in rng.rand(20): # add a datapoint to make model specified b = brange[0] + r * (brange[1] - brange[0]) xx = np.concatenate([x, [0]]) yy = np.concatenate([y, [b]]) theta = np.polyfit(xx, yy, p) yfit = np.polyval(theta, xfit) plt.plot(xfit, yfit, color='#BBBBBB') plt.plot(x, y, 'ok') if plot_conditioned: X = x[:, None] ** np.arange(p + 1) theta = np.linalg.solve(np.dot(X.T, X) + 1E-3 * np.eye(X.shape[1]), np.dot(X.T, y)) Xfit = xfit[:, None] ** np.arange(p + 1) yfit = np.dot(Xfit, theta) plt.plot(xfit, yfit, color='black', lw=2) def plot_underdetermined_line(): plot_underdetermined_fits(1) def plot_underdetermined_cubic(): plot_underdetermined_fits(3, brange=(-1, 2), xlim=(0, 1.2)) def plot_conditioned_line(): plot_underdetermined_fits(1, plot_conditioned=True)

Fitting a Line to Data

The archetypical model-fitting problem is that of fitting a line to data: A straight-line fit is one of the simplest of linear models, and is usually specified by two parameters: the slope m and intercept b. For any observed value

y M = m x + b

for some particular choice of

In [2]:

plot_simple_line()

The simple line-plus-intercept is a two-parameter model, and it becomes underdetermined when fitting it to fewer than two datapoints. This is easy to understand intuitively: after all, you can draw any number of perfectly-fit lines through a single data point:

In [3]:

plot_underdetermined_line()

The single point simply isn't enough to pin-down both a slope and an intercept, and the model has no unique solution.

Fitting a More General Linear Model

While it's harder to see intuitively, this same notion extends to models with more terms. For example, let's think consider fitting a general cubic curve to data. In this case our model is

y M = θ 0 + θ 1 x + θ 2 x 2 + θ 3 x 3

Note that this is still a linear model: the "linear" refers to the linearity of model parameters

In [4]:

plot_underdetermined_cubic()

For any such simple linear model, an underdetermined system will lead to a similar result: an infinite set of best-fit solutions.

The Mathematics of Underdetermined Models

To make more progress here, let's quickly dive into the mathematics behind these linear models. Going back to the simple straight-line fit, we have our model

y M (x | θ) = θ 0 + θ 1 x

where we've replaced our slope

χ 2 = \sum n = 1 N [y n - y M (x n | θ)] 2

We can make some progress by re-expressing this model in terms of matrices and vectors; we'll define the vector of

y = [y 1, y 2, y 3, \dots y N]

We'll also define the design matrix which we'll call

X = ⎡⎣⎢⎢⎢⎢ 1 1 ⋮ 1 x 1 x 2 ⋮ x N ⎤⎦⎥⎥⎥⎥

With this formalism, the vector of model values can be expressed as a matrix-vector product:

y M = X θ

and the

χ 2 = (y - X θ) T (y - X θ)

We'd like to minimize the

d χ 2 d θ = - 2 X T ( y - X θ ) = 0

Solving this for

θ^M L E = [X T X] - 1 X T y

Though this matrix formalism may seem a bit over-complicated, the nice part is that it straightforwardly generalizes to a host of more sophisticated linear models. For example, the cubic model considered above requires only a larger design matrix

X = ⎡⎣⎢⎢⎢⎢⎢ 1 1 ⋮ 1 x 1 x 2 ⋮ x N x 2 1 x 22

The added model complexity is completely encapsulated in the design matrix, and the expression to compute

Why Underdetermined Models Break

Taking a look at this Maximum Likelihood solution for

The number of rows in

Let's take a look at this in the case of fitting a line to the single point shown above,

In [5]:

X = np.array([[1, 0.37]])

We can use this to compute the normal matrix, which is the standard name for

In [6]:

C = np.dot(X.T, X)

If we try to invert this, we will get an error telling us the matrix is singular:

In [7]:

np.linalg.inv(C)

---------------------------------------------------------------------------

LinAlgError Traceback (most recent call last) <ipython-input-7-a6d66d9f99af> in <module>() ----> 1 np.linalg.inv(C) /Users/jakevdp/anaconda/envs/py34/lib/python3.4/site-packages/numpy/linalg/linalg.py in inv(a)  518 signature = 'D->D' if isComplexType(t) else 'd->d'  519 extobj = get_linalg_error_extobj(_raise_linalgerror_singular) --> 520 ainv = _umath_linalg.inv(a, signature=signature, extobj=extobj)  521 return wrap(ainv.astype(result_t))  522  /Users/jakevdp/anaconda/envs/py34/lib/python3.4/site-packages/numpy/linalg/linalg.py in _raise_linalgerror_singular(err, flag)  88   89 def _raise_linalgerror_singular(err, flag): ---> 90 raise LinAlgError("Singular matrix")  91   92 def _raise_linalgerror_nonposdef(err, flag): LinAlgError: Singular matrix

Evidently, if we want to fix the underdetermined model, we'll need to figure out how to modify the normal matrix so it is no longer singular.

Fixing an Underdetermined Model: Conditioning

One easy way to make a singular matrix invertible is to condition it: that is, you add to it some multiple of the identity matrix before performing the inversion (in many ways this is equivalent to "fixing" a divide-by-zero error by adding a small value to the denominator). Mathematically, that looks like this:

C = X T X + σ I

For example, by adding

In [8]:

cond = 1E-3 * np.eye(2) np.linalg.inv(C + cond)

Out[8]:

array([[ 121.18815362, -325.16038316],

       [-325.16038316,  879.69065823]])

Carrying this conditioned inverse through the computation, we get the following intercept and slope for our underdetermined problem:

In [9]:

b, m = np.linalg.solve(C + cond, np.dot(X.T, [0.95])) print("Conditioned best-fit model:") print("y = {0:.3f} x + {1:.3f}".format(m, b))

Conditioned best-fit model:

y = 0.309 x + 0.835

In [10]:

plot_conditioned_line()

This conditioning caused the model to settle on a particular one of the infinite possibilities for a perfect fit to the data. Numerically we have fixed our issue, but this arbitrary conditioning is more than a bit suspect: why is this particular result chosen, and what does it actually mean in terms of our model fit? In the next two sections, we will briefly discuss the meaning of this conditioning term from both a frequentist and Bayesian perspective.

Frequentist Conditioning: Regularization

In a frequentist approach, this type of conditioning is known as regularization. Regularization is motivated by a desire to penalize large values of model parameters. For example, in the underdetermined fit above (with

χ 2 r e g = χ 2 + λ θ T θ

Here

Using the expression for the regularized

d χ 2 d θ = - 2 [ X T ( y - X θ ) - λ θ ] = 0

This leads to the following regularized maximum likelihood estimate for

θ^M L E = [X T X + λ I] - 1 X T y

Comparing this to our conditioning above, we see that the regularization degree

I'll add that the above form of regularization is known variably as L2-regularization or Ridge Regularization, and is only one of the possible regularization approaches. Another useful form of regularization is L1-regularization, also known as Lasso Regularization, which has the interesting property that it favors sparsity in the model.

Bayesian Conditioning: Priors

Regularization illuminates the meaning of matrix conditioning, but it still sometimes seems like a bit of black magic. What does this penalization of model parameters within the

As I pointed out in my series of posts on Frequentism and Bayesianism, for many simple problems, the frequentist likelihood (proportional to the negative exponent of the

The Bayesian posterior probability on the model parameters

P (θ | D, M) = P ( D | θ , M ) P ( θ | M ) P ( D |

where the most important terms are the likelihood

P (D | θ, M) P (θ | M) \propto exp [- χ 2]

Because the term on the right-hand-side has no

P (D | θ, M) \propto exp [- χ 2]

That is, the simple frequentist likelihood is equivalent to the Bayesian posterior for the model with an implicit flat prior on

To understand the meaning of regularization, let's repeat this exercise with the regularized

P (D | θ, M) P (θ | M) \propto exp [- χ 2 - λ | θ | 2]

The regularization term in the

P (D | θ, M) \propto exp [- χ 2]

So we see that ridge regularization is equivalent to applying a Gaussian prior to your model parameters, centered at

Returning to our single-point example above, we can quickly see how this intuition explains the particular model chosen by the regularized fit; it is equivalent to fitting the line while taking into account prior knowledge that both the intercept and slope should be near zero:

In [11]:

plot_conditioned_line()

The benefit of the Bayesian view is that it helps us understand exactly what this conditioning means for our model, and given this understanding we can easily extend use more general priors. For example, what if you have reason to believe your slope is near 1, but have no prior information on your intercept? In the Bayesian approach, it is easy to add such information to your model in a rigorous way.

But regardless of which approach you use, this central fact remains clear: you can fit models with more parameters than data points, if you restrict your parameter space through the use of frequentist regularization or Bayesian priors.

Underdetermined Models in Action

There are a few places where these ideas about underdetermined models come up in real life. I'll briefly discuss a couple of them here, and then walk through the solution of a simple (but rather interesting) problem that demonstrates these ideas.

Compressed Sensing: Image Reconstruction

One area where underdetermined models are often used is in the field of Compressed Sensing. Compressed sensing comprises a set of models in which underdetermined linear systems are solved using a sparsity prior, the classic example of which is the reconstruction of a full image from just a handful of its pixels. As a simple linear model this would fail, because there are far more unknown pixels than known pixels. But by carefully training a model on the structure of typical images and applying priors based on sparsity, this seemingly impossible problem becomes tractable. This 2010 Wired article has a good popular-level discussion of the technique and its applications, and includes this image showing how a partially-hidden input image can be iteratively reconstructed from a handful of pixels using a sparsity prior:

In [12]:

from IPython.display import Image Image('http://www.wired.com/magazine/wp-content/images/18-03/ff_algorithm2_f.jpg')

Out[12]:

Kernel-based Methods: Gaussian Processes

Another area where a classically underdetermined model is solved is in the case of Gaussian Process Regression. Gaussian Processes are an interesting beast, and one way to view them is that rather than fitting, say, a two-parameter line or four-parameter cubic curve, they actually fit aninfinite-dimensional model to data! They accomplish this by judicious use of certain priors on the model, along with a so-called "kernel trick" which solves the infinite dimensional regression implicitly using a finite-dimensional representation constructed based on these priors.

In my opinion, the best resource to learn more about Gaussian Process methods is the Gaussian Processes for Machine Learning book, which is available for free online (though it is a bit on the technical side). You might also take a look at the scikit-learn Gaussian Process documentation. If you'd like to experiment with a rather fast Gaussian Process implementation in Python, check out the george library.

Imperfect Detectors: Extrasolar Planets

Another place I've seen effective use of the above ideas is in situations where the data collection process has unavoidable imperfections. There are two basic ways forward when working with such noisy and biased data:

you can pre-process the data to try to remove and/or correct data imperfections, and then fit a conventional model to the corrected data.
you can account for the data imperfections along with interesting model parameters as part of a unified model: this type of unified approach is often preferable in scientific settings, where even the most careful pre-processing can lead to biased results.

If you'd like to see a great example of this style of forward-modeling analysis, check out the efforts of David Hogg's group in finding extrasolar planets in the Kepler survey's K2 data; there's a nice astrobytes post which summarizes some of these results. While the group hasn't yet published any results based on truly underdetermined models, it is easy to imagine how this style of comprehensive forward-modeling analysis could be pushed to such an extreme.

Example: Fitting a Model to Biased Data

While it might be fun to dive into the details of models for noisy exoplanet searches, I'll defer that to the experts. Instead, as a more approachable example of an underdetermined model, I'll revisit a toy example in which a classically underdetermined model is used to account for imperfections in the input data.

Imagine you have some observed data you would like to model, but you know that your detector is flawed such that some observations (you don't know which) will have a bias that is not reflected in the estimated error: in short, there are outliers in your data. How can you fit a model to this data while accounting for the possibility of such outliers?

To make this more concrete, consider the following data, which is drawn from a line with noise, and includes several outliers:

In [13]:

rng = np.random.RandomState(42) theta = [-10, 2] x = 10 * rng.rand(10) dy = 2 + 2 * rng.rand(10) y = theta[0] + theta[1] * x + dy * rng.randn(10) y[4] += 15 y[7] -= 10 plt.errorbar(x, y, dy, fmt='ok', ecolor='gray');

If we try to fit a line using the standard

In [14]:

from scipy import optimize def chi2(theta, x=x, y=y, dy=dy): y_model = theta[0] + theta[1] * x return np.sum(0.5 * ((y - y_model) / dy) ** 2) theta1 = optimize.fmin(chi2, [0, 0], disp=False) xfit = np.linspace(0, 10) plt.errorbar(x, y, dy, fmt='ok', ecolor='gray') plt.plot(xfit, theta1[0] + theta1[1] * xfit, '-k') plt.title('Standard $\chi^2$ Minimization');

This reflects a well-known deficiency of

What we would like to do is propose a model which somehow accounts for the possibility that each of these points may be the result of a biased measurement. One possible route is to add

Let's show how it can be done.

Our linear model is:

y M (x | θ) = θ 0 + θ 1 x

For a non-outlier (let's call it an "inlier") point at

L i n, i (D | θ) = 1 2 π d y 2 i-----\sqrt exp - [ y i - y M ( x

For an "outlier" point, the likelihood is

L o u t, i (D | θ) = 1 2 π σ 2 y----\sqrt exp - [ y i - y M ( x

where

Now we'll specify

L (D | θ, g) = \prod i [(1 - g i) L i n, i + g i L o u t,

We will put a prior on these indicator variables

P (g) = exp [- \sum i g i]

where, recall,

Though you could likely solve for a point estimate of this model, I find the Bayesian approach to be much more straightforward and interpretable for a model this complex. To fit this, I'll make use of the excellent emcee package. Because emcee doesn't have categorical variables, we'll instead allow

We start by defining a function which computes the log-posterior given the data and model parameters, using some computational tricks for the sake of floating-point accuracy:

In [15]:

# theta will be an array of length 2 + N, where N is the number of points

# theta[0] is the intercept, theta[1] is the slope, # and theta[2 + i] is the weight g_i def log_prior(theta): g = theta[2:] #g_i needs to be between 0 and 1 if (np.any(g < 0) or np.any(g > 1)): return -np.inf # recall log(0) = -inf else: return -g.sum() def log_likelihood(theta, x, y, dy): sigma_y = np.std(y) y_model = theta[0] + theta[1] * x g = np.clip(theta[2:], 0, 1) # g<0 or g>1 leads to NaNs in logarithm # log-likelihood for in-lier logL_in = -0.5 * (np.log(2 * np.pi * dy ** 2) + ((y - y_model) / dy)** 2) # log-likelihood for outlier logL_out = -0.5 * (np.log(2 * np.pi * sigma_y ** 2) + ((y - y_model) / sigma_y) ** 2) return np.sum(np.logaddexp(np.log(1 - g) + logL_in, np.log(g) + logL_out)) def log_posterior(theta, x, y, dy): return log_prior(theta) + log_likelihood(theta, x, y, dy)

Now we use the emcee package to run this model. Note that because of the high dimensionality of the model, the run_mcmc command below will take a couple minutes to complete:

In [16]:

import emcee ndim = 2 + len(x) # number of parameters in the model nwalkers = 50 # number of MCMC walkers nburn = 10000 # "burn-in" period to let chains stabilize nsteps = 15000 # number of MCMC steps to take # set walkers near the maximum likelihood # adding some random scatter rng = np.random.RandomState(0) starting_guesses = np.zeros((nwalkers, ndim)) starting_guesses[:, :2] = rng.normal(theta1, 1, (nwalkers, 2)) starting_guesses[:, 2:] = rng.normal(0.5, 0.1, (nwalkers, ndim - 2)) sampler = emcee.EnsembleSampler(nwalkers, ndim, log_posterior, args=[x, y, dy]) sampler.run_mcmc(starting_guesses, nsteps) sample = sampler.chain # shape = (nwalkers, nsteps, ndim) sample = sampler.chain[:, nburn:, :].reshape(-1, ndim)

-c:21: RuntimeWarning: divide by zero encountered in log

-c:22: RuntimeWarning: divide by zero encountered in log

The Runtime warnings here are normal – they just indicate that we've hit log(0) = -inf for some pieces of the calculation.

With the sample chain determined, we can plot the marginalized distribution of samples to get an idea of the value and uncertainty of slope and intercept with this model:

In [17]:

plt.plot(sample[:, 0], sample[:, 1], ',k', alpha=0.1) plt.xlabel('intercept') plt.ylabel('slope');

Finally, we can make use of the marginalized values of all

In [18]:

theta2 = np.mean(sample[:, :2], 0) g = np.mean(sample[:, 2:], 0) outliers = (g > 0.5) plt.errorbar(x, y, dy, fmt='ok', ecolor='gray') plt.plot(xfit, theta1[0] + theta1[1] * xfit, color='gray') plt.plot(xfit, theta2[0] + theta2[1] * xfit, color='black') plt.plot(x[outliers], y[outliers], 'ro', ms=20, mfc='none', mec='red') plt.title('Bayesian Fit');

The red circles mark the points that were determined to be outliers by our model, and the black line shows the marginalized best-fit slope and intercept. For comparison, the grey line is the standard maximum likelihood fit.

Notice that we have successfully fit an

Conclusion

I hope you will see after reading this post that the model complexity myth, while rooted in a solid understanding of simple linear models, should not be assumed to apply to all possible models. In fact, it is possible to fit models with more parameters than datapoints: and for the types of noisy, biased, and heterogeneous data we often encounter in scientific research, you can make a lot of progress by taking advantage of such models. Thanks for reading!

This post was written entirely in the IPython notebook. You can download this notebook, or see a static view here.

你可能感兴趣的:(Model)

C#基于MVC模式实现TCP三次握手，附带简易日志管理模块风，停下 C#设计模式网络协议 c#mvc tcp/ip
C#基于MVC模式实现TCP三次握手1Model1.1ServerModel1.2ClientModel1.3配置参数模块1.4日志管理模块1.4.1数据结构1.4.1日志管理工具类1.4.1日志视图展示1.4.1.1UcLogManage.cs1.4.1.2UcLogManage.Designer.cs2视图（View）2.1ViewServer2.1.1ViewServer.cs2.1.1Vi
js 创建对象写法 ---追溯狼魂豹速 javascript 前端开发语言
复制重新生成importSqlParaDTOfrom‘./SqlParamDTO’;exportdefault{create(funcSysId,jsonPara){//实例私有状态(每次create()调用独立)conststate={funcSysId:String(funcSysId||‘’),//强制字符串类型sqlId:‘’,modelName:undefined,queryColumn
java 实现数据库备份李逍遙️ mysql 数据库 java mysql
importcom.guangyi.project.model.system.DataBaseInFo;importjava.io.BufferedReader;importjava.io.File;importjava.io.FileOutputStream;importjava.io.IOException;importjava.io.InputStream;importjava.io.Inp
LangChain入门：使用Python和通义千问打造免费的Qwen大模型聊天机器人南七小僧人工智能网站开发 AI技术产品经理服务器数据库 windows
前言LangChain是一个用于开发由大型语言模型（LargeLanguageModels，简称LLMs）驱动的应用程序的框架。它提供了一个灵活的框架，使得开发者可以构建具有上下文感知能力和推理能力的应用程序，这些应用程序可以利用公司的数据和APIs。这个框架由几个部分组成。LangChain库：Python和JavaScript库。包含了各种组件的接口和集成，一个基本的运行时，用于将这些组件组合
大规模语言模型从理论到实践分布式训练的集群架构 AI智能涌现深度研究 DeepSeek R1 &大数据AI人工智能 Python入门实战计算科学神经计算深度学习神经网络大数据人工智能大型语言模型 AI AGI LLM Java Python 架构设计 Agent RPA
大规模语言模型从理论到实践分布式训练的集群架构作者：禅与计算机程序设计艺术/ZenandtheArtofComputerProgramming1.背景介绍1.1问题的由来随着深度学习技术的飞速发展，大规模语言模型（LargeLanguageModels,LLMs）在自然语言处理（NaturalLanguageProcessing,NLP）领域取得了突破性进展。LLMs，如BERT、GPT-3等，通
Vue实例 · new Vue() liudachu Vue.js new Vue
十六、Vue实例1.创建一个Vue实例每个Vue应用都是通过用Vue函数创建一个新的Vue实例开始的：varvm=newVue({//选项//当创建一个Vue实例时，你可以传入一个选项对象。})虽然没有完全遵循MVVM模型，但是Vue的设计也受到了它的启发。因此在文档中经常会使用vm(ViewModel的缩写)这个变量名表示Vue实例。一个Vue应用由一个通过newVue创建的根Vue实例，以及可
python之gmsh划分网格老歌老听老掉牙 python有限元分析 python 开发语言 gmsh 划分网格
Gmsh（GeometryModelingandMeshingSuite）是一个开源的三维有限元网格生成器，它集成了内置的CAD引擎和后处理器。Gmsh的设计目标是提供一个快速、轻量级且用户友好的网格工具，同时具备参数化输入和高级可视化能力。Gmsh围绕几何（geometry）、网格（mesh）、求解器（solver）和后处理（post-processing）四个模块构建，用户可以通过图形用户界面
【论文阅读】MMedPO：用临床感知多模态偏好优化调整医学视觉语言模型勤奋的小笼包论文阅读语言模型人工智能自然语言处理 chatgpt
MMedPO：用临床感知多模态偏好优化调整医学视觉语言模型1.背景2.核心问题：3.方法：3.实验结果与优势4.技术贡献与意义5.结论MMedPO:AligningMedicalVision-LanguageModelswithClinical-AwareMultimodalPreferenceOptimizationMMedPO：用临床感知多模态偏好优化调整医学视觉语言模型gitgub:地址1.
android MutableLiveData setValue 响应速速 postValue 快 mmsx Android 常用开发技术 android livedata
MutableLiveData是LiveData的一个可变版本，常用于在ViewModel中保存和管理UI相关的数据。MutableLiveData提供了两种主要的方法来更新其值：setValue和postValue。关于这两者的响应速度，通常认为setValue比postValue更快。下面详细解释这两者的区别以及影响响应速度的因素。一、setValuevspostValue1.setValue
python-flask复习(一) 胖虎是只mao python-web python函数 python python flask
一、Python现阶段三大主流Web框架Django、Tornado、Flask对比Django主要特点是大而全，集成了很多组件（例如Models、Admin、Form等等）,不管你用得到用不到，反正它全都有，属于全能型框架，通常用于大型Web应用，由于内置组件足够强大所以使用Django开发可以一气呵成，优点是大而全，缺点也就暴露出来了，这么多的资源一次性全部加载，肯定会造成一部分的资源浪费；T
WPF使用MVVM模式开发 pluto li .net .net
本文用到的有：WPF（.net5）Microsoft.Toolkit.Mvvm按钮不带参数/带参数点击事件绑定文本框Text绑定，点击事件绑定步骤如下：创建wpf项目：WpfMVVM创建Views、ViewModels两个文件夹nuget添加Microsoft.Toolkit.Mvvm在ViewModels文件夹添加类MainViewModelusingMicrosoft.Toolkit.Mvvm
复旦：过程奖励优化多模态推理大模型任我行大模型-模型训练人工智能自然语言处理语言模型论文笔记
标题：VisualPRM:AnEffectiveProcessRewardModelforMultimodalReasoning来源：arXiv,2503.10291摘要我们引入了VisualPRM，这是一种具有8B参数的高级多模态过程奖励模型（PRM），它通过Best-of-N（BoN）评估策略提高了现有多模态大型语言模型（MLLM）在不同模型尺度和族之间的推理能力。具体来说，我们的模型提高了三
FastAPI测试策略：参数解析单元测试 qcidyu 文章归档异常传播验证依赖注入测试请求模拟技术测试覆盖率优化 Pydantic验证测试单元测试策略参数解析测试
扫描二维码关注或者微信搜一搜：编程智域前端至全栈交流与成长探索数千个预构建的AI应用，开启你的下一个伟大创意第一章：核心测试方法论1.1三层测试体系架构#第一层：模型级测试deftest_user_model_validation():withpytest.raises(ValidationError):User(age=-5)#第二层：依赖项测试deftest_auth_dependency()
小黑笔记本，写的todolist效果，增删效果，显示隐藏，全部清除效果。 YangHuan3 html
先说一下总的大致要去实现的效果：1.新增2.删除3.统计4.清除5.隐藏给大家看一下todoList的大致样式吧！第一个效果：新增1.生成列表结构（v-for数组）2.获取用户输入（v-model）3.回车获取数据（v-on，enter添加数据）第二个效果：删除1.点击删除指定的内容（v-onsplice索引）通过对应的下标删除指定的元素，并且v-for指定的内部是可以获取到当前循环元素下标的，这
PDF转图片 JAVA JAVA派派 java PDF
前言以下是一个使用ApachePDFBox将PDF文件转换为图片的封装方法。这个方法将会把PDF的每一页转换为一张图片，并保存到指定的目录中。1.添加依赖首先，你需要在项目中添加PDFBox的依赖。如果你使用的是Maven，可以在pom.xml中添加以下依赖：org.apache.pdfboxpdfbox2.0.292.转换方法importorg.apache.pdfbox.pdmodel.PDD
如何在一行代码中初始化各种AI模型 qahaj 人工智能 python 深度学习
技术背景介绍在开发大语言模型(LLM)应用时，用户有时需要选择不同的模型提供商和具体模型。这通常需要一定的逻辑来根据用户配置初始化不同的聊天模型。为了简化这一过程，init_chat_model()方法被引入，让开发者能够轻松地初始化多种模型集成，而无需担心导入路径和类名。核心原理解析init_chat_model()方法通过传入模型名称及其提供商，自动推断并实例化对应的聊天模型。该功能在lang
常见的数学统计模型若木胡数学模型
以下是常见的数学统计模型分类及简要说明，适用于数据分析、预测和推断等场景：1.参数模型（ParametricModels）假设数据服从特定分布（如正态分布），通过估计参数来描述数据规律。1.1线性回归模型数学形式：(y=\beta_0+\beta_1x_1+\beta_2x_2+\cdots+\beta_px_p+\epsilon)应用：预测连续型目标变量（如房价预测）。特点：简单、可解释性强，假
MyBatisPlus 代码生成器如何使用？一篇文章学会它！！！程序猿ZhangSir Java 数据库 #MyBatis java spring 数据库
目录一.MP代码生成器简介二.准备工作2.1建立数据库和表2.1创建项目三.编写工具类3.1创建类3.2定义数据库连接变量3.3定义单表代码生成函数3.4扩展为任意表自动生成代码四.测试代码生成器4.1测试单表生成model方法一.MP代码生成器简介代码生成器是MyBatis-Plus提供的一个非常实用的功能，可以快速生成Entity、Mapper、MapperXML、Service、Contro
This robot has a joint named “gripper_finger_joint“ which is not in the gazebo model. 无码不欢的我 ROS
在B站上看古月居的课《ROS机械臂开发：从入门到实战》，在运行第9节的代码时，出现如下报错：Thisrobothasajointnamed"gripper_finger_joint"whichisnotinthegazebomodel.本人所运行环境为：ubuntu版本：20.04ROS版本：noetic错误分析：xacro的宏调用格式错误，正确格式为或者为：...修改方法：1.找到probot_
医图论文 CVPR‘24 | 适应医学图像中泛化异常检测的视觉-语言模型小白学视觉医学图像处理论文解读语言模型人工智能计算机视觉医学图像顶会医学图像处理 CVPR 论文解读
论文信息题目：AdaptingVisual-LanguageModelsforGeneralizableAnomalyDetectioninMedicalImages适应医学图像中泛化异常检测的视觉-语言模型作者：ChaoqinHuang，AofanJiang，JinghaoFeng，YaZhang，XinchaoWang，YanfengWang源码：https://github.com/Medi
A Survey of Large Language Models大模型综述论文章节总结 WhyteHighmore 论文语言模型人工智能自然语言处理论文笔记
ASurveyofLLM人大译ASurveyofLargeLanguageModels这篇论文全面回顾了大型语言模型(LLM)的最新进展，重点关注其发展背景、关键发现和主流技术。文章主要围绕LLM的四个主要方面展开：1引言自从1950年图灵测试被提出以来，人类一直在探索机器掌握语言智能的方法。语言本质上是一种受语法规则支配的复杂、精细的人类表达系统，这使得开发能够理解和掌握语言的强大人工智能(AI
A SURVEY ON POST-TRAINING OF LARGE LANGUAGE MODELS——大型语言模型的训练后优化综述——第9部分——应用王金-太想进步了语言模型人工智能自然语言处理
应用尽管预训练为大型语言模型（LLMs）赋予了强大的基础能力，但在部署于专业领域时，LLMs仍经常遇到持续的限制，包括上下文长度受限、容易产生幻觉（hallucination）、推理能力欠佳和固有的偏见。在现实世界的应用中，这些不足显得尤为重要，因为在这些场景中，精确性、可靠性和伦理一致性是至关重要的。这些问题引发了一些根本性的探讨：(1)如何系统地提高LLM的表现以满足特定领域的需求？(2)在实
【论文精读】SCINet-基于降采样和交互学习的时序卷积模型打酱油的葫芦娃时序预测算法时序预测 SCINet TCN
《SCINet:TimeSeriesModelingandForecastingwithSampleConvolutionandInteraction》的作者团队来自香港中文大学，发表在NeurIPS2022会议上。动机该论文的出发点是观察到时间序列数据具有独特的属性：即使在将时间序列下采样成两个子序列后，时间关系（例如数据的趋势和季节性成分）也基本上得以保留。这个观察启发了作者去设计一种新型的神
vue-常用指令 | 常用指令的修饰符 Cshaosun web前端 #VUE vue.js 前端 javascript
目录什么是vue指令v-cloakv-textv-htmlv-prev-show/v-ifv-else/v-else-ifv-onv-bindv-forv-model常用指令的修饰符v-model指令修饰符事件修饰符按键修饰符什么是vue指令指令就是带有v-前缀的特殊属性，不同的属性对应不同的功能。分类汇总内容渲染指令（v-html、v-text）条件渲染指令（v-show、v-if、v-else
复旦：LLM不同层位置编码缩放大模型任我行大模型-结构原理人工智能自然语言处理语言模型论文笔记
标题：Layer-SpecificScalingofPositionalEncodingsforSuperiorLong-ContextModeling来源：arXiv,2503.04355摘要尽管大型语言模型（LLM）在处理长上下文输入方面取得了重大进展，但它们仍然存在“中间丢失”问题，即上下文中间的关键信息往往不足或丢失。我们广泛的实验表明，这个问题可能源于旋转位置嵌入（RoPE）的快速长期衰
rstudio检验多重共线性代码十三木机器学习人工智能
在Rstudio中，你可以使用vif()函数来检验多重共线性。例如，假设你已经建立了一个线性回归模型，并将它保存在一个变量model中。你可以使用如下代码来检验多重共线性：library(car)vif(model)这会返回每个自变量的方差膨胀因子(VIF)，如果VIF较大(通常超过5或10)，则可能存在多重共线性。你可以使用这些信息来确定是否需要删除某些自变量或使用其他方法来处理多重共线性。
Jetpack组件在MVVM架构中的应用 Ya-Jun 架构 android
Jetpack组件在MVVM架构中的应用一、引言Jetpack是Android官方推出的一套开发组件工具集，它能够帮助开发者构建高质量、可维护的Android应用。本文将深入探讨Jetpack核心组件在MVVM架构中的应用。二、ViewModel组件2.1ViewModel基本原理ViewModel是MVVM架构中最重要的组件之一，它具有以下特点：生命周期感知数据持久化避免内存泄漏2.2ViewM
laravel基础 m0_65977885 lavarel
#laravel基础###一、MVC设计模式在php的的主流框架中，大多都采用MVC的设计模式，它可以将代码解耦，让视图代码和逻辑代码分开编写，为后期的维护带来了极大的便利。**MVC是模型（model）、视图（view）、控制器（controller）是组合**，它表示将软件系统分成3个核心部分。-模型model，用于数据处理-视图view，用于显示数据-控制器controller，接收用户请求
D2D通信实现资源分配算法的有关代码 kkk1622245 matlab
D2D通信实现资源分配算法的有关代码，用于提高下一代蜂窝网络中的频谱利用率的解决方案是设备到设备D2D(DevicetoDevice）通信。列表d2d-master/README.md,2774d2d-master/applications/model/http-client.cc,5305d2d-master/applications/model/http-client.h,2470d2d-ma
SpringMVC基本使用沉下心来学技术 tomcat spring java
SpringMVC是什么？SpringMVC是Spring框架中的一个模块，用于构建基于MVC（Model-View-Controller）设计模式的Web应用程序。它分离了应用程序的业务逻辑、用户界面和用户输入，使开发更加模块化和易于维护。核心组件Model：负责封装应用数据，通常与业务逻辑交互。View：负责渲染数据，生成用户界面（如JSP、Thymeleaf）。Controller：处理用户
java线程Thread和Runnable区别和联系 zx_code java jvm thread 多线程 Runnable
我们都晓得java实现线程2种方式，一个是继承Thread，另一个是实现Runnable。模拟窗口买票，第一例子继承thread，代码如下 package thread; public class ThreadTest { public static void main(String[] args) { Thread1 t1 = new Thread1(
【转】JSON与XML的区别比较丁_新 json xml
1.定义介绍 (1).XML定义扩展标记语言 (Extensible Markup Language, XML) ，用于标记电子文件使其具有结构性的标记语言，可以用来标记数据、定义数据类型，是一种允许用户对自己的标记语言进行定义的源语言。 XML使用DTD(document type definition)文档类型定义来组织数据;格式统一，跨平台和语言，早已成为业界公认的标准。 XML是标
c++ 实现五种基础的排序算法 CrazyMizzz C++c 算法
#include<iostream> using namespace std; //辅助函数，交换两数之值 template<class T> void mySwap(T &x, T &y){ T temp = x; x = y; y = temp; } const int size = 10; //一、用直接插入排
我的软件麦田的设计者我的软件音乐类娱乐放松
这是我写的一款app软件，耗时三个月，是一个根据央视节目开门大吉改变的，提供音调，猜歌曲名。1、手机拥有者在android手机市场下载本APP，同意权限，安装到手机上。2、游客初次进入时会有引导页面提醒用户注册。（同时软件自动播放背景音乐）。3、用户登录到主页后，会有五个模块。a、点击不胫而走，用户得到开门大吉首页部分新闻，点击进入有新闻详情。b、
linux awk命令详解被触发 linux awk
awk是行处理器: 相比较屏幕处理的优点，在处理庞大文件时不会出现内存溢出或是处理缓慢的问题，通常用来格式化文本信息 awk处理过程: 依次对每一行进行处理，然后输出 awk命令形式: awk [-F|-f|-v] ‘BEGIN{} //{command1; command2} END{}’ file [-F|-f|-v]大参数，-F指定分隔符，-f调用脚本，-v定义变量 var=val
各种语言比较 _wy_ 编程语言
Java Ruby PHP 擅长领域
oracle 中数据类型为clob的编辑知了ing oracle clob
public void updateKpiStatus(String kpiStatus,String taskId){ Connection dbc=null; Statement stmt=null; PreparedStatement ps=null; try { dbc = new DBConn().getNewConnection(); //stmt = db
分布式服务框架 Zookeeper -- 管理分布式环境中的数据矮蛋蛋 zookeeper
原文地址： http://www.ibm.com/developerworks/cn/opensource/os-cn-zookeeper/ 安装和配置详解本文介绍的 Zookeeper 是以 3.2.2 这个稳定版本为基础，最新的版本可以通过官网 http://hadoop.apache.org/zookeeper/来获取，Zookeeper 的安装非常简单，下面将从单机模式和集群模式两
tomcat数据源 alafqq tomcat
数据库 JNDI(Java Naming and Directory Interface，Java命名和目录接口)是一组在Java应用中访问命名和目录服务的API。没有使用JNDI时我用要这样连接数据库： 03. Class.forName("com.mysql.jdbc.Driver"); 04. conn
遍历的方法百合不是茶遍历
遍历在java的泛
linux查看硬件信息的命令 bijian1013 linux
linux查看硬件信息的命令一.查看CPU： cat /proc/cpuinfo 二.查看内存： free 三.查看硬盘： df linux下查看硬件信息 1、lspci 列出所有PCI 设备； lspci - list all PCI devices:列出机器中的PCI设备（声卡、显卡、Modem、网卡、USB、主板集成设备也能
java常见的ClassNotFoundException bijian1013 java
1.java.lang.ClassNotFoundException: org.apache.commons.logging.LogFactory 添加包common-logging.jar2.java.lang.ClassNotFoundException: javax.transaction.Synchronization
【Gson五】日期对象的序列化和反序列化 bit1129 反序列化
对日期类型的数据进行序列化和反序列化时，需要考虑如下问题： 1. 序列化时，Date对象序列化的字符串日期格式如何 2. 反序列化时，把日期字符串序列化为Date对象，也需要考虑日期格式问题 3. Date A -> str -> Date B,A和B对象是否equals 默认序列化和反序列化 import com
【Spark八十六】Spark Streaming之DStream vs. InputDStream bit1129 Stream
1. DStream的类说明文档： /** * A Discretized Stream (DStream), the basic abstraction in Spark Streaming, is a continuous * sequence of RDDs (of the same type) representing a continuous st
通过nginx获取header信息 ronin47 nginx header
1. 提取整个的Cookies内容到一个变量，然后可以在需要时引用，比如记录到日志里面， if ( $http_cookie ~* "(.*)$") { set $all_cookie $1; } 变量$all_cookie就获得了cookie的值，可以用于运算了
java-65.输入数字n，按顺序输出从1最大的n位10进制数。比如输入3，则输出1、2、3一直到最大的3位数即999 bylijinnan java
参考了网上的http://blog.csdn.net/peasking_dd/article/details/6342984 写了个java版的： public class Print_1_To_NDigit { /** * Q65.输入数字n，按顺序输出从1最大的n位10进制数。比如输入3，则输出1、2、3一直到最大的3位数即999 * 1.使用字符串
Netty源码学习-ReplayingDecoder bylijinnan java netty
ReplayingDecoder是FrameDecoder的子类，不熟悉FrameDecoder的，可以先看看 http://bylijinnan.iteye.com/blog/1982618 API说，ReplayingDecoder简化了操作，比如： FrameDecoder在decode时，需要判断数据是否接收完全： public class IntegerH
js特殊字符过滤 cngolon js特殊字符 js特殊字符过滤
1.js中用正则表达式过滤特殊字符, 校验所有输入域是否含有特殊符号function stripscript(s) { var pattern = new RegExp("[`~!@#$^&*()=|{}':;',\\[\\].<>/?~！@#￥……&*（）——|{}【】‘；：”“'。，、？]"
hibernate使用sql查询 ctrain Hibernate
import java.util.Iterator; import java.util.List; import java.util.Map; import org.hibernate.Hibernate; import org.hibernate.SQLQuery; import org.hibernate.Session; import org.hibernate.Transa
linux shell脚本中切换用户执行命令方法 daizj linux shell 命令切换用户
经常在写shell脚本时，会碰到要以另外一个用户来执行相关命令，其方法简单记下： 1、执行单个命令：su - user -c "command" 如：下面命令是以test用户在/data目录下创建test123目录 [root@slave19 /data]# su - test -c "mkdir /data/test123"
好的代码里只要一个 return 语句 dcj3sjt126com return
别再这样写了：public boolean foo() { if (true) { return true; } else { return false;
Android动画效果学习 dcj3sjt126com android
1、透明动画效果方法一：代码实现 public View onCreateView(LayoutInflater inflater, ViewGroup container, Bundle savedInstanceState) { View rootView = inflater.inflate(R.layout.fragment_main, container, fals
linux复习笔记之bash shell (4)管道命令 eksliang linux管道命令汇总 linux管道命令 linux常用管道命令
转载请出自出处： http://eksliang.iteye.com/blog/2105461 bash命令执行的完毕以后，通常这个命令都会有返回结果，怎么对这个返回的结果做一些操作呢？那就得用管道命令‘|’。上面那段话，简单说了下管道命令的作用，那什么事管道命令呢？答：非常的经典的一句话，记住了，何为管
Android系统中自定义按键的短按、双击、长按事件 gqdy365 android
在项目中碰到这样的问题：由于系统中的按键在底层做了重新定义或者新增了按键，此时需要在APP层对按键事件（keyevent）做分解处理，模拟Android系统做法，把keyevent分解成： 1、单击事件：就是普通key的单击； 2、双击事件：500ms内同一按键单击两次； 3、长按事件：同一按键长按超过1000ms（系统中长按事件为500ms）； 4、组合按键：两个以上按键同时按住；
asp.net获取站点根目录下子目录的名称 hvt .net C#asp.net hovertree Web Forms
使用Visual Studio建立一个.aspx文件(Web Forms)，例如hovertree.aspx,在页面上加入一个ListBox代码如下： <asp:ListBox runat="server" ID="lbKeleyiFolder" /> 那么在页面上显示根目录子文件夹的代码如下： string[] m_sub
Eclipse程序员要掌握的常用快捷键 justjavac java eclipse 快捷键 ide
判断一个人的编程水平，就看他用键盘多，还是鼠标多。用键盘一是为了输入代码（当然了，也包括注释），再有就是熟练使用快捷键。曾有人在豆瓣评《卓有成效的程序员》：“人有多大懒，才有多大闲”。之前我整理了一个程序员图书列表，目的也就是通过读书，让程序员变懒。写道程序员作为特殊的群体，有的人可以这么懒，懒到事情都交给机器去做，而有的人又可
c++编程随记 lx.asymmetric C++笔记
为了字体更好看，改变了格式…… &&运算符： #include<iostream> using namespace std; int main(){ int a=-1,b=4,k; k=(++a<0)&&!(b--
linux标准IO缓冲机制研究音频数据 linux
一、什么是缓存I/O(Buffered I/O)缓存I/O又被称作标准I/O,大多数文件系统默认I/O操作都是缓存I/O。在Linux的缓存I/O机制中，操作系统会将I/O的数据缓存在文件系统的页缓存(page cache)中，也就是说，数据会先被拷贝到操作系统内核的缓冲区中，然后才会从操作系统内核的缓冲区拷贝到应用程序的地址空间。1.缓存I/O有以下优点:A.缓存I/O使用了操作系统内核缓冲区，
随想生活暗黑小菠萝生活
其实账户之前就申请了，但是决定要自己更新一些东西看也是最近。从毕业到现在已经一年了。没有进步是假的，但是有多大的进步可能只有我自己知道。毕业的时候班里12个女生，真正最后做到软件开发的只要两个包括我，PS：我不是说测试不好。当时因为考研完全放弃找工作，考研失败，我想这只是我的借口。那个时候才想到为什么大学的时候不能好好的学习技术，增强自己的实战能力，以至于后来找工作比较费劲。我
我认为POJO是一个错误的概念 windshome java POJO 编程 J2EE 设计
这篇内容其实没有经过太多的深思熟虑，只是个人一时的感觉。从个人风格上来讲，我倾向简单质朴的设计开发理念；从方法论上，我更加倾向自顶向下的设计；从做事情的目标上来看，我追求质量优先，更愿意使用较为保守和稳妥的理念和方法。 &