晓理紫

[晓理紫]每日论文分享(有中文摘要，源码或项目地址)--强化学习、模仿学习、机器人、开放词汇

专属领域论文订阅

关注{晓理紫|小李子}，每日更新论文，如感兴趣，请转发给有需要的同学，谢谢支持

如果你感觉对你有所帮助，请关注我，每日准时为你推送最新论文。

为了答谢各位网友的支持，从今日起免费为300名读者提供订阅主题论文服务，只需VX关注公号并回复{邮箱+论文主题}（如：[email protected] + chatgpt@large language model @LLM）,主题必须是同一个领域，最多三个关键词。解释权归博主所有

分类:

大语言模型LLM

视觉模型VLM

扩散模型

视觉语言导航VLN

强化学习 RL

模仿学习 IL

机器人

开放词汇，检测分割

==RL==

标题: Diffusion Models for Reinforcement Learning: A Survey

作者: Zhengbang Zhu, Hanye Zhao, Haoran He

PubTime: 2024-02-05

Downlink: http://arxiv.org/abs/2311.01223v3

GitHub: https://github.com/apexrl/Diff4RLSurvey|

中文摘要: 扩散模型在样本质量和训练稳定性方面优于以前的生成模型。最近的工作显示了扩散模型在改进强化学习（RL）解决方案方面的优势。这项调查旨在提供这一新兴领域的概述，并希望激发新的研究途径。首先，我们研究RL算法遇到的几个挑战。然后，我们提出了一个基于扩散模型在RL中的作用的现有方法的分类，并探讨了如何解决前面的挑战。我们进一步概述了扩散模型在各种RL相关任务中的成功应用。最后，我们总结了调查结果，并对未来的研究方向提出了见解。我们正在积极维护一个GitHub存储库，用于存储论文和其他相关资源，以利用RL中的扩散模型：https：//github.com/apex rl/diff 4 rl survey。

摘要: Diffusion models surpass previous generative models in sample quality and training stability. Recent works have shown the advantages of diffusion models in improving reinforcement learning (RL) solutions. This survey aims to provide an overview of this emerging field and hopes to inspire new avenues of research. First, we examine several challenges encountered by RL algorithms. Then, we present a taxonomy of existing methods based on the roles of diffusion models in RL and explore how the preceding challenges are addressed. We further outline successful applications of diffusion models in various RL-related tasks. Finally, we conclude the survey and offer insights into future research directions. We are actively maintaining a GitHub repository for papers and other related resources in utilizing diffusion models in RL: https://github.com/apexrl/Diff4RLSurvey.

标题: Deal, or no deal (or who knows)? Forecasting Uncertainty in Conversations using Large Language Models

作者: Anthony Sicilia, Hyunwoo Kim, Khyathi Raghavi Chandu

PubTime: 2024-02-05

Downlink: http://arxiv.org/abs/2402.03284v1

中文摘要: 有效的对话者会解释他人不确定的目标、信念和情绪。但是，即使是最优秀的人类健谈者也无法完美地预测对话的轨迹。语言模型能在多大程度上代表对话中固有的不确定性？我们提出了FortUne Dial，这是长期存在的“对话预测”任务的扩展：评估不仅仅是准确性，而是使用不确定性感知指标进行，有效地实现了对单个实例的弃权。我们研究了语言模型潜在表示结果不确定性的两种方式（内部使用分数和直接使用标记），并提出了微调策略来改善这两种表示的校准。在八个困难的谈判语料库上的实验表明，我们提出的微调策略（传统的监督策略和非策略强化学习策略）可以校准较小的开源模型，以与10倍于其大小的预训练模型竞争。

摘要: Effective interlocutors account for the uncertain goals, beliefs, and emotions of others. But even the best human conversationalist cannot perfectly anticipate the trajectory of a dialogue. How well can language models represent inherent uncertainty in conversations? We propose FortUne Dial, an expansion of the long-standing “conversation forecasting” task: instead of just accuracy, evaluation is conducted with uncertainty-aware metrics, effectively enabling abstention on individual instances. We study two ways in which language models potentially represent outcome uncertainty (internally, using scores and directly, using tokens) and propose fine-tuning strategies to improve calibration of both representations. Experiments on eight difficult negotiation corpora demonstrate that our proposed fine-tuning strategies (a traditional supervision strategy and an off-policy reinforcement learning strategy) can calibrate smaller open-source models to compete with pre-trained models 10x their size.

标题: A Framework for Partially Observed Reward-States in RLHF

作者: Chinmaya Kausik, Mirco Mutti, Aldo Pacchiano

PubTime: 2024-02-05

Downlink: http://arxiv.org/abs/2402.03282v1

中文摘要: 近年来，基于人类反馈的强化学习（RLHF）的研究因其在LLMs开发中的作用而变得突出。神经科学研究表明，人类对刺激的反应取决于部分观察到的“内部状态”。不幸的是，目前的RLHF模型没有考虑到这一点。此外，大多数RLHF模型没有考虑中间反馈，这在实证工作中越来越重要，可以帮助提高样本复杂性和一致性。为了解决这些限制，我们将RLHF建模为具有部分观察奖励状态（PORRL）的强化学习。我们显示了从RLHF中人类反馈的两种主要形式——基数反馈和决斗反馈到PORRL的减少。对于基数反馈，我们开发了通用的统计有效算法，并实例化它们以呈现POR-UCRL和POR-UCBVI。对于决斗反馈，我们证明了基数反馈的天真简化不能实现次线性决斗后悔。然后，我们提出了第一个明确的减少，将基本后悔的保证转换为决斗后悔。我们表明，我们的模型和保证在这两种情况下推广和扩展了现有的。最后，我们在我们的模型上确定了一个递归结构，它可以改善PORRL的统计和计算易处理性，给出了过去在RLHF上的工作的例子，以及PORRL包含的学习完美奖励机。

摘要: The study of reinforcement learning from human feedback (RLHF) has gained prominence in recent years due to its role in the development of LLMs. Neuroscience research shows that human responses to stimuli are known to depend on partially-observed “internal states.” Unfortunately current models of RLHF do not take take this into consideration. Moreover most RLHF models do not account for intermediate feedback, which is gaining importance in empirical work and can help improve both sample complexity and alignment. To address these limitations, we model RLHF as reinforcement learning with partially observed reward-states (PORRL). We show reductions from the the two dominant forms of human feedback in RLHF - cardinal and dueling feedback to PORRL. For cardinal feedback, we develop generic statistically efficient algorithms and instantiate them to present POR-UCRL and POR-UCBVI. For dueling feedback, we show that a naive reduction to cardinal feedback fails to achieve sublinear dueling regret. We then present the first explicit reduction that converts guarantees for cardinal regret to dueling regret. We show that our models and guarantees in both settings generalize and extend existing ones. Finally, we identify a recursive structure on our model that could improve the statistical and computational tractability of PORRL, giving examples from past work on RLHF as well as learning perfect reward machines, which PORRL subsumes.

标题: Mixed Traffic Control and Coordination from Pixels

作者: Michael Villarreal, Bibek Poudel, Jia Pan

PubTime: 2024-02-05

Downlink: http://arxiv.org/abs/2302.09167v4

中文摘要: 交通拥堵是我们社会中一个长期存在的问题。以前的交通控制方法已被证明在缓解当前的拥堵水平方面是徒劳的，鉴于道路上不同自主水平的车辆越来越多，研究人员开始探索机器人车辆的想法。这就产生了混合交通控制，机器人车辆通过强化学习（RL）来调节人类驾驶的车辆。然而，大多数现有的研究使用精确的观测，这需要领域专业知识和每个道路网络的观测空间的手工工程。此外，精确观测使用全球信息，如环境外流，和局部信息，即车辆位置和速度。获得这些信息需要用巨大的传感器环境更新现有的道路基础设施，并与潜在的不情愿的人类驾驶员进行通信。我们考虑图像观测，一种尚未通过RL广泛探索用于混合交通控制的模态，作为替代方案：1）图像不需要从一个环境到另一个环境对观测空间进行完全的重新想象；2）图像通过卫星图像、车载摄像系统和交通监控系统无处不在；以及3）图像只需要与设备通信。在这项工作中，我们展示了使用图像观察的机器人车辆可以实现与使用环境上的精确信息相比具有竞争力的性能，包括环形、8字形、交叉口、合并和瓶颈。在某些情况下，我们的方法甚至优于使用精确观察，例如，在合并环境中平均车辆速度增加了8%，尽管仅使用本地交通信息而不是全局交通信息。

摘要: Traffic congestion is a persistent problem in our society. Previous methods for traffic control have proven futile in alleviating current congestion levels leading researchers to explore ideas with robot vehicles given the increased emergence of vehicles with different levels of autonomy on our roads. This gives rise to mixed traffic control, where robot vehicles regulate human-driven vehicles through reinforcement learning (RL). However, most existing studies use precise observations that require domain expertise and hand engineering for each road network’s observation space. Additionally, precise observations use global information, such as environment outflow, and local information, i.e., vehicle positions and velocities. Obtaining this information requires updating existing road infrastructure with vast sensor environments and communication to potentially unwilling human drivers. We consider image observations, a modality that has not been extensively explored for mixed traffic control via RL, as the alternative: 1) images do not require a complete re-imagination of the observation space from environment to environment; 2) images are ubiquitous through satellite imagery, in-car camera systems, and traffic monitoring systems; and 3) images only require communication to equipment. In this work, we show robot vehicles using image observations can achieve competitive performance to using precise information on environments, including ring, figure eight, intersection, merge, and bottleneck. In certain scenarios, our approach even outperforms using precision observations, e.g., up to 8% increase in average vehicle velocity in the merge environment, despite only using local traffic information as opposed to global traffic information.

标题: MobilityGPT: Enhanced Human Mobility Modeling with a GPT model

作者: Ammar Haydari, Dongjie Chen, Zhengfeng Lai

PubTime: 2024-02-05

Downlink: http://arxiv.org/abs/2402.03264v1

中文摘要: 生成模型在捕捉人类移动特征和生成方面显示出有希望的结果合成trajectories.然而，确保生成的地理空间移动性数据在语义上是真实的，包括一致的位置序列，并反映真实世界的特征，例如对地理空间限制的约束，仍然具有挑战性。为了解决这些问题，我们利用生成式预训练Transformer model（GPT），将人类移动建模重新格式化为自回归生成任务。为了确保其可控生成以缓解上述挑战，我们提出了一个地理空间感知生成模型MobilityGPT。我们提出了一种基于重力的采样方法来训练语义序列相似性的Transformer model。然后，我们通过道路连通性矩阵来约束训练过程，该矩阵在轨迹生成中提供序列的连通性，从而将生成的轨迹保持在地理空间限制内。最后，我们构建了一个来自轨迹反馈的强化学习（RLTF），以最小化训练和合成生成的轨迹之间的移动距离。我们在真实世界数据集上的实验表明，MobilityGPT在生成高质量的移动轨迹方面优于最先进的方法，这些轨迹在起点——目的地相似性、行程长度、行程半径、链接和重力分布方面最接近真实数据。

摘要: Generative models have shown promising results in capturing human mobility characteristics and generating synthetic trajectories. However, it remains challenging to ensure that the generated geospatial mobility data is semantically realistic, including consistent location sequences, and reflects real-world characteristics, such as constraining on geospatial limits. To address these issues, we reformat human mobility modeling as an autoregressive generation task, leveraging Generative Pre-trained Transformer (GPT). To ensure its controllable generation to alleviate the above challenges, we propose a geospatially-aware generative model, MobilityGPT. We propose a gravity-based sampling method to train a transformer for semantic sequence similarity. Then, we constrained the training process via a road connectivity matrix that provides the connectivity of sequences in trajectory generation, thereby keeping generated trajectories in geospatial limits. Lastly, we constructed a Reinforcement Learning from Trajectory Feedback (RLTF) to minimize the travel distance between training and the synthetically generated trajectories. Our experiments on real-world datasets demonstrate that MobilityGPT outperforms state-of-the-art methods in generating high-quality mobility trajectories that are closest to real data in terms of origin-destination similarity, trip length, travel radius, link, and gravity distributions.

标题: Multi-agent Reinforcement Learning for Energy Saving in Multi-Cell Massive MIMO Systems

作者: Tianzhang Cai, Qichen Wang, Shuai Zhang

PubTime: 2024-02-05

Downlink: http://arxiv.org/abs/2402.03204v1

摘要: We develop a multi-agent reinforcement learning (MARL) algorithm to minimize the total energy consumption of multiple massive MIMO (multiple-input multiple-output) base stations (BSs) in a multi-cell network while preserving the overall quality-of-service (QoS) by making decisions on the multi-level advanced sleep modes (ASMs) and antenna switching of these BSs. The problem is modeled as a decentralized partially observable Markov decision process (DEC-POMDP) to enable collaboration between individual BSs, which is necessary to tackle inter-cell interference. A multi-agent proximal policy optimization (MAPPO) algorithm is designed to learn a collaborative BS control policy. To enhance its scalability, a modified version called MAPPO-neighbor policy is further proposed. Simulation results demonstrate that the trained MAPPO agent achieves better performance compared to baseline policies. Specifically, compared to the auto sleep mode 1 (symbol-level sleeping) algorithm, the MAPPO-neighbor policy reduces power consumption by approximately 8.7% during low-traffic hours and improves energy efficiency by approximately 19% during high-traffic hours, respectively.

== Imitation Learning ==

标题: Vision-Language Foundation Models as Effective Robot Imitators

作者: Xinghang Li, Minghuan Liu, Hanbo Zhang

PubTime: 2024-02-05

Downlink: http://arxiv.org/abs/2311.01378v3

Project: https://roboflamingo.github.io|

中文摘要: 视觉语言基础模型的最新进展显示了它们理解多模态数据和解决复杂视觉语言任务的能力，包括机器人操作。我们寻求一种直接的方法来利用现有的视觉语言模型（VLM），并对机器人数据进行简单的微调。为此，我们推导出一个简单而新颖的视觉语言操作框架，称为RoboFlamingo，建立在开源VLMs OpenFlamingo的基础上。与之前的工作不同，RoboFlamingo利用预先训练的VLMs进行单步视觉语言理解，用明确的策略头对连续的历史信息进行建模，并通过仅在语言条件操纵数据集上的模仿学习进行轻微微调。这种分解为RoboFlamingo提供了在低性能平台上进行开环控制和部署的灵活性。通过在测试基准测试中大幅超越最先进的性能，我们表明RoboFlamingo可以成为使VLM适应机器人控制的有效且有竞争力的替代方案。我们广泛的实验结果也揭示了几个有趣的结论，关于不同的预训练VLM在操纵任务中的行为。我们相信RoboFlamingo有潜力成为一种经济高效且易于使用的机器人操作解决方案，使每个人都有能力微调自己的机器人政策。

摘要: Recent progress in vision language foundation models has shown their ability to understand multimodal data and resolve complicated vision language tasks, including robotics manipulation. We seek a straightforward way of making use of existing vision-language models (VLMs) with simple fine-tuning on robotics data. To this end, we derive a simple and novel vision-language manipulation framework, dubbed RoboFlamingo, built upon the open-source VLMs, OpenFlamingo. Unlike prior works, RoboFlamingo utilizes pre-trained VLMs for single-step vision-language comprehension, models sequential history information with an explicit policy head, and is slightly fine-tuned by imitation learning only on language-conditioned manipulation datasets. Such a decomposition provides RoboFlamingo the flexibility for open-loop control and deployment on low-performance platforms. By exceeding the state-of-the-art performance with a large margin on the tested benchmark, we show RoboFlamingo can be an effective and competitive alternative to adapt VLMs to robot control. Our extensive experimental results also reveal several interesting conclusions regarding the behavior of different pre-trained VLMs on manipulation tasks. We believe RoboFlamingo has the potential to be a cost-effective and easy-to-use solution for robotics manipulation, empowering everyone with the ability to fine-tune their own robotics policy.

标题: ILBiT: Imitation Learning for Robot Using Position and Torque Information based on Bilateral Control with Transformer

作者: Masato Kobayashi, Thanpimon Buamanee, Yuki Uranishi

PubTime: 2024-02-05

Downlink: http://arxiv.org/abs/2401.16653v2

中文摘要: 机器人手臂中的自主操纵是机器人学中一个复杂且不断发展的研究领域。本文介绍了一种创新的方法来应对这一挑战，重点是模仿学习（IL）。与传统的模仿方法不同，我们的方法使用基于双边控制的IL，允许更精确和适应性更强的机器人运动。传统的基于双边控制方法的IL依赖于长短期记忆（LSTM）网络。在本文中，我们提出了基于Transformer model双边控制（ILBiT）的基于位置和扭矩信息的机器人IL。这种提出的方法采用了Transformer模型，该方法以其在处理不同数据集方面的稳健性能和超越LSTM限制的能力而闻名，特别是在需要详细力量调整的任务中。ILBiT的一个突出特点是其100 Hz的高频操作，这大大提高了系统对不同环境和不同硬度水平物体的适应性和响应能力。基于Transformer model的ILBiT方法的有效性可以通过全面的真实世界实验来看。

摘要: Autonomous manipulation in robot arms is a complex and evolving field of study in robotics. This paper introduces an innovative approach to this challenge by focusing on imitation learning (IL). Unlike traditional imitation methods, our approach uses IL based on bilateral control, allowing for more precise and adaptable robot movements. The conventional IL based on bilateral control method have relied on Long Short-Term Memory (LSTM) networks. In this paper, we present the IL for robot using position and torque information based on Bilateral control with Transformer (ILBiT). This proposed method employs the Transformer model, known for its robust performance in handling diverse datasets and its capability to surpass LSTM’s limitations, especially in tasks requiring detailed force adjustments. A standout feature of ILBiT is its high-frequency operation at 100 Hz, which significantly improves the system’s adaptability and response to varying environments and objects of different hardness levels. The effectiveness of the Transformer-based ILBiT method can be seen through comprehensive real-world experiments.

==robotic agent==

标题: Exploring the Effects of Shared Autonomy on Cognitive Load and Trust in Human-Robot Interaction

作者: Jiahe Pan, Jonathan Eden, Denny Oetomo

PubTime: 2024-02-05

Downlink: http://arxiv.org/abs/2402.02758v1

中文摘要: 遥操作越来越被认为是在危险环境中部署机器人的可行解决方案。控制机器人执行复杂或高要求的任务可能会使操作员过载，导致性能不佳。为了设计一个机器人控制器来帮助人类执行这种具有挑战性的任务，全面了解机器人的自主行为和操作员的内部状态之间的相互作用是必不可少的。在本文中，我们研究了机器人自主性与人类用户的认知负荷和信任水平之间的关系，以及在机器人辅助执行任务中三方交互的潜在存在。我们的用户研究（N=24）结果表明，虽然自主水平影响遥控操作员的感知认知负荷和信任，但这些因素之间没有明确的相互作用。相反，这些元素似乎是独立运作的，因此强调了在共享控制设置中改变机器人自主水平时，需要将认知负荷和信任作为不同但相互关联的因素来考虑。这种洞察力对于开发更有效和适应性更强的辅助机器人系统至关重要。

摘要: Teleoperation is increasingly recognized as a viable solution for deploying robots in hazardous environments. Controlling a robot to perform a complex or demanding task may overload operators resulting in poor performance. To design a robot controller to assist the human in executing such challenging tasks, a comprehensive understanding of the interplay between the robot’s autonomous behavior and the operator’s internal state is essential. In this paper, we investigate the relationships between robot autonomy and both the human user’s cognitive load and trust levels, and the potential existence of three-way interactions in the robot-assisted execution of the task. Our user study (N=24) results indicate that while autonomy level influences the teleoperator’s perceived cognitive load and trust, there is no clear interaction between these factors. Instead, these elements appear to operate independently, thus highlighting the need to consider both cognitive load and trust as distinct but interrelated factors in varying the robot autonomy level in shared-control settings. This insight is crucial for the development of more effective and adaptable assistive robotic systems.

标题: Learning Mutual Excitation for Hand-to-Hand and Human-to-Human Interaction Recognition

作者: Mengyuan Liu, Chen Chen, Songtao Wu

PubTime: 2024-02-04

Downlink: http://arxiv.org/abs/2402.02431v1

中文摘要: 识别交互动作，包括手对手的交互和人对人的交互，在视频分析和人机交互领域的各种应用中引起了越来越多的关注。考虑到图卷积在从骨架数据建模拓扑感知特征方面的成功，最近的方法通常在单独的实体上操作图卷积，并使用后期融合进行交互式动作识别，这几乎不能建模成对实体之间的相互语义关系。为此，我们提出了一种通过堆叠互激励图卷积（me-GC）层的互激励图卷积网络（me-GCN）。具体来说，me-GC使用相互拓扑激励模块，首先从单个实体中提取邻接矩阵，然后自适应地建模它们之间的相互约束。此外，me-GC扩展了上述思想，并进一步使用相互特征激励模块从成对实体中提取和合并深度特征。与图卷积相比，我们提出的me-GC逐渐学习图卷积运算的每一层和每一阶段的互信息。在具有挑战性的手对手交互数据集（即Assembely101数据集）和两个大规模人对人交互数据集（即NTU60-Interaction和NTU120-Interaction）上的大量实验一致验证了我们提出的方法的优越性，该方法优于最先进的基于GCN和基于Transformer model的方法。

摘要: Recognizing interactive actions, including hand-to-hand interaction and human-to-human interaction, has attracted increasing attention for various applications in the field of video analysis and human-robot interaction. Considering the success of graph convolution in modeling topology-aware features from skeleton data, recent methods commonly operate graph convolution on separate entities and use late fusion for interactive action recognition, which can barely model the mutual semantic relationships between pairwise entities. To this end, we propose a mutual excitation graph convolutional network (me-GCN) by stacking mutual excitation graph convolution (me-GC) layers. Specifically, me-GC uses a mutual topology excitation module to firstly extract adjacency matrices from individual entities and then adaptively model the mutual constraints between them. Moreover, me-GC extends the above idea and further uses a mutual feature excitation module to extract and merge deep features from pairwise entities. Compared with graph convolution, our proposed me-GC gradually learns mutual information in each layer and each stage of graph convolution operations. Extensive experiments on a challenging hand-to-hand interaction dataset, i.e., the Assembely101 dataset, and two large-scale human-to-human interaction datasets, i.e., NTU60-Interaction and NTU120-Interaction consistently verify the superiority of our proposed method, which outperforms the state-of-the-art GCN-based and Transformer-based methods.

标题: Utilization of Non-verbal Behaviour and Social Gaze in Classroom Human-Robot Interaction Communications

作者: Sahand Shaghaghi, Pourya Aliasghari, Bryan Tripp

PubTime: 2024-02-04

Downlink: http://arxiv.org/abs/2312.06825v2

中文摘要: 本摘要探讨了课堂人机交互（HRI）场景，重点是在机器人认知架构中适应人类启发的社交凝视模型，以促进更无缝的社交交互。首先，我们详细介绍了我们在研究中探索的HRI场景，然后描述了我们研究中使用的社会凝视模型。我们强调了在课堂HRI场景中使用这种注意力模型的优势。我们还详细介绍了我们即将进行的研究的预期目标，涉及这个社会凝视模型。

摘要: This abstract explores classroom Human-Robot Interaction (HRI) scenarios with an emphasis on the adaptation of human-inspired social gaze models in robot cognitive architecture to facilitate a more seamless social interaction. First, we detail the HRI scenarios explored by us in our studies followed by a description of the social gaze model utilized for our research. We highlight the advantages of utilizing such an attentional model in classroom HRI scenarios. We also detail the intended goals of our upcoming study involving this social gaze model.

== Segmentation ==

标题: HASSOD: Hierarchical Adaptive Self-Supervised Object Detection

作者: Shengcao Cao, Dhiraj Joshi, Liang-Yan Gui

PubTime: 2024-02-05

Downlink: http://arxiv.org/abs/2402.03311v1

Project: https://HASSOD-NeurIPS23.github.io|

中文摘要: 人类的视觉感知系统在没有明确监督和理解物体的部分到整体组成的情况下表现出非凡的学习能力。从这两种能力中获得灵感，我们提出了分层自适应自监督对象检测（HASSOD），这是一种在没有人类监督的情况下学习检测对象并理解其组成的新方法。HASSOD采用分层自适应聚类策略，基于自监督视觉表示将区域分组到对象掩模中，自适应地确定每个图像的对象数量。此外，HASSOD通过分析掩模之间的覆盖关系和构建树结构来识别对象在组成方面的层次。这种额外的自我监督学习任务导致检测性能的提高和可解释性的增强。最后，我们放弃了先前方法中使用的低效的多轮自我训练过程，而是从半监督学习中调整平均教师框架，这导致了更平滑和更有效的训练过程。通过在流行图像数据集上的大量实验，我们证明了HASSOD相对于现有方法的优越性，从而推进了自监督目标检测的技术水平。值得注意的是，我们将LVIS上的掩模AR从20.2提高到22.5，将SA-1B上的掩模AR从17.0提高到26.0。项目页面：https：//hassod-neurips 23.github.io。

摘要: The human visual perception system demonstrates exceptional capabilities in learning without explicit supervision and understanding the part-to-whole composition of objects. Drawing inspiration from these two abilities, we propose Hierarchical Adaptive Self-Supervised Object Detection (HASSOD), a novel approach that learns to detect objects and understand their compositions without human supervision. HASSOD employs a hierarchical adaptive clustering strategy to group regions into object masks based on self-supervised visual representations, adaptively determining the number of objects per image. Furthermore, HASSOD identifies the hierarchical levels of objects in terms of composition, by analyzing coverage relations between masks and constructing tree structures. This additional self-supervised learning task leads to improved detection performance and enhanced interpretability. Lastly, we abandon the inefficient multi-round self-training process utilized in prior methods and instead adapt the Mean Teacher framework from semi-supervised learning, which leads to a smoother and more efficient training process. Through extensive experiments on prevalent image datasets, we demonstrate the superiority of HASSOD over existing methods, thereby advancing the state of the art in self-supervised object detection. Notably, we improve Mask AR from 20.2 to 22.5 on LVIS, and from 17.0 to 26.0 on SA-1B. Project page: https://HASSOD-NeurIPS23.github.io.

标题: Swin-UMamba: Mamba-based UNet with ImageNet-based pretraining

作者: Jiarun Liu, Hao Yang, Hong-Yu Zhou

PubTime: 2024-02-05

Downlink: http://arxiv.org/abs/2402.03302v1

GitHub: https://github.com/JiarunLiu/Swin-UMamba|

中文摘要: 精确的医学图像分割需要整合多尺度信息，从局部特征到全局依赖关系。然而，现有的方法对远程全局信息进行建模是具有挑战性的，其中卷积神经网络（CNN）受到其局部感受野的限制，并且视觉转换器（vit）遭受其注意机制的高二次复杂性。最近，基于Mamba的模型因其在长序列建模方面令人印象深刻的能力而获得了极大的关注。几项研究表明，这些模型可以在各种任务中优于流行的视觉模型，提供更高的准确性，更低的内存消耗和更少的计算负担。然而，现有的基于Mamba的模型大多是从头开始训练的，没有探索预训练的力量，预训练已被证明对于数据高效的医学图像分析相当有效。本文介绍了一种新的基于Mamba的模型Swin-UMamba，它是专门为医学图像分割任务设计的，利用了基于ImageNet的预训练的优势。我们的实验结果揭示了基于ImageNet的训练在增强基于Mamba的模型的性能中的重要作用。与CNN、ViTs和最新的基于Mamba的模型相比，Swin-UMamba表现出了卓越的性能和较大的优势。值得注意的是，在腹部核磁共振成像、内窥镜检查和显微镜数据集上，Swin-UMamba比其最接近的对手U-Mamba平均得分高出3.58%。Swin-UMamba的代码和模型可在以下网址公开获得：https://github.com/JiarunLiu/Swin-UMamba

摘要: Accurate medical image segmentation demands the integration of multi-scale information, spanning from local features to global dependencies. However, it is challenging for existing methods to model long-range global information, where convolutional neural networks (CNNs) are constrained by their local receptive fields, and vision transformers (ViTs) suffer from high quadratic complexity of their attention mechanism. Recently, Mamba-based models have gained great attention for their impressive ability in long sequence modeling. Several studies have demonstrated that these models can outperform popular vision models in various tasks, offering higher accuracy, lower memory consumption, and less computational burden. However, existing Mamba-based models are mostly trained from scratch and do not explore the power of pretraining, which has been proven to be quite effective for data-efficient medical image analysis. This paper introduces a novel Mamba-based model, Swin-UMamba, designed specifically for medical image segmentation tasks, leveraging the advantages of ImageNet-based pretraining. Our experimental results reveal the vital role of ImageNet-based training in enhancing the performance of Mamba-based models. Swin-UMamba demonstrates superior performance with a large margin compared to CNNs, ViTs, and latest Mamba-based models. Notably, on AbdomenMRI, Encoscopy, and Microscopy datasets, Swin-UMamba outperforms its closest counterpart U-Mamba by an average score of 3.58%. The code and models of Swin-UMamba are publicly available at: https://github.com/JiarunLiu/Swin-UMamba

标题: InstanceDiffusion: Instance-level Control for Image Generation

作者: Xudong Wang, Trevor Darrell, Sai Saketh Rambhatla

PubTime: 2024-02-05

Downlink: http://arxiv.org/abs/2402.03290v1

Project: https://people.eecs.berkeley.edu/|

中文摘要: 文本到图像扩散模型产生高质量的图像，但不提供对图像中单个实例的控制。我们引入了InstanceDiffusion，它为文本到图像的扩散模型添加了精确的实例级控制。InstanceDiffusion支持每个实例的自由形式语言条件，并允许以灵活的方式指定实例位置，如简单的单点、涂鸦、边界框或复杂的实例分割遮罩及其组合。我们对文本到图像模型提出了三个主要的改变，以实现精确的实例级控制。我们的UniFusion块支持文本到图像模型的实例级条件，ScaleU块提高了图像保真度，我们的多实例采样器提高了多实例的生成。对于每种位置条件，InstanceDiffusion都大大超过了专门的最先进的模型。值得注意的是，在COCO数据集上，我们在框输入方面比以前的技术水平高出20.4%AP $_{50}^\text{box}$ ，在掩码输入方面比以前的技术水平高出25.4%IoU。

摘要: Text-to-image diffusion models produce high quality images but do not offer control over individual instances in the image. We introduce InstanceDiffusion that adds precise instance-level control to text-to-image diffusion models. InstanceDiffusion supports free-form language conditions per instance and allows flexible ways to specify instance locations such as simple single points, scribbles, bounding boxes or intricate instance segmentation masks, and combinations thereof. We propose three major changes to text-to-image models that enable precise instance-level control. Our UniFusion block enables instance-level conditions for text-to-image models, the ScaleU block improves image fidelity, and our Multi-instance Sampler improves generations for multiple instances. InstanceDiffusion significantly surpasses specialized state-of-the-art models for each location condition. Notably, on the COCO dataset, we outperform previous state-of-the-art by 20.4% AP $_{50}^\text{box}$ for box inputs, and 25.4% IoU for mask inputs.

标题: ActiveAnno3D -- An Active Learning Framework for Multi-Modal 3D Object Detection

作者: Ahmed Ghita, Bjørk Antoniussen, Walter Zimmer

PubTime: 2024-02-05

Downlink: http://arxiv.org/abs/2402.03235v1

Project: https://active3d-framework.github.io/active3d-framework|

中文摘要: 大规模数据集的监管仍然成本高昂，需要大量时间和资源。数据通常被手动标记，创建高质量数据集的挑战依然存在。在这项工作中，我们填补了研究空白，使用主动学习的多模态三维目标检测。我们提出了ActiveAnno3D，这是一个主动学习框架，用于选择具有最大信息量的数据样本进行标记，以进行训练。我们探索了各种连续训练方法，并整合了关于计算需求和检测性能的最有效的方法。此外，我们使用BEVFusion和PV-RCNN在nuScenes和TUM交通交叉口数据集上进行了广泛的实验和消融研究。我们表明，当仅使用TUM交通交叉口数据集的一半训练数据（77.25 mAP对83.50 mAP）时，我们可以使用PV-RCNN和基于熵的查询策略实现几乎相同的性能。BEVFusion在使用一半训练数据时实现了64.31的mAP，在使用完整的nuScenes数据集时实现了75.0的mAP。我们将我们的主动学习框架集成到proAnno标记工具中，以实现人工智能辅助的数据选择和标记，并最大限度地降低标记成本。最后，我们在我们的网站上提供代码、权重和可视化结果：https：//active3d-framework.github.io/active3d-framework。

摘要: The curation of large-scale datasets is still costly and requires much time and resources. Data is often manually labeled, and the challenge of creating high-quality datasets remains. In this work, we fill the research gap using active learning for multi-modal 3D object detection. We propose ActiveAnno3D, an active learning framework to select data samples for labeling that are of maximum informativeness for training. We explore various continuous training methods and integrate the most efficient method regarding computational demand and detection performance. Furthermore, we perform extensive experiments and ablation studies with BEVFusion and PV-RCNN on the nuScenes and TUM Traffic Intersection dataset. We show that we can achieve almost the same performance with PV-RCNN and the entropy-based query strategy when using only half of the training data (77.25 mAP compared to 83.50 mAP) of the TUM Traffic Intersection dataset. BEVFusion achieved an mAP of 64.31 when using half of the training data and 75.0 mAP when using the complete nuScenes dataset. We integrate our active learning framework into the proAnno labeling tool to enable AI-assisted data selection and labeling and minimize the labeling costs. Finally, we provide code, weights, and visualization results on our website: https://active3d-framework.github.io/active3d-framework.

标题: RRWNet: Recursive Refinement Network for Effective Retinal Artery/Vein Segmentation and Classification

作者: José Morano, Guilherme Aresta, Hrvoje Bogunović

PubTime: 2024-02-05

Downlink: http://arxiv.org/abs/2402.03166v1

GitHub: https://github.com/j-morano/rrwnet|

中文摘要: 视网膜血管的口径和配置是各种疾病和医疗状况的重要生物标志物。对视网膜脉管系统的彻底分析需要对血管进行分割并将其分类为动脉和静脉，这通常在通过视网膜造影（一种广泛使用的成像技术）获得的彩色眼底图像上进行。尽管如此，手动执行这些任务是劳动密集型的，并且容易出现人为错误。已经提出了各种自动化方法来解决这个问题。然而，由于影响分割图拓扑一致性的明显分类错误，动脉/静脉分割和分类的当前技术水平面临挑战。本研究提出了一个创新的端到端框架，RRWNet，旨在递归细化语义分割图和纠正清单分类错误。该框架由一个完全卷积的神经网络和一个递归细化子网组成，该神经网络具有一个从输入图像生成基本分割图的基本子网，以及一个迭代和递归地改进这些图的递归细化子网。在公共数据集上的评估证明了所提出的方法的最先进的性能，产生了比现有方法更拓扑一致的分割图和更少的明显分类错误。此外，递归细化模块在对来自其他方法的分割图进行后处理、自动纠正分类错误和提高拓扑一致性方面证明是有效的。模型代码、权重和预测可在https://github.com/j-morano/rrwnet。

摘要: The caliber and configuration of retinal blood vessels serve as important biomarkers for various diseases and medical conditions. A thorough analysis of the retinal vasculature requires the segmentation of blood vessels and their classification into arteries and veins, which is typically performed on color fundus images obtained by retinography, a widely used imaging technique. Nonetheless, manually performing these tasks is labor-intensive and prone to human error. Various automated methods have been proposed to address this problem. However, the current state of art in artery/vein segmentation and classification faces challenges due to manifest classification errors that affect the topological consistency of segmentation maps. This study presents an innovative end-to-end framework, RRWNet, designed to recursively refine semantic segmentation maps and correct manifest classification errors. The framework consists of a fully convolutional neural network with a Base subnetwork that generates base segmentation maps from input images, and a Recursive Refinement subnetwork that iteratively and recursively improves these maps. Evaluation on public datasets demonstrates the state-of-the-art performance of the proposed method, yielding more topologically consistent segmentation maps with fewer manifest classification errors than existing approaches. In addition, the Recursive Refinement module proves effective in post-processing segmentation maps from other methods, automatically correcting classification errors and improving topological consistency. The model code, weights, and predictions are publicly available at https://github.com/j-morano/rrwnet.

标题: Context-self contrastive pretraining for crop type semantic segmentation

作者: Michail Tarasiou, Riza Alp Guler, Stefanos Zafeiriou

PubTime: 2024-02-05

Downlink: http://arxiv.org/abs/2104.04310v3

GitHub: https://github.com/michaeltrs/DeepSatModels|

摘要: In this paper, we propose a fully supervised pre-training scheme based on contrastive learning particularly tailored to dense classification tasks. The proposed Context-Self Contrastive Loss (CSCL) learns an embedding space that makes semantic boundaries pop-up by use of a similarity metric between every location in a training sample and its local context. For crop type semantic segmentation from Satellite Image Time Series (SITS) we find performance at parcel boundaries to be a critical bottleneck and explain how CSCL tackles the underlying cause of that problem, improving the state-of-the-art performance in this task. Additionally, using images from the Sentinel-2 (S2) satellite missions we compile the largest, to our knowledge, SITS dataset densely annotated by crop type and parcel identities, which we make publicly available together with the data generation pipeline. Using that data we find CSCL, even with minimal pre-training, to improve all respective baselines and present a process for semantic segmentation at super-resolution for obtaining crop classes at a more granular level. The code and instructions to download the data can be found in https://github.com/michaeltrs/DeepSatModels.

专属领域论文订阅

关注{晓理紫|小李子}，每日更新论文，如感兴趣，请转发给有需要的同学，谢谢支持

如果你感觉对你有所帮助，请关注我，每日准时为你推送最新论文。

为了答谢各位网友的支持，从今日起免费为300名读者提供订阅主题论文服务，只需VX关注公号并回复{邮箱+论文主题}（如：[email protected] + chatgpt@large language model @LLM）,主题必须是同一个领域，最多三个关键词。解释权归博主所有

你可能感兴趣的:(每日论文,学习,机器人,人工智能,大模型,深度学习)

情绪觉察日记第37天露露_e800
今天是家庭关系规划师的第二阶最后一天，慧萍老师帮我做了个案，帮我处理了埋在心底好多年的一份恐惧，并给了我深深的力量！这几天出来学习，爸妈过来婆家帮我带小孩，妈妈出于爱帮我收拾东西，并跟我先生和婆婆产生矛盾，妈妈觉得他们没有照顾好我…。今晚回家见到妈妈，我很欣赏她并赞扬她，妈妈说今晚要跟我睡我说好，当我们俩躺在床上准备睡觉的时候，我握着妈妈的手对她说:妈妈这几天辛苦你了，你看你多利害把我们的家收拾得
机器学习与深度学习间关系与区别 ℒℴѵℯ心·动ꦿ໊ོ꫞ 人工智能学习深度学习 python
一、机器学习概述定义机器学习（MachineLearning,ML）是一种通过数据驱动的方法，利用统计学和计算算法来训练模型，使计算机能够从数据中学习并自动进行预测或决策。机器学习通过分析大量数据样本，识别其中的模式和规律，从而对新的数据进行判断。其核心在于通过训练过程，让模型不断优化和提升其预测准确性。主要类型1.监督学习（SupervisedLearning）监督学习是指在训练数据集中包含输入
铭刻于星（四十二）随风至
69夜晚，绍敏同学做完功课后，看了眼房外，没听到动静才敢从书包的夹层里拿出那个心形纸团。折痕压得很深，都有些旧了，想来是已经写好很久了。绍敏同学慢慢地、轻轻地捏开折叠处，待到全部拆开后，又反复抚平纸张，然后仔细地一字字默看。只是开头的三个字是第一次看到，让她心漏跳了几拍。“亲爱的绍敏：从四年级的时候，我就喜欢你了，但是我一直不敢说，怕影响你学习。六年级的时候听说有人跟你表白，你接受了，我很难过，但
OC语言多界面传值五大方式 Magnetic_h ios ui 学习 objective-c 开发语言
前言在完成暑假仿写项目时，遇到了许多需要用到多界面传值的地方，这篇博客来总结一下比较常用的五种多界面传值的方式。属性传值属性传值一般用前一个界面向后一个界面传值，简单地说就是通过访问后一个视图控制器的属性来为它赋值，通过这个属性来做到从前一个界面向后一个界面传值。首先在后一个界面中定义属性@interfaceBViewController:UIViewController@propertyNSSt
UI学习——cell的复用和自定义cell Magnetic_h ui 学习
目录cell的复用手动（非注册）自动（注册）自定义cellcell的复用在iOS开发中，单元格复用是一种提高表格（UITableView）和集合视图（UICollectionView）滚动性能的技术。当一个UITableViewCell或UICollectionViewCell首次需要显示时，如果没有可复用的单元格，则视图会创建一个新的单元格。一旦这个单元格滚动出屏幕，它就不会被销毁。相反，它被添
学点心理知识，呵护孩子健康静候花开_7090
昨天听了华中师范大学教育管理学系副教授张玲老师的《哪里才是学生心理健康的最后庇护所，超越教育与技术的思考》的讲座。今天又重新学习了一遍，收获匪浅。张玲博士也注意到了当今社会上的孩子由于心理问题导致的自残、自杀及伤害他人等恶性事件。她向我们普及了一个重要的命题，她说心理健康的一些基本命题，我们与我们通常的一些教育命题是不同的，她还举了几个例子，让我们明白我们原来以为的健康并非心理学上的健康。比如如果
Cell Insight | 单细胞测序技术又一新发现，可用于HIV-1和Mtb共感染个体诊断尐尐呅
结核病是艾滋病合并其他疾病中导致患者死亡的主要原因。其中结核病由结核分枝杆菌（Mycobacteriumtuberculosis,Mtb）感染引起，获得性免疫缺陷综合症（艾滋病）由人免疫缺陷病毒（Humanimmunodeficiencyvirustype1,HIV-1）感染引起。国家感染性疾病临床医学研究中心/深圳市第三人民医院张国良团队携手深圳华大生命科学研究院吴靓团队，共同研究得出单细胞测序
《策划经理回忆录之二》路基雅虎
话说三年变六年，飘了，飘了……眨眼，2013年5月，老吴回到了他的家乡——油城从新开启他的工作幻想症生涯。很庆幸，这是一家很有追求，同时敢于尝试的，且实力不容低调的新星房企——金源置业(前身泰源置业)更值得庆幸的是第一个盘就是油城十路的标杆之一:金源盛世。2013年5月，到2015年11月，两年的陪伴，迎来了一场大爆发。2000个筹，5万/筹，直接回笼1个亿！！！这……让我开始认真审视这座看似五线
2021-08-26 影幽
在生活中，女人与男人的感悟往往有所不同。人生最大的舞台就是生活，大幕随时都可能拉开，关键是你愿不愿意表演都无法躲避。在生活中，遇事不要急躁，不要急于下结论，尤其生气时不要做决断，要学会换位思考，大事化小小事化了，把复杂的事情尽量简单处理，千万不要把简单的事情复杂化。永远不要扭曲，别人善意，无药可救。昨天是张过期的支票，明天是张信用卡，只有今天才是现金，要善加利用！执着的攀登者不必去与别人比较自己的
消息中间件有哪些常见类型 xmh-sxh-1314 java
消息中间件根据其设计理念和用途，可以大致分为以下几种常见类型：点对点消息队列（Point-to-PointMessagingQueues）：在这种模型中，消息被发送到特定的队列中，消费者从队列中取出并处理消息。队列中的消息只能被一个消费者消费，消费后即被删除。常见的实现包括IBM的MQSeries、RabbitMQ的部分使用场景等。适用于任务分发、负载均衡等场景。发布/订阅消息模型（Pub/Sub
ArcGIS栅格计算器常见公式（赋值、0和空值的转换、补充栅格空值）研学随笔 arcgis 经验分享
我们在使用ArcGIS时通常经常用到栅格计算器，今天主要给大家介绍我日常中经常用到的几个公式，供大家参考学习。将特定值（-9999）赋值为0，例如-9999.Con("raster"==-9999,0,"raster")2.给空值赋予特定的值（如0）Con(IsNull("raster"),0,"raster")3.将特定的栅格值(如1)赋值为空值，其他保留原值SetNull("raster"==
三大师传 beca酱
巴尔扎克的作品被誉为“法国社会的一面镜子”。文学大师维克多·雨果对巴尔扎克的评价是：“在最伟大的人物中间，巴尔扎克是名列前茅者；在最优秀的人物中间，巴尔扎克是佼佼者之一。”一个原本寂寂无名的小人物，从地中海的某个海岛上，只身一人来到巴黎，没有朋友，也没有名望。作为一个一文不名的外乡人，凭着赤手空拳赢得了巴黎，征服了整个法兰西，并且赢得了世界。这个人就是十九世纪法国伟大的军事家、政治家，法兰西第一帝
我的烦恼余建梅
我的烦恼。女儿问我：“你给学生布置什么作文题目？”“《我的烦恼》。”“他们都这么大了，你觉得他们还有烦恼吗？”“有啊！每个人都会有自己烦恼。”“我不相信，大人是没有烦恼的，如果说一定有的话，你的烦恼和我写作业有关，而且是小烦恼。不像我，天天被你说，有这样的妈妈，烦恼是没完没了。”女儿愤愤不平。每个人都会有自己的烦恼，处在上有老下有小的年纪，烦恼多的数不完。想干好工作带好孩子，想孝顺父母又想经营好自
《大清方方案》| 第二话谁佐清欢
和珅究竟说了些什么？竟能令堂堂九五之尊龙颜失色！此处暂且按下不表；单说这位乾隆皇帝，果真不愧是康熙从小带过的，一旦决定了要做的事，便杀伐决断毫不含糊。他当即亲自拟旨，着令和珅为钦差大臣，全权负责处理方方事件，并钦赐尚方宝剑，遇急则三品以下官员可先斩后奏。和珅身负皇上重托，岂敢有半点怠慢，当夜即率领相关人等，马不停蹄杀奔江汉。这一路上，和珅的几位幕僚一直在商讨方方事件的处置方案。有位年轻幕僚建议快刀
【一起学Rust | 设计模式】习惯语法——使用借用类型作为参数、格式化拼接字符串、构造函数广龙宇一起学Rust #Rust设计模式 rust 设计模式开发语言
提示：文章写完后，目录可以自动生成，如何生成可参考右边的帮助文档文章目录前言一、使用借用类型作为参数二、格式化拼接字符串三、使用构造函数总结前言Rust不是传统的面向对象编程语言，它的所有特性，使其独一无二。因此，学习特定于Rust的设计模式是必要的。本系列文章为作者学习《Rust设计模式》的学习笔记以及自己的见解。因此，本系列文章的结构也与此书的结构相同（后续可能会调成结构），基本上分为三个部分
回溯 Leetcode 332 重新安排行程 mmaerd Leetcode刷题学习记录 leetcode 算法职场和发展
重新安排行程Leetcode332学习记录自代码随想录给你一份航线列表tickets，其中tickets[i]=[fromi,toi]表示飞机出发和降落的机场地点。请你对该行程进行重新规划排序。所有这些机票都属于一个从JFK（肯尼迪国际机场）出发的先生，所以该行程必须从JFK开始。如果存在多种有效的行程，请你按字典排序返回最小的行程组合。例如，行程[“JFK”,“LGA”]与[“JFK”,“LGB
每日一题——第八十九题互联网打工人no1 C语言程序设计每日一练 c语言
题目：在字符串中找到提取数字，并统计一共找到多少整数，a123xxyu23&8889，那么找到的整数为123，23，8889//思想：#include#include#includeintmain(){charstr[]="a123xxyu23&8889";intcount=0;intnum=0;//用于临时存放当前正在构建的整数。boolinNum=false;//用于标记当前是否正在读取一个整
每日一题——第九十题互联网打工人no1 C语言程序设计每日一练 c语言
题目：判断子串是否与主串匹配#include#include#include//////判断子串是否在主串中匹配//////主串///子串///boolisSubstring(constchar*str,constchar*substr){intlenstr=strlen(str);//计算主串的长度intlenSub=strlen(substr);//计算子串的长度//遍历主字符串，对每个可能得
Python数据分析与可视化实战指南 William数据分析 python python 数据
在数据驱动的时代，Python因其简洁的语法、强大的库生态系统以及活跃的社区，成为了数据分析与可视化的首选语言。本文将通过一个详细的案例，带领大家学习如何使用Python进行数据分析，并通过可视化来直观呈现分析结果。一、环境准备1.1安装必要库在开始数据分析和可视化之前，我们需要安装一些常用的库。主要包括pandas、numpy、matplotlib和seaborn等。这些库分别用于数据处理、数学
每日一题——第八十一题互联网打工人no1 C语言程序设计每日一练 c语言
打印如下图案:#includeintmain(){inti,j;charch='A';for(i=1;i<5;i++,ch++){for(j=0;j<5-i;j++){printf("");//控制空格输出}for(j=1;j<2*i;j++)//条件j<2*i{printf("%c",ch);//控制字符输出}printf("\n");}return0;}
每日一题——第八十四题互联网打工人no1 C语言程序设计每日一练 c语言
题目：编写函数1、输入10个职工的姓名和职工号2、按照职工由大到小顺序排列，姓名顺序也随之调整3、要求输入一个职工号，用折半查找法找出该职工的姓名#define_CRT_SECURE_NO_WARNINGS#include#include#defineMAX_EMPLOYEES10typedefstruct{intid;charname[50];}Empolyee;voidinputEmploye
每日一题——第八十二题互联网打工人no1 C语言程序设计每日一练 c语言
题目：将一个控制台输入的字符串中的所有元音字母复制到另一字符串中#include#include#include#include#defineMAX_INPUT1024boolisVowel(charp);intmain(){charinput[MAX_INPUT];charoutput[MAX_INPUT];printf("请输入一串字符串：\n");fgets(input,sizeof(inp
每日一题——第八十三题互联网打工人no1 C语言程序设计每日一练 c语言
题目：将输入的整形数字输出,输出1990，输出"1990"#include#defineMAX_INPUT1024intmain(){intarrr_num[MAX_INPUT];intnum,i=0;printf("请输入一个数字：");scanf_s("%d",&num);while(num!=0){arrr_num[i++]=num%10;num/=10;}printf("\"");for(
2019-12-22-22:30 涓涓1016
今天是冬至，写下我的日更，是因为这两天的学习真的是能量的满满，让我看到了自己，未来另外一种可能性，也让我看到了这两年这几年的过程中我所接受那些痛苦的来源。一切的根源和痛苦都来自于人生，家庭，而你的原生家庭，你的爸爸和妈妈，是因为你这个灵魂在那一刻选择他们作为你的爸爸和妈妈来的，所以你得接受他，你得接纳他，他就是因为他的存在而给你的学习和成长带来这些痛苦，那其实是你必然要经历的这个过程，当你去接纳的
谁家酒器最绝唱，藏在酒厂人未知？景阳冈酒厂先秦藏品大揭秘李虓酒评论
文/王赛时中国的酒器酒具历史久远，举世闻名。从北京的故宫博物院、中国国家博物馆，到世界各国的大型博物馆，都以能够收藏中国古代酒具而夸耀。但很少有人知道，在山东阳谷景阳冈酒厂，默默地收藏了两千件中国酒器。这些酒器，就封藏在景阳冈的酒道馆里。其中有一些青铜酒器，一睡就是三、四千年，堪称无声国宝，堪作无字史书！今天，我将引领诸位首先窥视一下景阳冈酒道馆的9件先秦藏品，你自己来说震撼不震撼。提示：这只是景
LLM 词汇表落难Coder LLMs NLP 大语言模型大模型 llama 人工智能
Contextwindow“上下文窗口”是指语言模型在生成新文本时能够回溯和参考的文本量。这不同于语言模型训练时所使用的大量数据集，而是代表了模型的“工作记忆”。较大的上下文窗口可以让模型理解和响应更复杂和更长的提示，而较小的上下文窗口可能会限制模型处理较长提示或在长时间对话中保持连贯性的能力。Fine-tuning微调是使用额外的数据进一步训练预训练语言模型的过程。这使得模型开始表示和模仿微调数
将cmd中命令输出保存为txt文本文件落难Coder Windows cmd window
最近深度学习本地的训练中我们常常要在命令行中运行自己的代码，无可厚非，我们有必要保存我们的炼丹结果，但是复制命令行输出到txt是非常麻烦的，其实Windows下的命令行为我们提供了相应的操作。其基本的调用格式就是：运行指令>输出到的文件名称或者具体保存路径测试下，我打开cmd并且ping一下百度：pingwww.baidu.com>./data.txt看下相同目录下data.txt的输出：如果你再
感赏日志133 马姐读书
图片发自App感赏自己今天买个扫地机，以后可以解放出来多看点书，让这个智能小机器人替我工作了。感赏孩子最近进步很大，每天按时上学，认真听课，认真背书，主动认真完成老师布置的作业。感赏自己明白自己容易受到某人的影响，心情不好，每当此刻我就会舒缓，感赏，让自己尽快抽离，想好的一面。感赏儿子今天在我提醒他事情时，告诉我谢谢妈妈对我的提醒我明白了，而不是说我啰嗦，管事情，孩子更懂事了，懂得感恩了。投射父母
509. 斐波那契数(每日一题) lzyprime
lzyprime博客(github)创建时间：2021.01.04qq及邮箱：2383518170leetcode笔记题目描述斐波那契数，通常用F(n)表示，形成的序列称为斐波那契数列。该数列由0和1开始，后面的每一项数字都是前面两项数字的和。也就是：F(0)=0，F(1)=1F(n)=F(n-1)+F(n-2)，其中n>1给你n，请计算F(n)。示例1：输入：2输出：1解释：F(2)=F(1)+
四章-32-点要素的聚合彩云飘过
本文基于腾讯课堂老胡的课《跟我学Openlayers--基础实例详解》做的学习笔记，使用的openlayers5.3.xapi。源码见1032.html，对应的官网示例https://openlayers.org/en/latest/examples/cluster.htmlhttps://openlayers.org/en/latest/examples/earthquake-clusters.
java短路运算符和逻辑运算符的区别 3213213333332132 java基础
/* * 逻辑运算符——不论是什么条件都要执行左右两边代码 * 短路运算符——我认为在底层就是利用物理电路的“并联”和“串联”实现的 * 原理很简单，并联电路代表短路或（||），串联电路代表短路与（&&）。 * * 并联电路两个开关只要有一个开关闭合，电路就会通。 * 类似于短路或（||），只要有其中一个为true（开关闭合）是
Java异常那些不得不说的事白糖_ java exception
一、在finally块中做数据回收操作比如数据库连接都是很宝贵的，所以最好在finally中关闭连接。 JDBCAgent jdbc = new JDBCAgent(); try{ jdbc.excute("select * from ctp_log"); }catch(SQLException e){ ... }finally{ jdbc.close();
utf-8与utf-8(无BOM)的区别 dcj3sjt126com PHP
BOM——Byte Order Mark，就是字节序标记在UCS 编码中有一个叫做"ZERO WIDTH NO-BREAK SPACE"的字符，它的编码是FEFF。而FFFE在UCS中是不存在的字符，所以不应该出现在实际传输中。UCS规范建议我们在传输字节流前，先传输字符"ZERO WIDTH NO-BREAK SPACE"。这样如
JAVA Annotation之定义篇周凡杨 java 注解 annotation 入门注释
Annotation: 译为注释或注解 An annotation, in the Java computer programming language, is a form of syntactic metadata that can be added to Java source code. Classes, methods, variables, pa
tomcat的多域名、虚拟主机配置 g21121 tomcat
众所周知apache可以配置多域名和虚拟主机，而且配置起来比较简单，但是项目用到的是tomcat，配来配去总是不成功。查了些资料才总算可以，下面就跟大家分享下经验。很多朋友搜索的内容基本是告诉我们这么配置：在Engine标签下增面积Host标签，如下： <Host name="www.site1.com" appBase="webapps"
Linux SSH 错误解析（Capistrano 的cap 访问错误 Permission ） 510888780 linux capistrano
1.ssh -v [email protected] 出现 Permission denied (publickey,gssapi-keyex,gssapi-with-mic,password). 错误运行状况如下： OpenSSH_5.3p1, OpenSSL 1.0.1e-fips 11 Feb 2013 debug1: Reading configuratio
log4j的用法 Harry642 java log4j
一、前言： log4j 是一个开放源码项目，是广泛使用的以Java编写的日志记录包。由于log4j出色的表现，当时在log4j完成时，log4j开发组织曾建议sun在jdk1.4中用log4j取代jdk1.4 的日志工具类，但当时jdk1.4已接近完成，所以sun拒绝使用log4j，当在java开发中
mysql、sqlserver、oracle分页，java分页统一接口实现 aijuans oracle jave
定义：pageStart 起始页，pageEnd 终止页,pageSize页面容量 oracle分页：　　　　select * from ( select mytable.*,rownum num from (实际传的SQL) where rownum<=pageEnd) where num>=pageStart sqlServer分页：
Hessian 简单例子 antlove java Web service hessian
hello.hessian.MyCar.java package hessian.pojo; import java.io.Serializable; public class MyCar implements Serializable { private static final long serialVersionUID = 473690540190845543
数据库对象的同义词和序列百合不是茶 sql 序列同义词 ORACLE权限
回顾简单的数据库权限等命令; 解锁用户和锁定用户 alter user scott account lock/unlock; //system下查看系统中的用户 select * dba_users; //创建用户名和密码 create user wj identified by wj; identified by //授予连接权和建表权 grant connect to
使用Powermock和mockito测试静态方法 bijian1013 持续集成单元测试 mockito Powermock
实例： package com.bijian.study; import static org.junit.Assert.assertEquals; import java.io.IOException; import org.junit.Before; import org.junit.Test; import or
精通Oracle10编程SQL(6)访问ORACLE bijian1013 oracle 数据库 plsql
/* *访问ORACLE */ --检索单行数据 --使用标量变量接收数据 DECLARE v_ename emp.ename%TYPE; v_sal emp.sal%TYPE; BEGIN select ename,sal into v_ename,v_sal from emp where empno=&no; dbms_output.pu
【Nginx四】Nginx作为HTTP负载均衡服务器 bit1129 nginx
Nginx的另一个常用的功能是作为负载均衡服务器。一个典型的web应用系统，通过负载均衡服务器，可以使得应用有多台后端服务器来响应客户端的请求。一个应用配置多台后端服务器，可以带来很多好处：负载均衡的好处增加可用资源增加吞吐量加快响应速度，降低延时出错的重试验机制 Nginx主要支持三种均衡算法： round-robin l
jquery-validation备忘白糖_ jquery css F#Firebug
留点学习jquery validation总结的代码： function checkForm(){ validator = $("#commentForm").validate({// #formId为需要进行验证的表单ID errorElement :"span",// 使用"div"标签标记错误，默认:&
solr限制admin界面访问（端口限制和http授权限制） ronin47 限定Ip访问
solr的管理界面可以帮助我们做很多事情，但是把solr程序放到公网之后就要限制对admin的访问了。可以通过tomcat的http基本授权来做限制，也可以通过iptables防火墙来限制。我们先看如何通过tomcat配置http授权限制。第一步：在tomcat的conf/tomcat-users.xml文件中添加管理用户，比如： <userusername="ad
多线程-用JAVA写一个多线程程序，写四个线程，其中二个对一个变量加1，另外二个对一个变量减1 bylijinnan java 多线程
public class IncDecThread { private int j=10; /* * 题目:用JAVA写一个多线程程序，写四个线程，其中二个对一个变量加1，另外二个对一个变量减1 * 两个问题： * 1、线程同步--synchronized * 2、线程之间如何共享同一个j变量--内部类 */ public static
买房历程 cfyme
2015-06-21: 万科未来城，看房子 2015-06-26: 办理贷款手续，贷款73万，贷款利率5.65=5.3675 2015-06-27: 房子首付,签完合同 2015-06-28，央行宣布降息 0.25，就2天的时间差啊，没赶上。首付，老婆找他的小姐妹接了5万，另外几个朋友借了1-
[军事与科技]制造大型太空战舰的前奏 comsci 制造
天气热了........空调和电扇要准备好.......... 最近,世界形势日趋复杂化,战争的阴影开始覆盖全世界.......... 所以,我们不得不关
dateformat dai_lm DateFormat
"Symbol Meaning Presentation Ex." "------ ------- ------------ ----" "G era designator (Text) AD" "y year
Hadoop如何实现关联计算 datamachine mapreduce hadoop 关联计算
选择Hadoop，低成本和高扩展性是主要原因，但但它的开发效率实在无法让人满意。以关联计算为例。假设：HDFS上有2个文件，分别是客户信息和订单信息，customerID是它们之间的关联字段。如何进行关联计算，以便将客户名称添加到订单列表中？ &nbs
用户模型中修改用户信息时，密码是如何处理的 dcj3sjt126com yii
当我添加或修改用户记录的时候对于处理确认密码我遇到了一些麻烦，所有我想分享一下我是怎么处理的。场景是使用的基本的那些(系统自带)，你需要有一个数据表(user)并且表中有一个密码字段(password),它使用 sha1、md5或其他加密方式加密用户密码。面是它的工作流程: 当创建用户的时候密码需要加密并且保存，但当修改用户记录时如果使用同样的场景我们最终就会把用户加密过的密码再次加密，这
中文 iOS/Mac 开发博客列表 dcj3sjt126com Blog
本博客列表会不断更新维护，如果有推荐的博客，请到此处提交博客信息。本博客列表涉及的文章内容支持定制化Google搜索，特别感谢 JeOam 提供并帮助更新。本博客列表也提供同步更新的OPML文件（下载OPML文件），可供导入到例如feedly等第三方定阅工具中，特别感谢 lcepy 提供自动转换脚本。这里有导入教程。
js去除空格，去除左右两端的空格蕃薯耀去除左右两端的空格 js去掉所有空格 js去除空格
js去除空格，去除左右两端的空格 >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>&g
SpringMVC4零配置--web.xml hanqunfeng springmvc4
servlet3.0+规范后，允许servlet，filter，listener不必声明在web.xml中，而是以硬编码的方式存在，实现容器的零配置。 ServletContainerInitializer：启动容器时负责加载相关配置 package javax.servlet; import java.util.Set; public interface ServletContainer
《开源框架那些事儿21》：巧借力与借巧力 j2eetop 框架 UI
同样做前端UI，为什么有人花了一点力气，就可以做好？而有的人费尽全力，仍然错误百出？我们可以先看看几个故事。故事1：巧借力，乌鸦也可以吃核桃有一个盛产核桃的村子，每年秋末冬初，成群的乌鸦总会来到这里，到果园里捡拾那些被果农们遗落的核桃。核桃仁虽然美味，但是外壳那么坚硬，乌鸦怎么才能吃到呢？原来乌鸦先把核桃叼起，然后飞到高高的树枝上，再将核桃摔下去，核桃落到坚硬的地面上，被撞破了，于是，
JQuery EasyUI 验证扩展可怜的猫 jquery easyui 验证
最近项目中用到了前端框架-- EasyUI，在做校验的时候会涉及到很多需要自定义的内容，现把常用的验证方式总结出来，留待后用。以下内容只需要在公用js中添加即可。使用类似于如下： <input class="easyui-textbox" name="mobile" id="mobile&
架构师之httpurlconnection----------读取和发送(流读取效率通用类) nannan408
1.前言. 如题. 2.代码. /* * Copyright (c) 2015, S.F. Express Inc. All rights reserved. */ package com.test.test.test.send; import java.io.IOException; import java.io.InputStream
Jquery性能优化 r361251 JavaScript jquery
一、注意定义jQuery变量的时候添加var关键字这个不仅仅是jQuery，所有javascript开发过程中，都需要注意，请一定不要定义成如下： $loading = $('#loading'); //这个是全局定义，不知道哪里位置倒霉引用了相同的变量名，就会郁闷至死的二、请使用一个var来定义变量如果你使用多个变量的话，请如下方式定义： . 代码如下: var page
在eclipse项目中使用maven管理依赖 tjj006 eclipse maven
概览: 如何导入maven项目至eclipse中建立自有Maven Java类库服务器建立符合maven代码库标准的自定义类库 Maven在管理Java类库方面有巨大的优势，像白衣所说就是非常“环保”。我们平时用IDE开发都是把所需要的类库一股脑的全丢到项目目录下，然后全部添加到ide的构建路径中，如果用了SVN/CVS，这样会很容易就把
中国天气网省市级联页面 x125858805 级联
1、页面及级联js <%@ page language="java" import="java.util.*" pageEncoding="UTF-8"%> <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"> &l