晓理紫

[晓理紫]每日论文分享(有中文摘要，源码或项目地址)--强化学习、模仿学习、机器人

专属领域论文订阅

关注{晓理紫}，每日更新论文，如感兴趣，请转发给有需要的同学，谢谢支持

如果你感觉对你有所帮助，请关注我，每日准时为你推送最新论文。

为了答谢各位网友的支持，从今日起免费为300名读者提供订阅主题论文服务，只需VX关注公号并回复{邮箱+论文主题}（如：[email protected] + chatgpt@large language model @LLM）,主题必须是同一个领域，最多三个关键词。解释权归博主所有

分类:

大语言模型LLM

视觉模型VLM

扩散模型

视觉语言导航VLN

强化学习 RL

模仿学习 IL

机器人

开放词汇，检测分割

== RL @ RLHF ==

标题: Getting the Ball Rolling: Learning a Dexterous Policy for a Biomimetic Tendon-Driven Hand with Rolling Contact Joints

作者: Yasunori Toshimitsu, Benedek Forrai, Barnabas Gavin Cangan

PubTime: 2024-01-22

Downlink: http://arxiv.org/abs/2308.02453v3

Project: https://srl-ethz.github.io/get-ball-rolling/|https://youtu.be/YahsMhqNU8o|

GitHub: https://github.com/srl-ethz/faive_gym_oss|

中文摘要: 仿生、灵巧的机器人手有潜力复制人类可以完成的许多任务，并获得作为通用操作平台的地位。强化学习（RL）框架的最新进展在四足运动和灵巧操纵任务中取得了显著的性能。结合能够并行模拟数千个机器人的基于GPU的高度并行化模拟，基于RL的控制器变得更加可扩展和可接近。然而，为了将RL训练的策略带到现实世界中，我们需要输出可以与物理致动器和传感器一起工作的策略的训练框架，以及可以用可访问的材料制造但足够健壮以运行交互式策略的硬件平台。本工作介绍了仿生肌腱驱动的Faive手及其系统架构，该系统使用肌腱驱动的滚动接触关节来实现3D可打印、鲁棒的高自由度手设计。我们对手的每个元素进行建模，并将其集成到GPU模拟环境中，用RL训练策略，并实现灵巧的手握球体旋转技能向物理机器人手的零镜头转移。

摘要: Biomimetic, dexterous robotic hands have the potential to replicate much of the tasks that a human can do, and to achieve status as a general manipulation platform. Recent advances in reinforcement learning (RL) frameworks have achieved remarkable performance in quadrupedal locomotion and dexterous manipulation tasks. Combined with GPU-based highly parallelized simulations capable of simulating thousands of robots in parallel, RL-based controllers have become more scalable and approachable. However, in order to bring RL-trained policies to the real world, we require training frameworks that output policies that can work with physical actuators and sensors as well as a hardware platform that can be manufactured with accessible materials yet is robust enough to run interactive policies. This work introduces the biomimetic tendon-driven Faive Hand and its system architecture, which uses tendon-driven rolling contact joints to achieve a 3D printable, robust high-DoF hand design. We model each element of the hand and integrate it into a GPU simulation environment to train a policy with RL, and achieve zero-shot transfer of a dexterous in-hand sphere rotation skill to the physical robot hand.

标题: Sample-efficient Reinforcement Learning in Robotic Table Tennis

作者: Jonas Tebbe, Lukas Krauch, Yapeng Gao

PubTime: 2024-01-04

Downlink: http://arxiv.org/abs/2011.03275v4

Project: https://youtu.be/uRAtdoL6Wpw.|

中文摘要: 强化学习（RL）最近在各种计算机游戏和模拟中取得了一些令人印象深刻的成功。这些成功中的大多数都是基于代理人可以从中学习的大量情节。然而，在典型的机器人应用中，可行的尝试次数非常有限。在本文中，我们提出了一个样本有效的RL算法应用于一个乒乓球机器人的例子。在乒乓球比赛中，每一次击球都是不同的，位置、速度和旋转都不同。因此，必须根据高维连续状态空间找到精确的返回。为了使在少数试验中学习成为可能，该方法被嵌入到我们的机器人系统中。这样我们就可以使用一步到位的环境。状态空间取决于击球时的球（位置、速度、旋转），动作是击球时的球拍状态（方向、速度）。提出了一种基于行动者——批评家的确定性策略梯度算法用于加速学习。在许多具有挑战性的场景中，我们的方法在模拟和真实机器人上都具有竞争力。在不到200美元的训练中，无需预训练即可获得准确的结果。展示我们实验的视频可在https：//youtu.be/uRAtdoL6Wpw。

摘要: Reinforcement learning (RL) has achieved some impressive recent successes in
various computer games and simulations. Most of these successes are based on
having large numbers of episodes from which the agent can learn. In typical
robotic applications, however, the number of feasible attempts is very limited.
In this paper we present a sample-efficient RL algorithm applied to the example
of a table tennis robot. In table tennis every stroke is different, with
varying placement, speed and spin. An accurate return therefore has to be found
depending on a high-dimensional continuous state space. To make learning in few
trials possible the method is embedded into our robot system. In this way we
can use a one-step environment. The state space depends on the ball at hitting
time (position, velocity, spin) and the action is the racket state
(orientation, velocity) at hitting. An actor-critic based deterministic policy
gradient algorithm was developed for accelerated learning. Our approach
performs competitively both in a simulation and on the real robot in a number
of challenging scenarios. Accurate results are obtained without pre-training in
under $200$ episodes of training. The video presenting our experiments is
available at https://youtu.be/uRAtdoL6Wpw.

标题: Bridging the Gap Between Target Networks and Functional Regularization

作者: Alexandre Piche, Valentin Thomas, Joseph Marino

PubTime: 2024-01-03

Downlink: http://arxiv.org/abs/2210.12282v2

Project: https://openreview.net/forum?id=BFvoemrmqX|

中文摘要: 自举是深度强化学习许多成功的背后原因。然而，通过自举学习价值函数往往会由于目标值的快速变化而导致训练不稳定。通过使用一组附加的滞后参数来估计目标值，目标网络被用来稳定训练。尽管目标网络很受欢迎，但它们对优化的影响仍然被误解。在这项工作中，我们表明，他们作为一个隐式正则化。这种正则化器具有不灵活和非凸等缺点。为了克服这些问题，我们提出了一个显式函数正则化，它是函数空间中的一个凸正则化子，并且易于调整。我们从理论上分析了我们的方法的收敛性，并从经验上证明了用更有理论基础的函数正则化方法代替目标网络导致更好的样本效率和性能改进。

摘要: Bootstrapping is behind much of the successes of Deep Reinforcement Learning.
However, learning the value function via bootstrapping often leads to unstable
training due to fast-changing target values. Target Networks are employed to
stabilize training by using an additional set of lagging parameters to estimate
the target values. Despite the popularity of Target Networks, their effect on
the optimization is still misunderstood. In this work, we show that they act as
an implicit regularizer. This regularizer has disadvantages such as being
inflexible and non convex. To overcome these issues, we propose an explicit
Functional Regularization that is a convex regularizer in function space and
can easily be tuned. We analyze the convergence of our method theoretically and
empirically demonstrate that replacing Target Networks with the more
theoretically grounded Functional Regularization approach leads to better
sample efficiency and performance improvements.

标题: Memory Gym: Towards Endless Tasks to Benchmark Memory Capabilities of Agents

作者: Marco Pleines, Matthias Pallasch, Frank Zimmer

PubTime: 2024-01-03

Downlink: http://arxiv.org/abs/2309.17207v3

GitHub: https://github.com/MarcoMeter/endless-memory-gym/|

中文摘要: Memory Gym提供了一套2D部分可观察的环境，即迫击炮伤害、神秘路径和灼热的聚光灯，旨在对决策代理的记忆能力进行基准测试。这些最初任务有限的环境被扩展成创新的、无止境的格式，反映了累积记忆游戏（如“我打包了我的包”）不断升级的挑战。任务设计的这一进展将重点从仅仅评估样本效率转移到探索动态、长时间场景中的记忆效率水平。为了解决可用的基于内存的深度强化学习基线中的差距，我们引入了一种将Transformer model-XL（TrXL）与近似策略优化相集成的实现。这种方法利用TrXL作为情景记忆的一种形式，采用滑动窗口技术。我们对门控循环单元（GRU）和TrXL的比较研究揭示了不同设置下的不同性能。在有限环境下，TrXL在神秘路径中表现出优越的采样效率，在迫击炮伤害中表现出色。然而，GRU在灼热的聚光灯下效率更高。最值得注意的是，在所有没完没了的任务中，GRU取得了显著的复苏，持续大幅超过TrXL。网站和源代码：https://github.com/MarcoMeter/endless-memory-gym/

摘要: Memory Gym presents a suite of 2D partially observable environments, namely
Mortar Mayhem, Mystery Path, and Searing Spotlights, designed to benchmark
memory capabilities in decision-making agents. These environments, originally
with finite tasks, are expanded into innovative, endless formats, mirroring the
escalating challenges of cumulative memory games such as ``I packed my bag’'.
This progression in task design shifts the focus from merely assessing sample
efficiency to also probing the levels of memory effectiveness in dynamic,
prolonged scenarios. To address the gap in available memory-based Deep
Reinforcement Learning baselines, we introduce an implementation that
integrates Transformer-XL (TrXL) with Proximal Policy Optimization. This
approach utilizes TrXL as a form of episodic memory, employing a sliding window
technique. Our comparative study between the Gated Recurrent Unit (GRU) and
TrXL reveals varied performances across different settings. TrXL, on the finite
environments, demonstrates superior sample efficiency in Mystery Path and
outperforms in Mortar Mayhem. However, GRU is more efficient on Searing
Spotlights. Most notably, in all endless tasks, GRU makes a remarkable
resurgence, consistently outperforming TrXL by significant margins. Website and
Source Code: https://github.com/MarcoMeter/endless-memory-gym/

标题: DeXtreme: Transfer of Agile In-hand Manipulation from Simulation to Reality

作者: Ankur Handa, Arthur Allshire, Viktor Makoviychuk

PubTime: 2024-01-02

Downlink: http://arxiv.org/abs/2210.13702v2

Project: https://dextreme.org/|

中文摘要: 最近的工作证明了深度强化学习（RL）算法在模拟中学习复杂机器人行为的能力，包括在多指操作领域。然而，由于模拟和现实之间的差距，这种模型很难转移到现实世界中。在本文中，我们介绍了我们的技术来训练a）可以在拟人化机器人手上执行鲁棒灵巧操作的策略和b）适合于提供关于被操纵物体状态的可靠实时信息的鲁棒姿态估计器。我们的策略经过训练，可以适应模拟中的各种条件。因此，在相同的重定向任务上，我们基于视觉的策略明显优于文献中的最佳视觉策略，并且与通过运动捕捉系统给予特权状态信息的策略具有竞争力。我们的工作重申了在各种硬件和模拟器设置中灵巧操作的模拟到真实转换的可能性，在我们的例子中，是基于Allegro Hand和Isaac Gym GPU的模拟。此外，它为研究人员提供了使用常见的、负担得起的机器人手和相机实现这些结果的可能性。由此产生的视频政策及补充包括实验和演示在内的信息可以在https：//dextreme.org/

摘要: Recent work has demonstrated the ability of deep reinforcement learning (RL)
algorithms to learn complex robotic behaviours in simulation, including in the
domain of multi-fingered manipulation. However, such models can be challenging
to transfer to the real world due to the gap between simulation and reality. In
this paper, we present our techniques to train a) a policy that can perform
robust dexterous manipulation on an anthropomorphic robot hand and b) a robust
pose estimator suitable for providing reliable real-time information on the
state of the object being manipulated. Our policies are trained to adapt to a
wide range of conditions in simulation. Consequently, our vision-based policies
significantly outperform the best vision policies in the literature on the same
reorientation task and are competitive with policies that are given privileged
state information via motion capture systems. Our work reaffirms the
possibilities of sim-to-real transfer for dexterous manipulation in diverse
kinds of hardware and simulator setups, and in our case, with the Allegro Hand
and Isaac Gym GPU-based simulation. Furthermore, it opens up possibilities for
researchers to achieve such results with commonly-available, affordable robot
hands and cameras. Videos of the resulting policy and supplementary
information, including experiments and demos, can be found at
https://dextreme.org/

标题: Multi-agent Reinforcement Learning for Cooperative Lane Changing of Connected and Autonomous Vehicles in Mixed Traffic

作者: Wei Zhou, Dong Chen, Jun Yan

PubTime: 2024-01-05

Downlink: http://arxiv.org/abs/2111.06318v2

中文摘要: 自动驾驶在过去吸引了大量的研究兴趣
二十年，因为它提供了许多潜在的好处，包括释放司机
从疲惫的驾驶和缓解交通拥堵，等等。
尽管取得了可喜的进展，但变道仍然是一个巨大的挑战
自动驾驶汽车（AV），尤其是在混合和动态交通场景中。
最近，强化学习（RL），一种强大的数据驱动控制方法，
已被广泛研究用于AVs的变道决策
取得了令人鼓舞的成果。然而，这些研究中的大多数是
侧重于单车设置，以及在变道的背景下
与人类驾驶车辆（HDV）共存的多种AVs很少收到
注意。在本文中，我们制定了车道变换决策
混合交通公路环境中多个AVs作为多agent的研究
强化学习（MARL）问题，其中每个AV进行车道变换
基于相邻AVs和hdv的运动的决策。具体来说，
提出了一种新的多智能体优势演员——评论家网络（MA2C）
局部奖励设计和参数共享方案。特别是
提出了多目标奖励函数，
驾驶舒适性和自动驾驶的安全性。综合实验
在三种不同交通密度和不同水平下进行的结果
表明我们提出的MARL框架
在以下方面始终优于几个最先进的基准
效率、安全性和驾驶员舒适性。

摘要: Autonomous driving has attracted significant research interests in the past
two decades as it offers many potential benefits, including releasing drivers
from exhausting driving and mitigating traffic congestion, among others.
Despite promising progress, lane-changing remains a great challenge for
autonomous vehicles (AV), especially in mixed and dynamic traffic scenarios.
Recently, reinforcement learning (RL), a powerful data-driven control method,
has been widely explored for lane-changing decision makings in AVs with
encouraging results demonstrated. However, the majority of those studies are
focused on a single-vehicle setting, and lane-changing in the context of
multiple AVs coexisting with human-driven vehicles (HDVs) have received scarce
attention. In this paper, we formulate the lane-changing decision making of
multiple AVs in a mixed-traffic highway environment as a multi-agent
reinforcement learning (MARL) problem, where each AV makes lane-changing
decisions based on the motions of both neighboring AVs and HDVs. Specifically,
a multi-agent advantage actor-critic network (MA2C) is developed with a novel
local reward design and a parameter sharing scheme. In particular, a
multi-objective reward function is proposed to incorporate fuel efficiency,
driving comfort, and safety of autonomous driving. Comprehensive experimental
results, conducted under three different traffic densities and various levels
of human driver aggressiveness, show that our proposed MARL framework
consistently outperforms several state-of-the-art benchmarks in terms of
efficiency, safety and driver comfort.

== Imitation Learning ==

标题: LangProp: A code optimization framework using Language Models applied to driving

作者: Shu Ishida, Gianluca Corrado, George Fedoseev

PubTime: 2024-01-18

Downlink: http://arxiv.org/abs/2401.10314v1

GitHub: https://github.com/shuishida/LangProp.|

中文摘要: LangProp是一个框架，用于在监督/强化学习设置中迭代优化大型语言模型（LLMs）生成的代码。虽然LLMs可以产生合理的解决方案，但这些解决方案往往是次优的。特别是对于代码生成任务，初始代码很可能会在某些边缘情况下失败。LangProp自动评估输入输出对数据集上的代码性能，以及捕捉任何异常，并在训练循环中将结果反馈给LLM，以便LLM可以迭代地改进它生成的代码。通过对该代码优化过程采用度量和数据驱动的训练范式，人们可以轻松地适应传统机器学习技术（如模仿学习、匕首和强化学习）的发现。我们在CARLA中展示了自动驾驶自动代码优化的第一个概念证明，表明LangProp可以生成可解释和透明的驾驶策略，这些策略可以以度量和数据驱动的方式进行验证和改进。我们的代码将是开源的，可在https：//github.com/shuishida/LangProp。获得

摘要: LangProp is a framework for iteratively optimizing code generated by large language models (LLMs) in a supervised/reinforcement learning setting. While LLMs can generate sensible solutions zero-shot, the solutions are often sub-optimal. Especially for code generation tasks, it is likely that the initial code will fail on certain edge cases. LangProp automatically evaluates the code performance on a dataset of input-output pairs, as well as catches any exceptions, and feeds the results back to the LLM in the training loop, so that the LLM can iteratively improve the code it generates. By adopting a metric- and data-driven training paradigm for this code optimization procedure, one could easily adapt findings from traditional machine learning techniques such as imitation learning, DAgger, and reinforcement learning. We demonstrate the first proof of concept of automated code optimization for autonomous driving in CARLA, showing that LangProp can generate interpretable and transparent driving policies that can be verified and improved in a metric- and data-driven way. Our code will be open-sourced and is available at https://github.com/shuishida/LangProp.

标题: FMB: a Functional Manipulation Benchmark for Generalizable Robotic Learning

作者: Jianlan Luo, Charles Xu, Fangchen Liu

PubTime: 2024-01-16

Downlink: http://arxiv.org/abs/2401.08553v1

Project: https://functional-manipulation-benchmark.github.io|

中文摘要: 在本文中，我们提出了一个在功能操纵背景下研究机器人学习的真实世界基准：机器人需要通过以功能相关的方式组合个体操纵技能来完成复杂的长期行为。我们的功能操作基准（FMB）的核心设计原则强调复杂性和可访问性之间的和谐平衡。任务的范围被有意缩小，以确保可管理规模的模型和数据集可以被有效地用来跟踪进度。同时，它们的多样性足以构成重大的一般化挑战。此外，该基准测试旨在易于复制，包含所有基本的硬件和软件组件。为了实现这一目标，FMB由各种3D打印物体组成，旨在让其他研究人员轻松准确地复制。对象是按程序生成的，提供了一个原则性的框架，以受控的方式研究泛化。我们专注于基本的操作技能，包括抓取、重新定位和一系列组装行为。FMB可用于评估获得单个技能的方法，以及组合和排序这些技能以解决复杂、多阶段操作任务的方法。我们还提供了一个模仿学习框架，其中包括一套经过训练的策略来解决提出的任务。这使得研究人员能够利用我们的任务作为一个通用的工具包来检查管道的各个部分。例如，研究人员可以为抓取控制器提出更好的设计，并结合我们的基线重定向和组装策略进行评估，作为解决多阶段任务的管道的一部分。我们的数据集、对象CAD文件、代码和评估视频可以在我们的项目网站上找到：https：//functional-manipulation-benchmark.github.io

摘要: In this paper, we propose a real-world benchmark for studying robotic learning in the context of functional manipulation: a robot needs to accomplish complex long-horizon behaviors by composing individual manipulation skills in functionally relevant ways. The core design principles of our Functional Manipulation Benchmark (FMB) emphasize a harmonious balance between complexity and accessibility. Tasks are deliberately scoped to be narrow, ensuring that models and datasets of manageable scale can be utilized effectively to track progress. Simultaneously, they are diverse enough to pose a significant generalization challenge. Furthermore, the benchmark is designed to be easily replicable, encompassing all essential hardware and software components. To achieve this goal, FMB consists of a variety of 3D-printed objects designed for easy and accurate replication by other researchers. The objects are procedurally generated, providing a principled framework to study generalization in a controlled fashion. We focus on fundamental manipulation skills, including grasping, repositioning, and a range of assembly behaviors. The FMB can be used to evaluate methods for acquiring individual skills, as well as methods for combining and ordering such skills to solve complex, multi-stage manipulation tasks. We also offer an imitation learning framework that includes a suite of policies trained to solve the proposed tasks. This enables researchers to utilize our tasks as a versatile toolkit for examining various parts of the pipeline. For example, researchers could propose a better design for a grasping controller and evaluate it in combination with our baseline reorientation and assembly policies as part of a pipeline for solving multi-stage tasks. Our dataset, object CAD files, code, and evaluation videos can be found on our project website: https://functional-manipulation-benchmark.github.io

标题: Residual Q-Learning: Offline and Online Policy Customization without Value

作者: Chenran Li, Chen Tang, Haruki Nishimura

PubTime: 2024-01-15

Downlink: http://arxiv.org/abs/2306.09526v3

Project: https://sites.google.com/view/residualq-learning.|

中文摘要: 模仿学习（IL）是一个广泛使用的框架，用于从演示中学习模仿行为。它对于解决复杂的现实世界任务特别有吸引力，在这些任务中手工制作奖励功能是困难的，或者当目标是模仿人类专家行为时。然而，习得的模仿政策只能遵循示范中的行为。在应用模仿策略时，我们可能需要定制策略行为，以满足来自不同下游任务的不同需求。同时，我们仍然希望定制策略保持其模仿性。为此，我们制定了一个新的问题集，称为策略定制。它将学习任务定义为训练一个策略，该策略继承了先前策略的特征，同时满足目标下游任务施加的一些额外要求。我们提出了一种新颖的原则性方法来解释和确定两个任务目标之间的权衡。具体来说，我们将定制问题公式化为马尔可夫决策过程（MDP），其奖励函数结合了1）演示的固有奖励；以及2）下游任务指定的附加奖励。我们提出了一个新的框架，残差Q学习，它可以在不知道先验策略的内在回报或价值函数的情况下，通过利用先验策略来求解公式化的MDP。我们推导了一族可以实现离线和在线策略定制的残差Q学习算法，并表明所提出的算法可以在各种环境下有效地完成策略定制任务。演示视频和代码可在我们的网站上获得：https：//sites.google.com/view/residualq-learning。

摘要: Imitation Learning (IL) is a widely used framework for learning imitative behavior from demonstrations. It is especially appealing for solving complex real-world tasks where handcrafting reward function is difficult, or when the goal is to mimic human expert behavior. However, the learned imitative policy can only follow the behavior in the demonstration. When applying the imitative policy, we may need to customize the policy behavior to meet different requirements coming from diverse downstream tasks. Meanwhile, we still want the customized policy to maintain its imitative nature. To this end, we formulate a new problem setting called policy customization. It defines the learning task as training a policy that inherits the characteristics of the prior policy while satisfying some additional requirements imposed by a target downstream task. We propose a novel and principled approach to interpret and determine the trade-off between the two task objectives. Specifically, we formulate the customization problem as a Markov Decision Process (MDP) with a reward function that combines 1) the inherent reward of the demonstration; and 2) the add-on reward specified by the downstream task. We propose a novel framework, Residual Q-learning, which can solve the formulated MDP by leveraging the prior policy without knowing the inherent reward or value function of the prior policy. We derive a family of residual Q-learning algorithms that can realize offline and online policy customization, and show that the proposed algorithms can effectively accomplish policy customization tasks in various environments. Demo videos and code are available on our website: https://sites.google.com/view/residualq-learning.

标题: Multi-Stage Cable Routing through Hierarchical Imitation Learning

作者: Jianlan Luo, Charles Xu, Xinyang Geng

PubTime: 2024-01-13

Downlink: http://arxiv.org/abs/2307.08927v5

Project: https://sites.google.com/view/cablerouting.|

中文摘要: 我们研究学习执行多阶段机器人操纵任务的问题，并应用于电缆布线，机器人必须将电缆穿过一系列夹子。这种设置提出了具有代表性的挑战复杂的多阶段机器人操作场景：处理可变形物体，结束视觉感知的循环，以及处理由多个步骤组成的扩展行为，这些步骤必须成功执行才能完成整个任务。在这种情况下，为每个阶段学习以足够高的成功率成功执行完整的时间扩展任务的单个原语是不切实际的：如果每个阶段都必须成功完成并且具有不可忽略的失败概率，则成功完成整个任务的可能性变得可以忽略不计。因此，用于这种多阶段任务的成功控制器必须能够从故障中恢复，并通过智能地选择在任何给定时间触发哪些控制器、重试或根据需要采取纠正措施来补偿低级控制器中的缺陷。为此，我们描述了一个模仿学习系统，该系统使用从较低（电机控制）和较高（排序）级别的演示中训练的基于视觉的策略，提出了一个用于实例化该方法以学习电缆布线任务的系统，并执行了在推广到非常具有挑战性的剪辑放置变化方面显示出良好性能的评估。补充视频、数据集和代码可在https://sites.google.com/view/cablerouting。

摘要: We study the problem of learning to perform multi-stage robotic manipulation tasks, with applications to cable routing, where the robot must route a cable through a series of clips. This setting presents challenges representative of complex multi-stage robotic manipulation scenarios: handling deformable objects, closing the loop on visual perception, and handling extended behaviors consisting of multiple steps that must be executed successfully to complete the entire task. In such settings, learning individual primitives for each stage that succeed with a high enough rate to perform a complete temporally extended task is impractical: if each stage must be completed successfully and has a non-negligible probability of failure, the likelihood of successful completion of the entire task becomes negligible. Therefore, successful controllers for such multi-stage tasks must be able to recover from failure and compensate for imperfections in low-level controllers by smartly choosing which controllers to trigger at any given time, retrying, or taking corrective action as needed. To this end, we describe an imitation learning system that uses vision-based policies trained from demonstrations at both the lower (motor control) and the upper (sequencing) level, present a system for instantiating this method to learn the cable routing task, and perform evaluations showing great performance in generalizing to very challenging clip placement variations. Supplementary videos, datasets, and code can be found at https://sites.google.com/view/cablerouting.

标题: Mobile ALOHA: Learning Bimanual Mobile Manipulation with Low-Cost Whole-Body Teleoperation

作者: Zipeng Fu, Tony Z. Zhao, Chelsea Finn

PubTime: 2024-01-04

Downlink: http://arxiv.org/abs/2401.02117v1

Project: https://mobile-aloha.github.io|https://mobile-aloha.github.io|

中文摘要: 从人类演示中模仿学习在机器人领域表现出了令人印象深刻的表现。然而，大多数结果都集中在桌面操作上，缺乏通常有用任务所需的灵活性和灵活性。在这项工作中，我们开发了一个模拟移动操作任务的系统，这些任务是双手动的，需要全身控制。我们首先介绍了Mobile ALOHA，一种低成本的全身远程操作数据采集系统。它为ALOHA系统增加了一个移动基站和一个全身远程操作接口。使用使用Mobile ALOHA收集的数据，然后我们执行监督行为克隆，并发现与现有静态ALOHA数据集的联合训练可以提高移动操作任务的性能。每项任务有50个演示，联合训练可以将成功率提高90%，使Mobile ALOHA能够自主完成复杂的移动操作任务，如炒虾和上菜、打开一个双门壁柜来存放沉重的烹饪锅、呼叫并进入电梯，以及使用厨房水龙头轻轻冲洗用过的锅。项目网站：https://mobile-aloha.github.io

摘要: Imitation learning from human demonstrations has shown impressive performance
in robotics. However, most results focus on table-top manipulation, lacking the
mobility and dexterity necessary for generally useful tasks. In this work, we
develop a system for imitating mobile manipulation tasks that are bimanual and
require whole-body control. We first present Mobile ALOHA, a low-cost and
whole-body teleoperation system for data collection. It augments the ALOHA
system with a mobile base, and a whole-body teleoperation interface. Using data
collected with Mobile ALOHA, we then perform supervised behavior cloning and
find that co-training with existing static ALOHA datasets boosts performance on
mobile manipulation tasks. With 50 demonstrations for each task, co-training
can increase success rates by up to 90%, allowing Mobile ALOHA to autonomously
complete complex mobile manipulation tasks such as sauteing and serving a piece
of shrimp, opening a two-door wall cabinet to store heavy cooking pots, calling
and entering an elevator, and lightly rinsing a used pan using a kitchen
faucet. Project website: https://mobile-aloha.github.io

标题: Mimicking the Maestro: Exploring the Efficacy of a Virtual AI Teacher in Fine Motor Skill Acquisition

作者: Hadar Mulian, Segev Shlomov, Lior Limonad

PubTime: 2024-01-24

Downlink: http://arxiv.org/abs/2310.10280v2

摘要: Motor skills, especially fine motor skills like handwriting, play an essential role in academic pursuits and everyday life. Traditional methods to teach these skills, although effective, can be time-consuming and inconsistent. With the rise of advanced technologies like robotics and artificial intelligence, there is increasing interest in automating such teaching processes using these technologies, via human-robot and human-computer interactions. In this study, we examine the potential of a virtual AI teacher in emulating the techniques of human educators for motor skill acquisition. We introduce an AI teacher model that captures the distinct characteristics of human instructors. Using a Reinforcement Learning environment tailored to mimic teacher-learner interactions, we tested our AI model against four guiding hypotheses, emphasizing improved learner performance, enhanced rate of skill acquisition, and reduced variability in learning outcomes. Our findings, validated on synthetic learners, revealed significant improvements across all tested hypotheses. Notably, our model showcased robustness across different learners and settings and demonstrated adaptability to handwriting. This research underscores the potential of integrating Reinforcement Learning and Imitation Learning models with robotics in revolutionizing the teaching of critical motor skills.

== Embodied Artificial Intelligence@robotic agent@human robot interaction ==

标题: The Conversation is the Command: Interacting with Real-World Autonomous Robot Through Natural Language

作者: Linus Nwankwo, Elmar Rueckert

PubTime: 2024-01-22

Downlink: http://arxiv.org/abs/2401.11838v1

Project: https://osf.io/wzyf6|

GitHub: https://github.com/LinusNEP/TCC_IRoNL.git).|

中文摘要: 近年来，自主代理在现实世界环境中激增，如我们的家庭、办公室和公共场所。然而，自然的人机交互仍然是一个关键的挑战。在本文中，我们介绍了一种协同利用大型语言模型（LLMs）和多模态视觉语言模型（VLMs）的能力的方法，使人类能够通过对话与自主机器人进行自然交互。我们利用LLMs解码来自人类的高级自然语言指令，并将其抽象为精确的机器人可操作命令或查询。此外，我们利用VLMs来提供对机器人任务环境的视觉和语义理解。我们99.13%的命令识别准确率和97.96%的命令执行成功率表明，我们的方法可以增强现实世界应用中的人机交互。本文的视频演示可以在https：//osf.io/wzyf6找到，代码可以在我们的GitHub资源库（https：//github.com/LinusNEP/tcc_iron.git）找到。

摘要: In recent years, autonomous agents have surged in real-world environments such as our homes, offices, and public spaces. However, natural human-robot interaction remains a key challenge. In this paper, we introduce an approach that synergistically exploits the capabilities of large language models (LLMs) and multimodal vision-language models (VLMs) to enable humans to interact naturally with autonomous robots through conversational dialogue. We leveraged the LLMs to decode the high-level natural language instructions from humans and abstract them into precise robot actionable commands or queries. Further, we utilised the VLMs to provide a visual and semantic understanding of the robot’s task environment. Our results with 99.13% command recognition accuracy and 97.96% commands execution success show that our approach can enhance human-robot interaction in real-world applications. The video demonstrations of this paper can be found at https://osf.io/wzyf6 and the code is available at our GitHub repository (https://github.com/LinusNEP/TCC_IRoNL.git).

标题: Augmented Reality User Interface for Command, Control, and Supervision of Large Multi-Agent Teams

作者: Frank Regal, Chris Suarez, Fabian Parra

PubTime: 2024-01-11

Downlink: http://arxiv.org/abs/2401.05665v1

Project: https://sites.google.com/view/xr-robotics-iros2023/home?authuser=0|

中文摘要: 多智能体人——机器人团队通过利用和结合人类和机器人的优势，可以更有效地收集各种环境的信息。在国防、搜索和救援、急救等行业，异构人机团队有望通过将人类从未知和潜在危险的情况中移除来加速数据收集和提高团队安全性。这项工作建立在AugRE的基础上，AugRE是一个基于增强现实（AR）的可扩展人机团队框架。它使用户能够本地化并与50多个自主代理通信。通过我们的努力，用户能够指挥、控制和监督大型团队中的代理，无论是视距还是非视距，而无需事先修改环境，也无需用户使用典型的硬件（即操纵杆、键盘、笔记本电脑、平板电脑等）。）在外地。演示的工作表明，早期迹象表明，将这些基于AR-HMD的用户交互模式结合起来进行指挥、控制和监督，将有助于改善人机团队协作、健壮性和信任。

摘要: Multi-agent human-robot teaming allows for the potential to gather information about various environments more efficiently by exploiting and combining the strengths of humans and robots. In industries like defense, search and rescue, first-response, and others alike, heterogeneous human-robot teams show promise to accelerate data collection and improve team safety by removing humans from unknown and potentially hazardous situations. This work builds upon AugRE, an Augmented Reality (AR) based scalable human-robot teaming framework. It enables users to localize and communicate with 50+ autonomous agents. Through our efforts, users are able to command, control, and supervise agents in large teams, both line-of-sight and non-line-of-sight, without the need to modify the environment prior and without requiring users to use typical hardware (i.e. joysticks, keyboards, laptops, tablets, etc.) in the field. The demonstrated work shows early indications that combining these AR-HMD-based user interaction modalities for command, control, and supervision will help improve human-robot team collaboration, robustness, and trust.

标题: Unified Learning from Demonstrations, Corrections, and Preferences during Physical Human-Robot Interaction

作者: Shaunak A. Mehta, Dylan P. Losey

PubTime: 2024-01-09

Downlink: http://arxiv.org/abs/2207.03395v2

Project: https://youtu.be/FSUJsTYvEKU|

中文摘要: 人类可以利用物理交互来教授机器人手臂。这种物理交互有多种形式，取决于任务、用户和机器人到目前为止学到的东西。最先进的方法专注于从单一模态中学习，或者通过假设机器人具有关于人类预期任务的先验信息来组合多种交互类型。相比之下，在本文中，我们介绍了一种算法形式主义，它将从演示、纠正和偏好中学习结合起来。我们的方法对人类想要教给机器人的任务没有任何假设；相反，我们通过将人类的输入与附近的替代方案进行比较，从头开始学习奖励模型。我们首先导出一个损失函数，它训练一组奖励模型来匹配人类的演示、纠正和偏好。反馈的类型和顺序由人类老师决定：我们让机器人被动或主动地收集反馈。然后，我们应用约束优化将我们学习到的奖励转换成期望的机器人轨迹。通过模拟和用户研究，我们证明了我们提出的方法比现有的基线更准确地从物理人类交互中学习操纵任务，特别是当机器人面临新的或意想不到的目标时。我们的用户研究视频可在以下网站获得：https：//youtu.be/FSUJsTYvEKU

摘要: Humans can leverage physical interaction to teach robot arms. This physical interaction takes multiple forms depending on the task, the user, and what the robot has learned so far. State-of-the-art approaches focus on learning from a single modality, or combine multiple interaction types by assuming that the robot has prior information about the human’s intended task. By contrast, in this paper we introduce an algorithmic formalism that unites learning from demonstrations, corrections, and preferences. Our approach makes no assumptions about the tasks the human wants to teach the robot; instead, we learn a reward model from scratch by comparing the human’s inputs to nearby alternatives. We first derive a loss function that trains an ensemble of reward models to match the human’s demonstrations, corrections, and preferences. The type and order of feedback is up to the human teacher: we enable the robot to collect this feedback passively or actively. We then apply constrained optimization to convert our learned reward into a desired robot trajectory. Through simulations and a user study we demonstrate that our proposed approach more accurately learns manipulation tasks from physical human interaction than existing baselines, particularly when the robot is faced with new or unexpected objectives. Videos of our user study are available at: https://youtu.be/FSUJsTYvEKU

标题: StROL: Stabilized and Robust Online Learning from Humans

作者: Shaunak A. Mehta, Forrest Meng, Andrea Bajcsy

PubTime: 2024-01-04

Downlink: http://arxiv.org/abs/2308.09863v2

GitHub: https://github.com/VT-Collab/StROL_RAL|

中文摘要: 在当前的互动中，机器人经常需要在线学习人类的奖励功能。这种实时学习需要快速但近似的学习规则：当人类的行为有噪声或次优时，当前的近似会导致机器人学习不稳定。因此，在本文中，我们试图增强梯度下降学习规则在推断人类奖励参数时的鲁棒性和收敛性。我们将机器人的学习算法建模为基于人类偏好参数的动态系统，其中人类的真实（但未知）偏好是平衡点。这使我们能够执行李亚普诺夫稳定性分析，以推导机器人学习动力学收敛的条件。我们提出的算法（StROL）使用这些条件来学习设计鲁棒的学习规则：给定原始的学习动态，StROL输出修改的学习规则，该规则现在在更大的人类输入集下收敛到人类的真实参数。在实践中，这些自主生成的学习规则可以正确地推断出人类试图传达的内容，即使人类是嘈杂的、有偏见的和次优的。通过模拟和用户研究，我们发现StROL比最先进的在线奖励学习方法产生更准确的估计和更少的遗憾。请点击此处查看视频和代码：https://github.com/VT-Collab/StROL_RAL

摘要: Robots often need to learn the human’s reward function online, during the
current interaction. This real-time learning requires fast but approximate
learning rules: when the human’s behavior is noisy or suboptimal, current
approximations can result in unstable robot learning. Accordingly, in this
paper we seek to enhance the robustness and convergence properties of gradient
descent learning rules when inferring the human’s reward parameters. We model
the robot’s learning algorithm as a dynamical system over the human preference
parameters, where the human’s true (but unknown) preferences are the
equilibrium point. This enables us to perform Lyapunov stability analysis to
derive the conditions under which the robot’s learning dynamics converge. Our
proposed algorithm (StROL) uses these conditions to learn robust-by-design
learning rules: given the original learning dynamics, StROL outputs a modified
learning rule that now converges to the human’s true parameters under a larger
set of human inputs. In practice, these autonomously generated learning rules
can correctly infer what the human is trying to convey, even when the human is
noisy, biased, and suboptimal. Across simulations and a user study we find that
StROL results in a more accurate estimate and less regret than state-of-the-art
approaches for online reward learning. See videos and code here:
https://github.com/VT-Collab/StROL_RAL

标题: Motion Control of Interactive Robotic Arms Based on Mixed Reality Development

作者: Hanxiao Chen

PubTime: 2024-01-03

Downlink: http://arxiv.org/abs/2401.01644v1

Project: http://www.icca.net/,|

中文摘要: 混合现实（MR）正在不断发展，以激发机器人的新模式

摘要: Mixed Reality (MR) is constantly evolving to inspire new patterns of robot
manipulation for more advanced Human- Robot Interaction under the 4th
Industrial Revolution Paradigm. Consider that Mixed Reality aims to connect
physical and digital worlds to provide special immersive experiences, it is
necessary to establish the information exchange platform and robot control
systems within the developed MR scenarios. In this work, we mainly present
multiple effective motion control methods applied on different interactive
robotic arms (e.g., UR5, UR5e, myCobot) for the Unity-based development of MR
applications, including GUI control panel, text input control panel,
end-effector object dynamic tracking and ROS-Unity digital-twin connection.

标题: Chat Failures and Troubles: Reasons and Solutions

作者: Manal Helal, Patrick Holthaus, Gabriella Lakatos

PubTime: 2024-01-18

Downlink: http://arxiv.org/abs/2309.03708v2

中文摘要: 本文研究了人机交互（HRI）中导致聊天失败和麻烦的一些常见问题。给定用例的设计决策始于合适的机器人、合适的聊天模型、识别导致故障的常见问题、识别潜在的解决方案以及规划持续改进。总之，建议使用闭环控制算法来指导训练过的人工智能（AI）预训练模型的使用，并提供词汇过滤，在新数据集上重新训练批处理模型，从数据流中在线学习，和/或使用强化学习模型来自我更新训练过的模型并减少错误。

摘要: This paper examines some common problems in Human-Robot Interaction (HRI) causing failures and troubles in Chat. A given use case’s design decisions start with the suitable robot, the suitable chatting model, identifying common problems that cause failures, identifying potential solutions, and planning continuous improvement. In conclusion, it is recommended to use a closed-loop control algorithm that guides the use of trained Artificial Intelligence (AI) pre-trained models and provides vocabulary filtering, re-train batched models on new datasets, learn online from data streams, and/or use reinforcement learning models to self-update the trained models and reduce errors.

== Object Detection@ Segmentation@Open vocabulary detection@SAM ==

标题: MicroSegNet: A Deep Learning Approach for Prostate Segmentation on Micro-Ultrasound Images

作者: Hongxu Jiang, Muhammad Imran, Preethika Muralidharan

PubTime: 2024-01-25

Downlink: http://arxiv.org/abs/2305.19956v3

Project: https://zenodo.org/records/10475293|

GitHub: https://github.com/mirthAI/MicroSegNet|

中文摘要: 微超声（微美国）是一种新颖的29 MHz超声技术，其分辨率比传统超声高3-4倍，有可能实现低成本、准确的前列腺癌诊断。准确的前列腺分割对于前列腺体积测量、癌症诊断、前列腺活检和治疗计划至关重要。然而，由于中线前列腺、膀胱和尿道之间的伪影和模糊边界，微美国上的前列腺分割具有挑战性。本文介绍了MicroSegNet，这是一个多尺度注释引导的Transformer model UNet模型，专门用于解决这些挑战。在训练过程中，MicroSegNet更关注难以分割的区域（硬区域），其特征是专家和非专家注释之间的差异。我们通过提出注释引导的二元交叉熵（AG-BCE）损失来实现这一点，该损失将较大的权重分配给硬区域中的预测误差，而将较低的权重分配给容易区域中的预测误差。AG-BCE损失通过利用多尺度深度监督无缝集成到训练过程中，使MicroSegNet能够捕捉各种尺度的全球上下文相关性和本地信息。我们使用55名患者的微美国图像训练了我们的模型，随后对20名患者进行了评估。我们的MicroSegNet模型实现了0.939的Dice系数和2.02 mm的Hausdorff距离，优于几种最先进的分割方法，以及三种具有不同经验水平的人类注释者。我们的代码在https：//github.com/mirthAI/MicroSegNet公开，我们的数据集在https：//zenodo.org/records/10475293公开。

摘要: Micro-ultrasound (micro-US) is a novel 29-MHz ultrasound technique that provides 3-4 times higher resolution than traditional ultrasound, potentially enabling low-cost, accurate diagnosis of prostate cancer. Accurate prostate segmentation is crucial for prostate volume measurement, cancer diagnosis, prostate biopsy, and treatment planning. However, prostate segmentation on micro-US is challenging due to artifacts and indistinct borders between the prostate, bladder, and urethra in the midline. This paper presents MicroSegNet, a multi-scale annotation-guided transformer UNet model designed specifically to tackle these challenges. During the training process, MicroSegNet focuses more on regions that are hard to segment (hard regions), characterized by discrepancies between expert and non-expert annotations. We achieve this by proposing an annotation-guided binary cross entropy (AG-BCE) loss that assigns a larger weight to prediction errors in hard regions and a lower weight to prediction errors in easy regions. The AG-BCE loss was seamlessly integrated into the training process through the utilization of multi-scale deep supervision, enabling MicroSegNet to capture global contextual dependencies and local information at various scales. We trained our model using micro-US images from 55 patients, followed by evaluation on 20 patients. Our MicroSegNet model achieved a Dice coefficient of 0.939 and a Hausdorff distance of 2.02 mm, outperforming several state-of-the-art segmentation methods, as well as three human annotators with different experience levels. Our code is publicly available at https://github.com/mirthAI/MicroSegNet and our dataset is publicly available at https://zenodo.org/records/10475293.

标题: OMG-Seg: Is One Model Good Enough For All Segmentation?

作者: Xiangtai Li, Haobo Yuan, Wei Li

PubTime: 2024-01-18

Downlink: http://arxiv.org/abs/2401.10229v1

Project: https://lxtgh.github.io/project/omg_seg/|

GitHub: https://github.com/lxtGH/OMG-Seg.|

中文摘要: 在这项工作中，我们解决了各种分割任务，每个任务传统上都由不同的或部分统一的模型来解决。我们提出了OMG-Seg，这是一个足够好的模型，可以高效和有效地处理所有分割任务，包括图像语义、实例和全景分割，以及它们的视频对应物、开放词汇设置、提示驱动的交互式分割（如SAM）和视频对象分割。据我们所知，这是第一个在一个模型中处理所有这些任务并实现令人满意的性能的模型。我们表明，OMG-Seg是一种基于Transformer model的编码器——解码器架构，具有特定于任务的查询和输出，可以支持十多种不同的分割任务，同时显著降低各种任务和数据集的计算和参数开销。我们严格评估了合作训练中任务间的影响和相关性。代码和模型可在https：//github.com/lxtGH/OMG-Seg获得。

摘要: In this work, we address various segmentation tasks, each traditionally tackled by distinct or partially unified models. We propose OMG-Seg, One Model that is Good enough to efficiently and effectively handle all the segmentation tasks, including image semantic, instance, and panoptic segmentation, as well as their video counterparts, open vocabulary settings, prompt-driven, interactive segmentation like SAM, and video object segmentation. To our knowledge, this is the first model to handle all these tasks in one model and achieve satisfactory performance. We show that OMG-Seg, a transformer-based encoder-decoder architecture with task-specific queries and outputs, can support over ten distinct segmentation tasks and yet significantly reduce computational and parameter overhead across various tasks and datasets. We rigorously evaluate the inter-task influences and correlations during co-training. Code and models are available at https://github.com/lxtGH/OMG-Seg.

标题: RAP-SAM: Towards Real-Time All-Purpose Segment Anything

作者: Shilin Xu, Haobo Yuan, Qingyu Shi

PubTime: 2024-01-18

Downlink: http://arxiv.org/abs/2401.10228v1

Project: https://xushilin1.github.io/rap_sam/|

GitHub: https://github.com/xushilin1/RAP-SAM/.|

中文摘要: 由Transformer model架构推进，视觉基础模型（VFMs）在性能和泛化能力方面取得了显著进步。Segment Anything模型（SAM）是一种能够实现广义分割的出色模型。然而，大多数VFM不能实时运行，这使得很难将它们转移到几个产品中。另一方面，目前的实时分割主要有一个目的，比如对驾驶场景进行语义分割。我们认为实际应用需要不同的输出。因此，本工作探索了一种新的实时分段设置，称为实时通用分段，以在实时部署中传输VFMs。它包含三个不同的任务，包括交互式分割、全景分割和视频分割。我们的目标是使用一个模型来实时完成上述任务。我们首先对几个强基线进行基准测试。然后，我们提出了实时通用SAM（RAP-SAM）。它包含一个高效的编码器和一个高效的解耦解码器来执行提示驱动解码。此外，我们进一步探索不同的训练策略和调整方法，以进一步提高共同训练的表现。我们的代码和模型可在https：//github.com/xushilin1/RAP-SAM/获得。

摘要: Advanced by transformer architecture, vision foundation models (VFMs) achieve remarkable progress in performance and generalization ability. Segment Anything Model (SAM) is one remarkable model that can achieve generalized segmentation. However, most VFMs cannot run in realtime, which makes it difficult to transfer them into several products. On the other hand, current real-time segmentation mainly has one purpose, such as semantic segmentation on the driving scene. We argue that diverse outputs are needed for real applications. Thus, this work explores a new real-time segmentation setting, named all-purpose segmentation in real-time, to transfer VFMs in real-time deployment. It contains three different tasks, including interactive segmentation, panoptic segmentation, and video segmentation. We aim to use one model to achieve the above tasks in real-time. We first benchmark several strong baselines. Then, we present Real-Time All Purpose SAM (RAP-SAM). It contains an efficient encoder and an efficient decoupled decoder to perform prompt-driven decoding. Moreover, we further explore different training strategies and tuning methods to boost co-training performance further. Our code and model are available at https://github.com/xushilin1/RAP-SAM/.

标题: Adversarial Supervision Makes Layout-to-Image Diffusion Models Thrive

作者: Yumeng Li, Margret Keuper, Dan Zhang

PubTime: 2024-01-16

Downlink: http://arxiv.org/abs/2401.08815v1

Project: https://yumengli007.github.io/ALDM/|

GitHub: https://github.com/boschresearch/ALDM|

中文摘要: 尽管大规模扩散模型最近取得了进展，但布局到图像（L2I）合成任务进展甚微。当前的L2I模型要么通过文本的可编辑性差，要么生成的图像和输入布局之间的对齐弱。这限制了它们在实践中的可用性。为了减轻这一点，我们建议将对抗性监督整合到L2I扩散模型（ALDM）的传统训练管道中。具体来说，我们采用基于分割的鉴别器，该鉴别器向扩散发生器提供关于去噪图像和输入布局之间的像素级对齐的显式反馈。为了鼓励在采样步骤中一致地遵守输入布局，我们进一步引入了多步展开策略。我们不是查看单个时间步长，而是递归地展开几个步骤来模拟推理过程，并要求鉴别器在特定时间窗口内评估去噪图像与布局的对齐情况。我们的实验表明，ALDM能够实现生成图像的布局忠实性，同时允许通过文本提示进行广泛的编辑。此外，我们展示了它在实际应用中的有用性：通过文本控制合成目标分布样本，我们大大提高了语义分割模型的领域泛化能力（约1200万分）。

摘要: Despite the recent advances in large-scale diffusion models, little progress has been made on the layout-to-image (L2I) synthesis task. Current L2I models either suffer from poor editability via text or weak alignment between the generated image and the input layout. This limits their usability in practice. To mitigate this, we propose to integrate adversarial supervision into the conventional training pipeline of L2I diffusion models (ALDM). Specifically, we employ a segmentation-based discriminator which provides explicit feedback to the diffusion generator on the pixel-level alignment between the denoised image and the input layout. To encourage consistent adherence to the input layout over the sampling steps, we further introduce the multistep unrolling strategy. Instead of looking at a single timestep, we unroll a few steps recursively to imitate the inference process, and ask the discriminator to assess the alignment of denoised images with the layout over a certain time window. Our experiments show that ALDM enables layout faithfulness of the generated images, while allowing broad editability via text prompts. Moreover, we showcase its usefulness for practical applications: by synthesizing target distribution samples via text control, we improve domain generalization of semantic segmentation models by a large margin (~12 mIoU points).

标题: LESEN: Label-Efficient deep learning for Multi-parametric MRI-based Visual Pathway Segmentation

作者: Alou Diakite, Cheng Li, Lei Xie

PubTime: 2024-01-03

Downlink: http://arxiv.org/abs/2401.01654v1

GitHub: https://github.com/aldiak/Semi-Supervised-Multimodal-Visual-Pathway-|

中文摘要: 最近的研究显示了深度学习在基于多参数MRI的视觉路径（VP）分割中的潜力。然而，获取用于训练的标记数据既费力又耗时。因此，在标记样本有限的情况下开发有效的算法至关重要。在这项工作中，我们提出了一种标签有效的自集成深度学习方法（LESEN）。LESEN结合了监督和非监督损失，使学生和教师模型能够相互学习，形成一个自我集成的平均教师框架。此外，我们引入了可靠的无标记样本选择（RUSS）机制，以进一步提高LESEN的有效性。我们在人类连接体项目（HCP）数据集上的实验证明了我们的方法与最先进的技术相比的卓越性能，推进了临床和研究环境中综合分析的多模态VP分割。实现代码可在以下网址获得：https：//github.com/aldiak/semi-supervised-multimodal-visual-pathway-delineation。

摘要: Recent research has shown the potential of deep learning in multi-parametric
MRI-based visual pathway (VP) segmentation. However, obtaining labeled data for
training is laborious and time-consuming. Therefore, it is crucial to develop
effective algorithms in situations with limited labeled samples. In this work,
we propose a label-efficient deep learning method with self-ensembling (LESEN).
LESEN incorporates supervised and unsupervised losses, enabling the student and
teacher models to mutually learn from each other, forming a self-ensembling
mean teacher framework. Additionally, we introduce a reliable unlabeled sample
selection (RUSS) mechanism to further enhance LESEN’s effectiveness. Our
experiments on the human connectome project (HCP) dataset demonstrate the
superior performance of our method when compared to state-of-the-art
techniques, advancing multimodal VP segmentation for comprehensive analysis in
clinical and research settings. The implementation code will be available at:
https://github.com/aldiak/Semi-Supervised-Multimodal-Visual-Pathway-
Delineation.

标题: S3Net: Innovating Stereo Matching and Semantic Segmentation with a Single-Branch Semantic Stereo Network in Satellite Epipolar Imagery

作者: Qingyuan Yang, Guanzhou Chen, Xiaoliang Tan

PubTime: 2024-01-03

Downlink: http://arxiv.org/abs/2401.01643v1

GitHub: https://github.com/CVEO/S3Net.|

中文摘要: 立体匹配和语义分割是双目卫星三维重建中的重要任务。然而，以前的研究主要将这些任务视为独立的并行任务，缺乏一个完整的多任务学习框架。本文介绍了一种解决方案，单分支语义立体网络（S3Net），它创新性地将语义分割和立体匹配结合起来，使用自融合和互融合模块。与以前独立利用语义或差异信息的方法不同，我们的方法确定并利用这两个任务之间的内在联系，导致对语义信息和差异估计的更准确理解。在US3D数据集上的对比测试证明了我们的S3Net的有效性。我们的模型将语义分割中的mIoU从61.38提高到67.39，并将视差估计中的D1误差和平均端点误差（EPE）分别从10.051降低到9.579和1.439降低到1.403，超过了现有的竞争方法。我们的代码可在以下网址查阅：https://github.com/CVEO/S3Net。

摘要: Stereo matching and semantic segmentation are significant tasks in binocular
satellite 3D reconstruction. However, previous studies primarily view these as
independent parallel tasks, lacking an integrated multitask learning framework.
This work introduces a solution, the Single-branch Semantic Stereo Network
(S3Net), which innovatively combines semantic segmentation and stereo matching
using Self-Fuse and Mutual-Fuse modules. Unlike preceding methods that utilize
semantic or disparity information independently, our method dentifies and
leverages the intrinsic link between these two tasks, leading to a more
accurate understanding of semantic information and disparity estimation.
Comparative testing on the US3D dataset proves the effectiveness of our S3Net.
Our model improves the mIoU in semantic segmentation from 61.38 to 67.39, and
reduces the D1-Error and average endpoint error (EPE) in disparity estimation
from 10.051 to 9.579 and 1.439 to 1.403 respectively, surpassing existing
competitive methods. Our codes are available at:https://github.com/CVEO/S3Net.

专属领域论文订阅

关注{晓理紫|小李子}，每日更新论文，如感兴趣，请转发给有需要的同学，谢谢支持

如果你感觉对你有所帮助，请关注我，每日准时为你推送最新论文。

为了答谢各位网友的支持，从今日起免费为300名读者提供订阅主题论文服务，只需VX关注公号并回复{邮箱+论文主题}（如：[email protected] + chatgpt@large language model @LLM）,主题必须是同一个领域，最多三个关键词。解释权归博主所有

你可能感兴趣的:(每日论文,学习,机器人,人工智能,深度学习,大模型)

情绪觉察日记第37天露露_e800
今天是家庭关系规划师的第二阶最后一天，慧萍老师帮我做了个案，帮我处理了埋在心底好多年的一份恐惧，并给了我深深的力量！这几天出来学习，爸妈过来婆家帮我带小孩，妈妈出于爱帮我收拾东西，并跟我先生和婆婆产生矛盾，妈妈觉得他们没有照顾好我…。今晚回家见到妈妈，我很欣赏她并赞扬她，妈妈说今晚要跟我睡我说好，当我们俩躺在床上准备睡觉的时候，我握着妈妈的手对她说:妈妈这几天辛苦你了，你看你多利害把我们的家收拾得
机器学习与深度学习间关系与区别 ℒℴѵℯ心·动ꦿ໊ོ꫞ 人工智能学习深度学习 python
一、机器学习概述定义机器学习（MachineLearning,ML）是一种通过数据驱动的方法，利用统计学和计算算法来训练模型，使计算机能够从数据中学习并自动进行预测或决策。机器学习通过分析大量数据样本，识别其中的模式和规律，从而对新的数据进行判断。其核心在于通过训练过程，让模型不断优化和提升其预测准确性。主要类型1.监督学习（SupervisedLearning）监督学习是指在训练数据集中包含输入
铭刻于星（四十二）随风至
69夜晚，绍敏同学做完功课后，看了眼房外，没听到动静才敢从书包的夹层里拿出那个心形纸团。折痕压得很深，都有些旧了，想来是已经写好很久了。绍敏同学慢慢地、轻轻地捏开折叠处，待到全部拆开后，又反复抚平纸张，然后仔细地一字字默看。只是开头的三个字是第一次看到，让她心漏跳了几拍。“亲爱的绍敏：从四年级的时候，我就喜欢你了，但是我一直不敢说，怕影响你学习。六年级的时候听说有人跟你表白，你接受了，我很难过，但
OC语言多界面传值五大方式 Magnetic_h ios ui 学习 objective-c 开发语言
前言在完成暑假仿写项目时，遇到了许多需要用到多界面传值的地方，这篇博客来总结一下比较常用的五种多界面传值的方式。属性传值属性传值一般用前一个界面向后一个界面传值，简单地说就是通过访问后一个视图控制器的属性来为它赋值，通过这个属性来做到从前一个界面向后一个界面传值。首先在后一个界面中定义属性@interfaceBViewController:UIViewController@propertyNSSt
UI学习——cell的复用和自定义cell Magnetic_h ui 学习
目录cell的复用手动（非注册）自动（注册）自定义cellcell的复用在iOS开发中，单元格复用是一种提高表格（UITableView）和集合视图（UICollectionView）滚动性能的技术。当一个UITableViewCell或UICollectionViewCell首次需要显示时，如果没有可复用的单元格，则视图会创建一个新的单元格。一旦这个单元格滚动出屏幕，它就不会被销毁。相反，它被添
学点心理知识，呵护孩子健康静候花开_7090
昨天听了华中师范大学教育管理学系副教授张玲老师的《哪里才是学生心理健康的最后庇护所，超越教育与技术的思考》的讲座。今天又重新学习了一遍，收获匪浅。张玲博士也注意到了当今社会上的孩子由于心理问题导致的自残、自杀及伤害他人等恶性事件。她向我们普及了一个重要的命题，她说心理健康的一些基本命题，我们与我们通常的一些教育命题是不同的，她还举了几个例子，让我们明白我们原来以为的健康并非心理学上的健康。比如如果
Cell Insight | 单细胞测序技术又一新发现，可用于HIV-1和Mtb共感染个体诊断尐尐呅
结核病是艾滋病合并其他疾病中导致患者死亡的主要原因。其中结核病由结核分枝杆菌（Mycobacteriumtuberculosis,Mtb）感染引起，获得性免疫缺陷综合症（艾滋病）由人免疫缺陷病毒（Humanimmunodeficiencyvirustype1,HIV-1）感染引起。国家感染性疾病临床医学研究中心/深圳市第三人民医院张国良团队携手深圳华大生命科学研究院吴靓团队，共同研究得出单细胞测序
《策划经理回忆录之二》路基雅虎
话说三年变六年，飘了，飘了……眨眼，2013年5月，老吴回到了他的家乡——油城从新开启他的工作幻想症生涯。很庆幸，这是一家很有追求，同时敢于尝试的，且实力不容低调的新星房企——金源置业(前身泰源置业)更值得庆幸的是第一个盘就是油城十路的标杆之一:金源盛世。2013年5月，到2015年11月，两年的陪伴，迎来了一场大爆发。2000个筹，5万/筹，直接回笼1个亿！！！这……让我开始认真审视这座看似五线
2021-08-26 影幽
在生活中，女人与男人的感悟往往有所不同。人生最大的舞台就是生活，大幕随时都可能拉开，关键是你愿不愿意表演都无法躲避。在生活中，遇事不要急躁，不要急于下结论，尤其生气时不要做决断，要学会换位思考，大事化小小事化了，把复杂的事情尽量简单处理，千万不要把简单的事情复杂化。永远不要扭曲，别人善意，无药可救。昨天是张过期的支票，明天是张信用卡，只有今天才是现金，要善加利用！执着的攀登者不必去与别人比较自己的
消息中间件有哪些常见类型 xmh-sxh-1314 java
消息中间件根据其设计理念和用途，可以大致分为以下几种常见类型：点对点消息队列（Point-to-PointMessagingQueues）：在这种模型中，消息被发送到特定的队列中，消费者从队列中取出并处理消息。队列中的消息只能被一个消费者消费，消费后即被删除。常见的实现包括IBM的MQSeries、RabbitMQ的部分使用场景等。适用于任务分发、负载均衡等场景。发布/订阅消息模型（Pub/Sub
ArcGIS栅格计算器常见公式（赋值、0和空值的转换、补充栅格空值）研学随笔 arcgis 经验分享
我们在使用ArcGIS时通常经常用到栅格计算器，今天主要给大家介绍我日常中经常用到的几个公式，供大家参考学习。将特定值（-9999）赋值为0，例如-9999.Con("raster"==-9999,0,"raster")2.给空值赋予特定的值（如0）Con(IsNull("raster"),0,"raster")3.将特定的栅格值(如1)赋值为空值，其他保留原值SetNull("raster"==
三大师传 beca酱
巴尔扎克的作品被誉为“法国社会的一面镜子”。文学大师维克多·雨果对巴尔扎克的评价是：“在最伟大的人物中间，巴尔扎克是名列前茅者；在最优秀的人物中间，巴尔扎克是佼佼者之一。”一个原本寂寂无名的小人物，从地中海的某个海岛上，只身一人来到巴黎，没有朋友，也没有名望。作为一个一文不名的外乡人，凭着赤手空拳赢得了巴黎，征服了整个法兰西，并且赢得了世界。这个人就是十九世纪法国伟大的军事家、政治家，法兰西第一帝
我的烦恼余建梅
我的烦恼。女儿问我：“你给学生布置什么作文题目？”“《我的烦恼》。”“他们都这么大了，你觉得他们还有烦恼吗？”“有啊！每个人都会有自己烦恼。”“我不相信，大人是没有烦恼的，如果说一定有的话，你的烦恼和我写作业有关，而且是小烦恼。不像我，天天被你说，有这样的妈妈，烦恼是没完没了。”女儿愤愤不平。每个人都会有自己的烦恼，处在上有老下有小的年纪，烦恼多的数不完。想干好工作带好孩子，想孝顺父母又想经营好自
《大清方方案》| 第二话谁佐清欢
和珅究竟说了些什么？竟能令堂堂九五之尊龙颜失色！此处暂且按下不表；单说这位乾隆皇帝，果真不愧是康熙从小带过的，一旦决定了要做的事，便杀伐决断毫不含糊。他当即亲自拟旨，着令和珅为钦差大臣，全权负责处理方方事件，并钦赐尚方宝剑，遇急则三品以下官员可先斩后奏。和珅身负皇上重托，岂敢有半点怠慢，当夜即率领相关人等，马不停蹄杀奔江汉。这一路上，和珅的几位幕僚一直在商讨方方事件的处置方案。有位年轻幕僚建议快刀
【一起学Rust | 设计模式】习惯语法——使用借用类型作为参数、格式化拼接字符串、构造函数广龙宇一起学Rust #Rust设计模式 rust 设计模式开发语言
提示：文章写完后，目录可以自动生成，如何生成可参考右边的帮助文档文章目录前言一、使用借用类型作为参数二、格式化拼接字符串三、使用构造函数总结前言Rust不是传统的面向对象编程语言，它的所有特性，使其独一无二。因此，学习特定于Rust的设计模式是必要的。本系列文章为作者学习《Rust设计模式》的学习笔记以及自己的见解。因此，本系列文章的结构也与此书的结构相同（后续可能会调成结构），基本上分为三个部分
回溯 Leetcode 332 重新安排行程 mmaerd Leetcode刷题学习记录 leetcode 算法职场和发展
重新安排行程Leetcode332学习记录自代码随想录给你一份航线列表tickets，其中tickets[i]=[fromi,toi]表示飞机出发和降落的机场地点。请你对该行程进行重新规划排序。所有这些机票都属于一个从JFK（肯尼迪国际机场）出发的先生，所以该行程必须从JFK开始。如果存在多种有效的行程，请你按字典排序返回最小的行程组合。例如，行程[“JFK”,“LGA”]与[“JFK”,“LGB
每日一题——第八十九题互联网打工人no1 C语言程序设计每日一练 c语言
题目：在字符串中找到提取数字，并统计一共找到多少整数，a123xxyu23&8889，那么找到的整数为123，23，8889//思想：#include#include#includeintmain(){charstr[]="a123xxyu23&8889";intcount=0;intnum=0;//用于临时存放当前正在构建的整数。boolinNum=false;//用于标记当前是否正在读取一个整
每日一题——第九十题互联网打工人no1 C语言程序设计每日一练 c语言
题目：判断子串是否与主串匹配#include#include#include//////判断子串是否在主串中匹配//////主串///子串///boolisSubstring(constchar*str,constchar*substr){intlenstr=strlen(str);//计算主串的长度intlenSub=strlen(substr);//计算子串的长度//遍历主字符串，对每个可能得
Python数据分析与可视化实战指南 William数据分析 python python 数据
在数据驱动的时代，Python因其简洁的语法、强大的库生态系统以及活跃的社区，成为了数据分析与可视化的首选语言。本文将通过一个详细的案例，带领大家学习如何使用Python进行数据分析，并通过可视化来直观呈现分析结果。一、环境准备1.1安装必要库在开始数据分析和可视化之前，我们需要安装一些常用的库。主要包括pandas、numpy、matplotlib和seaborn等。这些库分别用于数据处理、数学
每日一题——第八十一题互联网打工人no1 C语言程序设计每日一练 c语言
打印如下图案:#includeintmain(){inti,j;charch='A';for(i=1;i<5;i++,ch++){for(j=0;j<5-i;j++){printf("");//控制空格输出}for(j=1;j<2*i;j++)//条件j<2*i{printf("%c",ch);//控制字符输出}printf("\n");}return0;}
每日一题——第八十四题互联网打工人no1 C语言程序设计每日一练 c语言
题目：编写函数1、输入10个职工的姓名和职工号2、按照职工由大到小顺序排列，姓名顺序也随之调整3、要求输入一个职工号，用折半查找法找出该职工的姓名#define_CRT_SECURE_NO_WARNINGS#include#include#defineMAX_EMPLOYEES10typedefstruct{intid;charname[50];}Empolyee;voidinputEmploye
每日一题——第八十二题互联网打工人no1 C语言程序设计每日一练 c语言
题目：将一个控制台输入的字符串中的所有元音字母复制到另一字符串中#include#include#include#include#defineMAX_INPUT1024boolisVowel(charp);intmain(){charinput[MAX_INPUT];charoutput[MAX_INPUT];printf("请输入一串字符串：\n");fgets(input,sizeof(inp
每日一题——第八十三题互联网打工人no1 C语言程序设计每日一练 c语言
题目：将输入的整形数字输出,输出1990，输出"1990"#include#defineMAX_INPUT1024intmain(){intarrr_num[MAX_INPUT];intnum,i=0;printf("请输入一个数字：");scanf_s("%d",&num);while(num!=0){arrr_num[i++]=num%10;num/=10;}printf("\"");for(
2019-12-22-22:30 涓涓1016
今天是冬至，写下我的日更，是因为这两天的学习真的是能量的满满，让我看到了自己，未来另外一种可能性，也让我看到了这两年这几年的过程中我所接受那些痛苦的来源。一切的根源和痛苦都来自于人生，家庭，而你的原生家庭，你的爸爸和妈妈，是因为你这个灵魂在那一刻选择他们作为你的爸爸和妈妈来的，所以你得接受他，你得接纳他，他就是因为他的存在而给你的学习和成长带来这些痛苦，那其实是你必然要经历的这个过程，当你去接纳的
谁家酒器最绝唱，藏在酒厂人未知？景阳冈酒厂先秦藏品大揭秘李虓酒评论
文/王赛时中国的酒器酒具历史久远，举世闻名。从北京的故宫博物院、中国国家博物馆，到世界各国的大型博物馆，都以能够收藏中国古代酒具而夸耀。但很少有人知道，在山东阳谷景阳冈酒厂，默默地收藏了两千件中国酒器。这些酒器，就封藏在景阳冈的酒道馆里。其中有一些青铜酒器，一睡就是三、四千年，堪称无声国宝，堪作无字史书！今天，我将引领诸位首先窥视一下景阳冈酒道馆的9件先秦藏品，你自己来说震撼不震撼。提示：这只是景
LLM 词汇表落难Coder LLMs NLP 大语言模型大模型 llama 人工智能
Contextwindow“上下文窗口”是指语言模型在生成新文本时能够回溯和参考的文本量。这不同于语言模型训练时所使用的大量数据集，而是代表了模型的“工作记忆”。较大的上下文窗口可以让模型理解和响应更复杂和更长的提示，而较小的上下文窗口可能会限制模型处理较长提示或在长时间对话中保持连贯性的能力。Fine-tuning微调是使用额外的数据进一步训练预训练语言模型的过程。这使得模型开始表示和模仿微调数
将cmd中命令输出保存为txt文本文件落难Coder Windows cmd window
最近深度学习本地的训练中我们常常要在命令行中运行自己的代码，无可厚非，我们有必要保存我们的炼丹结果，但是复制命令行输出到txt是非常麻烦的，其实Windows下的命令行为我们提供了相应的操作。其基本的调用格式就是：运行指令>输出到的文件名称或者具体保存路径测试下，我打开cmd并且ping一下百度：pingwww.baidu.com>./data.txt看下相同目录下data.txt的输出：如果你再
感赏日志133 马姐读书
图片发自App感赏自己今天买个扫地机，以后可以解放出来多看点书，让这个智能小机器人替我工作了。感赏孩子最近进步很大，每天按时上学，认真听课，认真背书，主动认真完成老师布置的作业。感赏自己明白自己容易受到某人的影响，心情不好，每当此刻我就会舒缓，感赏，让自己尽快抽离，想好的一面。感赏儿子今天在我提醒他事情时，告诉我谢谢妈妈对我的提醒我明白了，而不是说我啰嗦，管事情，孩子更懂事了，懂得感恩了。投射父母
509. 斐波那契数(每日一题) lzyprime
lzyprime博客(github)创建时间：2021.01.04qq及邮箱：2383518170leetcode笔记题目描述斐波那契数，通常用F(n)表示，形成的序列称为斐波那契数列。该数列由0和1开始，后面的每一项数字都是前面两项数字的和。也就是：F(0)=0，F(1)=1F(n)=F(n-1)+F(n-2)，其中n>1给你n，请计算F(n)。示例1：输入：2输出：1解释：F(2)=F(1)+
四章-32-点要素的聚合彩云飘过
本文基于腾讯课堂老胡的课《跟我学Openlayers--基础实例详解》做的学习笔记，使用的openlayers5.3.xapi。源码见1032.html，对应的官网示例https://openlayers.org/en/latest/examples/cluster.htmlhttps://openlayers.org/en/latest/examples/earthquake-clusters.
Java实现的基于模板的网页结构化信息精准抽取组件：HtmlExtractor yangshangchuan 信息抽取 HtmlExtractor 精准抽取信息采集
HtmlExtractor是一个Java实现的基于模板的网页结构化信息精准抽取组件，本身并不包含爬虫功能，但可被爬虫或其他程序调用以便更精准地对网页结构化信息进行抽取。 HtmlExtractor是为大规模分布式环境设计的，采用主从架构，主节点负责维护抽取规则，从节点向主节点请求抽取规则，当抽取规则发生变化，主节点主动通知从节点，从而能实现抽取规则变化之后的实时动态生效。如
java编程思想 -- 多态百合不是茶 java 多态详解
一: 向上转型和向下转型面向对象中的转型只会发生在有继承关系的子类和父类中（接口的实现也包括在这里）。父类：人子类：男人向上转型： Person p = new Man() ; //向上转型不需要强制类型转化向下转型： Man man =
[自动数据处理]稳扎稳打,逐步形成自有ADP系统体系 comsci dp
对于国内的IT行业来讲,虽然我们已经有了"两弹一星",在局部领域形成了自己独有的技术特征,并初步摆脱了国外的控制...但是前面的路还很长.... 首先是我们的自动数据处理系统还无法处理很多高级工程...中等规模的拓扑分析系统也没有完成,更加复杂的
storm 自定义日志文件商人shang storm cluster logback
Storm中的日志级级别默认为INFO，并且，日志文件是根据worker号来进行区分的，这样，同一个log文件中的信息不一定是一个业务的，这样就会有以下两个需求出现： 1. 想要进行一些调试信息的输出 2. 调试信息或者业务日志信息想要输出到一些固定的文件中不要怕，不要烦恼，其实Storm已经提供了这样的支持，可以通过自定义logback 下的 cluster.xml 来输
Extjs3 SpringMVC使用 @RequestBody 标签问题记录 21jhf
springMVC使用 @RequestBody(required = false) UserVO userInfo 传递json对象数据，往往会出现http 415，400,500等错误，总结一下需要使用ajax提交json数据才行，ajax提交使用proxy，参数为jsonData，不能为params；另外，需要设置Content-type属性为json，代码如下：（由于使用了父类aaa
一些排错方法文强chu 方法
1、java.lang.IllegalStateException: Class invariant violation at org.apache.log4j.LogManager.getLoggerRepository(LogManager.java:199)at org.apache.log4j.LogManager.getLogger(LogManager.java:228) at o
Swing中文件恢复我觉得很难小桔子 swing
我那个草了！老大怎么回事，怎么做项目评估的？只会说相信你可以做的，试一下，有的是时间！用java开发一个图文处理工具，类似word，任意位置插入、拖动、删除图片以及文本等。文本框、流程图等，数据保存数据库，其余可保存pdf格式。ok,姐姐千辛万苦，
php 文件操作 aichenglong PHP 读取文件写入文件
1 写入文件 @$fp=fopen("$DOCUMENT_ROOT/order.txt", "ab"); if(!$fp){ echo "open file error" ; exit; } $outputstring="date:"." \t tire:".$tire."
MySQL的btree索引和hash索引的区别 AILIKES 数据结构 mysql 算法
Hash 索引结构的特殊性，其检索效率非常高，索引的检索可以一次定位，不像B-Tree 索引需要从根节点到枝节点，最后才能访问到页节点这样多次的IO访问，所以 Hash 索引的查询效率要远高于 B-Tree 索引。可能很多人又有疑问了，既然 Hash 索引的效率要比 B-Tree 高很多，为什么大家不都用 Hash 索引而还要使用 B-Tree 索引呢
JAVA的抽象--- 接口 --实现百合不是茶
抽象接口实现接口 //抽象类 ,方法 //定义一个公共抽象的类 ,并在类中定义一个抽象的方法体抽象的定义使用abstract abstract class A 定义一个抽象类例如： //定义一个基类 public abstract class A{ //抽象类不能用来实例化，只能用来继承 //
JS变量作用域实例 bijian1013 作用域
<script> var scope='hello'; function a(){ console.log(scope); //undefined var scope='world'; console.log(scope); //world console.log(b);
TDD实践（二） bijian1013 java TDD
实践题目：分解质因数 Step1：单元测试： package com.bijian.study.factor.test; import java.util.Arrays; import junit.framework.Assert; import org.junit.Before; import org.junit.Test; import com.bijian.
[MongoDB学习笔记一]MongoDB主从复制 bit1129 mongodb
MongoDB称为分布式数据库，主要原因是1.基于副本集的数据备份， 2.基于切片的数据扩容。副本集解决数据的读写性能问题，切片解决了MongoDB的数据扩容问题。事实上，MongoDB提供了主从复制和副本复制两种备份方式，在MongoDB的主从复制和副本复制集群环境中，只有一台作为主服务器，另外一台或者多台服务器作为从服务器。本文介绍MongoDB的主从复制模式，需要指明
【HBase五】Java API操作HBase bit1129 hbase
import java.io.IOException; import org.apache.hadoop.conf.Configuration; import org.apache.hadoop.hbase.HBaseConfiguration; import org.apache.hadoop.hbase.HColumnDescriptor; import org.apache.ha
python调用zabbix api接口实时展示数据 ronin47
zabbix api接口来进行展示。经过思考之后，计划获取如下内容： 1、获得认证密钥 2、获取zabbix所有的主机组 3、获取单个组下的所有主机 4、获取某个主机下的所有监控项
jsp取得绝对路径 byalias 绝对路径
在JavaWeb开发中，常使用绝对路径的方式来引入JavaScript和CSS文件，这样可以避免因为目录变动导致引入文件找不到的情况，常用的做法如下：一、使用${pageContext.request.contextPath} 　　代码” ${pageContext.request.contextPath}”的作用是取出部署的应用程序名，这样不管如何部署，所用路径都是正确的。
Java定时任务调度：用ExecutorService取代Timer bylijinnan java
《Java并发编程实战》一书提到的用ExecutorService取代Java Timer有几个理由，我认为其中最重要的理由是：如果TimerTask抛出未检查的异常，Timer将会产生无法预料的行为。Timer线程并不捕获异常，所以 TimerTask抛出的未检查的异常会终止timer线程。这种情况下，Timer也不会再重新恢复线程的执行了;它错误的认为整个Timer都被取消了。此时，已经被
SQL 优化原则 chicony sql
一、问题的提出　在应用系统开发初期，由于开发数据库数据比较少，对于查询SQL语句，复杂视图的的编写等体会不出SQL语句各种写法的性能优劣，但是如果将应用系统提交实际应用后，随着数据库中数据的增加，系统的响应速度就成为目前系统需要解决的最主要的问题之一。系统优化中一个很重要的方面就是SQL语句的优化。对于海量数据，劣质SQL语句和优质SQL语句之间的速度差别可以达到上百倍，可见对于一个系统
java 线程弹球小游戏 CrazyMizzz java 游戏
最近java学到线程，于是做了一个线程弹球的小游戏，不过还没完善这里是提纲 1.线程弹球游戏实现 1.实现界面需要使用哪些API类 JFrame JPanel JButton FlowLayout Graphics2D Thread Color ActionListener ActionEvent MouseListener Mouse
hadoop jps出现process information unavailable提示解决办法 daizj hadoop jps
hadoop jps出现process information unavailable提示解决办法 jps时出现如下信息： 3019 -- process information unavailable3053 -- process information unavailable2985 -- process information unavailable2917 --
PHP图片水印缩放类实现 dcj3sjt126com PHP
<?php class Image{ private $path; function __construct($path='./'){ $this->path=rtrim($path,'/').'/'; } //水印函数，参数：背景图，水印图，位置，前缀,TMD透明度 public function water($b,$l,$pos
IOS控件学习：UILabel常用属性与用法 dcj3sjt126com ios UILabel
参考网站： http://shijue.me/show_text/521c396a8ddf876566000007 http://www.tuicool.com/articles/zquENb http://blog.csdn.net/a451493485/article/details/9454695 http://wiki.eoe.cn/page/iOS_pptl_artile_281
完全手动建立maven骨架 eksliang java eclipse Web
建一个 JAVA 项目： mvn archetype:create -DgroupId=com.demo -DartifactId=App [-Dversion=0.0.1-SNAPSHOT] [-Dpackaging=jar] 建一个 web 项目： mvn archetype:create -DgroupId=com.demo -DartifactId=web-a
配置清单 gengzg 配置
1、修改grub启动的内核版本 vi /boot/grub/grub.conf 将default 0改为1 拷贝mt7601Usta.ko到/lib文件夹拷贝RT2870STA.dat到 /etc/Wireless/RT2870STA/文件夹拷贝wifiscan到bin文件夹，chmod 775 /bin/wifiscan 拷贝wifiget.sh到bin文件夹，chm
Windows端口被占用处理方法 huqiji windows
以下文章主要以80端口号为例，如果想知道其他的端口号也可以使用该方法..........................1、在windows下如何查看80端口占用情况?是被哪个进程占用?如何终止等. 这里主要是用到windows下的DOS工具,点击"开始"--"运行",输入&
开源ckplayer 网页播放器，跨平台(html5, mobile)，flv, f4v, mp4, rtmp协议. webm, ogg, m3u8 ！天梯梦 mobile
CKplayer，其全称为超酷flv播放器，它是一款用于网页上播放视频的软件，支持的格式有：http协议上的flv,f4v,mp4格式，同时支持rtmp视频流格式播放，此播放器的特点在于用户可以自己定义播放器的风格，诸如播放/暂停按钮，静音按钮，全屏按钮都是以外部图片接口形式调用，用户根据自己的需要制作出播放器风格所需要使用的各个按钮图片然后替换掉原始风格里相应的图片就可以制作出自己的风格了，
简单工厂设计模式 hm4123660 java 工厂设计模式简单工厂模式
简单工厂模式（Simple Factory Pattern）属于类的创新型模式，又叫静态工厂方法模式。是通过专门定义一个类来负责创建其他类的实例，被创建的实例通常都具有共同的父类。简单工厂模式是由一个工厂对象决定创建出哪一种产品类的实例。简单工厂模式是工厂模式家族中最简单实用的模式，可以理解为是不同工厂模式的一个特殊实现。
maven笔记 zhb8015 maven
跳过测试阶段： mvn package -DskipTests 临时性跳过测试代码的编译： mvn package -Dmaven.test.skip=true maven.test.skip同时控制maven-compiler-plugin和maven-surefire-plugin两个插件的行为，即跳过编译，又跳过测试。指定测试类 mvn test
非mapreduce生成Hfile，然后导入hbase当中 Stark_Summer map hbase reduce Hfile path实例
最近一个群友的boss让研究hbase，让hbase的入库速度达到5w+/s，这可愁死了，4台个人电脑组成的集群，多线程入库调了好久，速度也才1w左右，都没有达到理想的那种速度，然后就想到了这种方式，但是网上多是用mapreduce来实现入库，而现在的需求是实时入库，不生成文件了，所以就只能自己用代码实现了，但是网上查了很多资料都没有查到，最后在一个网友的指引下，看了源码，最后找到了生成Hfile
jsp web tomcat 编码问题王新春 tomcat jsp pageEncode
今天配置jsp项目在tomcat上，windows上正常，而linux上显示乱码，最后定位原因为tomcat 的server.xml 文件的配置，添加 URIEncoding 属性： <Connector port="8080" protocol="HTTP/1.1" connectionTi