最近在学udacity的机器人课程, 陆续会将学到的内容总结成笔记发上来, 今天介绍第一个作业项目。
要完成这一项目需要以下准备工作:
本项目的任务是在模拟器中操作小车自动导航并寻找和采集地图中的矿石。本章分为两个部分:第一部分介绍感知, 通过小车的摄像头记录当前位置的路况景象,对图像处理标注障碍物,道路和矿石的坐标位置;第二部分介绍决策和控制, 主要讲使用第一部分的输出进行决策并控制小车在地图中采集矿石的过程。
# Identify pixels above the threshold
# Threshold of RGB > 160 does a nice job of identifying ground pixels only
def color_thresh(img, rgb_thresh=(160, 160, 160)):
# Create an array of zeros same xy size as img, but single channel
navigable_layer = np.zeros_like(img[:,:,0])
obstacle_layer = np.zeros_like(img[:,:,0])
sampleRocks_layer = np.zeros_like(img[:,:,0])
# Threshold the image
sampleRocks_thresh = (20,100,100)
hsv = cv2.cvtColor(img, cv2.COLOR_RGB2HSV)
navigable = (img[:,:,0] > rgb_thresh[0]) \
& (img[:,:,1] > rgb_thresh[1]) \
& (img[:,:,2] > rgb_thresh[2])
sampleRocks = (hsv[:, :, 0] > sampleRocks_thresh[0]) \
& (hsv[:, :, 1] > sampleRocks_thresh[1]) \
& (hsv[:, :, 2] > sampleRocks_thresh[2])
obstacle = (img[:, :, 0] <= rgb_thresh[0]) \
& (img[:, :, 1] <= rgb_thresh[1]) \
& (img[:, :, 2] <= rgb_thresh[2])
# Return the binary images for ground, rock and obstacles
navigable_layer[navigable] = 1
return navigable_layer, obstacle, sampleRocks
由于矿石,道路和山脉的颜色对比比较鲜明,所以设置合适的阈值后,很容易将三者分离出来。为了规划小车行进的路线,需要从摄像头视角转换为更容易控制小车导航的地图视角,即根据地图坐标系标注小车位置,用极坐标控制小车前进的方向。
# Define a function to perform a perspective transform
# I've used the example grid image above to choose source points for the
# grid cell in front of the rover (each grid cell is 1 square meter in the sim)
# Define a function to perform a perspective transform
def perspect_transform(img, src, dst):
M = cv2.getPerspectiveTransform(src, dst)
warped = cv2.warpPerspective(img, M, (img.shape[1], img.shape[0]))# keep same size as input image
return warped
2. 以车为中心的笛卡尔平面
把鸟瞰图转换为以车为中心的笛卡尔坐标,通过这一步转换,得到以小车为原点的周围路况的坐标分布,方便小车决策以调整行驶角度和速度。
# Define a function to convert from image coords to rover coords
def rover_coords(binary_img):
# Identify nonzero pixels
ypos, xpos = binary_img.nonzero()
print(ypos, xpos)
# Calculate pixel positions with reference to the rover position being at the
# center bottom of the image.
x_pixel = -(ypos - binary_img.shape[0]).astype(np.float)
y_pixel = -(xpos - binary_img.shape[1] / 2).astype(np.float)
return x_pixel, y_pixel
# Grab another random image
idx = np.random.randint(0, len(img_list)-1)
image = mpimg.imread(img_list[idx])
warped = perspect_transform(image, source, destination)
navigable,obstacle,sampleRocks = color_thresh(warped)
# Calculate pixel values in rover-centric coords and distance/angle to all pixels
xpix, ypix = rover_coords(navigable)
# Define a function to map rover space pixels to world space
def rotate_pix(xpix, ypix, yaw):
# Convert yaw to radians
yaw_rad = yaw * np.pi / 180
xpix_rotated = (xpix * np.cos(yaw_rad)) - (ypix * np.sin(yaw_rad))
ypix_rotated = (xpix * np.sin(yaw_rad)) + (ypix * np.cos(yaw_rad))
# Return the result
return xpix_rotated, ypix_rotated
def translate_pix(xpix_rot, ypix_rot, xpos, ypos, scale):
# Apply a scaling and a translation
xpix_translated = (xpix_rot / scale) + xpos
ypix_translated = (ypix_rot / scale) + ypos
# Return the result
return xpix_translated, ypix_translated
# Define a function to apply rotation and translation (and clipping)
# Once you define the two functions above this function should work
def pix_to_world(xpix, ypix, xpos, ypos, yaw, world_size, scale):
# Apply rotation
xpix_rot, ypix_rot = rotate_pix(xpix, ypix, yaw)
# Apply translation
xpix_tran, ypix_tran = translate_pix(xpix_rot, ypix_rot, xpos, ypos, scale)
# Clip to world_size
x_pix_world = np.clip(np.int_(xpix_tran), 0, world_size - 1)
y_pix_world = np.clip(np.int_(ypix_tran), 0, world_size - 1)
# Return the result
return x_pix_world, y_pix_world
# Define a function to apply rotation and translation (and clipping)
# Once you define the two functions above this function should work
def pix_to_world(xpix, ypix, xpos, ypos, yaw, world_size, scale):
# Apply rotation
xpix_rot, ypix_rot = rotate_pix(xpix, ypix, yaw)
# Apply translation
xpix_tran, ypix_tran = translate_pix(xpix_rot, ypix_rot, xpos, ypos, scale)
# Clip to world_size
x_pix_world = np.clip(np.int_(xpix_tran), 0, world_size - 1)
y_pix_world = np.clip(np.int_(ypix_tran), 0, world_size - 1)
# Return the result
return x_pix_world, y_pix_world
image = mpimg.imread('angle_example.jpg')
warped = perspect_transform(image) # Perform perspective transform
colorsel = color_thresh(warped, rgb_thresh=(160, 160, 160)) # Threshold the warped image
xpix, ypix = rover_coords(colorsel) # Convert to rover-centric coords
distances, angles = to_polar_coords(xpix, ypix) # Convert to polar coords
avg_angle = np.mean(angles) # Compute the average angle
avg_angle_degrees = avg_angle * 180/np.pi
steering = np.clip(avg_angle_degrees, -15, 15)
以上就是第一部分感知的内容, 核心的任务就是把摄像头视角的图片中的道路和矿石映射到以小车为中心的极坐标系中,最终得到小车前进的方向和方向上可以前进的距离作为参数供第二部分决策。
在做决策之前,先创造一个小车类并定义一些辅助做决策的属性。
# Define RoverState() class to retain rover state parameters
class RoverState():
def __init__(self):
self.start_time = None # To record the start time of navigation
self.total_time = None # To record total duration of naviagation
self.img = None # Current camera image
self.pos = None # Current position (x, y)
self.yaw = None # Current yaw angle
self.pitch = None # Current pitch angle
self.roll = None # Current roll angle
self.vel = None # Current velocity
self.steer = 0 # Current steering angle
self.throttle = 0 # Current throttle value
self.brake = 0 # Current brake value
self.nav_angles = None # Angles of navigable terrain pixels
self.nav_dists = None # Distances of navigable terrain pixels
self.ground_truth = ground_truth_3d # Ground truth worldmap
self.mode = 'forward' # Current mode (can be forward or stop)
self.throttle_set = 0.2 # Throttle setting when accelerating
self.brake_set = 10 # Brake setting when braking
# The stop_forward and go_forward fields below represent total count
# of navigable terrain pixels. This is a very crude form of knowing
# when you can keep going and when you should stop. Feel free to
# get creative in adding new fields or modifying these!
self.stop_forward = 50 # Threshold to initiate stopping
self.go_forward = 500 # Threshold to go forward again
self.max_vel = 2 # Maximum velocity (meters/second)
# Image output from perception step
# Update this image to display your intermediate analysis steps
# on screen in autonomous mode
self.vision_image = np.zeros((160, 320, 3), dtype=np.float)
# Worldmap
# Update this image with the positions of navigable terrain
# obstacles and rock samples
self.worldmap = np.zeros((200, 200, 3), dtype=np.float)
self.samples_pos = None # To store the actual sample positions
self.samples_found = 0 # To count the number of samples found
self.near_sample = False # Set to True if within reach of a rock sample
self.pick_up = False # Set to True to trigger rock pickup
做决策的逻辑简单,但代码冗长,所以就不贴源码了,感兴趣的可以下载代码查看,简单总结决策的逻辑就是小车有两种工作模式,运动和停止,在运动模式下当前面的可通行路长度超过stop_forward,则继续前进,否则停止,进入停止模式如果速度大于0.5 则踩刹车并左转,当速度小于0.5只转向,直到可视范围内的道路长度大于500米,开始加速进入前进模式。
根据决策系统发出的angle, throttle和brake指令就可以控制小车在地图中行驶了,在行驶的过程中,同时运行第一部分探测矿石的函数, 当探测到矿石时,到达矿石位置,并为模拟器发送采集矿石命令。如果小车能成功在地图中自动驾驶并采集矿石,则完成了本项目的全部任务。
def telemetry(sid, data):
if data:
global Rover
# Initialize / update Rover with current telemetry
Rover, image = update_rover(Rover, data)
if np.isfinite(Rover.vel):
# Execute the perception and decision steps to update the Rover's state
Rover = perception_step(Rover)
Rover = decision_step(Rover)
# Create output images to send to server
out_image_string1, out_image_string2 = create_output_images(Rover)
# The action step! Send commands to the rover!
commands = (Rover.throttle, Rover.brake, Rover.steer)
send_control(commands, out_image_string1, out_image_string2)
# If in a state where want to pickup a rock send pickup command
if Rover.pick_up and not Rover.picking_up:
send_pickup()
# Reset Rover flags
Rover.pick_up = False
# In case of invalid telemetry, send null commands
else:
# Send zeros for throttle, brake and steer and empty images
send_control((0, 0, 0), '', '')
# If you want to save camera images from autonomous driving specify a path
# Example: $ python drive_rover.py image_folder_path
# Conditional to save image frame if folder was specified
if args.image_folder != '':
timestamp = datetime.utcnow().strftime('%Y_%m_%d_%H_%M_%S_%f')[:-3]
image_filename = os.path.join(args.image_folder, timestamp)
image.save('{}.jpg'.format(image_filename))
else:
sio.emit('manual', data={}, skip_sid=True)