Vision Framework - Building on Core ML

What You Can Do with Vision

  1. Face Detection
    • Small Faces
    • Strong Profiles
    • Partially Occluded
    • Hats and Glasses
    • Face Landmarks
  2. Image Registration
  3. Rectangle Detection
  4. Barcode Detection
  5. Text Detection
  6. Object Tracking
    • faces
    • rectangles
    • general templates
  7. Integration with Core ML

On Device vs. Cloud

  1. Privacy
    • Images and video stay on device
  2. Cost
    • No usage fees
    • No data transfer
  3. Real-time use cases
    • No latency, fast execution

Vision Concepts

  1. Analyzing an Image


    Vision Framework - Building on Core ML_第1张图片
    2dd7f001-33d3-4fc8-860a-ecbff3b0967c.png

    Vision Framework - Building on Core ML_第2张图片
    36d6f7e2-e797-49fd-badf-39c9927cf1e8.png
  2. Tracking in a Sequence


    Vision Framework - Building on Core ML_第3张图片
    d9d9bd9a-c67a-4fa4-97c1-4dcc51ef5be4.png
  3. Image Request Handler
    • For interactive exploration of an image
    • Holds on to the image for its lifecycle
    • Allows optimization of various requests performed on an image
  4. Sequence Request Handler
    • For anything that looks at images in a sequence like tracking
    • Does not optimize for multiple requests on an image
  5. Putting It into Code
// Create request
let faceDetectionRequest = VNDetectFaceRectanglesRequest()

 // Create request handle
let myRequestHandler = (url: fileURL, options: [:]) 

// send the requests to the request handler
myRequestHandler.perform([faceDetectionRequest])

// Do we have a face
for observation in faceDetectionRequest.results as! [VNFaceObservation] { /// do something
}
 // Create a sequence request handler
let requestHandler = VNSequenceRequestHandler()

// Start the tracking with an observation
let observations = detectionRequest.results  as! [VNDetectedObjectObservation]
let objectsToTrack = observations.map { VNTrackObjectRequest(detectedObjectObservation: $0) }

// Run the requests
requestHandler.perform(objectsToTrack, on: pixelBuffer)

// Lets look at the results
for request in objectsToTrack
for observation in request.results as! [VNDetectedObjectObservation]

Which Image Type Is Right for Me?

  1. Vision supports various image types
    • CVPixelBufferRef: VideoDataOut
    • CGImageRef: UIImage
    • CIImage: Core Image
    • NSURL: disk
    • NSData: web
  2. The image type to choose depends on where the image comes from
  3. You shouldn’t have to pre-scale the image
  4. Make sure to pass in the EXIF orientation of the image

What Am I Going to Do with the Image?

  1. Interactively explore the image
    • Use VNImageRequestHandler and hold onto it
    • Remember that the input image is immutable
  2. Tracking an observation
    • Use VNSequenceRequestHandler
    • Tracking state is kept in the VNSequenceRequestHandler
    • Lifecycle of images is not tied to the life of the VNSequenceRequestHandler

What Performance Do I Need or Want?

Vision tasks can be time consuming and processing intensive

  • Dispatch your work on a queue with appropriate QOS
  • Use the completion handler to work with the results
  • Completion handler is called on the same queue as the request

你可能感兴趣的:(Vision Framework - Building on Core ML)