Milan Todorovic
Let your iOS app read texts
#1about 4 minutes
Introduction to the Vision framework for text recognition
The Vision framework simplifies incorporating optical character recognition (OCR) into iOS and macOS applications using Swift.
#2about 4 minutes
Understanding the core Vision request workflow
The fundamental process involves creating an image request handler, defining a request, and then performing the handler to get results.
#3about 2 minutes
Simplifying text recognition with VNRecognizedTextRequest
The modern API streamlines text recognition by using the VNRecognizedTextRequest class, which returns candidate strings directly.
#4about 3 minutes
Choosing between fast and accurate recognition modes
A comparison of the 'fast' mode, which uses character detection, and the 'accurate' mode, which uses a neural network for whole-word recognition.
#5about 4 minutes
Implementing the full workflow with advanced options
A complete code walkthrough shows how to set up the request, handle completion, and improve results with language correction and custom lexicons.
#6about 6 minutes
Live demo of scanning printed text from a book
A practical demonstration using a sample app to scan a page from a printed book, showing the high accuracy of the Vision framework.
#7about 3 minutes
Demonstrating business card and receipt scanning
The demo continues by scanning a business card and a multi-language receipt, highlighting both successes and potential challenges with complex layouts.
#8about 3 minutes
Recognizing handwritten text and a brief code overview
The final demo shows the framework's capability to recognize handwritten text, followed by a quick look at the relevant Swift code in the sample project.
#9about 5 minutes
Resources and other capabilities of the Vision framework
Learn where to find documentation and tutorials, and discover other Vision features like hand and body pose detection or image classification.
#10about 3 minutes
On-device processing and cross-platform considerations
The benefits of on-device processing for speed, security, and privacy are discussed, along with potential alternatives for Android and Flutter developers.
Related jobs
Jobs that call for the skills explored in this talk.
Matching moments
00:59 MIN
Distinguishing real from fake tech headlines
Fake or News: Coding on a Phone, Emotional Support Toasters, ChatGPT Weddings and more - Anselm Hannemann
04:57 MIN
Increasing the value of talk recordings post-event
Cat Herding with Lions and Tigers - Christian Heilmann
00:38 MIN
Exploring bizarre headlines about IoT and robotics
Fake or News: Coding on a Phone, Emotional Support Toasters, ChatGPT Weddings and more - Anselm Hannemann
01:15 MIN
Crypto crime, EU regulation, and working while you sleep
Fake or News: Self-Driving Cars on Subscription, Crypto Attacks Rising and Working While You Sleep - Théodore Lefèvre
05:17 MIN
Shifting from traditional CVs to skill-based talent management
From Data Keeper to Culture Shaper: The Evolution of HR Across Growth Stages
03:31 MIN
The value of progressive enhancement and semantic HTML
WeAreDevelopers LIVE – You Don’t Need JavaScript, Modern CSS and More
09:38 MIN
Technical challenges of shipping a cross-platform browser
Developer Time Is Valuable - Use the Right Tools - Kilian Valkhof
03:16 MIN
Improving the developer feedback loop with specialized tools
Developer Time Is Valuable - Use the Right Tools - Kilian Valkhof
Featured Partners
Related Videos
Detect Hand Pose with Vision
Milan Todorovic
Harnessing Apple Intelligence: Live Coding with Swift for iOS
MIlan Todorović
Apple Vision Pro: Proven Development Methods Meet the Latest Technology
Mario Petricevic
Vision for Websites: Training Your Frontend to See
Daniel Madalitso Phiri
From ML to LLM: On-device AI in the Browser
Nico Martin
Building Better Apps with React Native
Marc Rousavy
AskUI - How to leverage Vision Agents for Test Automation
Jonas Menesklou
Computer Vision from the Edge to the Cloud done easy
Flo Pachinger
Related Articles
View all articles



From learning to earning
Jobs that call for the skills explored in this talk.



Plain Concepts
Remote
Azure
Python
Computer Vision
Machine Learning
+2

Apple
Zürich, Switzerland
Python
Pandas
PyTorch
Data analysis
Computer Vision
+1

Apple
Zürich, Switzerland
Python
Pandas
PyTorch
Data analysis
Computer Vision
+1

Virtonomy GmbH
München, Germany
€55-70K
Pandas
PyTorch
TensorFlow
Matplotlib
+2


Vicomtech
Municipality of Bilbao, Spain
Keras
Python
PyTorch
TensorFlow
Data analysis
+3

Imec
Azure
Python
PyTorch
TensorFlow
Computer Vision
+1