WWDC 2019 presented features, tools and devices that have the potential to change the industry. From One-Shot Object Detection to RealityKit and ARKit 3, let’s get into the best of the five-day event.
The event started with a Keynote presentation, where Apple presented the highlights. After that, four conference halls were jampacked from 9 a.m. to 6 p.m. every day. Topics covered were catered to both the expert developers and the curious users in the audience.
As a software design and engineering company, we’ll be focusing on the biggest tech that will affect development, and will define our approach to delivering top-notch products for our clients.
STRV team at WWDC: Jakub Vodak, Jan Schwarz, Matej Obrtanec, Alexandre Tavares
ONE-SHOT OBJECT DETECTION
About Turi Create
Turi Create is a cross platform and open source framework that simplifies the development of custom machine learning models that can be exported into native Core ML format. Exported models can then be used for iOS, macOS, watchOS and tvOS applications to enhance their behavior or to support tasks that were previously impossible — such as recommending favorite movies or detecting objects and their location within a provided image. To train a simple object detector, we only need a set of images and as little as 6 lines of code.
Presented at WWDC 2019 is Turi Create’s addition of two entirely new toolkits to the family of the toolkits already existing, namely One-Shot Object Detection and Drawing Classification.
Object Detection Using Object Detector
The current Object Detection toolkit allows you to train a model that provides predictions about what kind of objects are present in the image, as well as their location in the image. For example, a model trained to detect cards from a deck of 52 playing cards would provide the following prediction:
To train our object detection model to perform reasonably well, we would have to provide a set of approximately 30+ images for each one of the 52 playing cards, along with annotations describing the location of the card on the image. To collect and annotate this amount of images would be a very long and tedious process. Luckily, One-Shot Object Detection comes to the rescue.
One-Shot Object Detection
With One-Shot Object Detection, you can dramatically reduce the amount of data needed to train your model. Compared to our Object Detection example—with 1560+ manually collected and manually annotated images—you can achieve similar or even better results using One-Shot Object Detection with as little as one image per card type, and no annotations.
In this case, you would start with providing one image per card type, containing only the images you wish to detect - in our case, cards. One-Shot Object Detection algorithms would then take our images and place them onto thousands of real-world images, artificially improving the quality of your model. The resulting model trained on the given set of generated images would then provide us with great predictions, with very little required input data and, most importantly, with no annotations needed.
source:  WWDC 2019 - Presentation Slides(example of artificially generating images with with card images)
Before starting with One-Shot Object Detection, keep in mind that it only supports 2D objects. For 3D objects, a more traditional Object Detection toolkit is recommended.
Drawing Classification is a newly introduced toolkit that allows you to train your models using Apple Pencil input.
Benefits of Using Drawing Classification
- The resulted Core ML model is less than 500 kB in size on the device.
- Only as little as 30 source images per class are needed due to the use of the pre-trained model on millions of drawings.
Along with Drawing Classification, PencilKit released an API that allows you to extract a cropped image of your drawing. The cropped image can then be passed to the vision API for interaction with Core ML model.
Just to provide a real-world example, you can use Drawing Classification to recognize strokes done with Apple Pencil and automatically replace them with predefined images. (Notice the star drawing being replaced by the star image.)
source: WWDC 2019 - Presentation Slides
One of the greatest things about WWDC? The labs. It’s where you get the chance to meet and talk to Apple engineers, discussing your project or an issue that you're facing.
The labs run for the whole week and there are multiple topics every day. It's recommended to sign up in advance to skip the queue. With around 12 engineers available for each topic, everyone had a chance to ask questions.
Our STRV crew discussed a number of our current projects with the engineers. Currently, we have a couple of IoT and “smart devices” projects in progress, so we wanted to discuss BLE connectivity, specifically background execution. An app can be woken up by the BLE device and has around 10 seconds to complete a task. In that time, you can discover and connect to a device as well as transfer your data. And as we are able to perform changes in the BLE device's firmware, it was great to discuss the data exchange options.
Apple has also introduced improvements to the Core Bluetooth framework, such as privacy-enhancing, LE 2Mbps, which increase the speed from 1Mbps for Bluetooth 5.0, advertising extensions, BR/EDR and dual-mode devices. We asked about it all, ensuring we understand how to adapt everything in the future.
Another big topic for STRV is live video streaming, as we have been dealing with apps using this technology. When speaking about streaming, it's always a matter of latency and how much the delay will affect the user experience. Apple introduced HLS (HTTP Live Streaming) a couple of years back, but the latency there could grow up to 20 seconds, which might not be sufficient. However, Apple is now offering a new Low-Latency mode. Latencies of less than two seconds are now achievable over public networks at scale. This is something that can compete with WebRTC, another streaming platform with low latency which is being used by Facebook Messenger, Google Hangouts and other video chat and live streaming platforms. It was great to see it in action.
Augmented reality is a huge topic. It looks cool, it brings virtual content to the real world and it is challenging and therefore appealing to developers. A few years ago, Apple started to invest their energy into incorporating augmented reality into apps, and on WWDC 2017 they announced the ARKit framework, with an initial set of features that allowed basic understanding and interactions with the surrounding world. From that point, new features and improvements have been announced every year.
Some of the most outstanding updates in ARKit 3 are people occlusion and body motion capture. People occlusion is demonstrated in the following picture.
Before ARKit 3, the mechanical soldier behind the woman would cover her, which was obviously wrong because he is placed further from the camera. Thanks to machine learning algorithms, ARKit 3 can recognize people in a scene and understand how far they are from the camera which allows us to create a much more realistic feel in a scene with people.
But Apple didn't stop there. ARKit is now able to not only detect people in a scene but to also understand body motions and provide developers with information about the position of a body in the scene, and positions of all main joints relative to the center of the body. This feature unlocks many opportunities. For example, we are now able to make a virtual character move through movements of a real person. We can place virtual clothes on a customer who wants to try it out before buying it in an e-shop. We can train a machine-learning model to understand body movements and to let us respond to them. There are literally hundreds of use cases where this feature can be utilized.
But that’s not all. One small but nice enhancement is that unlike the previous versions, ARKit 3 is capable of tracking more than one face. We can now easily build an app where you can try new lipstick shades on with your friends to see which shade matches which person best.
Another slightly hidden but huge addition is collaborative sessions. Before ARKit 3, we were able to capture all known information about an environment around a user and either share it with other devices or reload it on the next app’s launch. Meaning: if you used a virtual blackboard, it was possible to save everything you drew on that blackboard together with the exact information of where it was placed in the world. And when you launched the app the next time, your content appeared right where it was before.
There hadn’t been an easy way to do this in real time with two users writing on one blackboard. But thanks to ARKit 3, this limitation is gone, as more devices are now able to combine their understanding of an environment around them and ARKit can even tell where the other devices are in relation to you. This obviously has huge potential in gaming because it is now simple to create a game where you compete with your friends in the real world.
Those are some of the most interesting changes in ARKit 3. But this year's WWDC was packed with interesting changes. And one big announcement regarding augmented reality really stood out.
As part of Apple’s strategy to make every tool as easy to use as possible, the company decided to bring AR closer to even those developers who are still apprehensive about it.
RealityKit and Reality Composer
If you’ve ever created an app with an AR experience, you probably noticed that it is extremely easy to bring your virtual content to a scene, but it is very challenging to make the virtual content look natural in the real world. You have to place matching lighting to the scene, set up shadows for all objects, define how they will interact with each other, blur objects that are out of camera focus etc. Well, RealityKit solves this for us.
RealityKit is a high-level rendering framework built on top of ARKit. In layman’s terms, RealityKit consumes information coming from ARKit about detected planes, faces, bodies, about camera positioning in the world, about lighting conditions and more, and renders highly realistic virtual content on top of it. Everything is, of course, accessible through very simple and comprehensive API, so developers can achieve amazing results on just a few lines of code.
Along with RealityKit, Apple also introduced Reality Composer, which is an easy-to-use studio where you can design all the virtual content. Its behavior can later be placed into the real world with RealityKit.
Like everything, this new tool has its pros and cons. On the one hand, whenever there is a high-level framework it means that it is probably not useable for applications with specific requests for custom behavior. On the other hand, RealityKit opens up a world of augmented reality to any developer. Whether you want to create a game for playing soccer on a table, or you'd like to demonstrate simple physics experiments to students—or maybe you just want to see a visualization of your future living room—you no longer need rocket engineers to create this for you. Any qualified iOS developer is now capable of making it happen.
IPAD APPS FOR MACOS
For apps designed with the iPad in mind, one nice addition would be porting it to the macOS. But that involves rewriting a bunch of code, as macOS doesn't use the UIKit APIs.
That changes on Xcode 11. If you already have a great iPad app, all you need is to tick a checkbox and you can have your app on both platforms. That will make more sense as the iPadOS gets closer to macOS.
Games like Alphalt 9, which was shown on WWDC, looked pretty good running on a Mac. I’d bet we will see many more iPad games on the macOS Appstore soon.
There are also many apps that are currently being used in web browsers, like tools for task management, document editing or realtime communication. These are some examples of websites where users have to actively interact with content; providing them in the form of native apps would create a much better user experience. Luckily, if these services have apps optimized for iPads, they can now be brought to macOS.
Of course, not all iPad apps make sense on a Mac, so apps that use resources like AR, rear camera or GPS Navigation have no reason to be available for macOS. And while the heavy work is automatically handled, it's always interesting to consider implementing macOS-oly features, like the menu bar and keyboard shortcuts.
Siri Shortcuts was introduced on iOS 12. Now, on the new iOS version, a number of introduced changes may make you want to start using it in your app.
First of all, it will come pre-installed on iOS 13 and iPadOS, so that’s one step less for the users, which means that more people are going to try it. Secondly, it has become easier for users to create shortcuts. You no longer need to record a phrase for Siri to understand, so the app can suggest a predefined phrase for a Shortcut, or the user can type it.
Additionally, with Apple adding parameters to Shortcuts, the user can interact with the app in a conversational manner. The app can create custom intents for common user actions, like booking a table at a restaurant or ordering food, and Siri will be able to talk to the user to get enough information to complete the intent.
That’s it! Hope you enjoyed STRV’s selection of the most intriguing information, features and tools from WWDC 2019. We chose to skip topics like dark mode, new Mac Pro, sign in with Apple or SwiftUI and others, but I believe that this article covers the most crucial changes; those that will bring apps for iPhones, iPads and Macs to a whole new level.