Vision Tasks Overviews¶
MediaPipe4U internally uses a unified and general-purpose pipeline to handle vision tasks, including but not limited to motion capture and facial expression capture.
Vision Task Workflow¶
Vision tasks refer to tasks that apply AI algorithms to images. We have defined a unified vision task processing pipeline to accomplish these tasks.
Tip
Motion capture and facial expression capture are typical vision tasks.
The vision task processing workflow is defined in the MediaPipe4U plugin.
The motion capture components are defined in the MediaPipe4UMoion plugin.
The facial expression capture components are defined in the MediaPipe4ULiveLink plugin.
sequenceDiagram
autonumber
Image Source->>Image Consumer: Get one image frame
loop For each consumer
Image Consumer->>Image Consumer: Process one frame
end
Note right of Image Consumer: Synchronized processing, one by one consumer
Image Consumer->>Unreal Engine: Integration into Unreal Engine
Image Consumer-->>Image Source: Poll next frame!
Brief workflow explanation:
Image Sourceis responsible for obtaining an image frame from various media sources.Image Consumeris responsible for processing an image frame. There can be multipleImage Consumersin the pipeline, and the same frame is distributed to eachImage Consumerin turn.Image Consumerintegrates with Unreal Engine and sends the processed results back toUnreal Engine.- After all
Image Consumershave finished processing,Image Sourcepulls the next frame through aPolloperation to begin the next processing cycle.
Components and Abstractions¶
Image Source: Corresponds to theImageSourceComponentinterface in MediaPipe4U.Image Consumer: Corresponds to theIImageConsumerorIImageConsumerProviderinterface in MediaPipe4U.
Built-in Components¶
MediaPipe4U benefits from a unified vision task processing pipeline and comes with several built-in Image Source components and two main Image Consumer components:
ImageSourceComponent: MediaPipe4U includes a variety of image providers capable of acquiring images from common types of media. For more information aboutImage Source, please read the Image Source documentationMediaPipeHolisticComponent: AanImage Consumerthat uses Google's MediaPipe algorithm to process images and calculate character joint rotations, which are then applied to 3D skeletal meshes in Unreal Engine.MediaPipeFaceLinkActor: AnImage Consumerthat uses algorithms (eg., MediaPipe, Nvidia Maxine AR) to compute facial expression coefficients from images and transmits these coefficients to Unreal Engine via theLive Linkdata protocol.
Extension Support (C++)¶
Image Consumer
MediaPipe4U allows you to register your own Image Consumer to use our vision task pipeline for handling additional vision tasks.
For more information about Image Consumer, please refer to the Image Consumer documentation.