
From ArchLinux Wiki:
PipeWire is a new low-level multimedia framework. It aims to offer capture and playback for both audio and video with minimal latency.
Pipewire is the default audio server on new Linux distributions. But what is an audio server?
Audio Server
On a modern Linux system, multiple applications can produce audio at any given time. For example, the VLC video player and the Firefox browser. These applications are called audio sources. In order to play the audio, the digital audio data must be passed to the computer’s sound card or HDMI output (i.e., the audio jack on your monitor). The devices that convert digital audio information into sound waves that we can hear are called audio sinks.
The audio server performs two functions: routing and mixing.
Sources and sinks are in an n:m relationship. Multiple sources can be sent to one sink; for example, you can play a game and listen to music at the same time. Though rare, the same source can also be sent to multiple sinks. One example would be two-person audio/video editing. Sources and sinks cannot know which to connect to. The audio server is responsible for routing each source to its intended sink.
We’re all accustomed to hearing different sounds played from the same speaker. Have you ever listened to music during a boring meeting? Under the hood, It’s more complicated than that, though. An audio device can only accept one data stream at a time. You cannot send two streams of audio data to one sound card. Just as you cannot plug one headphone into two MP3 players — it only have one plug! This is where the audio server steps in. The audio server mixes multiple audio streams into one and sends the resulting stream to the audio sink. If the sink’s frequency differs from that of the sources, the audio server will also up or downsample the audio.
Since the audio server automatically mixes the audio, we will focus on routing.
Pipewire audio graph
We can represent the relationship between sources and sinks as a graph. Each audio source or sink is a node, and the connections between nodes are vertices.
In Pipewire, this graph consists of:
- Node: the audio sources and sinks, or filter nodes.
- Port: the sound channels of a source or sink. Stereo nodes have two ports and mono nodes have one. A port has directions: outward for sources and inward for sinks.
- Link: the connection between an outward port and an inward port
Each object in the graph has a unique object ID. To better understand the Pipewire audio graph, install qpwgraph and see for yourself.
Manipulate the graph
At first glance, the Pipewire graph appears to be object-oriented. Everything is an object with its own properties. However, it actually isn’t. To add a new connection , you can’t call the “connectToPort(Port a)” method on one port. Nor is there a function called “connectPorts(Port a, Port b)”. Instead, you create a new link object, setting its input and output ports to the respective objects.
To disconnect a source, simply delete all of its Link objects.
Construct the graph
Of course, before we can do anything with the graph, we need to construct it in our application. Pipewire provides a way to iterate through every object in the graph, and listen to graph changes. The official example only enumerates the objects. In a real-world application, you would certainly want to store the information in some way.
Surprisingly, the most efficient way to store the Pipewire graph is to store the objects in a HashMap with the object ID as the key. Yes, we flatten the graph. Even though links belong to Ports and Ports belong to Node, we ignore the hierarchy. Pipewire’s graph updating callback only provides the ID of the updated object. If we store links as part of ports and ports as members of Node, we’ll have a hard time figuring out where the object resides by ID.
The total number of objects ranges from the tens to the hundreds, so a simple data structure is sufficient.