Facebook udvikler sound source separation via ML

Musikbranche og ophavsrettigheder, musikteori, artister, sange og videoer, butikker.
Nyt svar
Medlemsavatar
Mike-air
Forum Donator
Indlæg: 10300
Sted: Oslo

Facebook udvikler sound source separation via ML

Indlæg af Mike-air »

https://tech.fb.com/one-track-minds-usi ... eparation/
Today, the most commonly used AI-powered music source-separation techniques work by analyzing spectrograms, which are heat map-like visualizations of a song’s different audio frequencies. “They are made by humans for other humans, so they are technically easy to create and visually easy to understand,” says Defossez.
Spectrograms, which can only represent sound waves as a montage of time and frequency, cannot capture such nuances. Consequently, they process a drumbeat or a slapped bass note as several noncontiguous vertical lines rather than as one neat, seamless sound. That is why drum and bass tracks that have been separated via spectrogram often sound muddy and indistinct.
AI-based waveform models avoid these problems because they do not attempt to push a song into a rigid structure of time and frequency. Defossez explains that waveform models work in a similar way to computer vision, the AI research field that aims to enable computers to learn to identify patterns from digital images so they can gain a high-level understanding of the visual world.
Defossez says his system can also be likened to the seismographic tools that detect and record earthquakes. During an earthquake, the base of the seismograph moves but the weight hanging above it does not, which allows a pen attached to that weight to draw a waveform that records the ground’s motion. An AI model can detect several different earthquakes happening at the same time and then infer detail about each one’s seismic magnitude and intensity. Likewise, Defossez’s system analyzes and separates a song as it actually is, rather than chopping it up according to the preconceived structure of a spectrogram.

Lyt eksempler i den fulde artikkel:
https://tech.fb.com/one-track-minds-usi ... eparation/

Medlemsavatar
Christoffer I. N.
Lydmaskinist
Indlæg: 35558
Sted: Hørsholm

Indlæg af Christoffer I. N. »

Tak, klart interessant.
Desværre lyder det indtil videre ligeså dårligt eller værre end eksempelvis iZotope RX eller Accusonus. Det hjælper heller ikke at lyden skal streames, alle lydbidderne har den der klamme streaming klang.

Medlemsavatar
Mike-air
Forum Donator
Indlæg: 10300
Sted: Oslo

Indlæg af Mike-air »

Ja, det ved jeg jo så ikke noget om (har aldrig brugt de nævnte redskaber :-) ). Men måske det er en sted hvor man løbende skal holde lidt øje med udviklingen. Hvis der er hold i det mht. spektrogram vs. det tensor-format de bruger, så kunne det jo godt tale for at resultatet vil blive bedre med mere data og evt. anderledes arkitektur (på det neurale netværk som bruges). Det er nemlig lidt den retning det har gået de seneste år indenfor computervision som det hedder.

Medlemsavatar
Hald
Forum Donator
Indlæg: 10829
Sted: Vind / Holstebro

Indlæg af Hald »

Det er da mega fedt. Vi er enige i at lyden taber noget kvalitet, men tænk at kunne tage et nummer og lave drum replacement, mappe diverse MIDI spor og erstatte dem, og så smide den originale sanger på, eller måske lave en instrumental der følger deres.

Eller det jeg ville bruge det til, tage et nummer, fjerne trommerne og bruge det til at øve efter.
"Knobs? Where we're going, we don't need knobs!" - 14 år med ørene i Lydmaskinen -

Nyt svar