Researchers from MIT, the MIT-IBM Watson AI Lab, IBM Research, and elsewhere have developed a new technique for analyzing unlabeled audio and visual data that could improve the performance of machine-learning models used in applications like speech recognition and object detection. The work, for the first time, combines two architectures of self-supervised learning, contrastive learning and masked data modeling, in an effort to scale machine-learning tasks like event classification in single-...
Read more
Computer News: Scaling audio-visual learning without labels
- bob_turner
- Bedrock
- Posts: 87604
- Joined: Fri May 25, 2018 5:14 pm
- 6
- Mood: