FOSSMeet 2016

FOSSMeet is an annual event at NIT Calicut that brings together the Free and Open Source Community from around the country.

Kosha - building an open music database for indian classical music

Submitted by Srihari Sriraman (@ssrihari) on Wednesday, 20 January 2016

videocam_off
Technical level: Beginner

Vote on this proposal

Login to vote

Total votes:  +8

Objective

Learn about a technically stimulating open source project in the music domain that allows you to contribute to the community directly, and immediately.

Description

Introduction

Why isn’t there a standard dictionary of ragas? Why isn’t there a standard list of compositions of the trinity of Carnatic Music?

For a form of music that is centuries old, revolves around ragas and compositions, there is surprisingly less information that is structured, coherent, and on the internet. While the question of “Why?” can be daunting, “How do we solve it?” is easily answered: We build an open music database that is built by the community, for the community.

And while we’re at it, we’ll build adaptive and intelligent internet crawlers in Clojure, implement a ‘soundex’ algorithm for indic languages within Postgres, analyse music programmatically to rate the quality, and even find patterns in them!

Skeleton of the talk

  1. The data is out there

    • Detail of the different kinds of sources for the database. The most promising websites, books, the biggest reserves of recorded music, the varying qualities of recorded music, and their usability for research purposes.
    • [6 minutes]
  2. Mining the data into one place - a database

    • What has been accomplished so far - of the kosha repository, the ragavardhini repository and r4g4.com.
    • Building an intelligent crawler/scraper that can automatically tag/categorize the data that it scrapes. So we don’t have to write scrapers manually for every website.
    • Mining data off books (digital and physical), retrieving music from known large reserves (AIR, academy, etc).
    • [10 minutes]
  3. Cleaning, knitting and organizing

    • Data from multiple sources will be duplicated, filled with noise, incomplete, and inaccurate. We’ll engineer ways to deduplicate, denoise, and connect pieces of information together so we fill the missing gaps. This is non trivial, and enters the big data realm.
    • To find related chunks of data, we’ll have to improve the algorithm to search for a given keyword, given that we’re representing indic language words in english.
    • [10 minutes]
  4. Using the mined data

    • What has been accomplished so far - of r4g4’s apis and interface.
    • Building an interface to search the contents by keywords or free form, play music with ease, while having access to all the information about the kriti (meaning, notation, etc).
    • Building APIs so other applications can use the database for other purposes (research).
    • [10 minutes]
  5. Why I think this is a big step forward

    • Enter fantasy land (not really).
    • Elimination of text books in music classes, removing need to remember scales of thousands of ragas, embracing compositions with meaning every time, learning at homes, having access to reliable information in our hands.
    • [5 minutes]

Speaker bio

Srihari is a FOSS enthusiast. He has contributed to Gimp, Eclipse, Diaspora and is excited about opportunities to give back. Over the last couple of years, he has worked on building an experimentation platform, delving into a particularly dense domain, meeting tight latency SLAs, and engineering assembly lines in software using Clojure.

He sings, and does music things in technology – he has worked on synthesizing gamakas, and has been building an open carnatic music database (the ragavardhini repo, r4g4.com) in his spare time.

He is a partner at nilenso, a hippie tree hugging bicycle riding software cooperative based in Bangalore. He blogs, plays basketball, and performs carnatic music occasionally.

Links

Comments

  • 1
    Jaseem abid (@jaseemabid) 3 years ago

    Heyo! Can you think of a mini workshop or BOF for students interested in contributing code? Even a code walkthough for interested students will be really valuable.

  • 1
    Srihari Sriraman (@ssrihari) Proposer 3 years ago

    Yeah, we could do this after the talk for whoever’s interested in this kind of work.

  • 1
    Harsha Galgali (@harshagalgali) 2 years ago

    I would really be interested to work on this.

  • 1
    Murty BVNS (@murtybvns) 11 months ago (edited 11 months ago)

    Excellent proposal. I am using Internet since 1999 and invlovled as Tech Volunteer in many international projects starting with Gutenberg, Dmoz.org and many others just to mention few things. Spent many thousands of hours classifying data but there is no database for Indian Classical Music. In fact when I started using Internet there is no music classification as Carnatic or Hindustani. Simply it is World Music. Please let me know what to do next. I am using MusicBrainz and IMDB to update information but as you know they do not contain essentials the Indian Music need.

Login with Twitter or Google to leave a comment