Projects

To find out more about any project, click the link in the title (if one exists)

Wikipins - A Visual Wikipedia navigation tool

What would it be like if you could browse Wikipedia in a Pinterest like style? Wikipins tries to unravel this mystery!

algorithm-js - A Data Structures & Algorithms in Javascript

With node.js becoming popular among web developers, I felt the need for a library (basically collection) of commonly used Data Structures and algorithms. The library currently contains Data Structures such as Trie, Stack, Queue, AVLTree (Height Balanced BST), Binary Heap, Min-Max-Heap and Algorithms such as lower_bound, upper_bound, binary search, partition, selection (bisection related).

node-xmpp-bosh - A BOSH connection manager

node-xmpp-bosh is an XMPP BOSH server (connection manager) written using node.js in the Javascript programming language. It has been mainly written to support Request & Response acknowledgements and Multiple Streams. It supports many many servers and clients.

tddb - The Distributed Database

How do you automatically partition data and then execute queries on all replicas and provide a consolidated result set? TDDB is an experiment with databases and distributed query processing. This was my final year project as an undergraduate student. Made a man out of me!

pymq - Python Distributed Message Queue

Can a message queue support millions of queues (one per user) and still remain remain responsive? Can such a system be self-monitoring and exhibit smooth failover characteristics? My evaluation of existing message queues resulted in no existing product that satisfied these requirements. pymq is intended to fill this gap.

lib-face - Fast auto-complete server

What do you do when you have 30 million phrases that people may search for and you want to help them by suggesting a set of 'k' possible completions based on what they have already typed? Furthermore, the completions should be ordered by a pre-defined rank.

Most current implementations (including Apache Solr) use a O(n log n) algorithm to get the candidate list and sort based on score. lib-face on the other hand returns the results in guaranteed O(k log n) time making it very attractive for large-scale deployments.

p2p-fs - Peer 2 Peer File System

NFS lets you share files across the network. However, it suffers from the following shortcomings:

Each share needs to be separately mounted by each client
In case the machine serving a share goes down, the client needs to wait for it to come back up even though the same files may be available on another share
A popular server may be unduly overloaded because of too many clients connecting to it even though the same content may also be replicated on other less popular machine

p2p-fs tries to mitigate all of the above by using a peer-2-peer model for file sharing (much like BitTorrent)

liblyric - Lyrics Search Library

A search result for many song lyrics on popular search engines returns many mostly relevant results. However, the target pages are filled with ads, videos and images. Anyone searching for the lyric text would not be interested in all that paraphernalia.

liblyric is an attempt to automate the process of scanning these individual result pages and extract the common textual content from them in the hope that the common parts will definitely be just the song's lyric text.

The techniques that liblyric employs have turned out to give accurate results more than 90% of the time.

CAE - Email Context Analysis

What does this user's email talk about? What products would he/she be interested in buying at this point in time? These are the broad questions that need to be answered for a user given his/her email so that relevant monetizable advertizements can be shown alongside the email. CAE tries to do this as accurately as possible.