If you are browsing the web using the http protocol (which is the only protocol most
sites today support), then you are essentially not safe from
packet sniffing.
This means that hackers on the same network that you are on can see the traffic going
to and from your computer. I've written a short post about this
here
explaining what the problem is and how you can go about solving it.
You can try out the
accompanying proxy
which enables you to make
HTTPS
connections to a trusted host which will proxy all your requests to
the real server you are trying to contact, thus eliminating the risk of
your packets being sniffed on the local network.
I've been
experimenting with alternate UIs for searching the
web more effectively. This is another such experiment in that direction.
I had 2 options to start off with:
Some observations and motivations:
FHN is an initiative started by one of the tech. leads
Sandeep Shetty at Directi. The agenda for the night is to
work on anything you like. It may be something work related you haven't
got time to work on due to deadlines, or your own personal projects. This
happens (almost) every alternate Friday and is generally accompanied with
pizza and beer. 2 other very active FHN members are
Rakesh Pai &
Vishnu Iyengar.
I've worked on 4 major projects in the 4 FHN meets that I've attended
Searching for song lyrics has been something that I've been fascinated with for a few years now. The reason being that even though there are so many sites that serve lyrics online, most are neither easily searchable nor do they present the lyrics in an enjoyable manner. Either their indexes are incomplete or the interfaces are filled with videos, ads and all sorts of distractions.
To overcome these drawbacks, I implemented a lyric search library that uses search engines and fetches lyrics from the internet by using a fuzzy document matching and intersection extraction algorithm.
I also recently wrote a
lyrics scraper for Duck Duck Go.
Try it here.
The C++ STL is an absolutely wonderful collection of generic algorithms and containers. If anyone wants to learn how to write top quality algorithms and data structures, I would highly recommend you to learn C++ just so you can read the STL and marvel at its beauty and simplicity.
C++ features the EBO (empty base optimization). So if you have a base class that is empty, it won't take up any space! To exploit this optimization, libstdc++ containers were made to inherit from the Allocator class. Due to this decision, if the Allocator class defines clear() or any other container member function to be virtual, and internally uses it, then it will actually result in the container's corresponding method being called (and not the allocator's method which would be expected by the allocator writer). I fixed this behaviour since it could be potentially disastrous.
One of my applications exhibited the following usage pattern with linked
lists. It read a bunch of strings from a file into a linked list, sorted
them, did some processing and then freed the list. It repeated this process
for many other files. I noticed that the first file was always processed
the fastest whereas all subsequent files took longer than the first file
to be processed. This was counter intuitive since I would expect the
cache to get hot after the first use! I pinned this down to the fact
that memory was being freed in a random order since sorting a linked list
shuffled the nodes around and then freeing them added them back to the
free list in that random order. This meant that a lot of the CPU cache was
being misused to store data that was never going to be used. Furthermore,
unnecessary cache misses were slowing the application down. A brilliant
tool called
cachegrind
(which is a valgrind extension) helped me verify my claims.
I wrote an allocator that avoided this pattern wherein non-adjacent blocks
of free memory are placed next to each other in the free-list and
contributed it back to libstdc++ as the
bitmap allocator,
since it uses bitmaps to keep track of used and free memory blocks. It also
has a lesser per-object overhead when compared to the default allocator.
CC-NUMA machines
exhibit non-uniform memory access times to different memory locations. The
Ingres database server when run on CC-NUMA
machines slows down because statistics for every query are maintained at a
single place in the process image. This means that processes executing on
CPUs other than the CPU whose memory bank houses the statistics data will
experience a slow-down every time statistics are updated by that process.
This problem was solved by having statistics data maintained separately for each running process and aggregating the data only when required (typically when the aggregate information is requested by a user).
A lot of pages seem to vanish from the internet over time.
StickyLinks is
an attempt to make a web link last forever. This is similar to the
internet archive and
WebCite
Fixed a bug in
XMMS which caused it to
crash if you deleted a song that was queued in the playlist.
Added a progress-bar to the cp(1) UNIX utility. I haven't released this since there is sufficient disagreement about this in the community ;)