Dhruv Matani

Mobile: (631) 403-7653 | E-mail: dhruvbird@gmail.com | GitHub: github.com/dhruvbird | Blog: dhruvbird.blogspot.com


Profile Summary:

A software engineer and architect with 14+ years of experience with large scale distributed systems, performance critical services, and a passion for creating developer friendly APIs and experiences.


Work Experience:


Facebook Inc., Menlo Park

March 2013 - Present


Logger (Product Infrastructure) – Engineering Manager and Technical Lead

·        Supporting the Logging Team at Facebook. The team is responsible for all analytics logging at Facebook. This is used to power critical company wide metrics such as time spent, DAU, advertiser revenue, etc.… and in addition being the data backbone of various analytical workflows, debugging tools, and customer insight dashboards, etc... Focusing on key outcomes related to reliability, usability efficiency, and privacy

·        Responsible for the continued wellbeing, growth, and productivity of the team

·        Leading a transformation within the framework from being user focused to being user and framework focused

·        Leading the transformation to support semantically rich types across major frameworks at FB (such as Ent, Logger)


Logger (Product Infrastructure) – Technical Lead

·        Led the Logging Infrastructure team at Facebook. This includes Logging from mobile devices, web browsers, web services, and various backend services.

·        Mentor for 9 developers on the team, and point of contact for all cross-functional initiatives regarding warehouse data logging

·        Achieved phenomenal efficiencies at various levels by re-architecting the on-wire serialization format for loading data into the Facebook Data Warehouse. This affects both batch and real-time workloads

·        Re-architected the PHP Logger framework to leverage type-safety and runtime efficiency offered by Hack and HHVM. This resulted in single digit percentage wins across the entire fleet of machines

·        Leading the implementation of an intent driven sampling initiative for logging data into the warehouse to allow data growth to scale independently of the popularity of the platform

·        Led the backend C++ Logger initiative to ensure high quality data logging from services at Facebook to various downstream data stores. Changed the development paradigm for logging by providing static type checking (instead of run-time failures and alerts) against downstream table schemas in C++ code, greatly increasing developer productivity

·        Led the cross-language C++/PHP/Python Reader initiative to provide a consistent data ingestion API for real-time workloads across Facebook. Involves integrating with multiple real-time solutions at Facebook such as Stylus and Puma. The focus here is also to have downstream table schemas bubble up into code, and have the code fail fast, enable developers to iterate quickly, and for the application be more reliable. Additionally, the reader frees developers from worrying about the positional nature of columns in their data streams, and provides them the convenience of keyed column lookup without the related performance overhead


Ads Targeting and Custom Audiences – Software Engineer

·        Led the effort to platformize the custom audiences backend (for use in other targeting products) and improve targeting by enabling matching on a broader category of traits


Scuba (Data Infrastructure) – Software Engineer

·        Part of the team that implemented a performant in-memory columnar storage engine that results in increased data compression

·        Improved data ingestion CPU utilization by 4x (have you heard of merge sort?)

·        Improved overall reliability and availability of scuba (i.e. added another 9)

·        Part of the team that implemented Fast Database Restarts in Scuba, bringing restart times down to 120 second from 2.5 hours (yes, that’s 1.3% of what it was)

·        Migrated the running service (which runs on thousands of machines) to a different region with zero downtime, over a period of 2 months; i.e. rolling migration without service disruption (ask me why this was challenging, and how I used scp on in-memory data)

·        Led the disaster recovery effort, which replicates data in real time across machines in multiple geographic regions


Facebook Inc., Menlo Park

May ‘12 – Aug ‘12


Software Engineer Intern

·        Developed a per-table backup and restore tool (for MySQL databases), which lets you restore individual tables from a complete database backup without restoring other unrelated tables. This tooling enabled much faster table restores in case of MySQL table corruption


Directi Pvt. Ltd., Mumbai, India

Jan ‘09 – Jun ‘11


Senior Software Engineer

·        Authored, tested, and integrated an XMPP BOSH Server for use in the talk.to project. This was later released as an Open Source project, which is used by many globally as a standalone BOSH proxy. https://github.com/dhruvbird/node-xmpp-bosh

·        Responsible for problem-setting and training interviewers for the IIT graduate hiring process

·        Developed and owned the online update logic for .pw desktop chat client. This handles creation, shipping and fail-safe application of differential updates (deltas) which update the application in a way that minimizes network data utilization

·        Setup, administration and question setting for the Directi Online Test (DOT) which is used for the recruitment of software developers and engineers

·        Developed the Business Logic Layer for administering the chat server and related services

·        Led the project to develop and deploy an online community driven translation tool to help internationalize web applications. http://sofi.directi.com/

·        Email Context Analysis Engine (CAE): Developed an email context analyser to determine the most relevant commercial keywords/phrases for a given email


Mukesh Patel School of Technology Management & Engineering., Mumbai, India

Jul ’08 – Jan ‘09



·        Taught a course on Systems Architecture & Programming

·        Taught a course on Operating Systems-II


Calsoft Pvt. Ltd., Pune, India

Jul ’06 – Apr‘08


Senior Development Engineer

·       Developed an HTTP Caching Proxy to cache YouTube flash video content for Umber Media systems

·       Prototyped an RTSP Caching Proxy to cache YouTube 3gp video content for Umber Media systems. Also performed re-encoding and down-sampling of the video and audio content. This proxy also let you insert advertisements at various places in the original video. It was prototyped for use as a video ad-serving proxy

·        Developed the DiskImage tool which is used by ScaleMP for fully automated installation of linux (distribution hosted on TFTP servers) on machines

·       Performance tuned the Ingres Database Server's statistics collection module for ScaleMP's VSMP architecture (which is a glorified NUMA architecture)

·        Optimized and fixed long standing bugs in the logpar (log parser) application for ScaleMP, which is used for parsing the performance logs produced by their system




Master of Science (M.S.), Computer Science

Aug ‘11 – Dec ‘12, GPA 3.82/4.00

Stony Brook University, Stony Brook (NY)


Bachelor of Engineering (B.E.), Computer Engineering

Aug ’02 – Jun ’06, Aggregate: 62.6%

University of Mumbai, Mumbai, India





· Fast Database Restarts at Facebook (25 citations)



·  Avoiding locks and atomic instructions in shared-memory parallel BFS using optimistic parallelization (7 citations)

· Partial deamortization of the Packed Memory Array



· Compressing the human genome against a reference (6 citations)



· An O(k log n) algorithm for prefix based ranked autocomplete (12 citations)



· An O(1) algorithm for LFU (Least Frequently Used) cache replacement (24 citations)



· A technique for extracting song lyrics from web pages without knowing their structure



·  A distributed approach for solving a system of linear equations (2 citation)





Programming Languages:

C, C++, Python, Hack, Javascript, PHP, Bash, Object Pascal, SQL

RPC/API paradigms:

Thrift, JSON, CSV

Presentation technologies:



Linux, MS-DOS, Windows


MySQL, SQLite, Hive, PostgreSQL, Oracle


Jupyter Notebooks, node.js, Mercurial, git



·        Architected and built the infrastructure powering the analysis in the book “Who’s Bigger” – by Steven Skiena and Charles Ward

·        Provided key insights into the complexity analysts of the R1Q algorithm mentioned here

·        Added a cache-optimized single-object allocator (bitmap_allocator) to libstdc++

·        Contributor to libstdc++-v3 (g++’s C++ Standard Template Library) – which is used to power a vast majority of applications built using C++