GENERALIST
Ashwanth Kumar

Ashwanth Kumar

Fractional Software Architect

About Me

With 12 years of professional experience across startups and public companies, I'm not just a software engineer — I'm a generalist who thrives on diverse challenges. My journey, shaped by exceptional mentors, has instilled in me a passion for pragmatic solutions and a commitment to paying it forward through mentorship.

I take pragmatic decisions and good engineering practices that pay compounding returns as we progress, rather than cutting corners to meet deadlines for today. As a result of excellent mentorship, I take mentoring, giving and taking feedback very seriously.

Beyond the digital realm, my curiosity drives me to explore the physical world through welding, dive into the intricacies of stock options trading, and even build an electric car from scratch in my parking lot.

Skills

Big Data
Hadoop (YARN and HDFS)
HBase
Spark
Zookeeper
Kafka
Distributed Systems
Large Scale Data Management
Scala
Java
Go
Python
Typescript/Javascript
AWS
Cost Optimization
Kubernetes
Docker
Terraform
Basic Linux Administration

Experience

Director & Fractional Software Architect

AK Labs - May 2022 - Present

AK Labs is my personal company under which I provide consulting services to my clients. I wear multiple hats which includes:

  • Building a pipeline of work
  • Conducting sales calls with leads
  • Managing finances for contractors when I hire them
  • Delivering value to my clients as a Software Architect on their team(s)

Key Clients:

  • Hella Infra Market Pvt. Ltd: Manage and support their internal Platform Engineering team. Also playing the role of part-time architect to their Data Platform Team that supports internal data pipelines for ETL, analytics and reporting.
  • Astranova Labs: Built their end-to-end infrastructure automation for software delivery on GCP, Kubernetes and Cloudflare.

Principal Engineer

Avalara Inc. - Feb 2019 - Apr 2022

Content Sourcing Platform: Automated detection of Sales and Use Tax changes across various states within US, reducing time from 3+ weeks of manual effort to a few minutes across many jurisdictions. Architected a No-Code Self-Serve Platform for Tax Researchers within Avalara to automate their workflows to identify important content changes.

Principal Engineer & Mentor

Indix - Oct 2015 - Jan 2019
  • Data Services Platform: Architected a Data Services Platform (Carol) to run our internal Big Data Workloads. This tool helped orchestrate our internal Spark based platform that ran on EMR clusters created on the fly.
  • Product Matching: Consulted for Product Matching team to reduce their workload runtime across 2+ Billion product dataset from multiple days to about 7 - 8 hours. This saved us a lot in Infrastructure costs and also allowed us to run various experiments faster to improve our model metrics: precision and recall faster.
  • Data Ingestion: Led the Data Ingestion Team responsible for building tools that ingest all the required data for building our catalogue. To better explain it in numbers: Crawler crawls >20 million webpages a day, Parser process >30 millions webpages a day all in realtime. This team was also responsible for managing all the Master (Raw) Data that flows within our Data pipeline.
  • Finder: HTML Archive Storage system for storing over 3 billion HTML documents. It was built using Suuchi, a library that provides set of abstraction for building distributed data systems that I helped develop.
  • Infrastructure: During this phase we had to revamp our Devops team internally and as part of the effort, I joined as the first member to help build the team, identify tools, setup automation and processes which after the first year saved over million dollars a year on AWS by intelligently using Spot instances and Auto Scaling for 70% of Production workloads.

Software Engineer

Indix - Jun 2012 - Sept 2015
  • Data Pipeline: Part of the team that built the first generation of Indix Data Pipeline based on λ-architecture principles. Contributed to writing Map-Reduce jobs, setting up S3 file layouts for faster MR job startup times and monitoring the overall health of the data pipeline.
  • Data Ingestion & Data Infrastructure: Responsible for key components like Crawler, Scheduler, Parser, and Master Data Management on S3.

Big Data Intern

Mu Sigma - Jan 2012 - May 2012
  • Evaluated AWS for migrating to cloud from custom data centers
  • Worked on running R code on Storm

Notable Projects

Content Sourcing Platform

Automated detection of Sales and Use Tax changes across various states within US, reducing processing time from weeks to minutes.

Data Services Platform (Carol)

Architected a platform to run internal Big Data Workloads efficiently, orchestrating Spark-based processing on dynamically created EMR clusters.

Finder

HTML Archive Storage system for storing over 3 billion HTML documents, built using the Suuchi library.

AWS Infrastructure Optimization

Led the revamp of DevOps processes, resulting in over a million dollars annual savings through intelligent use of Spot instances and Auto Scaling.

Open Source Contributions

I'm passionate about contributing to open source projects. Here are some of my notable contributions:

Indix OSS

Created an internal community at Indix to encourage engineers to build and contribute to open source tools used daily.

Explore Indix OSS

GoCD

Contributed several plugins including Github PR Plugin, Github Slack Notifier, and GoCD Janitor. Active participant in the mailing list.

Explore GoCD Projects

Suuchi

A toolkit for building distributed data systems using gRPC. Provides pluggable components for easily constructing data systems with desired characteristics.

View Suuchi

For a complete list of my open source contributions, please visit my GitHub profile.

Conference Talks

Why we built a distributed system

April 2018

DSConf 2018, Pune, India

Lessons scaling operations to everyone @indix

November 2017

Mini Cloud Conf, Chennai, India

Using Monoids for Large Scale Aggregates

November 2017

Scala.io, Lyon, France

Suuchi - Distributed System Primitives

April 2017

Devday, Chennai

Lessons from managing Hadoop clusters on AWS @indix

November 2016

DevopsDays, India

Lessons from Building Distributed RocksDB

November 2016

Geeknight Chennai

Lessons from running production infra on AWS Spot Reliably

April 2016

Geeknight Chennai

Education

Bachelor of Technology in Computer Science

SASTRA University, Tamil Nadu, India - July 2008 - May 2012

GPA: 8.12/10.0