Atul Kumar — Software Engineer · AI Agents, LangGraph, AWS, Apache Spark, Distributed Systems

Software Engineer · AI Agents · Distributed Systems · Backend at Scale

I build production AI agents and large-scale distributed systems — from LangGraph + SageMaker retrieval pipelines to high-throughput Apache Spark systems processing millions of records and server-side streaming APIs sustaining tens of thousands of records per second.

↓ Download Resume (PDF)

About

Software Engineer focused on production AI/ML systems and distributed infrastructure. I have shipped a LangGraph + AWS SageMaker agent that deflects 80% of internal support queries with sub-20-second responses over a 40-document corpus, streamed 5M+ records per request at ~50K records/sec via a server-side streaming API, sped up a critical Spark pipeline 6× (12 hrs → 2 hrs), and built shared Spark libraries that cut new-pipeline build time by ~75% across the team. Currently an Engineering Associate at Goldman Sachs Engineering.

I own systems end-to-end across AWS — Glue, Spark, ECS, Step Functions, Aurora, SageMaker — and enjoy the full spectrum from agent design and RAG retrieval to streaming APIs, OCR, and near-real-time data infrastructure.

Skills & Expertise

Languages

Java Python C++ SQL JavaScript/TypeScript Shell/Bash

Backend

Spring Boot Flask REST APIs Server-side streaming Microservices

Frontend

ReactJS HTML CSS

AI / ML

LangGraph AWS SageMaker RAG Embeddings & Vector Retrieval OCR LLM Application Design

Data & Pipelines

Apache Spark 3 Kafka AWS Glue Snowflake Sybase IQ Data Modeling NRT Pipelines

Cloud & Infra

AWS Lambda ECS Step Functions S3 Aurora API Gateway Route 53 VPC ELB Terraform CloudFormation Docker

Other Tools

Elasticsearch MongoDB PingFederate SkyFoundry Cloud FastTrack P3 Procmon

Professional Experience

Engineering Associate

Goldman Sachs, Bangalore · Jan 2025 - Present

TINA — Travel Insights & Navigation Agent

Deflected 80% of travel & expense support queries from human agents by building TINA, an internal AI agent that resolves policy questions (missed flights, reimbursements, approvals) end-to-end.
Cut average policy-answer latency to under 20 seconds across 40 policy documents by building a RAG pipeline that ingests document embeddings into AWS SageMaker and orchestrates multi-step retrieval with a LangGraph agent.

Travel & Expense Platform

Enabled streaming export of 5M+ live transaction, report, and ledger records per request by engineering a server-side API that sustains ~50K records/sec within a 10-minute window, eliminating the timeouts and OOM failures from the prior batch approach.
Eliminated daily manual reconciliation by the operations team across 80 global markets by building an auto/manual voucher creation flow for vendor (Amex) payments with auto-generation of ledgers, statements, and voucher reconciliation.
Met Goldman Sachs' Tech Raise Bar quality standard for the greenfield Travel & Expense platform by authoring an end-to-end integration test suite covering AWS Step Functions, ECS tasks and services, S3, and Aurora.

AI-Driven Invoice Lifecycle Management (POC)

Eliminated manual data entry on every invoice submission by building an OCR layer that pre-fills 100% of form fields from invoice attachments for downstream human verification.
Built an intelligent reviewer-assignment engine that routes invoices to reviewers based on expertise, calendar availability, and criticality, with cycle-time metrics fed back into the model for continuous improvement.
Prevented duplicate payments across $2M+ in average daily invoice volume by adding a duplicate-detection layer over historical submissions.

Slate — Inter-Affiliate Outsourcing Agreements

Drove data-backed feature prioritization for the next platform build by integrating GS Analytics into the Slate frontend, capturing feature-level telemetry from 800 users across 10 workflows in the agreement-creation pipeline.

High-Throughput Spark Pipeline on AWS Glue (FFIEC 009)

Engineered a high-performance Apache Spark 3 pipeline on AWS Glue that ingests and processes data from 12 upstream sources under strict latency and correctness budgets.
Reduced new-pipeline build time across the team by ~75% by designing shared libraries that standardize data processing and validation.

Engineering Analyst

Goldman Sachs, Bangalore · Jun 2022 - Dec 2024

Large-Scale Data Pipeline Re-Architecture (TIC B)

Cut end-to-end pipeline runtime by 83% (12 hrs → 2 hrs) by re-architecting the workload on Apache Spark 3.
Designed a versioning system that enables exact reproduction of any historical calculation across 60 onboarded data products by capturing the precise inputs and logic used for any past run.
Replaced manual hand-offs with a fully automated end-to-end data pipeline and downstream generation flow.
Advanced downstream data availability by 9 days each cycle by removing legacy system dependencies and building a new data model.

Kafka-Driven NRT Pipelines (TIC SLT / SHCA / SHLA)

Cut data-quality issue resolution from 4–5 days of manual email follow-up to near-real-time by building a UI over Kafka-driven NRT pipelines that surfaces issues as they occur.
Advanced downstream data availability by 17 days each cycle by shifting end-of-cycle batch processing to daily incremental processing.

Technology Intern

American Express, Bangalore · Jan 2021 - Jun 2021

Natural Language Query Parser for ML Studio

Removed the learning curve for Amex's ML Studio Logstash by building an end-to-end natural language query parser that translates plain-English questions into executable Logstash queries — an early NLP-to-DSL system.

Education

B.E. Computer Science

Punjab Engineering College, Chandigarh · Aug 2018 - Jun 2022

CGPA: 8.45/10

Class 12 (CBSE)

Springdale Public School, Kurukshetra · 2018

Score: 89.2%

Class 10 (CBSE)

Springdale Public School, Kurukshetra · 2016

CGPA: 10/10

Achievements & Leadership

Owned multiple regulatory reports end-to-end (including TIC-B) with full delivery responsibility — from raw data pipeline to final published report.
Drove student engagement at PECfest by organizing technical PEC-ACM events with 45+ participants across 15 teams.
Contributed as a volunteer researcher on the Indian-Origin Academicians Abroad project under DST.

Get In Touch

Email: atul.jawa857@gmail.com

Phone: +91 85720 21225

LinkedIn: linkedin.com/in/atul-jawa

GitHub: github.com/AtulKumar2009

Resume: Download PDF