Sandeep Joshi
Data science/Engineering at Kognitos- Claim this Profile
Click to upgrade to our gold package
for the full feature experience.
Topline Score
Bio
Experience
-
Kognitos
-
United States
-
Software Development
-
1 - 100 Employee
-
Data science/Engineering
-
Nov 2021 - Present
Semantic parsing / Natural language understanding Semantic parsing / Natural language understanding
-
-
-
-
Independent Consultant
-
Nov 2020 - Dec 2021
Social media Startup : Built social graph analyser (patent-pending). Did synthetic data generation. Developed feature for trending topics. Did NLP-based analysis (spacy, BERT) over large corpus of financial documents. Ed-tech startup in Southeast Asia : Did software design consulting (design reviews, interviews, code analysis). Major hyper-converged cloud provider : Built cluster sizing tool. Porting the hypervisor. Social media Startup : Built social graph analyser (patent-pending). Did synthetic data generation. Developed feature for trending topics. Did NLP-based analysis (spacy, BERT) over large corpus of financial documents. Ed-tech startup in Southeast Asia : Did software design consulting (design reviews, interviews, code analysis). Major hyper-converged cloud provider : Built cluster sizing tool. Porting the hypervisor.
-
-
-
Amazon
-
United States
-
Software Development
-
700 & Above Employee
-
Senior SDE
-
Nov 2018 - Oct 2020
Amazon Prime VideoBuild various features to assess video and caption quality. Together with Applied Scientists, built a Voice Activity detector (VAD ML model) and a classifier which detects subtitles which are out-of-sync with the audio. Developed an async processing pipeline using Sagemaker and Lambda. Built a ML model to detect the language spoken in the video. Experimented with TTS models (text-to-speech) to generate controllable speech.Patent : https://patents.google.com/patent/US10945041B1/Gave talk on "Operational Pitfalls in SOA (Service Oriented Architecture)" at Amazon's internal conference DevCon APAC 2019.Gave talk on "Using the back of the envelope (Queueing theory)" at Amazon's internal conference DevCon APAC 2020 Show less
-
-
Senior SDE
-
Jan 2018 - Oct 2018
Amazon AppstoreAnalyzing Android apps for policy violations, malware, fraud, etc.Apps submitted by developers go through a series of automated and manual tests to determine functional as well as policy violations (vendor fraud, IP infringement). Worked with the Content Operations team to detect and mitigate policy violations and security escalations. Rewrote slow keyword search. Evaluated integration with Knowledge graph and external App engines for improved app metadata scanning. Show less
-
-
-
DC Engines Ltd
-
India
-
Software Development
-
Senior Software Engineer
-
Nov 2015 - Jan 2018
1. Prototype to replicate RocksDB to secondary server, to work as part of mongo-rocks extension. 2. Built fast read bypass using Accelio to read from Alba object store (https://github.com/sanjosh/gobjfs) 3. Built an FSAL for NFS-Ganesha. Designed and implemented the NFS crash recovery in the FSAL enabling seamless HA and VMotion in VMWare. 4. Experimented with query pushdown from Apache Spark to in-development database product. Wrote an Expression evaluator which will evaluate Spark predicates. Experimented with LLVM to speed up expression evaluation using JIT. 5. Evaluated Capnproto, Scylla Seastar, Tarantool, Facebook Folly for integration with in-development database product. 6. Bitmap indexing and Text indexing Show less
-
-
-
Facebook
-
Software Development
-
700 & Above Employee
-
Software Engineer
-
Feb 2015 - Oct 2015
SIEM system built using PHP/Hack, C++, Python, Thrift, JSON. 1. Migration of log storage to Elasticsearch 2. Hadoop-Hive job to copy tables from Hive to Elasticsearch SIEM system built using PHP/Hack, C++, Python, Thrift, JSON. 1. Migration of log storage to Elasticsearch 2. Hadoop-Hive job to copy tables from Hive to Elasticsearch
-
-
-
Stealth Mode Startup
-
United States
-
Computer Games
-
100 - 200 Employee
-
Sr Principal SDE
-
Mar 2013 - Jan 2015
Software-defined storage, Storage Virtualization Cluster. 1. Extended wrapfs to support NFS operations. 2. Built module to do zero-copy transfer between kernel-mode driver and userspace using netlink and and a custom mmap kernel driver 3. Built fast snapshot module using asynchronous IO and a merge sort using STL priority queues to do simultaneous high-bandwidth IO to several distributed disk drives. 4. Developed plugin for NFS-Ganesha. 5. Wrote a zeromq-based client-server framework for cluster communication 6. Integrated support for erasure codes (Jerasure) 7. Wrote a QEMU block driver to optimize storage path for virtual machines running on KVM 8. Created a prototype to store cluster metadata in ledisdb (redis in Go) using Cap'n Proto RPC framework. 9. Added TIPC support to golang's networking package, ledisdb and hiredis (the Redis C Client). Show less
-
-
-
Lucent Technologies
-
United States
-
Telecommunications
-
200 - 300 Employee
-
Member of Technical Staff
-
2001 - 2002
iSCSI is an IETF-approved storage standard for transmitting SCSI commands over an IP network. Single-handedly developed the iSCSI protocol stack (target and initiator) on a FreeBSD system, first in user space and later in kernel space. Participated in the IETF forum deliberations [1]. Participated in the first iSCSI interop held at the Univ. of New Hampshire in July 2001. The stack that I had developed was among the first to demonstrate an iSCSI stack which worked flawlessly over multiple TCP connections [2].[1] http://www.pdl.cmu.edu/mailinglists/ips/mail/author18.html[2] http://www.pdl.cmu.edu/mailinglists/ips/mail/msg05418.html Show less
-
-
Member Of Technical Staff
-
1996 - 2001
Datablitz is a main memory database which provides concurrency, transaction management, recovery, replication and archiving. It was derived from Dali, a prototype which was built by the Database Research Department in Bell Labs. Datablitz was demonstrated at VLDB 98 and won the Bell Labs President’s award for 1999. This database is still in use within Alcatel-Lucent switches and is being maintained by Mascon Global Ltd.I was one of the core developers of this database. Primarily responsible for the memory allocation, recovery, transaction logger and multi-site replication features. Worked on making the Datablitz API library thread-safe, and porting it to Windows NT and to the 64-bit Solaris 7. Wrote a Java wrapper library (JNI) on top of the Datablitz C++ API. Show less
-
-
Education
-
Pune Institute of Computer Technology
Bachelor's degree, Computer Engineering