Kamaldeep Singh


Data Scientist


Singh, Kamaldeep

mailkamaldeep@gmail.com | +918105803335

|https://www.linkedin.com/in/mailkamaldeep | github:@kamaldeepebay

, @kamaldsingh


M.Sc. (Tech.) Information Systems CGPA : 8.01 / 10


Technical Skills & Abilities

∙†Programming languages :Scala,

Java, Shell, C, JS, R, Cypher

∙†Distributed Systems :Apache

Kafka,Storm, Amazon Kinesis, Hadoop, Mapreduce, HBase, Hive, Pig, Mahout, Redis,

Spark, Ambari, Sqoop, Camus, Zookeeper, Kylin, Olap, Datafu Hourglass, Ansible, Openstack, Nova, AWS, Akka Actors

∙†Data Science :Weka,Stanford CoreNLP, DL4J, Keras, MLLib, JavaBayes, Alchemy, Sentiment Analysis,Bayesian Networks, ContextFree Grammar, Spam Detection, AWS ML

∙†Search :Apache

Lucene, Contextual Autocomplete, Word2Vec, GloVe, Web Ontologies, Search Impressions

∙†Data Stores :Graph

(Neo4j), Mysql, HBase, Redis, MongoDB, Hive, Kylin

∙†Monitoring/Analytics :Mixpanel,

Graphite, OpenTSDB, Riemann, HProf, New Relic, Nagios, Google Analytics, Adwords

∙†Schedulers / Authentication / API’s :DSL’s,

JSON AST, Parboiled,Job Schedulers, Quartz, Kerberos, Crypto, Blocking

queues, Brokers, JScH, TestNG, JAXRS,

JAXB, Servlets, JDBC, Twitter4j, FB Graph Search, Siddhi Cep, JUnit, JMete r

· Open Source Contributions :Apache

Lucene, Kylin, Eagle, Hadoop, Amazon KCL


Experience and Projects

Data Scientist , Practo Technologies, Bangalore Oct. 2015 Present

∙†Building a Contextual Autocomplete Engine and Free Text Search Platform using Lucene/Stanford NLP

· Revamping the Practo Ranking Algorithm for scoring doctors and practices.

· Building Synonym Search by modelling external data sources like Snomed into Graph Entities in Neo4j and Tinkerpop.

· Designing Real time Lucene Index Updation system using Kinesis , Akka Actors and S3

· Improving Search and Autocomplete Performance ( ~ 100ms & ~ 20ms resp. )

Software Engineer 2 , Hadoop Services, Ebay Inc., Bangalore Dec 2013 – Oct. 2015

· Realtime

Event Processing Platform with Kafka, Hbase, Storm for Hadoop Security

· HDFS Metadata Streaming Platform for Hive security with a Cache manager built in Redis

· HBase Diagnostic Tool

· Hadoop / Hive Job Performance Optimizations, Hbase and Hadoop Capacity Management, Performance testing .

· Upgraded World’s largest Hadoop2 cluster (per HWX) introducing & benchmarking its HA architecture & built test suite

· Open Source Contributions to Apache Kylin (Spark on Kylin, Metadata Bkp.) and Hadoop (Hadoop TSDB.,Decom. patch)

· Hadoop as a service – Ambari with Openstack & Hadoop connector with Opentsdb (open sourced)

· POC’s around Incremental MR Processing, JVM Profiling and Alert Correlations in monitoring framework at Ebay

Associate Software Engineer , C.A. Technologies, Hyd. Jul 2013 – Nov 2013

∙†Spectrum Device Certification Automation Spectrum

enhancement for automatic alarm support for new devices

based on text classification of the MIB Description using bayesian classifier.

· LACP Device Support in Spectrum

Intern , Amazon Development Centre, Hyd. Jul 2012 – Dec 2012

· Automated Data Comparator engine – Scalable and an Automated tool for validating a redesigned

service against its previous version by finding diff between the backend databases using Mapreduce

and Hadoop.

· Structured Data Retrieval Serv. – Restful service to automate the creation of Json objects for unit testing.

Term projects ( BITS Hyderabad )

· Scalable framework for detecting network threats based on machine learning

· a) Designed a distributed packet sniffer on top of dumpcap and tshark and minimized the packet loss . b)

Used Hive for extracting features for the Machine Learning module on Mahout for classification of malicious packets.

· Optimizing search in Unstructured P2P Networks Evaluated

a parallel flooding search technique built on ideas of

mapreduce and analyzed results with existing p2p applications.

Other projects / Hackathons

· Built Rest web services interacting with data from social networking sites (fb/twitter) , IOT project on devices interacting

with ECommerce

sites, Recommender systems, Analyzing large volumes of clickstream data.

· Android apps – “Pearl 2012” (http://bit.Ly/wo6oth), iHelp (google hackathon hack 2012), Location aware blood donation

app (opportunity hack, ebay 2014), Housing hackathon (2015)

Publications / Patents

· Big Data Analytics Framework for peertopeer

botnet detection using random forests

Journal of information science volume 278 , pages 1914

(10 september 2014)


· HDFS Metadata Stream ( Pending Approval 2015.1552


Awards / Achievements

· Rookie Star Award at GDI Global All Hands, San Jose, California, Ebay inc. (2014)

· Going Extra Mile Award, Best Poster Award, Best Team Award, Spot Award, Ebay inc. (20142015)

· BITS Merit Scholarship 20092013

· Winner, International Informatics Olympiad in 2009

  • Updated 8 years ago
  • +44 (0)203 004 9596

  • This field is for validation purposes and should be left unchanged.

Post a comment

Your email address will not be published. Required fields are marked *

close slider
  • +44 (0)203 004 9596

  • This field is for validation purposes and should be left unchanged.