Kamaldeep Singh
Singh, Kamaldeep
mailkamaldeep@gmail.com | +918105803335
|https://www.linkedin.com/in/mailkamaldeep | github:@kamaldeepebay
, @kamaldsingh
Education
M.Sc. (Tech.) Information Systems CGPA : 8.01 / 10
BITS PILANI HYDERABAD CAMPUS, HYDERABAD 2009 2013
Technical Skills & Abilities
∙†Programming languages :Scala,
Java, Shell, C, JS, R, Cypher
∙†Distributed Systems :Apache
Kafka,Storm, Amazon Kinesis, Hadoop, Mapreduce, HBase, Hive, Pig, Mahout, Redis,
Spark, Ambari, Sqoop, Camus, Zookeeper, Kylin, Olap, Datafu Hourglass, Ansible, Openstack, Nova, AWS, Akka Actors
∙†Data Science :Weka,Stanford CoreNLP, DL4J, Keras, MLLib, JavaBayes, Alchemy, Sentiment Analysis,Bayesian Networks, ContextFree Grammar, Spam Detection, AWS ML
∙†Search :Apache
Lucene, Contextual Autocomplete, Word2Vec, GloVe, Web Ontologies, Search Impressions
∙†Data Stores :Graph
(Neo4j), Mysql, HBase, Redis, MongoDB, Hive, Kylin
∙†Monitoring/Analytics :Mixpanel,
Graphite, OpenTSDB, Riemann, HProf, New Relic, Nagios, Google Analytics, Adwords
∙†Schedulers / Authentication / API’s :DSL’s,
JSON AST, Parboiled,Job Schedulers, Quartz, Kerberos, Crypto, Blocking
queues, Brokers, JScH, TestNG, JAXRS,
JAXB, Servlets, JDBC, Twitter4j, FB Graph Search, Siddhi Cep, JUnit, JMete r
· Open Source Contributions :Apache
Lucene, Kylin, Eagle, Hadoop, Amazon KCL
·
Experience and Projects
Data Scientist , Practo Technologies, Bangalore Oct. 2015 Present
∙†Building a Contextual Autocomplete Engine and Free Text Search Platform using Lucene/Stanford NLP
· Revamping the Practo Ranking Algorithm for scoring doctors and practices.
· Building Synonym Search by modelling external data sources like Snomed into Graph Entities in Neo4j and Tinkerpop.
· Designing Real time Lucene Index Updation system using Kinesis , Akka Actors and S3
· Improving Search and Autocomplete Performance ( ~ 100ms & ~ 20ms resp. )
Software Engineer 2 , Hadoop Services, Ebay Inc., Bangalore Dec 2013 – Oct. 2015
· Realtime
Event Processing Platform with Kafka, Hbase, Storm for Hadoop Security
· HDFS Metadata Streaming Platform for Hive security with a Cache manager built in Redis
· HBase Diagnostic Tool
· Hadoop / Hive Job Performance Optimizations, Hbase and Hadoop Capacity Management, Performance testing .
· Upgraded World’s largest Hadoop2 cluster (per HWX) introducing & benchmarking its HA architecture & built test suite
· Open Source Contributions to Apache Kylin (Spark on Kylin, Metadata Bkp.) and Hadoop (Hadoop TSDB.,Decom. patch)
· Hadoop as a service – Ambari with Openstack & Hadoop connector with Opentsdb (open sourced)
· POC’s around Incremental MR Processing, JVM Profiling and Alert Correlations in monitoring framework at Ebay
Associate Software Engineer , C.A. Technologies, Hyd. Jul 2013 – Nov 2013
∙†Spectrum Device Certification Automation Spectrum
enhancement for automatic alarm support for new devices
based on text classification of the MIB Description using bayesian classifier.
· LACP Device Support in Spectrum
Intern , Amazon Development Centre, Hyd. Jul 2012 – Dec 2012
· Automated Data Comparator engine – Scalable and an Automated tool for validating a redesigned
service against its previous version by finding diff between the backend databases using Mapreduce
and Hadoop.
· Structured Data Retrieval Serv. – Restful service to automate the creation of Json objects for unit testing.
Term projects ( BITS Hyderabad )
· Scalable framework for detecting network threats based on machine learning
· a) Designed a distributed packet sniffer on top of dumpcap and tshark and minimized the packet loss . b)
Used Hive for extracting features for the Machine Learning module on Mahout for classification of malicious packets.
· Optimizing search in Unstructured P2P Networks Evaluated
a parallel flooding search technique built on ideas of
mapreduce and analyzed results with existing p2p applications.
Other projects / Hackathons
· Built Rest web services interacting with data from social networking sites (fb/twitter) , IOT project on devices interacting
with ECommerce
sites, Recommender systems, Analyzing large volumes of clickstream data.
· Android apps – “Pearl 2012” (http://bit.Ly/wo6oth), iHelp (google hackathon hack 2012), Location aware blood donation
app (opportunity hack, ebay 2014), Housing hackathon (2015)
Publications / Patents
· Big Data Analytics Framework for peertopeer
botnet detection using random forests
Journal of information science volume 278 , pages 1914
(10 september 2014)
http://www.sciencedirect.com/science/article/pii/S0020025514003570
· HDFS Metadata Stream ( Pending Approval 2015.1552
)
Awards / Achievements
· Rookie Star Award at GDI Global All Hands, San Jose, California, Ebay inc. (2014)
· Going Extra Mile Award, Best Poster Award, Best Team Award, Spot Award, Ebay inc. (20142015)
· BITS Merit Scholarship 20092013
· Winner, International Informatics Olympiad in 2009