No experience
Employment Type:
Full time
Job Category:
Software Engineer (Infra - Core Services)
(This job is no longer available)
Twitter | San Francisco, CA
Grad Date

Not sure what types of jobs you are interested in?

Explore Jobs
Based on Your Education

Follow This Company

Job Description

As a member of the Core Services team at TellApart, you will be responsible for building out the entity layer of our system, which is where all of our data pipelines are created. One of the main strengths of our business is our ability to track consumers across devices. The cross-device identity services we provide are used by every solution we offer today. Essentially, any data-driven capability that's common across all of our products is within our domain.

2+ years of industry experience using big data systems and/or using big data for data analysis (internships count toward this requirement) Hands on experience with distributed system concepts used in scaling big data technologies with exponential growth of data and speeding up queries Understanding of inner workings of technologies like Hadoop, MapReduce, HBase, Voldemort, MongoDB, Cloudera CDH, Spark, Parquet, Scalding, Kafka, Zookeeper, Eureka, etc. Ability to mentor others and thrive in a dynamic, fast-paced, collaborative, and high-growth start-up environment Bonus Points:
Architecting and deploying asynchronous work queues, high-volume storage systems, and high-throughput systems excites you Active contributor / committer to a well-known open source project and/or interest to do so in the future Demonstrated ability to excel in whatever you pursue (whether it's work, school, competitions, open source contributions, personal projects, etc.--you've always stood out and succeeded) Current Focus:
Play a key role in scaling our existing data pipeline to handle 10x the data to match our current growth trajectory Extend the functionality of a real-time data pipeline that produces up-to-the-minute analyses of PB-scale datasets Consider next-generation hardware choices and configurations for our Hadoop clusters to optimize performance and reliability Build or choose open source tools to help other engineers access complex data more efficiently Most importantly, because you are excited by big data technologies and would like to make meaningful contributions toward the next advances in big data Recently Completed Projects:
Built an identity management platform which associates a single user identity across devices and supports an anonymous profile service to derive attributes like gender and purchase preferences from their user history Built a data pipeline which ingests > 100M user activities per day and creates a cumulative user history for use in training models and building datasets for real-time usage Built a pluggable monitoring platform that allow users and services to retrieve, compare and alert on metrics recorded in OpenTSDB, Cloudwatch and other metric stores Built query system from Spark, Tachyon and Parquet that caches Terabytes of data in memory and increased our query speeds by 100x. We offered contributions back to all projects in the course of development Built a data extraction framework to enable simple and consistent data extraction from complex structures across multiple Map/Reduce platforms such as Hadoop, Hive and Spark

About Twitter

Twitter Inc. was founded in 2006 and has grown rapidly to more than 6 million users. The company has recently gotten plenty of Hollywood buzz as A-list actors sign up and post tweets to control their image, widen their appeal and communicate directly with fans.