Aleksander Luiz Lada Arruda,巴西圣保罗的开发者
Aleksander is available for hire
Hire Aleksander

Aleksander Luiz Lada Arruda

Verified Expert  in Engineering

Site Reliability Engineering (SRE) Developer

Location
São Paulo, Brazil
Toptal Member Since
February 8, 2019

Aleksander is a DevOps and site reliability engineer with an abundance of experience with cloud-native technologies. 同时拥有计算机科学学士学位, 他部署和管理生产级集群,比如Kubernetes, Kafka, and Elasticsearch—and worked on microservice architecture and everything that comes with it including container orchestration, service discovery, message queues, monitoring, logging, and tracing.

Portfolio

HMBradley
站点可靠性工程(SRE), PostgreSQL, Kubernetes运营(kOps)...
Toptal
亚马逊网络服务(AWS)、谷歌云平台(GCP)...
Pypestream
站点可靠性工程,Rancher, LDAP, OpenStack, Harbor...

Experience

Availability

Part-time

Preferred Environment

Linux, MacOS, iTerm2, Bash, Shell Scripting, Git

The most amazing...

...thing I’ve written was a multi-cluster Kafka setup providing very high availability to receive incoming app data from a company with over a billion downloads.

Work Experience

Site Reliability Engineering Manager

2020 - PRESENT
HMBradley
  • 从零开始构建一个面向未来的云原生基础设施, 跨不同环境管理多个Kubernetes集群, running self-contained, 使用基础设施作为代码维护的可替换组件.
  • Implemented a scalable and highly available stack for centralizing logs and metrics with LokiJS and Cortex, 根据严重性级别将自动警报发送到不同的通道.
  • 构建公司在Kubernetes上运行的数据基础设施, managing clusters such as Kafka, Elasticsearch, and Cassandra; created components to extract data from different sources into Redshift and Snowflake.
  • Introduced security best practices such as AWS CIS Benchmarks as well as intrusion detection and prevention techniques, targeting SOC 2 compliance; implemented granular access control across the systems, including AWS and Kubernetes.
  • 在整个环境中自动构建和部署基础设施组件和应用程序, 将持续交付和基础架构结合为代码.
  • 在AWS上开发一个提取详细数据的小软件,每小时收费, tagging, and shipping them to Prometheus and Cortex, 从而允许实时可视化基础设施的粒度成本.
Technologies: 站点可靠性工程(SRE), PostgreSQL, Kubernetes运营(kOps), Kubernetes, Redis, Cassandra, Vault, Apache ZooKeeper, Falcon, Prometheus, Grafana, Elasticsearch, Apache Kafka, Terraform, Amazon Web Services (AWS), Redshift, Snowflake, AWS Database Migration Service, Data Engineering, Data Warehousing, Shell Scripting, CI/CD Pipelines, SQL, Amazon EC2, Amazon S3 (AWS S3), Amazon Virtual Private Cloud (VPC), AWS IAM, Amazon CloudWatch, AWS Certified SysOps Administrator, GitHub, Git, Ansible, Infrastructure as Code (IaC), Cloud Infrastructure, SecOps, Amazon CloudFront CDN, Amazon RDS, Amazon DynamoDB, Cloudflare, Continuous Delivery (CD), Flask, Containerization, Architecture, Bash, Containers, Load Balancers, VPN, DevOps, Technical Leadership, AWS Cloud Architecture

DevOps Technical Screener

2019 - 2021
Toptal
  • 作为Toptal筛选团队的一员,处理DevOps垂直领域的所有类型的申请人.
  • 对候选人进行审查,在优秀者中只有3%的优秀者获得批准.
  • Worked on polishing the interview process, proposing new technical questions and tasks, as well as improving the existing ones.
  • 建议申请人提高他们作为DevOps工程师的技能, what technologies they should seek to learn, 以及他们应该根据自己的目标去追求什么样的认证.
  • Assisted the approved candidates by building their profiles in a way that would improve their chances of getting hired.
Technologies: 亚马逊网络服务(AWS)、谷歌云平台(GCP), Infrastructure Architecture, DevOps, Site Reliability Engineering (SRE), Shell Scripting, CI/CD Pipelines, SQL, Amazon EC2, Amazon S3 (AWS S3), Amazon Virtual Private Cloud (VPC), AWS IAM, Amazon CloudWatch, AWS Certified SysOps Administrator, GitHub, Git, Infrastructure as Code (IaC), Cloud Infrastructure, SecOps, Amazon CloudFront CDN, Amazon RDS, Amazon DynamoDB

Senior Site Reliability Engineer

2019 - 2020
Pypestream
  • 部署和升级知名生产集群和数据库, such as Kubernetes, Elasticsearch, PostgreSQL, and Ceph.
  • 微调我们的Elasticsearch集群,它每天摄取大约300G的数据, 考虑到Apache Lucene的底层实现,实现最佳实践, 从而提高了它的性能并允许我们缩小它的尺寸.
  • Owned the implementation of security components and best practices such as AWS CIS Benchmarks and intrusion detection and prevention tooling, which rendered the company a SOC 2 certification.
  • 提供全天候随叫随到的支持,处理生产基础设施上的各种事件.
  • Created several Jenkins pipelines with Groovy and Bash for deploying both infrastructure components and applications and worked with Jenkins Configuration as Code (JCasC), 确保整个持续交付堆栈易于复制.
  • Containerized several applications, creating CI/CD pipelines not only for building and deploying but also for performing code checks and security scans.
  • Implemented different solutions for backing up different systems that enabled the development of an expeditious disaster recovery plan.
Technologies: 站点可靠性工程,Rancher, LDAP, OpenStack, Harbor, GitLab CI/CD, Grafana, Prometheus, Ansible, Jenkins, Ceph, Elasticsearch, Kubernetes, Security, Docker, Kubernetes Operations (kOps), Amazon Web Services (AWS), Shell Scripting, CI/CD Pipelines, SQL, Amazon EC2, Amazon S3 (AWS S3), Amazon Virtual Private Cloud (VPC), AWS IAM, Amazon CloudWatch, AWS Certified SysOps Administrator, GitHub, Git, Infrastructure as Code (IaC), HIPAA Compliance, Cloud Infrastructure, SecOps, Amazon CloudFront CDN, Amazon RDS, Amazon DynamoDB, Continuous Delivery (CD), Flask, Containerization, Architecture, Bash, Containers, Load Balancers, VPN, DevOps, AWS Cloud Architecture

DevOps Consultant

2018 - 2019
Audsat
  • 为开发、暂存和生产环境设置三个Kubernetes集群. All clusters were multi-az and had autoscaling. Monitoring was done with Datadog and Pagerduty.
  • 使用自定义弹性代理实现GoCD,用于将应用程序部署到所有Kubernetes集群中. 容器化应用程序并将其部署为Helm图表.
  • 使用cert-manager实现Let 's Encrypt TLS证书的自动配置和续订.
  • 已部署的Fluentd守护进程集,用于将所有应用程序的日志聚合到Elasticsearch中. 还部署了Elasticsearch管理员来清理旧日志.
  • Set up the automatic monitoring of all Java applications deployed in the cluster by running them with sidecar containers exposing metrics retrieved from the application's JMX interface.
  • Spearheaded the project Navalis, 哪个web应用程序旨在允许开发人员部署, monitor, 并在多个Kubernetes集群中轻松扩展应用程序. It was developed with Golang and Vue.js.
  • 将Kubernetes扩展到300个节点,以便在几个小时内处理大量数据, taking into consideration the network and I/O limitations of both the local instances and the data source.
Technologies: Java, PagerDuty, Datadog, GoCD, Fluentd, Elasticsearch, Kubernetes, Amazon Web Services (AWS), Shell Scripting, CI/CD Pipelines, SQL, Amazon EC2, Amazon S3 (AWS S3), Amazon Virtual Private Cloud (VPC), AWS IAM, Amazon CloudWatch, AWS Certified SysOps Administrator, GitHub, Git, Infrastructure as Code (IaC), Cloud Infrastructure, SecOps, Amazon CloudFront CDN, Amazon RDS, Amazon DynamoDB, Grafana, Continuous Delivery (CD), Containerization, Architecture, Bash, Containers, Load Balancers, VPN, DevOps, AWS Cloud Architecture

DevOps Engineer

2017 - 2018
Wildlife Studios
  • Partnered with the data engineering team to develop a new Kafka cluster for the company inspired by Netflix’s way of orchestrating and monitoring Kafka. 它由几个相互连接的Kafka集群组成,以防止数据丢失.
  • Developed a system for monitoring backups consisting of a Python and Flask server and a client written in Go. The system would centralize the status of the backups across the whole infrastructure and notify our team whenever a backup was missing.
  • 解决了一个大型Elasticsearch集群每天早上崩溃的问题. The issue was caused by misconfigured Logstash instances that flooded the cluster with requests for creating new shards.
  • Developed a tool with Go for cross-validating the Kubernetes network which would establish a route between every machine in Kubernetes generating a complete graph or pointing out issues in the network.
  • 使用VyOS在AWS中的可用区(US和AP)之间创建冗余VPN.
  • 帮助我们的最重要的服务器仪器与Jaeger APM.
  • Deployed a Kubernetes cluster with autoscaling as a proof-of-concept to test how well a Kafka cluster would scale within Kubernetes.
  • Solved an issue in which our Kafka cluster would crash because of unexpected behavior of a tool someone had installed to monitor ZooKeper, Netflix's Exhibitor.
  • 部署多个MongoDB集群,用于在高流量事件时收集数据.
  • Deployed a Kubernetes cluster the hard way, 没有任何工具,如Kubernetes Operations (Kops)或Kubeadm, to learn deeper concepts of its architecture.
Technologies: Hyperledger Burrow, Apache ZooKeeper, Apache Kafka, Datadog, Elasticsearch, Jenkins, Helm, Kubernetes, VyOS, MongoDB, PagerDuty, Amazon Web Services (AWS), Go, Python, Docker, Terraform, Chef, Shell Scripting, CI/CD Pipelines, SQL, Amazon EC2, Amazon S3 (AWS S3), Amazon Virtual Private Cloud (VPC), AWS IAM, Amazon CloudWatch, AWS Certified SysOps Administrator, GitHub, Git, Infrastructure as Code (IaC), Cloud Infrastructure, SecOps, Amazon CloudFront CDN, Amazon RDS, Amazon DynamoDB, Continuous Delivery (CD), Containerization, Architecture, Bash, Containers, Load Balancers, VPN, DevOps, AWS Cloud Architecture

DevOps Engineer

2017 - 2017
MAV Technology
  • Centralized in an HAProxy cluster all incoming requests which didn’t have a proper entry point for the infrastructure (i.e.(DNS指向许多不同的入口点),从而避免单点故障.
  • Fixed multiple bugs in Node.js servers, among them a critical one which forced us to restart production containers from time to time because of a progressive decay of performance.
  • Solved multiple bugs in Objective-C servers by creating a system for debugging multiple servers in real time, attaching multiple GDBs to multiple processes distributed amongst nodes and capturing eventual stack traces—allowing us to quickly fix bugs that would only occur in the production environment.
  • Developed a Node.js server that would hold thousands of connections open as a fronting proxy for a legacy server that was not able to receive too many simultaneous connections.
  • Stopped an ongoing brute-force password attack, which I was able to detect because of an expressive increase in the number of failed authentications in DataDog. 我通过在HAProxy中阻止攻击者的IP地址来阻止攻击.
  • 解决了一个可能导致Ceph崩溃的严重问题. 我们将问题追踪到与我们使用的特定软件版本相关的bug.
技术:Ceph, MongoDB, MySQL, Datadog, Consul, HAProxy, Node.js, Shell Scripting, CI/CD Pipelines, SQL, Nagios, GitHub, Git, SecOps, Containerization, Architecture, Bash, Containers, Load Balancers, DevOps

Software Engineering Intern

2015 - 2016
Synopsys, Inc.
  • Developed a tool in Python for automatically generating C++ code that would bind hardware transactors written in C++ to TCL.
  • 构建了一个从硬件仿真平台提取统计数据并生成D3的工具.js charts.
  • 修复了一个由GTK和硬件处理器之间的竞争条件引起的主要c++错误.
  • Worked for a month at Synopsys' headquarters in Mountain View where I learned a lot about electronic design automation.
Technologies: EDA, D3.js, Tcl, Python, C++, Verilog, GitHub, Git, Bash

Junior Back-end Engineer

2012 - 2014
MAV Technology
  • Developed a substantial part of a back end of a corporate email service; it was written in C++ with language bindings to Lua. I utilized MongoDB for storing the email metadata, GridFS for storing their bodies, and MySQL for storing relational user data. 在整体架构中使用REST接口.
  • 用Java和谷歌Web Toolkit构建了他们前端的一部分.
  • Constructed IMAP and POP3 proxies to route new users from other email service providers to their old servers while capturing their passwords and transparently migrating their accounts to our servers.
  • 用c++从零开始开发HTTP和SMTP服务器.
  • Supported the development of the company’s ERP system; built with CakePHP and Bootstrap.
Technologies: Bootstrap, CakePHP, GWT, Java, MySQL, MongoDB, Lua, C++, SQL, Nagios, GitHub, Git, Bash, Load Balancers

Navalis

Navalis is a platform which enables developers to deploy and visualize applications in Kubernetes with ease. 它还检查集群的不一致性,并不断监视其运行状况. 它由用Go编写的API和用Vue编写的前端组成.js.

Flux Control Language Compiler

http://github.com/aleksanderllada/FCL-Compiler
FCL是我大学毕业设计的一门编程语言. Its goal is to allow scientists that are not familiar with low-level programming languages to dynamically control a pipetting robot.

这个项目是我用Java和ANTLR4编写的编译器,用于生成FCL的p代码, based on the formal grammar I wrote for the language.

Flux Control Language Interpreter

http://github.com/aleksanderllada/FCL-Interpreter
FCL是我大学毕业设计的一门编程语言. Its goal is to allow scientists unfamiliar with low-level programming languages to dynamically control a pipetting robot.

这个项目是我为该语言的p-code编写的解释器, which is generated by the FCL Compiler. 它像一个堆栈机器一样工作,类似于Python和Lua的解释器.

Languages

Bash, Go, JavaScript, C++, Python, SQL, Java, Lua, Verilog, Tcl, Falcon, Java 8, Transaction Control Language (TCL), Snowflake

Tools

Jenkins, Terraform, Amazon Virtual Private Cloud (VPC), AWS IAM, GitHub, Git, Ansible, Vault, Chef, NGINX, Grafana, Amazon CloudWatch, Amazon CloudFront CDN, VPN, Helm, GitLab CI/CD, ANTLR 4, Kong, Fluentd, Apache ZooKeeper, MirrorMaker, Nagios

Paradigms

Continuous Integration (CI), Continuous Delivery (CD), Distributed Computing, DevOps, Scrum, Design Patterns, HIPAA Compliance

Platforms

Kubernetes, Linux, Apache Kafka, Amazon Web Services (AWS), Docker, Amazon EC2, PagerDuty, Google Cloud Platform (GCP), Heroku, Hyperledger Burrow, Harbor, OpenStack, Rancher, MacOS

Storage

Elasticsearch, Datadog, Amazon S3 (AWS S3), MongoDB, MySQL, PostgreSQL, Amazon DynamoDB, Cassandra, Redis, Ceph, Redshift

Other

Kubernetes Operations (kOps), Site Reliability Engineering (SRE), GoCD, Prometheus, AWS DevOps, Shell Scripting, CI/CD Pipelines, Infrastructure as Code (IaC), Cloud Infrastructure, SecOps, Amazon RDS, Containerization, Architecture, Containers, Load Balancers, AWS Cloud Architecture, Distributed Tracing, HAProxy, APM, AWS Certified SysOps Administrator, Cloudflare, Technical Leadership, EDA, LDAP, Infrastructure Architecture, Computer Science, Compilers, Programming Languages, iTerm2, Consul, VyOS, AWS Database Migration Service, Data Engineering, Data Warehousing

Frameworks

Qt 5, Flask, Express.js, GWT, CakePHP, Bootstrap, Spring

Libraries/APIs

Node.js, POCO C++, Vue, D3.js

Industry Expertise

Security

2011 - 2017

Bachelor of Science Degree in Computer Science

米纳斯吉拉斯州联邦大学-贝洛奥里藏特,米纳斯吉拉斯州,巴西

FEBRUARY 2020 - FEBRUARY 2023

AWS Certified SysOps Administrator

Amazon Web Services