Posts

Showing posts with the label Cloud

[Cloud Computing] Kafka

What is the paper trying to do? The paper is trying to introduce a novel messaging system for log processing, called Kafka. Kafka is a combination of the benefits of traditional log aggregator and messaging systems. Kafka is useful in processing huge volumes of log data streams, such as a messaging system. It is able to achieve higher throughput than other messaging systems because it focuses on log processing applications. There is a also integration for distributed support and can scale out. What do you think is the contribution of the paper? I think the main contribution of Kafka is the decisions the developer made to implement Kafka. For example, to make the system efficient, Kafka has a very simple storage layout, where each partition of a topic corresponds to a logical log. Another interesting decision is by making the consumer keep track of how much each consumer has consumed, as opposed to traditional methods where it is done by the broker. This reduces a lot of complexity...

[Cloud Computing] Supporting Security Sentitive Tenants in a Bare Metal Cloud Review

What is the paper trying to do? The paper, “Supporting Security Sentitive Tenants in a Bare Metal Cloud”, is trying to present Bolted, an architecture for a bare metal cloud. Bolded is special in that it satisfies the needs of both security sensitive and insensitive tenants; as security sensitive tenants can control their own security and insensitive tenants can use default security. What do you think is the contribution of the paper? What are its major strengths? Allows security sensitive tenants to control their security (assuming physical security and availability is not an issue). Does not impose overhead on security insensitive tenants and does not employ extra cost on flexibility and operational efficiency of the provider. Eliminate the need to trust the provider to disk scrub via disk-less provisioning. “remote attestation” allow tenants to inspect source code used to generate the firmware it runs on. Performance Evaluation Can rapidly set up secure servers with com...

[Cloud Computing] Dynamo (a highly available key-value storage system)

What is the paper trying to do? This paper is trying to “present the design and implementation of Dynamo, a highly available key-value storage system”, that is used in Amazon’s core services. By sacrificing consistency under certain failure scenarios, Dynamo is able to reach an “always-on” experience in terms of availability. This allows it to be successful in handling server failures, data center failures and network partitions. Additionally, Dynamo is incrementally scalable and can scale up and down while it is up and running. Dynamo uses a combination of technologies, including extensive use of object versioning, application-assisted conflict resolution, data partitioning, replication via consistent hashing. Additionally, during updates, quorum-like technique and a decentralized replica synchronization protocol is used to maintain consistency amongst replicas. What do you think is the contribution of the paper? The main contribution of this paper is “the evaluation of how differ...

[Cloud Computing] Spark: Cluster Computing with Working Sets

What is the paper trying to do? What do you think is the contribution of the paper? The paper, “Spark: Cluster Computing with Working Sets”, is trying to explain Spark, a new framework that can retain the scalability and fault tolerance similar to MapReduce, and at the same time, support applications that reuse a working set of data across multiple parallel operations. This type of application is not as effective in MapReduce because MapReduce only works well with acyclic data flow graphs. However, it works well in Spark. Spark achieves this by using an abstraction called “resilient distributed datasets (RDD)”. It is a “read-only collection of objects partitioned across a set of machines that can be rebuilt if a partition is lost”. Although Spark is still in prototype, the authors demonstrate that Spark can outperform “Hadoop by 10x in iterative machine learning workloads and can be used interactively to scan a 39GB dataset with sub-second latency”. What are its major strengths? ...

[Cloud] OpenShift vs Kubernetes

Image
OpenShift VS Kubernetes Openshift is based on Kubernetes and docker. In other words, OpenShift is a modded version of Kubernetes. Below is an example of the namespace component of Kubernetes. As you can see, OpenShift replaces some of the original Kubernetes components with their own. Below is a more extensive list of differences: https://www.whizlabs.com/blog/wp-content/uploads/2019/08/openshift-vs-kubernetes-table.png

[Cloud] OpenShift vs OpenStack

1. OpenShift vs OpenStack OpenStack turns servers into cloud . It can be used to automate resource allocation so customers can provision virtual resources. OpenShift is a container centric model that leverages core concepts of Kubernetes and packages them in a neat way for developers to deploy applications on the cloud. 1.1. Concerning Containers OpenStack typically uses hypervisors like KVM, Xen or VMware to spin up virtual machines. On the other hand, OpenShift can run bare metal or it may run on Virtual Machines but it always uses containers on top of them. The containerization technology that they use is almost exclusively Docker. (Note: OpenStack does offer containerization support as well, it is meant to be used more of less like VPS and is optional.) 1.2. Distributed System OpenStack is not exclusively a distributed system . It can take control over an entire data center but that’s nowhere as global as a Kubernetes cluster. You would need a lot of e...

[Cloud] OpenStack, Magnum, OpenShift

Image
1. OpenStack OpenStack an open-source cloud operating system that turns your server into cloud environments. In other words, it provides an open alternative to the top cloud providers. It is IaaS, and it can be used to automate resource allocation so customers can provision virtual resources like VPS, block storage, object storage among other things.   2. Magnum Magnum is an OpenStack API service that makes container orchestration engines , such as Docker Swarm, Kubernetes, and Mesos available, a first class resources in OpenStack. Magnum uses Heat to orchestrate an OS image, which contains Docker and Kubernetes and runs that image in either virtual machines or bare metal in a cluster configuration. 3. OpenShift OpenShift is a platform as a service (PaaS) that leverages the core concepts of Kubernetes and packages them in a neat way for developers to deploy applications on the cloud. In short, it’s a modded Kubernetes, and accepts kubctl commands. ...

[Cloud] Amazon EKS Overview

Image
1. Amazon EKS 1.1. Overview Amazon EKS is a managed service that helps make it very easy to run Kubernetes on AWS . The idea is that most applications will run on EKS with minimal mods, if any.   Through EKS, organizations can run Kubernetes without cumbersome steps, such as: Creating the Kubernetes master cluster Configuring service discovery, Kubernetes primitives Porting and Creating database instances Setting up load balancing (eg. with HA proxy) Security Networking Hosting Control Planes across different availability zones to prevent single point of failure (Highly Available) Managing Control Plane, so users do not need to worry about components like etcd, kube-controller-manager, kube-apiserver, cloud-controller-manager and kube-scheduler.   Basically EKS = Kubernetes-as-a-service   1.2. Running Kubernetes without EKS: Manual deployment on EC2 IT teams can run a self-hosted Kubernetes environment on an EC2 instance. Deploy wi...

[Cloud] Kubernetes Overview

Image
1. Kubernetes 1.1. Overview Kubernetes is an open-source system that allows organizations to deploy and manage containerized applications like platforms as a service (PaaS), batch processing workers, and microservices in the cloud at scale. Through an abstraction layer created on top of a group of hosts, development teams can let Kubernetes manage a host of functions--including load balancing, monitoring and controlling resource consumption by team or application, limiting resource consumption and leveraging additional resources from new hosts added to a cluster, and other workflows. 1.2. Kubernetes Architecture (Master, Worker)   The Kubernetes master is responsible for maintaining the desired state for your cluster . The master can also be replicated for availability and redundancy . When you interact with Kubernetes, eg. via the kubectl command-line interface, you’re communicating with the master. The worker nodes in a cluster are the machines (VMs, ph...